Commit Graph

427 Commits (81d9e2353217182669a623e1e0503ccc31fbb159)

Author SHA1 Message Date
Michael Peter Christen 81bb50118e found and fixed a huge memory leak in solr caching (inside Solr). The
11 years ago
Michael Peter Christen 7f768b42d3 we do not need the load-image flag any more since this is now controlled
11 years ago
Michael Peter Christen f1bfe64361 integrated startpage to compare_yacy
11 years ago
Michael Peter Christen 9bb7eab389 hacks to prevent storage of data longer than necessary during search and
11 years ago
orbiter 3c3cb78555 - removed a lot of garbage and bloated code from GuiHandler.
11 years ago
Michael Peter Christen 6aabc4e5c8 reduced logging line memory, 10000 lines had filled up 450MB! grrr.
11 years ago
Michael Peter Christen 1b4fa2947d - fixed a problem which ocurred when a document was not recognized with
11 years ago
Michael Peter Christen 820b896146 Replaced the inframe loading from yacy.net for donations with the
11 years ago
Michael Peter Christen 90c8577840 enhanced ranking; patches to replace old ranking
11 years ago
Michael Peter Christen 1b61bd40ed - Added new solr field url_file_name_tokens_t which stores the file name
11 years ago
orbiter 5f5a97bafc added the anchor text within web pages to the searcheable entities of a
11 years ago
Michael Peter Christen 21aa6a0321 migration to Solr 4.5.0
11 years ago
Michael Peter Christen b28d43decc added two more fields source_cr_host_norm_i,target_cr_host_norm_i in
11 years ago
Michael Peter Christen 4f83d5f18c added the new field harvestkey_s to the collection index and the
11 years ago
orbiter 8ac2e8c8c9 added location navigator which causes that the image to the map search
11 years ago
Michael Peter Christen 61c5e40687 - replaced the properties object in AnchorURL with distinct variables
11 years ago
Michael Peter Christen 85456f46b2 added two new fields, exact_signature_copycount_i and
11 years ago
Michael Peter Christen a2511b5600 turned images_alt_txt back to images_alt_sxt because it is not necessary
11 years ago
Michael Peter Christen 69f85265e1 added an option to put image links to the crawl queue and handle these
11 years ago
orbiter f106345eef link strings should not be tokenized
11 years ago
orbiter deadeb406e image alt tag strings should be tokenized
11 years ago
Michael Peter Christen 1a3e42eca4 index migration to lucene 4.4
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
sixcooler 1bc6003057 rise autoCommit maxTime to 3 Minutes to reduce IO
11 years ago
orbiter 944ae5686c added donation plea to the about box as default (you can replace this in
11 years ago
Michael Peter Christen 58fe986cca Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen cf12835f20 replaced the single-text description solr field with a multi-value
11 years ago
orbiter e7fcb81cea we should not do too much greedylearning at this time as we don't have
11 years ago
orbiter bf0ad04e1b apply load limitation also to dht-in
11 years ago
orbiter f50b596e0b do not run dht ditribution if system load is over 2.5
11 years ago
orbiter e24016e30a added the property federated.service.solr.indexing.timeout to yacy.init
11 years ago
Roland Haeder 98e10f95e2 Added some cora package loggers
11 years ago
orbiter 1b43e02b86 Merge branch 'master' of git://gitorious.org/~quix0r/yacy/quix0rs-yacy-rc1
12 years ago
orbiter a548354c71 replaced type of solr schema object sku of text_en_splitting_tight by
12 years ago
Roland Haeder ebbb3bc5c1 Fixed CHMOD on many files + added missing loggers (e.g. jena) and made some noisy loggers quiet
12 years ago
orbiter e609ec388a metager whitelist update
12 years ago
Michael Peter Christen 2716dfc46c increase crawler speed by reduction if the busysleep time
12 years ago
Michael Peter Christen 57ffdfad4c added a crawl option to obey html-meta-robots-noindex. This is on by
12 years ago
Michael Peter Christen 5a5d411ec0 new robots_i attribute fields
12 years ago
orbiter 7c6ccc426c set crawlingQ to true by default because most webpages are dynamic and
12 years ago
Michael Peter Christen 16d1d744fa added url_file_name_s in default collection schema for the file name
12 years ago
orbiter 8792e6c6e9 stub for better image indexing
12 years ago
Michael Peter Christen 570511f3c8 removed fields references_internal_id_sxt and
12 years ago
Michael Peter Christen fd1776a3b0 added a new 'Citations' function: each search result item can now be
12 years ago
Michael Peter Christen 7754a1263b switching back to the merge factor 10; the solr default.
12 years ago
Michael Peter Christen 1762911f57 added synchronizations and timeouts in solr api; missing
12 years ago
Michael Peter Christen 959ccc4675 increased the solr merge factor because 4 was too much IO load for
12 years ago
Michael Peter Christen 20fab1feb6 allip net has greedy learning disabled
12 years ago
Michael Peter Christen 6115bef335 added a 'greedy learning' mechanismn which will cause that a 'fresh'
12 years ago
Michael Peter Christen 856e5c42ae the line "Web Search by the People, for the People" is more generic for
12 years ago