Commit Graph

346 Commits (2d9e577ad07c3fbb0c8bb1abec578a03ab31db24)

Author SHA1 Message Date
Michael Peter Christen d09d9f2364 filter old peers from bootstrap (now stronger: 60 minutes instead of 13 years ago
Michael Peter Christen b0c408788b made class methods static where possible 13 years ago
Michael Peter Christen 7c1ba99755 removed more unused method parameters 13 years ago
Michael Peter Christen 0301aba1e9 removed unused method parameters 13 years ago
Michael Peter Christen 241dd8410a removed snippet pattern filter - it was not used 13 years ago
Michael Peter Christen d3964253ae - added @SuppressWarnings to unused servlet method parameters 13 years ago
Michael Peter Christen ea10766bfd cleaned unnecessary nested code 13 years ago
orbiter fc0f9543fe More SentenceReader cleanup 13 years ago
orbiter d4291ac1f3 more tolerance when creating solar document 13 years ago
orbiter 78fc3cf8f8 refactoring and new usage of SentenceReader: this class appeared as one 13 years ago
Michael Peter Christen 613b45f604 - better data structures in secondary search 13 years ago
Michael Peter Christen de903a53a0 parser refactoring & hacks 13 years ago
Michael Peter Christen 8a82609360 - smaller caches to save memory 13 years ago
Michael Peter Christen 7249d9c9de bugfix for concurrent seed loader 13 years ago
Michael Peter Christen c72d3b12cd concurrently initialize the seed list during p2p network bootstrap 13 years ago
Michael Peter Christen 1825f165b8 better integration of blacklist according to use case 13 years ago
Michael Peter Christen c18fa9fa75 Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1 13 years ago
Michael Peter Christen ce8d4b87d9 fixes for new eclipse 'Juno' warning 'Resource leak'. 13 years ago
Michael Peter Christen 0c345d1559 giving threads name so its easier to see whats happening during 13 years ago
reger 067728bccc add search result heuristic. adding a crawl job with depth-1 for every displayed search result (crawling every external linked page of displayed search result pages) 13 years ago
Michael Peter Christen 03280fb161 removed segments-concept and the Segments class: 13 years ago
Michael Peter Christen 508a81b86c added solr field 'refresh_s' which stores the refresh url contained in 13 years ago
Michael Peter Christen 9116013c64 - allow lazy initialization of solr value (if using 'lazy', then no 13 years ago
Michael Peter Christen 0294a53459 - add canonical field only if requested by solr schema 13 years ago
Michael Peter Christen 3fd4a01286 added option to record urls that are forwarded to the solr index 13 years ago
Michael Peter Christen 96aeb127e3 generalized localhost naming. 13 years ago
Michael Peter Christen 77f795756c fixing redirects and status codes: storing of status code in 13 years ago
Michael Peter Christen 8dd469b9dd added option to configure the autocommit delay time of solr on-the-fly 13 years ago
Michael Peter Christen b9dfca4b0a - fixed IndexFederated Servlet / a embedded Solr can now be selected 13 years ago
Michael Peter Christen fad3b14813 added jetty libraries, needed for future use as web server and as 13 years ago
Michael Peter Christen a38b0a2c46 extended embedded solr tests to ensure that it will be usable within a 13 years ago
Michael Peter Christen b9d42fd9c8 using com.google.common.io.Files instead of homebrew methods 13 years ago
Michael Peter Christen a5eb91fa60 refactoring 13 years ago
Michael Peter Christen 1be0025a9c - added test for EmbeddedSolrConnector 13 years ago
Michael Peter Christen e12bb254b4 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 13 years ago
Michael Peter Christen 3f55dc7c1e - added solr core and libraries that solr needs (lucene is missing, will 13 years ago
Michael Peter Christen 786be7d175 better integration of RDFaParser 13 years ago
Michael Peter Christen 0752983fbd - automatic periodic saving of triplestore 13 years ago
Michael Peter Christen 9264d8b4af removed old navigation practice using subject tags in favor of 13 years ago
Michael Peter Christen 64c0268b2b show triplestore metadata in yacydoc and viewfile 13 years ago
cominch a95127c9af Triplestore: initalize per-user triplestores 13 years ago
Michael Peter Christen e89747bb67 - added automated generation of vocabularies from url stubs 13 years ago
Michael Peter Christen 8b53771db2 changed behavior of navigation processing: 13 years ago
Michael Peter Christen 5fc6524ca8 - moved triple store to net.yacy.cora.lod (should be generalized there 13 years ago
Michael Peter Christen 4ee6fb1de9 added missing blacklist dht cache storage (maybe due to mistakes in 13 years ago
Roland 'Quix0r' Haeder edaa09b9b1 Rewrote all String blacklist types to enum 'BlacklistType', closes bug 13 years ago
Roland 'Quix0r' Haeder af5a597e47 Scroogle is not comming back, remove dead code 13 years ago
cominch 65c5826d93 bugfix 13 years ago
Michael Peter Christen cde20911bb saved a bit more ram using UTF8 String compression for OpenGeoDB and 13 years ago
Michael Peter Christen 2280a7b276 - changed initialization order to prefer allocation of memory for table 13 years ago
Michael Peter Christen 0746308bc2 only the metadata tables shall be able to use the tail cache 13 years ago
Michael Peter Christen 41c02cb10e - less restrictions for usage of Table RAM copy 13 years ago
Michael Peter Christen dd14b19c26 lazy initialization of block rank table ... only normal web search uses 13 years ago
Michael Peter Christen 701b9a28a0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 13 years ago
Michael Peter Christen ab7107b34b fixed RWIProcess queue limits: now discovering hidden results for mass 13 years ago
Michael Peter Christen b0095c8d3c flush the compressor cache when a cleanup is done 13 years ago
Michael Peter Christen a61f44f9e4 lazy initialization of block rank table. 13 years ago
Michael Peter Christen 96e9d77270 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 13 years ago
Michael Peter Christen 00f2df1120 a variety of possible memory leak fixes 13 years ago
Michael Peter Christen d0ec8018f5 fixes for bad long computation 13 years ago
Michael Peter Christen 461a0ce052 removed warnings 13 years ago
Michael Peter Christen 407fdf6968 more bug fixes and performance hacks for search process 13 years ago
Michael Peter Christen a1fe65b115 performance hacks 13 years ago
Michael Peter Christen 2fe207f813 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 13 years ago
Michael Peter Christen 5e562dcdb7 adopted vocabulary usage within anotation/naviagtion feature of search 13 years ago
Michael Peter Christen 240045cf7c fix for bad distance computation 13 years ago
Michael Peter Christen e0d8643226 - performance hacks 13 years ago
Michael Peter Christen 9b4c699526 ehanced location search: 13 years ago
Michael Peter Christen 834dc6b263 store more data from interface access 13 years ago
Michael Peter Christen 10da7335ea performance hack: use a hash cache for all hashes that are computed by a 13 years ago
Michael Peter Christen 7c1feefb28 introduced a default 10 second time-out in rwi normalization time 13 years ago
Michael Peter Christen c846e9ca14 redesign of the crawler monitor page: show crawled pages instead of 13 years ago
Michael Peter Christen c15fcde1c8 add-on to latest commit 13 years ago
Michael Peter Christen cf47d94888 performance hack to parse numbers inside of substrings without actually 13 years ago
Michael Peter Christen 7e0ddbd275 added a "fromCache" flag in Response object to omit one cache.has() 13 years ago
Michael Peter Christen 7bf421b9dd - fixed image search page navigation 13 years ago
Michael Peter Christen fb94b47b1a changed queue sizes to have less memory occupied during indexing 13 years ago
Michael Peter Christen 76157dc2c3 bugfix for http://bugs.yacy.net/view.php?id=173 13 years ago
Michael Peter Christen c6558cba08 more classification bugs 13 years ago
Michael Peter Christen 082831b9d6 search contentdom was checked in wrong way - fixed 13 years ago
reger ee553d971e correct typo in scripts_txt comment 13 years ago
Michael Peter Christen f294f2e295 bugfix to http://bugs.yacy.net/view.php?id=181 13 years ago
Michael Peter Christen acf8d521a2 fix for http://bugs.yacy.net/view.php?id=126 13 years ago
Michael Peter Christen bb88878b4d the last commit was incomplete.. 13 years ago
Michael Peter Christen d320a31ae1 bugfix for http://bugs.yacy.net/view.php?id=186 13 years ago
Michael Peter Christen 3e1bc9477f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 13 years ago
Roland 'Quix0r' Haeder d10627d591 More sync in close() methods 13 years ago
Roland 'Quix0r' Haeder b3ae2aa41f With or without 'final'? At least please try it in other methods 13 years ago
Roland 'Quix0r' Haeder fbb946f913 Made a method static (Eclipse suggested it), removed unused import, pk=null check does now output a warning in logfile 13 years ago
Michael Peter Christen 52d307c735 prevent that the snippet fectch process removes catchall entries 13 years ago
Michael Peter Christen 89142d1e8d removed (not all) warnings 13 years ago
Michael Peter Christen 5deebd02ea added serialization 13 years ago
reger b2175ea4ef Add possibility to set custom Solr field names for the YaCy default Solr attributes. 13 years ago
Michael Peter Christen e7e381d110 added configuration to switch off redirection following in crawler 13 years ago
Michael Peter Christen 2717c1b749 fixed bug in solr interface 13 years ago
Michael Peter Christen f150bc218b fixed bug in solr error document 13 years ago
Michael Peter Christen cb54c1737b solrj connector bugfix 13 years ago
Roland 'Quix0r' Haeder a093ccf5eb Now used synchronization in all close() methods to make sure all objects 13 years ago
Michael Peter Christen 0d58fea210 made multiple connector default 13 years ago
Michael Peter Christen adeb33bb36 better abstraction for solr objects 13 years ago
Michael Peter Christen 8864141872 more abstraction in solr connection classes 13 years ago
Michael Peter Christen c00efc2717 made the solr connection more generic 13 years ago
Michael Peter Christen ea2bd43b28 patch for broken configurations 13 years ago
Michael Peter Christen ba6aaabc51 refactoring + parser bugfixes 13 years ago
Michael Peter Christen 453010bd68 - solved problems with backpath normalization 13 years ago
Michael Peter Christen 5f5ed33ed8 patch for media search (audio, video apps) 13 years ago
Michael Peter Christen 19efbf1b0f - apply directDocByURL to NOLOAD Queue 13 years ago
Michael Peter Christen 659178942f - Redesigned crawler and parser to accept embedded links from the NOLOAD 13 years ago
Michael Peter Christen a3badd3205 changed search process for images: no more media snippet load process, 13 years ago
Michael Peter Christen f8cd57c92f new indexing strategy: ALL links that appear anywhere are indexed, not 13 years ago
Michael Peter Christen 14f67f217c refactoring of ContentDomain: now subclass of Classification 13 years ago
Michael Peter Christen a1a5b015d8 refactoring: moved document Classification to cora package 13 years ago
Michael Peter Christen 33d1062c79 refactoring: the cache belongs to the crawler 13 years ago
Michael Peter Christen 7b5b9baee0 added citation rank to ranking profile 13 years ago
Michael Christen 02e4dedff2 fix to url citation collection 13 years ago
Michael Christen e32055aa15 added stub classes for 13 years ago
Michael Christen ac5d124ee0 experimental implementation of a citation ranking as post-ranking 13 years ago
Michael Christen 8fc86fe397 added storage of full anchor link structure: 13 years ago
Lotus 0b3f39136e allow custom ppm lower than minimum button on /Crawler_p.html 13 years ago
Michael Peter Christen 8aba045ba1 if a new pop-up page is set in config portal, then this page applies 13 years ago
Michael Peter Christen 36e4d82b27 changed ranking 13 years ago
Michael Peter Christen 096c17e7cd added test code 13 years ago
Michael Peter Christen 9ad1d8dde2 complete redesign of crawl queue monitoring: do not look at a 13 years ago
Michael Peter Christen e2f8f263e8 changed storage of search words: keep order 13 years ago
Michael Peter Christen 2e5cd6a1b2 fixed parser extension deny list generation and usage 13 years ago
Michael Peter Christen 3cd6dcd352 do not add new solr fields as activated fields 13 years ago
Michael Peter Christen e3bb73c3d6 serialized some database access methods 13 years ago
Michael Peter Christen 355ecf330f reduced target file site to 64mb 13 years ago
Michael Peter Christen 2ea585d616 fix for host navigator 13 years ago
Michael Peter Christen 4c5edab1ec added option to have exception search result windows 13 years ago
Michael Peter Christen ef78f22ee1 performance hack 13 years ago
Michael Peter Christen 41536eb4a2 performance hack 13 years ago
Michael Peter Christen f91487fc50 added delete-button for host navigation 13 years ago
Michael Peter Christen e8d24fd802 author navigator can be switched off 13 years ago
Michael Peter Christen 558ab7bd4e made the protocol navigator reversible 13 years ago
Michael Peter Christen 96cb75f1d4 made the filetype navigator be able to deselect the search constraint 13 years ago
Lotus c73af39e54 refactoring of tray icon class, 13 years ago
Michael Peter Christen 4eff0e26f1 npe bugfix 13 years ago
Michael Peter Christen 1a0b6b3913 get more navigation details to search results 13 years ago
Michael Peter Christen 83009d86f7 added the vocabulary navigator. It can be very simply tested by 13 years ago
Michael Peter Christen 254adea51c small fixes 13 years ago
Michael Peter Christen c602eaaf46 enhanced search process 13 years ago
Michael Christen eff966f396 fix for search process (it was aborted too early during remote search) 13 years ago
Marek Otahal 72adbeae90 !Important: move from Hashtable to HashMap 13 years ago
Marek Otahal f40efb39af Blacklist loadList() remove duplicates by using Set 13 years ago
Michael Peter Christen 2ee8cbeb2c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git 13 years ago
Michael Peter Christen 992dbdf4bb added noload statistic to servlets 13 years ago
Michael Christen 216a287a85 Merge commit '6d4e08ed06c5cd28c45981b2ebe31c7f7ec6fd83' into quix0r 13 years ago
stbrumm d18095dc48 Patch fuer Issue 0000102 13 years ago
Michael Christen 585a8f3c44 fixed a bug in search sequence (caused emtpy results) 13 years ago
Roland 'Quix0r' Haeder a3083d13bf Blacklist checks are now always turned on, in media searches (e.g. image search) images matching blacklist entries are no longer shown to the user 13 years ago
Michael Christen 52184a1170 fix for search process 13 years ago
Michael Christen 0797b0de99 new handling of remote search processes: looking for seeds will now not 13 years ago
Michael Christen 9e5894c784 Removed handling of components objects for URIMetadataRows. 13 years ago
Michael Christen c04bfaa51b refactoring 13 years ago
Michael Christen e9dc99fe15 added rules to set specific RWIs as private RWIs which are not 13 years ago
Michael Peter Christen 0bcef2d156 added feature as requested in 13 years ago
Michael Christen 3eccdca63c protection against too long running snippet fetch processes 13 years ago
Michael Christen 86b3385847 fixed a deadlock during secondary remote search 13 years ago
Michael Christen c715d19c09 fixes for dependency on svn 13 years ago
Michael Christen 0bc5d76bee ups 13 years ago
Michael Christen 044f83feed added some pauses into the search process which shall produce 13 years ago
Michael Christen f14faf503b better ranking because we wait a very little time during the search 13 years ago
orbiter f9216e388c - faster ping to clean up old peers faster 13 years ago
orbiter d9c066227a fix for npe 13 years ago
orbiter ebd840ebf6 - enhanced description on search front page 13 years ago
orbiter e22f8497c9 - tested the ARC methods 13 years ago
orbiter bc5df0eef5 updated ranking tables (fresh computation) 13 years ago
orbiter 5a55397f99 some last-minute performance hacks 13 years ago
orbiter c9216d5adf fixed secondary remote search (the process that finds distributed join situations) 13 years ago
orbiter 64fd20b857 new default ranking profile 13 years ago
orbiter 0cf9ebc3b0 speed enhancements when parsing RWI rows (makes search slightly faster) 13 years ago
orbiter ee8b1d4de1 fixed unresolved pattern and unwanted local/global switch when using votes on search results 13 years ago
orbiter c584db991f creating a bookmark from the search results now works again .. with new YMarks 13 years ago
orbiter 6cd27473f5 - better default values for caching and cache usage 13 years ago
orbiter 1019c36dad bug fixes and speed enhancements for search 13 years ago
orbiter 507c9d478d much better timing when search globally; less blocking; more results earlier! 13 years ago
orbiter 8e0b2c5832 fixed cluster search 13 years ago
orbiter 804e48888b smaller bug fixes for search behavior; should produce less unnecessary removals and an exact number of results as shown in counter 14 years ago
orbiter 84c3fc9d97 local/global fixes in search, better abstraction 14 years ago
orbiter 06352b8d6b more logging 14 years ago
orbiter 017a01714d - enhanced logging in robots.txt parser for remote debugging 14 years ago
orbiter 3a15e58e28 - increased stability when opening the robots table 14 years ago
orbiter 78ce3b13be typo 14 years ago
orbiter 85d6bf4ac4 fixed urls to media content during indexing 14 years ago
orbiter 0d858d48ec replaced String with StringBuilder in suggestion process 14 years ago
orbiter 3a807e10cf - added a cache for active crawl profiles to the crawl switchboard 14 years ago
orbiter e58438c01c - added a new retry connector for solr (for cases where solr responses are slow) 14 years ago
orbiter 4ad9fc2bff new snippet strategy for search hits in metadata: show beginning of text instead of hit position 14 years ago
orbiter 5af9598bd1 enhanced exported row parsing during row import 14 years ago
orbiter a7df70221e refactoring 14 years ago
orbiter cf4fd525ee added directDocByURL attribute in crawl profile 14 years ago
orbiter 035ebfbf3b - performance hacks (should affect the crawl balancer and reduce CPU load during crawl stack re-fill) 14 years ago
orbiter b250e6466d implemented crawl restrictions for IP pattern and country lists 14 years ago
orbiter 2c3161b4ac refactoring: 14 years ago
orbiter d2ea250d99 refactoring: 14 years ago