Commit Graph

5766 Commits (b3aad6cc350c120ba58ea5b147e1744fe74513ca)

Author SHA1 Message Date
Michael Peter Christen dcc72799c4 better abstraction for result writers using controlled vocabularies and
12 years ago
Michael Peter Christen 136fcb1ad9 refactoring
12 years ago
Michael Peter Christen a12f693ec9 added two response writer for embedded solr interface:
12 years ago
Michael Peter Christen bca4a16603 replaced the multivalue generic string field name suffix _ss by _txt
12 years ago
orbiter 67edfd991c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter d9173ba7ed added more solr fields to integrate values from URIMetadataRow. All
12 years ago
Michael Peter Christen 3276508d1b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen 3ce04cecf3 bad hack to prevent a bug appearing in solr
12 years ago
sixcooler f32aa9a49c prevent merge of blobs that can't be handled in memory
12 years ago
Michael Peter Christen bbd242afb4 fix for a NPE
12 years ago
Michael Peter Christen 24d9db1613 snippet retrieval loading processes may use a smaller minimum load time
12 years ago
Michael Peter Christen ef488a15f7 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen 1687737771 Abstraction of HandleMap and HandleSet
12 years ago
sixcooler 76b037a20a check content domain fix:
12 years ago
sixcooler 9cd409682f close augmented stream if filled from cache to get its content
12 years ago
Michael Peter Christen e432bb9cd9 better calculation of possible saving in HeapReader index data structure
12 years ago
Michael Peter Christen 9549984c65 documentation/comments
12 years ago
Michael Peter Christen 3bcd9d622b cleaned up classes and methods which are either superfluous at this time
12 years ago
Michael Peter Christen 6f1ddb2519 Moved solr index-add method to the same method where the YaCy index is
12 years ago
Michael Peter Christen 315d83cfa0 cleanup
12 years ago
Michael Peter Christen 1f41d9c6f5 bugfix for a NPE
12 years ago
Michael Peter Christen 76202f068e extended abstraction of local and remote solr index using one front-end
12 years ago
Michael Peter Christen d3f243e2e1 fixed node type calculation for principal peers
12 years ago
Michael Peter Christen 826967513b changed options in IndexFederated_p to switch on/off parts of the index
12 years ago
Michael Peter Christen cba4ab862e fix for http://bugs.yacy.net/view.php?id=202
12 years ago
orbiter 69e743d9e3 - more abstraction for the RWI index as preparation for solr integration
12 years ago
orbiter 05a3ffd03a patches to ensure that solr connectors are active ony if they have a
13 years ago
orbiter 5a3c829872 embedded solr is only initiated if it is activated with
13 years ago
Michael Peter Christen 97b7bcf2a6 added a solr search index
13 years ago
Michael Peter Christen f0a079ac9f allow larger log entries
13 years ago
Michael Peter Christen 9b48c9fe2e removed a crawler overhead (terminated loop which searches greatest
13 years ago
Michael Peter Christen 784a4abb18 enhancement in internal data organization which should generate less
13 years ago
Michael Peter Christen f78ce93a80 collection of speed and memory saving hacks
13 years ago
orbiter c00a3cf74d less usage of generic logger to avoid logger generation overhead
13 years ago
orbiter a196f24f60 prevent enqueueing of non-loggeable logging entries
13 years ago
orbiter 482afed07c reduced logging overhead (a bit)
13 years ago
orbiter e76159040b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
orbiter bbfa497a3c replaced more size() > 0 by !isEmpty()
13 years ago
Michael Peter Christen 58e7d1952f reduction of logging to prevent too much IO caused be logging
13 years ago
Michael Peter Christen 83da68c4c1 fixed a memory leak inside the logger which appeared if the log was
13 years ago
orbiter 0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty()
13 years ago
orbiter 28b30231c3 fix for url matcher of multiple amp& in an url, see:
13 years ago
Roland 'Quix0r' Haeder aef9dd0350 - removed cleaning of blacklist cache on startup
13 years ago
orbiter c7afa8bc48 using SwitchboardConstants for solr attributes
13 years ago
orbiter c6d8950651 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
orbiter 5f3b8dc040 fix for RSS reader
13 years ago
orbiter 62202e2d71 refactoring of query attribute variable names for better consistency
13 years ago
Michael Peter Christen 1addbc792c use less memory for md5 cache
13 years ago
Michael Peter Christen f32de94723 more logging
13 years ago
Michael Peter Christen d09d9f2364 filter old peers from bootstrap (now stronger: 60 minutes instead of
13 years ago
Michael Peter Christen 434ee90c59 added classification for control file types which shall not be loaded
13 years ago
Michael Peter Christen a90bcb48f6 added webm
13 years ago
Michael Peter Christen 801972fe6f fix for url camel case parser and sentence reader
13 years ago
Michael Peter Christen fbc1a2030d fix for sitemap importer: can now also import very large sitemaps within
13 years ago
Michael Peter Christen 92731e5287 fix for sevenzip parser
13 years ago
Michael Peter Christen 45641b0c23 catch and log a warning in RasterPlotter
13 years ago
Michael Peter Christen 8efc1c1078 - fixed a memory leak (or bad usage) during parsing/snippet fetch
13 years ago
Michael Peter Christen c3db015410 prevent loading of content from the cache when retrieval with IFFRESH is
13 years ago
Michael Peter Christen b1e7c11fba fix for pattern matcher in html parser
13 years ago
Michael Peter Christen 8a6edc0031 fix for solr shutdown
13 years ago
Michael Peter Christen b8bcc06283 fix for urls beginning with "//"
13 years ago
Michael Peter Christen b0c408788b made class methods static where possible
13 years ago
Michael Peter Christen 5bd3c90907 - removed unnecessary semicolons
13 years ago
Michael Peter Christen 132afaf687 removed unaccessible code
13 years ago
Michael Peter Christen 7c1ba99755 removed more unused method parameters
13 years ago
Michael Peter Christen 83701a1b4c removed unused ImageReference package
13 years ago
Michael Peter Christen 0301aba1e9 removed unused method parameters
13 years ago
Michael Peter Christen 241dd8410a removed snippet pattern filter - it was not used
13 years ago
Michael Peter Christen d3964253ae - added @SuppressWarnings to unused servlet method parameters
13 years ago
Michael Peter Christen ea10766bfd cleaned unnecessary nested code
13 years ago
Michael Peter Christen 1481037820 replaced non-generic array with collection
13 years ago
orbiter fc0f9543fe More SentenceReader cleanup
13 years ago
orbiter 586bb0eb6a Simplified SentenceReader (no more Reader inside..)
13 years ago
orbiter 7f851d62a7 replaced HashARC with SizeLimited Objects which are less costly
13 years ago
orbiter d4291ac1f3 more tolerance when creating solar document
13 years ago
orbiter 78fc3cf8f8 refactoring and new usage of SentenceReader: this class appeared as one
13 years ago
orbiter bb8dcb4911 automatically adopt size of word cache to available memory
13 years ago
Michael Peter Christen ad09b786bf clean up parser data
13 years ago
Michael Peter Christen 276a66a793 Adding a limit of 1000 links that a parser shall store during indexing.
13 years ago
Michael Peter Christen 613b45f604 - better data structures in secondary search
13 years ago
Michael Peter Christen de903a53a0 parser refactoring & hacks
13 years ago
Michael Peter Christen 8a82609360 - smaller caches to save memory
13 years ago
Michael Peter Christen 7249d9c9de bugfix for concurrent seed loader
13 years ago
Michael Peter Christen c72d3b12cd concurrently initialize the seed list during p2p network bootstrap
13 years ago
Michael Peter Christen 1825f165b8 better integration of blacklist according to use case
13 years ago
Michael Peter Christen c18fa9fa75 Merge branch 'master' of git://gitorious.org/~reger/yacy/bbyacy-rc1
13 years ago
Michael Peter Christen ce8d4b87d9 fixes for new eclipse 'Juno' warning 'Resource leak'.
13 years ago
Michael Peter Christen 0c345d1559 giving threads name so its easier to see whats happening during
13 years ago
reger 067728bccc add search result heuristic. adding a crawl job with depth-1 for every displayed search result (crawling every external linked page of displayed search result pages)
13 years ago
Michael Peter Christen 03280fb161 removed segments-concept and the Segments class:
13 years ago
Michael Peter Christen 508a81b86c added solr field 'refresh_s' which stores the refresh url contained in
13 years ago
Michael Peter Christen f3167def64 do not fill the keywords with title content if keywords do not exist.
13 years ago
Michael Peter Christen 9116013c64 - allow lazy initialization of solr value (if using 'lazy', then no
13 years ago
sixcooler 97f60010d8 fix crawl start from file
13 years ago
Michael Peter Christen 0294a53459 - add canonical field only if requested by solr schema
13 years ago
Michael Peter Christen 3fd4a01286 added option to record urls that are forwarded to the solr index
13 years ago
Michael Peter Christen d763e4d94b fixed bad referer computation in SSIs which causes a NPE during host
13 years ago
Michael Peter Christen 358b04885e more IPv6 hacks
13 years ago
Michael Peter Christen 96aeb127e3 generalized localhost naming.
13 years ago
Michael Peter Christen 77f795756c fixing redirects and status codes: storing of status code in
13 years ago