Commit Graph

468 Commits (5db97a892856265737f67978a5175c36c46e5cee)

Author SHA1 Message Date
Michael Peter Christen 087df05e24 added option to Config_Network_p.html to enable remote search while
11 years ago
Michael Peter Christen 1a4a69c226 set more logger to 'final static'
11 years ago
Michael Peter Christen 69b8d61c47 fix for search requests in GSA interface which contain 'funny'
11 years ago
reger 7b17cdf6dd add content_type:image/* to image search
11 years ago
Michael Peter Christen 1b4fa2947d - fixed a problem which ocurred when a document was not recognized with
11 years ago
Michael Peter Christen 78e7aadb26 removed unused initialization method
11 years ago
Michael Peter Christen 4fbc4740df removed warnings
11 years ago
orbiter 8ac2e8c8c9 added location navigator which causes that the image to the map search
11 years ago
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
11 years ago
Michael Peter Christen 85456f46b2 added two new fields, exact_signature_copycount_i and
11 years ago
Michael Peter Christen a2511b5600 turned images_alt_txt back to images_alt_sxt because it is not necessary
11 years ago
Michael Peter Christen 85b1922244 activated image type navigation for image search
11 years ago
Michael Peter Christen 9e12fdff23 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen ab1201fdfd fixed wrong facet count
11 years ago
Michael Peter Christen 049c3b3f2e added an option to exclude image search results from text search. This
11 years ago
Michael Peter Christen a8c5bfcf58 avoid to create unnecessary objects
11 years ago
Michael Peter Christen dc179bd61f fix for catchall query goal for image search
11 years ago
reger 392174de8c remove all_words, all_strings lists from QueryGoal
11 years ago
Michael Peter Christen 169ef8963d one more fix for image search
11 years ago
Michael Peter Christen cb85b22725 redesign of the image search process (with much better results,
11 years ago
reger 29967102a2 optimized QueryGoal (reducing mem and computation by removing all_hashes)
11 years ago
orbiter f106345eef link strings should not be tokenized
11 years ago
reger a5019bc470 make Vocabulary Navigator tags a hard result entry filter
11 years ago
reger a67a4b7d86 improve tld: query modifier filter pattern (to prevent tld:net accepting www.abcinet.org)
11 years ago
Roland Haeder 841a28ae76 Added 'final' for all exception blocks as this helps the Java compiler
11 years ago
Michael Peter Christen 5878c1d599 - refactoring of log to ConcurrentLog:
12 years ago
Michael Peter Christen a2c8116a8f accept (but ignore) a '+' sign in front of search words
12 years ago
sixcooler d5d8936f9d For indexes that are changing rapidly in NRT situations, fcs (stands for
12 years ago
Michael Peter Christen 32aa1d4569 removed unused option for queries
12 years ago
Michael Peter Christen 8caaf6203a fixed false multiple-generation of remote facet search which
12 years ago
reger d367b1f4d9 add null pointer check to stopword fix
12 years ago
reger 7480e87386 - fix stopword handling for RWI see example http://bugs.yacy.net/view.php?id=247
12 years ago
Michael Peter Christen 409d6edf53 Store node/solr search threads to be able to send them an interrupt
12 years ago
Michael Peter Christen 0c1a018bbd removed 'later' tactic because it used too much RAM, reduced number of
12 years ago
orbiter da621e827e prevent NPE in case RWI is disabled
12 years ago
Michael Peter Christen c2b1075dcf activating pollImmediately in case that DHT receive is off. This will
12 years ago
Michael Peter Christen 06d3063dc9 - no downcase when using collection modifier
12 years ago
Michael Peter Christen 8dbc80da70 redesign of index.exist-test: this shall now not be done using a single
12 years ago
Michael Peter Christen 4058369288 fixed query expressions for collection selection (added quotes)
12 years ago
Michael Peter Christen cca19d94d4 re-declared some fields to be of type string rather than text which
12 years ago
Michael Peter Christen 3841854c97 abstraction of catchall term
12 years ago
Michael Peter Christen bb4bf3d8fd infinity timeout bug protection patch
12 years ago
Michael Peter Christen c091000165 added collection attribute also to the rss feed reader
12 years ago
orbiter f7571386a3 added a 'collection' property attribute in yacysearch.html which can be
12 years ago
Michael Peter Christen 97775fbebc fixed ranking for add-function queries: this did not work. The option
12 years ago
Michael Peter Christen 082e3274d6 - setting the same default ranking in the solr interface as for YaCy
12 years ago
Michael Peter Christen edc0b33f6d - showing references count and clickdepth in host browser
12 years ago
reger 566a3b0294 fix: Index Administration > Reverse Word Index (IndexControlRWIs_p) corrected use of word search to word-hash search
12 years ago
Michael Peter Christen cf0acd2cb4 upgrade to solr 4.2.1
12 years ago
orbiter 940c6849ee enhanced did-you-mean (a bit): can now remember previously searched
12 years ago
Michael Peter Christen 9406a2e438 fixed NPE during index abstract computation
12 years ago
Michael Peter Christen 2d36a7eaf5 - do not create a new query for all remote peers
12 years ago
Michael Peter Christen 4af0839be2 use appropriate ranking for each search situation:
12 years ago
Michael Peter Christen addba047e2 changes in ranking computation
12 years ago
Michael Peter Christen 25300913fa fixes to search debugging after testing with the different search
12 years ago
Michael Peter Christen 81380ae5c8 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen c2fde018b5 concurrent snippet fetching from solr results which do not have snippets
12 years ago
orbiter b1140e3d82 added debug switches for detailed search testing
12 years ago
orbiter cdbfddf091 added filter queries for better image, audio and video results
12 years ago
Michael Peter Christen 587ef83eab added missing cleanup statements for short memory cases during search
12 years ago
Michael Peter Christen ae734b3f8d enhanced the search result processing
12 years ago
Michael Peter Christen 221ed7d764 - enhanced concurrency during search without IO blocking
12 years ago
orbiter 0f7ea7ad9f - enhanced solr.add procedure for mass adds
12 years ago
orbiter 9c09fd7d0b better/less requests to local solr; the request is made in chunks which
12 years ago
orbiter d74472f562 corrected result counter
12 years ago
Michael Peter Christen c95a84103a complete redesign of search process:
12 years ago
Michael Peter Christen 35fa718b77 testing to use solr for portalsearch caused some bugfixing but no full
12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
12 years ago
Michael Peter Christen 91a0401d59 introduced a second core named 'webgraph'. This core will hold the link
12 years ago
Michael Peter Christen b6de1f42dc Full redesign of solr connection architecture. This was done to support
12 years ago
Michael Peter Christen d3508fa8ff fixed json search, quotes, auto-facets, urls etc. for
12 years ago
Michael Peter Christen c34af7fe94 extended JSON Response Writer and Opensearch Response Writer for the
12 years ago
Michael Peter Christen e8f7b85b98 fixes to internal RWI usage if RWI is switched off (NPE etc)
12 years ago
Michael Peter Christen 3834829b37 bugfixes and more logging for solr connector
12 years ago
Michael Peter Christen 592adf7ccb fix for domain navigation
12 years ago
Michael Peter Christen 8651ec35fe turned author_s into the multi-valued field author_sxt
12 years ago
Michael Peter Christen 0fe7b6fd3b migrated the index export methods from the old metadata to solr. Now
12 years ago
Michael Peter Christen 4735bd47f4 - changed solr commit call and added an optimize option. Since Solr
12 years ago
Michael Peter Christen cba038f97b one more NPE fix
12 years ago
Michael Peter Christen c3d50d91f8 relaxing site operator for www prefix:
12 years ago
Michael Peter Christen db49e91724 fixed a NPE which may appear for freeworld peers without any rwi index
12 years ago
Michael Peter Christen 4faa07c214 added a timeout for topic computation (solr is here much slower than the
12 years ago
Michael Peter Christen d2d5be032d added a 'inlink' search option according to the suggestion in the YaCy
12 years ago
reger 3897bb4409 added (manual) urldb migration (link on: Index Administraton -> Federated Solr Index)
12 years ago
reger f143804382 fix configuration for search page navigators
12 years ago
orbiter fe50702eb0 added a filterscannerfail attribute to QueryParams which causes that a
12 years ago
Michael Peter Christen eb90d38cd7 added missing extension 'mkv' for navigation
12 years ago
Michael Peter Christen 4a9182ae16 use the search configuration to default the cacheStrategy to the value
12 years ago
Michael Peter Christen e1f89efd0d - made image search in interactive search using the ViewImage servlet -
12 years ago
Michael Peter Christen 433143ba40 removed protocol, tld, ext from the urlmask and created specific
12 years ago
Michael Peter Christen 84f82541e8 search process enhancements
12 years ago
Michael Peter Christen 02020b590b - removed all extension types from extension navigation which are not
12 years ago
Michael Peter Christen 01200f06cc using the author field as solr-native facet. this makes it necessary to
12 years ago
Michael Peter Christen bab573361f - using a filter query for facet restriction
12 years ago
Michael Peter Christen 1052263af3 - added a new solr field references_i which stores the number of
12 years ago
Michael Peter Christen 34f8786508 removed dependency of vocabulary navigation from Jena and it's
12 years ago
Michael Peter Christen 9319b90d8a - fixes for host navigation
12 years ago
Michael Peter Christen cb5cbec14d distinguishing modified query string and original query string
12 years ago
Michael Peter Christen 8aa08261a7 update to Solr Boost handling
12 years ago
Michael Peter Christen 72f165d58b added a Boost class which stores solr query boost values. The class can
12 years ago
Michael Peter Christen 8fc3679c66 using more pre-compile pattern for split methods
12 years ago
Michael Peter Christen d48e9788d2 enhanced search result processing behavior
12 years ago
reger 469efcdb9d fix: display and calculate authors and namespace search navigator if configured (otherwise skip overhead)
12 years ago
orbiter ee612e8b93 start the local search only if this peer is doing a remote search or
12 years ago
Michael Peter Christen 4eab3aae60 removed overhead by preventing generation of full search results when
12 years ago
Michael Peter Christen d6b82840f8 added a feature to find similarities in documents.
12 years ago
Michael Peter Christen 46be4af5b9 Merge commit '2bb8f045cc92f31fc7e720cc30b38af417563890'
12 years ago
Michael Peter Christen 952e143580 FINALLY YaCy can now search for full strings using double- or
12 years ago
orbiter 5dfd6359cb redesign of the QueryParams class: introduced QueryGoal which holds the
12 years ago
Michael Peter Christen d64445c3cb because we have the inurl:<term> - searchmodifier, we don't actually
12 years ago
cominch d2a94cc55e refactor package
12 years ago
cominch 21df1ad9e0 update and generalization of the SMW import and content control routines
12 years ago
Michael Peter Christen 842faf96a2 fixed media search
12 years ago
Michael Peter Christen 93001586a0 removed warnings, removed too-fast pausing of crawls
12 years ago
Michael Peter Christen 8041742e48 added matching of path to query pattern
12 years ago
Michael Peter Christen 570e42c4e3 fix for filetype naviagtor
12 years ago
Michael Peter Christen 158732af37 automatically delete entries from the crawl profile list if crawl is
12 years ago
Michael Peter Christen 2371ef031c added solr faceted search support to YaCy search results
12 years ago
Michael Peter Christen 619bf7e875 fixed filetype modified for media types in text search
12 years ago
Michael Peter Christen 8fb370d9f8 renovated the way how search results are count. should be correct now...
12 years ago
Michael Peter Christen b764de424a code cleanup
12 years ago
Michael Peter Christen 1168d09de8 more refactoring - integrated the code of SnippetProcess into
12 years ago
Michael Peter Christen 6629e37685 tried to clean up the search process mess
12 years ago
Michael Peter Christen c5f67a5d6d fixed a problem with local search from solr results: now all results
12 years ago
orbiter 276dd6452b removed warnings
12 years ago
Michael Peter Christen ce0e5b1e17 - more refactoring / private methods
12 years ago
Michael Peter Christen ccc3760a47 Refactoring and redesign of data architecture to make URIMetadataRow
12 years ago
Michael Peter Christen e5b3c172ff removed hack which translated Solr documents to virtual RWI entries
12 years ago
Michael Peter Christen 5d16c23a1f specified more URIMetadata as URIMetadataNode
12 years ago
Michael Peter Christen 43f3345c90 - removed dependencies from URIMetadataRow and made direct access to
12 years ago
Michael Peter Christen 36c13ed15b less solr prefetch
12 years ago
Michael Peter Christen 5f0ab25382 removed the option to prevent removal of &amp; parts inside of the
12 years ago
Michael Peter Christen 584663ae8c - redesign of solr query construction
12 years ago
orbiter 4fed4a86d8 another fix to location search
12 years ago
orbiter 0f7a54452d fix for location search query encoding
12 years ago
Michael Peter Christen f8a3ab2d82 added the usage of synonyms to the GSA search interface
12 years ago
Michael Peter Christen ca313e404f - if a "/date" modifier is used, the solr remote query applies an
12 years ago
Michael Peter Christen 5ac61591f3 better abstraction for solr query params
12 years ago
Michael Peter Christen 1533bfd63b refactoring
12 years ago
Michael Peter Christen e49359cc95 removed tenant query attribute since it is not used any more and is
12 years ago
Michael Peter Christen 872f83ebe0 refactoring
12 years ago
Michael Peter Christen fb9460f0a8 using the search filter to drill down search to file types.
12 years ago
Michael Peter Christen e57bf2ca39 simplified DHT classes
12 years ago
Michael Peter Christen 8219a445f3 refactoring
12 years ago
Michael Peter Christen 00c1c777fa refactoring
12 years ago
orbiter 563d584420 removed more dependencies in cora from kelondro
12 years ago
orbiter 63762d8f89 removed kelondro dependencies from cora
12 years ago
Michael Peter Christen 4d29f59a27 removed warnings
12 years ago
Michael Peter Christen 31d4d38804 - extended the solr interface by a references-by-word-count method
12 years ago
Michael Peter Christen 75d5e3475d Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
cominch dc468dad01 add content control features for custom filter lists
12 years ago
Michael Peter Christen 316b5fe116 - added a solr type definition verifier
12 years ago
Michael Peter Christen 4521d63c92 added boosts to solr search queries
12 years ago
Michael Peter Christen e8acd542b5 - added faceted drill-down for host and geolocation to solr queries
12 years ago
Michael Peter Christen 48a82bc705 log queries anonymous from gsa+solr requests
12 years ago
Michael Peter Christen ab6ec4ec52 added snippet computation to solr/rss and gsa result writer
12 years ago
Michael Peter Christen 653645c1cf corrected solr query syntax
12 years ago
Michael Peter Christen a049761e0c fixed double-check
12 years ago
Michael Peter Christen f42a57cd7d gsa format update
12 years ago
Michael Peter Christen ff3eaa21b0 added remote search to solr on YaCy peers!
12 years ago
Michael Peter Christen a06123aec6 more abstraction and less parameter overhead for remote search
12 years ago
Michael Peter Christen f00733186b code simplifications
12 years ago
Michael Peter Christen db0d438709 fix for http://bugs.yacy.net/view.php?id=206
12 years ago
orbiter 404b0aab09 refactoring in remote search and stub for remote node peer selection
12 years ago
orbiter 99ef57f103 reduced sleep times
12 years ago
Michael Peter Christen 0cab06c47c refactoring
12 years ago
Michael Peter Christen 40c0856489 refactoring
12 years ago
Michael Peter Christen 06a78eecb7 code simplification
12 years ago
Michael Peter Christen 9bece5ac5f enhanced snippet fetch - removed a bug that caused documents to be
12 years ago
Michael Peter Christen 18f989dfb1 - refactoring (load -> getMetadata)
12 years ago
Michael Peter Christen 395b78a0d8 using the solr search index to concurrently search within solr and the
12 years ago
Michael Peter Christen 6197caf698 added clear-text search words in query params
12 years ago
Michael Peter Christen e5ef840f40 - renamed DoubleSolrConnector to MirrorSolrConnector and added a
12 years ago
Michael Peter Christen 136fcb1ad9 refactoring
12 years ago
Michael Peter Christen 24d9db1613 snippet retrieval loading processes may use a smaller minimum load time
12 years ago
Michael Peter Christen ef488a15f7 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen 1687737771 Abstraction of HandleMap and HandleSet
12 years ago
sixcooler 76b037a20a check content domain fix:
12 years ago
Michael Peter Christen 3bcd9d622b cleaned up classes and methods which are either superfluous at this time
12 years ago
Michael Peter Christen 6f1ddb2519 Moved solr index-add method to the same method where the YaCy index is
12 years ago
Michael Peter Christen 76202f068e extended abstraction of local and remote solr index using one front-end
12 years ago
orbiter 69e743d9e3 - more abstraction for the RWI index as preparation for solr integration
12 years ago
orbiter c00a3cf74d less usage of generic logger to avoid logger generation overhead
13 years ago
orbiter 0cbda0b2b8 - replaced all length() == 0 and size() == 0 with isEmpty()
13 years ago
orbiter c7afa8bc48 using SwitchboardConstants for solr attributes
13 years ago
orbiter 62202e2d71 refactoring of query attribute variable names for better consistency
13 years ago
Michael Peter Christen 0301aba1e9 removed unused method parameters
13 years ago
Michael Peter Christen 241dd8410a removed snippet pattern filter - it was not used
13 years ago
Michael Peter Christen ea10766bfd cleaned unnecessary nested code
13 years ago
Michael Peter Christen 613b45f604 - better data structures in secondary search
13 years ago
Michael Peter Christen ce8d4b87d9 fixes for new eclipse 'Juno' warning 'Resource leak'.
13 years ago
Michael Peter Christen 0c345d1559 giving threads name so its easier to see whats happening during
13 years ago
Michael Peter Christen b9dfca4b0a - fixed IndexFederated Servlet / a embedded Solr can now be selected
13 years ago
Michael Peter Christen 9264d8b4af removed old navigation practice using subject tags in favor of
13 years ago
Michael Peter Christen 64c0268b2b show triplestore metadata in yacydoc and viewfile
13 years ago
Michael Peter Christen 8b53771db2 changed behavior of navigation processing:
13 years ago
Michael Peter Christen 5fc6524ca8 - moved triple store to net.yacy.cora.lod (should be generalized there
13 years ago
Roland 'Quix0r' Haeder edaa09b9b1 Rewrote all String blacklist types to enum 'BlacklistType', closes bug
13 years ago
cominch 65c5826d93 bugfix
13 years ago
Michael Peter Christen 701b9a28a0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen ab7107b34b fixed RWIProcess queue limits: now discovering hidden results for mass
13 years ago
Michael Peter Christen 96e9d77270 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 00f2df1120 a variety of possible memory leak fixes
13 years ago
Michael Peter Christen 461a0ce052 removed warnings
13 years ago
Michael Peter Christen 407fdf6968 more bug fixes and performance hacks for search process
13 years ago
Michael Peter Christen a1fe65b115 performance hacks
13 years ago
Michael Peter Christen 2fe207f813 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 5e562dcdb7 adopted vocabulary usage within anotation/naviagtion feature of search
13 years ago
Michael Peter Christen 240045cf7c fix for bad distance computation
13 years ago
Michael Peter Christen e0d8643226 - performance hacks
13 years ago
Michael Peter Christen 9b4c699526 ehanced location search:
13 years ago
Michael Peter Christen 834dc6b263 store more data from interface access
13 years ago
Michael Peter Christen 7c1feefb28 introduced a default 10 second time-out in rwi normalization time
13 years ago
Michael Peter Christen 7bf421b9dd - fixed image search page navigation
13 years ago
Michael Peter Christen c6558cba08 more classification bugs
13 years ago
Michael Peter Christen 082831b9d6 search contentdom was checked in wrong way - fixed
13 years ago
Michael Peter Christen f294f2e295 bugfix to http://bugs.yacy.net/view.php?id=181
13 years ago
Michael Peter Christen 3e1bc9477f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
13 years ago
Michael Peter Christen 52d307c735 prevent that the snippet fectch process removes catchall entries
13 years ago
Michael Peter Christen 89142d1e8d removed (not all) warnings
13 years ago
reger b2175ea4ef Add possibility to set custom Solr field names for the YaCy default Solr attributes.
13 years ago
Michael Peter Christen c00efc2717 made the solr connection more generic
13 years ago
Michael Peter Christen ba6aaabc51 refactoring + parser bugfixes
13 years ago
Michael Peter Christen a3badd3205 changed search process for images: no more media snippet load process,
13 years ago
Michael Peter Christen f8cd57c92f new indexing strategy: ALL links that appear anywhere are indexed, not
13 years ago
Michael Peter Christen 14f67f217c refactoring of ContentDomain: now subclass of Classification
13 years ago
Michael Peter Christen 33d1062c79 refactoring: the cache belongs to the crawler
13 years ago
Michael Peter Christen 7b5b9baee0 added citation rank to ranking profile
13 years ago
Michael Christen ac5d124ee0 experimental implementation of a citation ranking as post-ranking
13 years ago
Michael Peter Christen e2f8f263e8 changed storage of search words: keep order
13 years ago
Michael Peter Christen 2ea585d616 fix for host navigator
13 years ago
Michael Peter Christen 41536eb4a2 performance hack
13 years ago
Michael Peter Christen f91487fc50 added delete-button for host navigation
13 years ago
Michael Peter Christen e8d24fd802 author navigator can be switched off
13 years ago
Michael Peter Christen 558ab7bd4e made the protocol navigator reversible
13 years ago
Michael Peter Christen 96cb75f1d4 made the filetype navigator be able to deselect the search constraint
13 years ago
Michael Peter Christen 4eff0e26f1 npe bugfix
13 years ago
Michael Peter Christen 1a0b6b3913 get more navigation details to search results
13 years ago
Michael Peter Christen 83009d86f7 added the vocabulary navigator. It can be very simply tested by
13 years ago
Michael Peter Christen c602eaaf46 enhanced search process
13 years ago
Michael Christen eff966f396 fix for search process (it was aborted too early during remote search)
13 years ago
Michael Christen 585a8f3c44 fixed a bug in search sequence (caused emtpy results)
13 years ago
Michael Christen 52184a1170 fix for search process
13 years ago
Michael Christen 0797b0de99 new handling of remote search processes: looking for seeds will now not
13 years ago
Michael Christen 9e5894c784 Removed handling of components objects for URIMetadataRows.
13 years ago
Michael Christen c04bfaa51b refactoring
13 years ago
Michael Christen e9dc99fe15 added rules to set specific RWIs as private RWIs which are not
13 years ago
Michael Peter Christen 0bcef2d156 added feature as requested in
13 years ago
Michael Christen 3eccdca63c protection against too long running snippet fetch processes
13 years ago
Michael Christen 86b3385847 fixed a deadlock during secondary remote search
13 years ago