yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	00c1c777fa	refactoring	12 years ago
Michael Peter Christen	f75b3f8a47	added more patches to work without RWI data structure	12 years ago
Michael Peter Christen	e8acd542b5	- added faceted drill-down for host and geolocation to solr queries - added a new geolocation field to index schema, the old values are migrated if possible	12 years ago
Michael Peter Christen	4716546ef5	- reduced memory usage in index transmission using a transformation of Node to Row objects - removed peerDeparture in solr remote search in case that peer does not answer (this may be normal because it is allowed to switch this off)	12 years ago
Michael Peter Christen	653645c1cf	corrected solr query syntax	12 years ago
Michael Peter Christen	f42a57cd7d	gsa format update	12 years ago
Michael Peter Christen	b3aad6cc35	bugfix for remote search when search is done to solr	12 years ago
Michael Peter Christen	ff3eaa21b0	added remote search to solr on YaCy peers! - when doing a remote search, node peers are selected for solr queries - the solr query is done concurrently to the standard YaCy rwi search - the solr search result is feeded into the same data structure that prepares the rwi search result - the same remote seach that is done to several outside peers is done to the local solr index - the search process works now also without any 'old' RWI data using solr	12 years ago
Michael Peter Christen	a06123aec6	more abstraction and less parameter overhead for remote search	12 years ago
Michael Peter Christen	f00733186b	code simplifications	12 years ago
orbiter	404b0aab09	refactoring in remote search and stub for remote node peer selection	12 years ago
Michael Peter Christen	0cab06c47c	refactoring	12 years ago
Michael Peter Christen	18f989dfb1	- refactoring (load -> getMetadata) - added getDocument to retrieve Solr documents which shall replace getMetadata	12 years ago
Michael Peter Christen	703f427303	fixed some peer-ping connection details - larger time-out - removed too old seedlist - fixed a bug in connection test	12 years ago
orbiter	e816b88b55	changed behaviour of metadata storage: in case that any solr is attached, the metadata is not written to the metadata-db, even if it is enabled but instead to solr. This prevents that metadata is written in two store systems at the same time. It is also the next step to migrate the current metadata-db to solr.	12 years ago
Michael Peter Christen	f9c0e6e950	- Implemented and integrated the URIMetadataNode object which is a metadata representation from the solr index. This shall replace metadata from the built-in database in the future. - added the Solr-driven metadata into the search index of YaCy which makes it now possible to run YaCy without the old metadata index. This is a major stept forward to a full migration to Solr.	12 years ago
orbiter	67edfd991c	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
orbiter	d9173ba7ed	added more solr fields to integrate values from URIMetadataRow. All writings to the Metadata-DB are now also done to solr. This includes metadata transfer during search and rwi transfer. The new/added solr fields are: ## time when resource was loaded load_date_dt ## date until resource shall be considered as fresh fresh_date_dt ## id of the host, a 6-byte hash that is part of the document id host_id_s ## ids of referrer to this document referrer_id_ss ## the md5 of the raw source md5_s ## the name of the publisher of the document publisher_t ## the language used in the document; starts with primary language language_ss ## an external ranking value ranking_i ## the size of the raw source size_i ## number of links to audio resources audiolinkscount_i ## number of links to video resources videolinkscount_i ## number of links to application resources applinkscount_i	12 years ago
Michael Peter Christen	1687737771	Abstraction of HandleMap and HandleSet	12 years ago
Michael Peter Christen	6f1ddb2519	Moved solr index-add method to the same method where the YaCy index is written. Also done some code-cleanup.	12 years ago
Michael Peter Christen	1f41d9c6f5	bugfix for a NPE	12 years ago
Michael Peter Christen	d3f243e2e1	fixed node type calculation for principal peers	12 years ago
orbiter	69e743d9e3	- more abstraction for the RWI index as preparation for solr integration - added options in search index to switch parts of the index on or off	12 years ago
orbiter	0cbda0b2b8	- replaced all length() == 0 and size() == 0 with isEmpty() - replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be done automatically - implemented some isEmpty() methods	13 years ago
Michael Peter Christen	0301aba1e9	removed unused method parameters	13 years ago
Michael Peter Christen	241dd8410a	removed snippet pattern filter - it was not used	13 years ago
Michael Peter Christen	1481037820	replaced non-generic array with collection	13 years ago
Roland 'Quix0r' Haeder	edaa09b9b1	Rewrote all String blacklist types to enum 'BlacklistType', closes bug #143 Conflicts: htroot/Supporter.java htroot/yacy/crawlReceipt.java htroot/yacy/transferRWI.java htroot/yacy/transferURL.java source/de/anomic/crawler/CrawlStacker.java source/de/anomic/data/ListManager.java source/net/yacy/peers/Protocol.java source/net/yacy/repository/Blacklist.java source/net/yacy/repository/LoaderDispatcher.java source/net/yacy/search/Switchboard.java source/net/yacy/search/index/MetadataRepository.java source/net/yacy/search/index/Segment.java source/net/yacy/search/query/RWIProcess.java source/net/yacy/search/snippet/MediaSnippet.java	13 years ago
Michael Peter Christen	e0d8643226	- performance hacks - added log warnings in case that search processes run into time-out situations - better concurrency for Integer formatter (used a non-synchronized formatter before) - bugfix for search termination (a poison pill was missing) - added timeout parameters for search (again) -> target is, that they are never reached.	13 years ago
Michael Peter Christen	7c1feefb28	introduced a default 10 second time-out in rwi normalization time uring search process to prevent endless deadlocks after a very long running search	13 years ago
Michael Peter Christen	3c2bec681f	added a root node flag: identifies peers with short ping time	13 years ago
Michael Peter Christen	ba6aaabc51	refactoring + parser bugfixes	13 years ago
Michael Peter Christen	f8cd57c92f	new indexing strategy: ALL links that appear anywhere are indexed, not only links where the content can be parsed. All non-parseable links are placed into the noload queue. The search process must therefore be able to filter out non-text search results. - This fixes the problem that image search results appeared in the text search. - The interactive search can retrieve now ALL types of links - The p2p interface is now extended to retrieve only certain types of links (text, image, video, apps) - The search process has an extension to filter the right document type according to the search query	13 years ago
Michael Peter Christen	14f67f217c	refactoring of ContentDomain: now subclass of Classification	13 years ago
Michael Peter Christen	c639248c23	protection against strange answers from remote peers during search	13 years ago
Michael Christen	9e5894c784	Removed handling of components objects for URIMetadataRows. This is a preparation to replace this rows with nodes from the node store.	13 years ago
Michael Christen	044f83feed	added some pauses into the search process which shall produce better-ranked search results. without that pauses the result page will only contain links from the peer that answers first which is not a good average picture of all the peers that provided results	13 years ago
Michael Christen	f14faf503b	better ranking because we wait a very little time during the search process more to get better remote sear results into the ranking priority stack	13 years ago
sixcooler	aeeae75b8a	the timeout of httpclient is not absolut, but till a connection is established or between bytes send trying this to reduce count of client-connections to /yacy/search.html of other peers	13 years ago
orbiter	ebd840ebf6	- enhanced description on search front page - fixed language and heuristic modifier - added hint to crawl start that we can do also ftp and smb crawls - added a protocol extension to remote crawls to transport all search modifiers to remote peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8108 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	c9216d5adf	fixed secondary remote search (the process that finds distributed join situations) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8098 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	8e0b2c5832	fixed cluster search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8083 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	a7df70221e	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7987 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago

43 Commits (f879a344e7822012eca8cccba371f3d7643e631e)