yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	a4214694df	We assert that no other metadata storage than solr is used now. Therefore a property like solrConnected() must be true all the time. Removal of this method causes removal of all write operations to the old metadata index.	13 years ago
Michael Peter Christen	abab291162	made the index schema retrieval public and allow cross-domain retrieval	13 years ago
Michael Peter Christen	0cec7e761a	enhanced snippet extractor to find snippets also inside of tokens of an url	13 years ago
sixcooler	c65b576a6f	added filename for missing crawlname when crawling from file	13 years ago
sixcooler	6c50d016ed	pdf- and zipParser should not use forced Memory-Limits	13 years ago
Michael Peter Christen	562183932b	- removed ip_s from default profile since that needs a DNS lookup to create an document entry. This makes remote search much slower. - removed synchronization of add method if ip_s is activated to prevent that a user configuration causes bad behavior. The disadvantage of that is, that a index dump can cause data loss if an indexing is running during index dump - catched more exceptions and more NPE - better abstraction in MirrorSolrConnector - slight performance enhancement when only the index count is requested (rows=0 is sufficient to get a total count)	13 years ago
Michael Peter Christen	24f4ca4d85	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	13 years ago
apfelmaennchen	7efe9eb37b	adding CORS access header for Network.xml to overcome cross domain restriction (e.g. necessary to build a JavaScript YaCy client).	13 years ago
apfelmaennchen	116f429e35	fix for java.lang.RuntimeException: TableColumnIndex not available...	13 years ago
Michael Peter Christen	5ac61591f3	better abstraction for solr query params	13 years ago
Michael Peter Christen	c913b2ba77	- fix for NPEs during remote solr configuration - fixed remote solr setting switch - added more logging	13 years ago
Michael Peter Christen	b5192e03d7	fixed bad output in stopYACY.sh	13 years ago
Michael Peter Christen	882d54067a	added dummy update servlet	13 years ago
Michael Peter Christen	1533bfd63b	refactoring	13 years ago
Michael Peter Christen	e49359cc95	removed tenant query attribute since it is not used any more and is replaced by the site-operator in the GSA interface. This operator can also be simulated in the Solr interface using the collections_sxt field.	13 years ago
Michael Peter Christen	872f83ebe0	refactoring	13 years ago
Michael Peter Christen	fb9460f0a8	using the search filter to drill down search to file types. A search like "mp3 filetype:mp3" will now maybe surprise you.	13 years ago
Michael Peter Christen	bc865ab816	more cleaning (yacy-cora)	13 years ago
Michael Peter Christen	640339ee21	added the indexrestore.sh script which must be called with the path of the index dump. This is the reverse of indexdump.sh which takes the output of indexdump.sh as input to restore an index. Now it should be possible to transfer a complete YaCy Solr index from one peer yacy1 to another peer yacy2 with the following command: yacy2/bin/indexrestore.sh ´yacy1/bin/indexdump.sh´	13 years ago
Michael Peter Christen	15ea053c3a	- added xml output in IndexControlURLs to get the storage page of index dump commands - adjusted the apicall.sh script to get the downloaded text as output to stdout which is necessary to parse the content out of it - added indexdump.sh script which creates a solr dump and prints out the storage path for the index dump - added synchronization to the Fulltext class to prevent that data is stored to a non-existing solr index while this index is disabled during the storage of the dump	13 years ago
Michael Peter Christen	1b474139dd	used the new zip writer/reader to add a solr dump process: the whole solr index can be written to a zip dump and also restored during runtime	13 years ago
Michael Peter Christen	4a3e684f8c	added a directory-to-zip writer and zip-to-directory reader	13 years ago
Michael Peter Christen	d9ebf4a40f	a bit more logging	13 years ago
Michael Peter Christen	5683162bd3	simplifications in DHT Distribution class and more documentation	13 years ago
Michael Peter Christen	e57bf2ca39	simplified DHT classes	13 years ago
orbiter	a053b356ee	added new classes to renovate the YaCy protocol based on simple data structures in cora: - added the Peer object, which is a fresh version of Seed - added the Peers object, which is a fresh version of Network - added the Network api access class to retrieve a list of peers based on the Network.xml servlet in all YaCy peers.	13 years ago
orbiter	14897d4bfc	fixed mistake in wt-option which caused that the yacy json format overlapped the solr built-in json format	13 years ago
Michael Peter Christen	8219a445f3	refactoring	13 years ago
Michael Peter Christen	f879a344e7	fix for no depth limit default value	13 years ago
Michael Peter Christen	fa7f6f0be8	added HostBrowser servlet (stub)	13 years ago
Michael Peter Christen	00c1c777fa	refactoring	13 years ago
orbiter	563d584420	removed more dependencies in cora from kelondro	13 years ago
orbiter	aa65282259	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	13 years ago
orbiter	63762d8f89	removed kelondro dependencies from cora	13 years ago
orbiter	39564fddbd	more ignore	13 years ago
orbiter	6e0f4557f8	added ftp to getName	13 years ago
cominch	23204d2245	change parameter to support the smw extension for list import	13 years ago
Michael Peter Christen	c235d5c0f1	fixed size parsing in RSS message parser (for YaCy size parameter)	13 years ago
orbiter	089a03114e	full memory usage for debian and when changing the size: debian seems to dislike the big difference between xmx and xms (I have crashes here which stop if both values are same)	13 years ago
Michael Peter Christen	5bc8f34150	fix for success query counter	13 years ago
orbiter	60b1e23f05	added new crawl options: - indexUrlMustMatch and indexUrlMustNotMatch which can be used to select loaded pages for indexing. Default patterns are in such a way that all loaded pages are also indexed (as before) but when doing an expert crawl start, then the user may select only specific urls to be indexed. - crawlerNoDepthLimitMatch is a new pattern that can be used to remove the crawl depth limitation. This filter a never-match by default (which causes that the depth is used) but the user can select paths which will be loaded completely even if a crawl depth is reached.	13 years ago
orbiter	4987921d3d	fixed the size() method which counted also failed pages (which are also inside the solr index)	13 years ago
Michael Peter Christen	6ec02deec6	added new crawl attributes in crawl profile (not active yet)	13 years ago
Michael Peter Christen	a13e5153ac	- added the possibility to have not one but a list of crawl start urls - the list of urls is entered in the expert crawl start in a textfield; the one-line input field was replaced with a text box - start urls can also be given in one single line where the urls are separated by a '\|'-character - as an effect, the crawl profile cannot carry a single start url for identificaton because it is possible to have more. Therefore the url was removed from the crawl profile - this affect all servlets which display a crawl profile: removed the url field from all there servlets - to work consistently with several start urls and the other crawl starts which computed crawl start url lists from sitelists or sitemaps, the crawl start servlet was restructured completely - new rules for must-match patterns were created to make it possible that site crawl starts also work with several crawl starts at once	13 years ago
Michael Peter Christen	975bc95ddf	added default facet fields for json response format (stub)	13 years ago
Michael Peter Christen	2f218df55d	added missing license headers	13 years ago
Michael Peter Christen	a30653a864	added a regular expression test servlet which is linked within the parser/crawler error page whenever a problem with regular expression occurs. This makes it easy to correct and enhance the must-match and must-not-match patterns just by trying out which pattern could be correct.	13 years ago
Michael Peter Christen	0504b01bdc	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	13 years ago
orbiter	9413f77b65	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	13 years ago
orbiter	a55e77a115	added twitter search heuristic	13 years ago

1 2 3 4 5 ...

8906 Commits (a4214694df9610e88aa08480c8835d71664ac373) All Branches Search

8906 Commits (a4214694df9610e88aa08480c8835d71664ac373)

All Branches