yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	6e0f4557f8	added ftp to getName	12 years ago
cominch	23204d2245	change parameter to support the smw extension for list import	12 years ago
Michael Peter Christen	c235d5c0f1	fixed size parsing in RSS message parser (for YaCy size parameter)	12 years ago
orbiter	089a03114e	full memory usage for debian and when changing the size: debian seems to dislike the big difference between xmx and xms (I have crashes here which stop if both values are same)	12 years ago
Michael Peter Christen	5bc8f34150	fix for success query counter	12 years ago
orbiter	60b1e23f05	added new crawl options: - indexUrlMustMatch and indexUrlMustNotMatch which can be used to select loaded pages for indexing. Default patterns are in such a way that all loaded pages are also indexed (as before) but when doing an expert crawl start, then the user may select only specific urls to be indexed. - crawlerNoDepthLimitMatch is a new pattern that can be used to remove the crawl depth limitation. This filter a never-match by default (which causes that the depth is used) but the user can select paths which will be loaded completely even if a crawl depth is reached.	12 years ago
orbiter	4987921d3d	fixed the size() method which counted also failed pages (which are also inside the solr index)	12 years ago
Michael Peter Christen	6ec02deec6	added new crawl attributes in crawl profile (not active yet)	12 years ago
Michael Peter Christen	a13e5153ac	- added the possibility to have not one but a list of crawl start urls - the list of urls is entered in the expert crawl start in a textfield; the one-line input field was replaced with a text box - start urls can also be given in one single line where the urls are separated by a '\|'-character - as an effect, the crawl profile cannot carry a single start url for identificaton because it is possible to have more. Therefore the url was removed from the crawl profile - this affect all servlets which display a crawl profile: removed the url field from all there servlets - to work consistently with several start urls and the other crawl starts which computed crawl start url lists from sitelists or sitemaps, the crawl start servlet was restructured completely - new rules for must-match patterns were created to make it possible that site crawl starts also work with several crawl starts at once	12 years ago
Michael Peter Christen	975bc95ddf	added default facet fields for json response format (stub)	12 years ago
Michael Peter Christen	2f218df55d	added missing license headers	12 years ago
Michael Peter Christen	a30653a864	added a regular expression test servlet which is linked within the parser/crawler error page whenever a problem with regular expression occurs. This makes it easy to correct and enhance the must-match and must-not-match patterns just by trying out which pattern could be correct.	12 years ago
Michael Peter Christen	0504b01bdc	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
orbiter	9413f77b65	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
orbiter	a55e77a115	added twitter search heuristic	12 years ago
Michael Peter Christen	e54ac38095	- some corrections in usage of getFile() and getFileName() - added more attributes in json response writer according to yacy servlet	12 years ago
Michael Peter Christen	62add1d564	added the protocol and the file name extension to the solr fields since these fields are probably facets in file search	12 years ago
Michael Peter Christen	e072632a54	no complaints about memory if the database is empty	12 years ago
Michael Peter Christen	b846f585fa	fixed a bug with size_i field usage	12 years ago
Michael Peter Christen	9db032664e	activate two solr fields which will be used by administration interface (later)	12 years ago
orbiter	fcd5c7eec3	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
orbiter	6171143b4a	added facet stub in JsonResponseWriter	12 years ago
Michael Peter Christen	e6330f648a	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
Michael Peter Christen	e84ffdb4f3	enhanced solr writers	12 years ago
Michael Peter Christen	9644c186a4	added search functionality to ViewFile.html servlet	12 years ago
Marc Nause	03f3a8b647	*) fix for http://www.yacy-forum.org/viewtopic.php?f=2&t=759	12 years ago
Michael Peter Christen	b69ed96f0b	- added collections to yacydoc - changed yacydoc.htm to yacydoc.json - added query logging in solr and gsa search result	12 years ago
Michael Peter Christen	5df553c152	- added a json writer for solr (yes there was one using xslt but this one writes the same way as yacysearch.json) - using the new json solr result to change the ajax search in IndexControlURLs to the new solr search	12 years ago
Michael Peter Christen	4634f0e626	fix for images_withalt	12 years ago
Michael Peter Christen	e65cecc419	- updated lucene libraries to 3.6.1 - added lucene-grouping which enables faceted search; try this: http://localhost:8090/solr/select?q=:&start=0&rows=3&facet=true&facet.field=host_s	12 years ago
Michael Peter Christen	1754fbb6d9	Merge remote-tracking branch 'reger/master'	12 years ago
Michael Peter Christen	4d29f59a27	removed warnings	12 years ago
Michael Peter Christen	8c099d2106	Merge remote-tracking branch 'origin/master' Conflicts: htroot/api/ymarks/import_ymark.java source/de/anomic/data/ymark/YMarkEntry.java source/de/anomic/data/ymark/YMarkTables.java	12 years ago
apfelmaennchen	59bd478ed1	Added more sophisticated RDF output for YMarks, including the folder structure (b:Topic) and support for multiple tags (dc:subject) and folders (b:hasTopic) via rdf:Bag container.	12 years ago
apfelmaennchen	d31a632951	- added dmoz RDF dump importer - added indexing to Tables columns to support larger bookmark collections - added RDF output (HTTP) for public bookmarks at /YMarks.rdf - YMarkRDF also provides a Jena RDF Model as "internal" API - various other changes/fixes for YMarks (mainly backend)	12 years ago
reger	40d8086bf7	keep input order of translation entries within one file section. Allowing on translation conflicts (translaton of words contained in other sentence) to put shorter key at the end of the translation list.	12 years ago
Michael Peter Christen	10b911eed4	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	12 years ago
Michael Peter Christen	be67c70a47	added Solr fields: inboundlinks_text_chars_val inboundlinks_text_words_val inboundlinks_alttag_txt outboundlinks_text_chars_val outboundlinks_text_words_val outboundlinks_alttag_txt	12 years ago
orbiter	d73fff0e0e	added solr field images_withalt_i	12 years ago
orbiter	66ac4076c2	added disjunction '\|' option to site parameter in GSA API	12 years ago
sixcooler	a975bcffcb	clear fulltext-cache and stop crawling if running out of memory	12 years ago
sixcooler	e78fe3f477	also do a clearcache on the solr-connector-caches	12 years ago
sixcooler	9ee2e09983	statistics for solr-cache	12 years ago
Michael Peter Christen	d8425e6809	added collections to crawl monitor	12 years ago
Michael Peter Christen	ee23fc7a32	added h1..h6 counter fields	12 years ago
Michael Peter Christen	4b36a2c3b4	small style changes	12 years ago
Michael Peter Christen	8ca842b137	added new button design to more buttons	12 years ago
Michael Peter Christen	04709e91d7	add nice submit buttons to pdblue skin	12 years ago
Michael Peter Christen	ef6de52ab5	dependency is java6 only	12 years ago
Michael Peter Christen	b2b516cc3e	added a collection attribute to crawls and searches: - a solr field collection_sxt can be used to store a set of crawl tags - when this field is activated, a crawl tag can be assigned when crawls are started - the content of the collection field can be comma-separated, all of them are assigned to the documents when they are indexed as result of such a crawl start - a search result can be drilled down to a specific collection; this is currently only available in the solr interface and also in the gsa interface using the 'site' option - this adds a mandatory field for gsa queries (the google api demands that field all the time)	12 years ago

... 13 14 15 16 17 ...

9571 Commits (b85db72a73da4797d09dc72155c56ca00dd5da0f) All Branches Search

9571 Commits (b85db72a73da4797d09dc72155c56ca00dd5da0f)

All Branches