yacy_search_server

Commit Graph

Author	SHA1	Message	Date
reger	8768896975	extract lastmodified from openoffice doc set lastmod date in office document parsers	10 years ago
Michael Peter Christen	c40c302748	when many crawl queues are generated, this NPE can occur; probably caused as concurrency issue: W 2015/09/05 14:09:10 ConcurrentLog java.lang.NullPointerException java.lang.NullPointerException at java.util.TreeMap.rotateRight(TreeMap.java:2239) at java.util.TreeMap.fixAfterInsertion(TreeMap.java:2271) at java.util.TreeMap.put(TreeMap.java:582) at net.yacy.kelondro.table.Table.<init>(Table.java:235) at net.yacy.crawler.HostQueue.openStack(HostQueue.java:229) at net.yacy.crawler.HostQueue.getStack(HostQueue.java:204) at net.yacy.crawler.HostQueue.push(HostQueue.java:397) at net.yacy.crawler.HostBalancer.push(HostBalancer.java:237) at net.yacy.crawler.data.NoticedURL.push(NoticedURL.java:184) at net.yacy.crawler.CrawlStacker.stackCrawl(CrawlStacker.java:355) at net.yacy.crawler.CrawlStacker.job(CrawlStacker.java:134) at sun.reflect.GeneratedMethodAccessor6.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:483) at net.yacy.kelondro.workflow.InstantBlockingThread.job(InstantBlockingThread.java:101) at net.yacy.kelondro.workflow.AbstractBlockingThread.run(AbstractBlockingThread.java:82) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745)	10 years ago
Michael Peter Christen	94cfa63c46	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	10 years ago
Michael Peter Christen	0a37d8af89	in case that a site crawl is started for urls with file:// path, the host filter does not work because there is no host given in such urls. In that case, patch the filter to be a sub-path filter.	10 years ago
reger	367fe388b9	fix exception throw after sendError in DefaultServlet - reduce debug exception logs in crawler	10 years ago
Michael Peter Christen	348b8db9d2	Merge pull request #12 from luccioman/master Updated french locale and added new translator utils	10 years ago
luccioman	9df249296a	Return to mai repository version	10 years ago
luccioman	9752bd5f88	Added utils to help translation without launching full YaCy application : - translate all source files with a locale - list all non translated files with a locale	10 years ago
luccioman	2f0f0180e2	Added a function to list files recursively.	10 years ago
luccioman	7e4c1d2282	Translator refactoring : - deleted useless new StringBuilder allocation - use of a new reusable FileNameFilter - added javadoc	10 years ago
luccioman	c1d937a90c	Merge branch 'master' of ssh://git@github.com/yacy/yacy_search_server	10 years ago
reger	7c1da173e0	fix missing license in image search see http://mantis.tokeek.de/view.php?id=522	10 years ago
luccioman	f17863588f	Updated french translations for yacysearhitem.html, yacysearchtrailer.html and Steering.html files. Corrected various labels.	10 years ago
luccioman	918ef72bbe	Corrected br markup	10 years ago
luccioman	f88bb2277e	Corrected bookmark link title	10 years ago
luccioman	802ea66d19	Merge branch 'master' of ssh://git@github.com/yacy/yacy_search_server	10 years ago
reger	5297e80cda	fix missing onclick in ConfigPortal to enable checkbox	10 years ago
luccioman	cc8d6ad75f	Merge branch 'master' of ssh://git@github.com/yacy/yacy_search_server	10 years ago
reger	802ccaead6	fix init of error cache, use latest faildates => load_date_dt	10 years ago
reger	dba7f15073	apply same size constrain on result image from doc as for linked images see `19f1308bf0`	10 years ago
reger	5e45f1a460	enable Solr schema dynamicField _p (type=location) for YaCy coordinate_p field	10 years ago
luccioman	70e483ecc6	Merge branch 'master' of ssh://git@github.com/yacy/yacy_search_server	10 years ago
reger	4cf875336c	complete TODO: getFileExtension handle dot in query part + testcase	10 years ago
sixcooler	87e4abe393	fight the fieldcache by usind DocValues: in Solr-5.x the fieldcache has moved and was not cleared anymore. This results in an huge fieldcache. (http://lucene.apache.org/#highlights-of-the-lucene-release-include https://issues.apache.org/jira/browse/LUCENE-5666) Here I try to use DovValues where it is possible. For this I used the Api-Scheme as new basis für the Solr-Schema. This needs at least a complete optimization of the Solr-Index to get a smaller FieldCache. Everything that is indexed with these setting will not use the Fieldcache at all.	10 years ago
sixcooler	c729d089b6	French Translation update by Luc: http://forum.yacy-websuche.de/viewtopic.php?f=8&t=5671	10 years ago
luccioman	e0dda0c01c	Merge branch 'master' of ssh://git@github.com/yacy/yacy_search_server.git	10 years ago
reger	eaf0e8ff2c	start recording/indexing pixel size for image document as for linked images	10 years ago
reger	c33229fc0c	check mime prior to ext for metadata modification for images	10 years ago
reger	19f1308bf0	enforce th result images limit to > 16x16px for linked images http://mantis.tokeek.de/view.php?id=594	10 years ago
luccioman	a4509ea2ca	Updated french translation for index.html, yacysearch.html and simpleheader.template. Correcte special characters to use HTML entities instead.	10 years ago
reger	250f6457f0	remove exired domain titan.deep-one.in from bootstrap.seedlist	10 years ago
luccioman	67799ce867	Updated translation of index.html, yacysearch.html and simpleheader.template, corrected some special characters not written as HTML entities.	10 years ago
reger	0e4ba0360b	fix NPE on .yacyh result url of disconnected peer (cleanup yacyshare remaining)	10 years ago
reger	7ed812a2bf	log missing seed.port in favour of exception to prevent repeating throws	10 years ago
reger	206883f80d	fix: Preserve protocol in url proxy to connect to http/https. Display warning if https target is viewed over http	10 years ago
reger	f7b0b3b7b3	avoid runtime exception by earlier testing for seed.ip=null	10 years ago
reger	0f80bc8309	upd to jsoup-1.8.3	10 years ago
Michael Peter Christen	906b5fd742	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	10 years ago
Michael Peter Christen	8f90767889	fix for filesystem crawl	10 years ago
sixcooler	a3dd4be749	added / corrected charste to be 1.7 compatible. @Orbiter: please check is this is ok for you	10 years ago
Michael Peter Christen	8028410ab7	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	10 years ago
Michael Peter Christen	df3314ac1a	added a new facet type based on a probabilistic classifier using bayesian filters. This can be used to classify documents during indexing-time using a pre-definied bayesian filter. New wordings: - a context is a class where different categories are possible. The context name is equal to a facet name. - a category is a facet type within a facet navigation. Each context must have several categories, at least one custom name (things you want to discover) and one with the exact name "negative". To use this, you must do: - for each context, you must create a directory within DATA/CLASSIFICATION with the name of the context (the facet name) - within each context directory, you must create text files with one document each per line for every categroy. One of these categories MUST have the name 'negative.txt'. Then, each new document is classified to match within one of the given categories for each context.	10 years ago
reger	1409cabe8b	exclude more default search fields from text copy to text_t for metadata index documents	10 years ago
reger	e2e73258ca	remove obsolete interface SearchAccumulator and unused SRURSSConnector Thread inheritance	10 years ago
Michael Peter Christen	dbbad23e12	removed warnings	10 years ago
Michael Peter Christen	500cfa9457	enhanced logging	10 years ago
Michael Peter Christen	c14bc8d9b7	revert of fq transformation (recent fix)	10 years ago
Michael Peter Christen	203df5a750	Merge branch 'master' of git@github.com:yacy/yacy_search_server.git	10 years ago
reger	fa08ca207e	! finish running crawls before applying ! Allow crawl urls up to 2048 character fix for http://mantis.tokeek.de/view.php?id=575	10 years ago
reger	ee77f24e52	use some more declared HeaderFramework constants	10 years ago

1 2 3 4 5 ...

12007 Commits (81f53fc83a339e60c57be4899b98b09e33a48be2) All Branches Search

12007 Commits (81f53fc83a339e60c57be4899b98b09e33a48be2)

All Branches