yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Roland 'Quix0r' Haeder	a093ccf5eb	Now used synchronization in all close() methods to make sure all objects are 'closed' in an ordered way Conflicts: source/de/anomic/http/server/ChunkedInputStream.java source/de/anomic/http/server/ChunkedOutputStream.java source/de/anomic/http/server/ContentLengthInputStream.java source/net/yacy/cora/protocol/Domains.java source/net/yacy/cora/services/federated/solr/SolrShardingConnector.java source/net/yacy/cora/services/federated/solr/SolrSingleConnector.java source/net/yacy/document/content/dao/PhpBB3Dao.java source/net/yacy/document/parser/html/AbstractTransformer.java source/net/yacy/kelondro/blob/BEncodedHeap.java source/net/yacy/kelondro/blob/HeapReader.java source/net/yacy/kelondro/index/RAMIndexCluster.java source/net/yacy/kelondro/io/ByteCountInputStream.java source/net/yacy/kelondro/logging/ConsoleOutErrHandler.java source/net/yacy/kelondro/table/SQLTable.java	13 years ago
Michael Peter Christen	f5efdb21fd	refactoring	13 years ago
Michael Peter Christen	a1a5b015d8	refactoring: moved document Classification to cora package	13 years ago
Michael Peter Christen	33d1062c79	refactoring: the cache belongs to the crawler	13 years ago
Michael Peter Christen	8aba045ba1	if a new pop-up page is set in config portal, then this page applies also to the default page configuration for the httpd if no path is given.	13 years ago
low012	2120db289a	*) Small change which should solve problem with cgitb module in Python CGI scripts.	13 years ago
Michael Peter Christen	9ad1d8dde2	complete redesign of crawl queue monitoring: do not look at a ready-prepared crawl list but at the stacks of the domains that are stored for balanced crawling. This affects also the balancer since that does not need to prepare the pre-selected crawl list for monitoring. As a effect: - it is no more possible to see the correct order of next to-be-crawled links, since that depends on the actual state of the balancer stack the next time another url is requested for loading - the balancer works better since the next url can be selected according to the current situation and not according to a pre-selected order.	13 years ago
Michael Peter Christen	4540174fe0	memory hacks	13 years ago
Michael Peter Christen	9ebcae2fbc	enhanced url parser to understand urls with & instead of & in post urls	13 years ago
Marek Otahal	72adbeae90	!Important: move from Hashtable to HashMap Hashtable is an obsolete collection v1, now since v2 offers HashMap with same or better functionality. Please review, almost all code was already moved, so only a few changes. That is not the issue, but I found notices that some (ugly big) helper classes had to be created in past to compensate missing Hashtable's functionality. I'd like input if we can remove some of them. look for //FIX: if these commits Signed-off-by: Marek Otahal <markotahal@gmail.com>	13 years ago
Marek Otahal	ed253b7aff	update javadoc, does not throw IOException Signed-off-by: Marek Otahal <markotahal@gmail.com>	13 years ago
Michael Christen	3eccdca63c	protection against too long running snippet fetch processes	13 years ago
low012	7cfdc2c092	Improved CGI capabilities: ) CGI respects shebang now (should solve problems with MS Windows) ) better error handling (more correct HTTP error codes) *) logging git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8136 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	35a9e8f307	- fixed network graphic - debuged evaluation tables - changed cache settings in template engine - some speed hacks - changed int angles for peer positions in network graphic to double angles git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8124 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	8895d8c1cd	removed unnecessary log entries git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8117 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	5a55397f99	some last-minute performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8101 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	c50f8f9a06	code cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8055 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	37e35f2741	normalization of url using urlencoding/decoding git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8017 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	9e4875230f	performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8001 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago
orbiter	a7df70221e	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7987 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	e207c41c8e	* fix urlproxy for urls containing dolar signs git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7979 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	d2ea250d99	refactoring: - moved many classes from de.anomic to net.yacy - made more sub-packages for search classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7973 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	6b22865dbc	- removed some warinings - removed a dead update location git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7970 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	87e6abd168	* fix urls containing a port number in urlproxy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7964 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	97045022fa	* pass cookies to Server Side Includes * User.html a bit more usable git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7963 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	610b01e1c3	- added a 'add every media object linked in a html document as a new document' to the html parser. This causes that all image, app, video or audio file that is linked in a html file is added as document. In fact that means that parsing a single html document may cause that a number of documents is inserted into the search index. - some refactoring for mime type discovery git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7919 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	169236c6d9	almost revert changes in this class of 7880 and 7882 since MemoryControl does handle negative value requests git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7887 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	c64faf41e2	addon to svn 7880 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7882 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	06408a9428	since many POST-requests come as gzip they report a contentlength of -1 request memory of -1 * 3 look useless to me so I added some megs to it - even correct report of contentlength should not be harmed by this git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7880 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
sixcooler	a311596881	finishing up my commits (7855-7858) which could be helpful for not declaring inside loops (helps GC of some VMs) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7859 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	3a5fa73008	* revert parts of previous commit, because it breaks the trickle-feature git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7851 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	6e79675ff3	* use gzip-encoding in more cases * send Expire-Header for static content * should improve webserver-performance for slow connections * fixes #37 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7850 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
cominch	09bb7a390c	do not replace malformed or invalid URLs in urlproxy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7835 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	96957375cc	* fix url proxy for relative links and chromium git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7805 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	f7ca84cfc0	enhanced template engine git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7800 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	7db208c992	performance hacks: more pre-allocated StringBuilder git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7790 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	87bd559c42	fixed warning git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7789 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	f30d36b101	enhanced template engine git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7783 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	0c1b29f3c9	- applied many small performance hacks - added a memory limitation in the zip parser and the pdf parser - added a search throttling: if there are too many search queries are still to be computed, then new requests are not accepted for some time. if after a one second still no space is there to perform another search, the search terminates with no results. this case should only happen in case of DoS-like situations and in case of strong load on a peer like if it is integrated in metager. - added a search cache deletion process that removes search requests in case that throttling happens git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7766 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	900dacbf97	* improve link rewriting in proxy-url * only rewrites links, which are in current search domain git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7765 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	dc855d881b	* further improve proxyurl git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7762 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	4bea3f9714	hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: used a ASCII String <-> byte[] conversion wherever possible. Many Strings in YaCy are hashes which are pure ASCII (base64 hashes). The new ASCII String <-> byte[] conversion method have less computation overhead than the UTF8 conversion. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7746 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	746e3c3b06	Replaced a widely-used Property Object in the httpd with HashMap<String, Object> which is not synchronized like Properties A synchronization is not needed here and applies an overhead to the httpd process which is now removed. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7745 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
f1ori	14e1666b21	* fix replacing regexes in url proxy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7742 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	d1dbbd956a	always use a template method cache even if the template cache flag is set to false. This flag is only used to make dynamic updates to the template files, to not dynamic updates to the rewrite methods (which is not possible without recompiling). low memory usage is guaranteed by the usage of soft references which are dropped before an OOM is thrown git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7735 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	b45701d20f	this is a re-implementation of the YaCy Block Rank feature This time it works like this: - each peer provides its ranking information using the yacy/idx.json servlet - peers with more than 1 GB ram will load this information from all other peers, combine that into one ranking table and store it locally. This happens during the start-up of the peer concurrently. The new generated file with the ranking information is at DATA/INDEX/<network>/QUEUES/hostIndex.blob - this index is then computed to generate a new fresh ranking table. Peers which can calculate their own ranking table will do that every start-up to get latest feature updates until the feature is stable - I computed new ranking tables as part of the distribition and commit it here also - the YBR feature must be enabled manually by setting the YBR value in the ranking servlet to level 15. A default configuration for that is also in the commit but it does not affect your current installation only fresh peers - a recursive block rank refinement is implemented but disabled at this point. it needs more testing Please play around with the ranking settings and see if this helped to make search results better. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7729 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	9248a4eef4	reduce teh effect of 'Bildersuche findet generierte HTML-Seiten als Bilder' see http://bugs.yacy.net/view.php?id=9 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7705 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	6e42d4de88	- added full-String search function: find things that match exactly what is quoted in the query - re-structuring authentification methods to fix a problem with API steering git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7697 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	6fa439c82b	- refactoring of robots - added option to crawler to send error-URLs to solr - changed solr scheme slightly (no multi-value fields where no multi values are) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7693 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago
orbiter	d8e934c085	better abstraction of http client identification git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7675 6c8d7289-2bf4-0310-a012-ef5d649a1542	14 years ago

1 2 3 4 5 ...

858 Commits (5b3acc12cd4b4343c4e7d7f0a20a1da8ea8d5f6a)