yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	efafa79db5	- added a content-encoding: gzip to streamed http server responses - finish and close streamed http responses immediately - this applies only to the solr interface which should be much faster now!	13 years ago
Michael Peter Christen	aab0b680c3	- added xslt support for solr result formats. try i.e. http://localhost:8090/solr/select?q=:&start=0&rows=10&wt=xslt&tr=json.xsl - added servlet-side mime-type configuration for streamed servlets. this is used for the result formatters in solr result formats	13 years ago
sixcooler	9cd409682f	close augmented stream if filled from cache to get its content use augmented stream if proxyAugmentation is set only	13 years ago
Michael Peter Christen	97b7bcf2a6	added a solr search index - by default, a (empty) solr storage instance is created at SEGMENTS/solr_36 - the index is written if in /IndexFederated_p.html the flag "embedded solr search index" is switched on - a standard solr query interface is available now with a new servlet at http://127.0.0.1:8090/solr/select To test this, do the following: - switch to webportal mode - switch on the feature as described - do a crawl. this fills the solr index. The normal YaCy search will NOT work now! - do a solr query, like: http://127.0.0.1:8090/solr/select?q=: http://127.0.0.1:8090/solr/select?q=text_t:Help play with different search fields as you can see in /IndexFederated_p.html You can use the standard solr query attributes as described in http://wiki.apache.org/solr/SearchHandler	13 years ago
orbiter	bbfa497a3c	replaced more size() > 0 by !isEmpty()	13 years ago
orbiter	0cbda0b2b8	- replaced all length() == 0 and size() == 0 with isEmpty() - replaced some length() > 0 and size() > 0 with !isEmpty() - cannot be done automatically - implemented some isEmpty() methods	13 years ago
Michael Peter Christen	8efc1c1078	- fixed a memory leak (or bad usage) during parsing/snippet fetch - more logging for errors	13 years ago
Michael Peter Christen	b0c408788b	made class methods static where possible	13 years ago
Michael Peter Christen	5bd3c90907	- removed unnecessary semicolons - added default case for switch	13 years ago
Michael Peter Christen	7c1ba99755	removed more unused method parameters	13 years ago
Michael Peter Christen	0301aba1e9	removed unused method parameters	13 years ago
Michael Peter Christen	ea10766bfd	cleaned unnecessary nested code	13 years ago
Michael Peter Christen	276a66a793	Adding a limit of 1000 links that a parser shall store during indexing. A limit was necessary because some web pages have such huge numbers of links that it can easily cause a OOM just by the number of links. The quesion if the number of 1000 links is sufficient or too weak must be answered with the result of testing this feature.	13 years ago
Michael Peter Christen	ce8d4b87d9	fixes for new eclipse 'Juno' warning 'Resource leak'.	13 years ago
Michael Peter Christen	d763e4d94b	fixed bad referer computation in SSIs which causes a NPE during host computation. This error was there before the latest IPv6 hack but did not cause a NPE. The IPv6 hack was not the cause for this bug, but it discovered the misconfiguration of the 'referer' referrer.	13 years ago
Michael Peter Christen	96aeb127e3	generalized localhost naming. this is also a preparation for a better IPv6 implementation.	13 years ago
Michael Peter Christen	77f795756c	fixing redirects and status codes: storing of status code in ResponseHeader to make it available for late evaluations, like storage in solr.	13 years ago
cominch	e4555cbee3	Augmented browsing: Pass on additional action parameter	13 years ago
Michael Peter Christen	8e97ada7c9	IPv6 bugfix	13 years ago
Michael Peter Christen	fbded1f466	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	13 years ago
Michael Peter Christen	df3531f8d5	added the generation of virtual vocabularies using the pnd	13 years ago
Michael Peter Christen	e806106b10	jquery bugfix	13 years ago
Roland 'Quix0r' Haeder	edaa09b9b1	Rewrote all String blacklist types to enum 'BlacklistType', closes bug #143 Conflicts: htroot/Supporter.java htroot/yacy/crawlReceipt.java htroot/yacy/transferRWI.java htroot/yacy/transferURL.java source/de/anomic/crawler/CrawlStacker.java source/de/anomic/data/ListManager.java source/net/yacy/peers/Protocol.java source/net/yacy/repository/Blacklist.java source/net/yacy/repository/LoaderDispatcher.java source/net/yacy/search/Switchboard.java source/net/yacy/search/index/MetadataRepository.java source/net/yacy/search/index/Segment.java source/net/yacy/search/query/RWIProcess.java source/net/yacy/search/snippet/MediaSnippet.java	13 years ago
cominch	7a4dab6d1d	- removed unused variables - do not replace malformed or invalid URLs in urlproxy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7835 6c8d7289-2bf4-0310-a012-ef5d649a1542 Conflicts: source/de/anomic/http/server/HTTPDFileHandler.java	13 years ago
Michael Peter Christen	ca93835713	removed usage of deprecated methods	13 years ago
cominch	6b32f7c1f6	re-enable augmented proxy	13 years ago
cominch	b5a8fb5fd8	Catch malformed URL when submitted in encoded style	13 years ago
cominch	8e80894812	create virtual web folder /currentyacypeer/ which always points to local peer, even when using the urlproxy Conflicts: source/de/anomic/http/server/HTTPDProxyHandler.java	13 years ago
cominch	ae8adb0e58	Small changes	13 years ago
cominch	b21048892b	augmentedParser add features and integrate external html parser to modify existing web pages Conflicts: addon/YaCy.app/Contents/Info.plist build.xml	13 years ago
cominch	9cbfc1a1c0	augmentedProxy, which forwards every proxy request to a rewrite engine to customize existing webpages. originally implemented by Florian Richter. Conflicts: source/de/anomic/http/server/HTTPDProxyHandler.java	13 years ago
Michael Peter Christen	43c2c6e588	better logging	13 years ago
Michael Peter Christen	8b974905ee	changed log-in text for all servlets with authentication: - added hint how to set the password using a shell script - added a shell script to change the password	13 years ago
Michael Peter Christen	c15fcde1c8	add-on to latest commit	13 years ago
Michael Peter Christen	cf47d94888	performance hack to parse numbers inside of substrings without actually generating a substring. This avoids the allocation of a String object ech time a substring is parsed. Should affect CPU load during RWI transmission.	13 years ago
Michael Peter Christen	7e0ddbd275	added a "fromCache" flag in Response object to omit one cache.has() check during snippet generation. This should cause less blockings	13 years ago
Michael Peter Christen	89142d1e8d	removed (not all) warnings	13 years ago
Roland 'Quix0r' Haeder	a093ccf5eb	Now used synchronization in all close() methods to make sure all objects are 'closed' in an ordered way Conflicts: source/de/anomic/http/server/ChunkedInputStream.java source/de/anomic/http/server/ChunkedOutputStream.java source/de/anomic/http/server/ContentLengthInputStream.java source/net/yacy/cora/protocol/Domains.java source/net/yacy/cora/services/federated/solr/SolrShardingConnector.java source/net/yacy/cora/services/federated/solr/SolrSingleConnector.java source/net/yacy/document/content/dao/PhpBB3Dao.java source/net/yacy/document/parser/html/AbstractTransformer.java source/net/yacy/kelondro/blob/BEncodedHeap.java source/net/yacy/kelondro/blob/HeapReader.java source/net/yacy/kelondro/index/RAMIndexCluster.java source/net/yacy/kelondro/io/ByteCountInputStream.java source/net/yacy/kelondro/logging/ConsoleOutErrHandler.java source/net/yacy/kelondro/table/SQLTable.java	13 years ago
Michael Peter Christen	f5efdb21fd	refactoring	13 years ago
Michael Peter Christen	a1a5b015d8	refactoring: moved document Classification to cora package	13 years ago
Michael Peter Christen	33d1062c79	refactoring: the cache belongs to the crawler	13 years ago
Michael Peter Christen	8aba045ba1	if a new pop-up page is set in config portal, then this page applies also to the default page configuration for the httpd if no path is given.	13 years ago
low012	2120db289a	*) Small change which should solve problem with cgitb module in Python CGI scripts.	13 years ago
Michael Peter Christen	9ad1d8dde2	complete redesign of crawl queue monitoring: do not look at a ready-prepared crawl list but at the stacks of the domains that are stored for balanced crawling. This affects also the balancer since that does not need to prepare the pre-selected crawl list for monitoring. As a effect: - it is no more possible to see the correct order of next to-be-crawled links, since that depends on the actual state of the balancer stack the next time another url is requested for loading - the balancer works better since the next url can be selected according to the current situation and not according to a pre-selected order.	13 years ago
Michael Peter Christen	4540174fe0	memory hacks	13 years ago
Michael Peter Christen	9ebcae2fbc	enhanced url parser to understand urls with & instead of & in post urls	13 years ago
Marek Otahal	72adbeae90	!Important: move from Hashtable to HashMap Hashtable is an obsolete collection v1, now since v2 offers HashMap with same or better functionality. Please review, almost all code was already moved, so only a few changes. That is not the issue, but I found notices that some (ugly big) helper classes had to be created in past to compensate missing Hashtable's functionality. I'd like input if we can remove some of them. look for //FIX: if these commits Signed-off-by: Marek Otahal <markotahal@gmail.com>	13 years ago
Marek Otahal	ed253b7aff	update javadoc, does not throw IOException Signed-off-by: Marek Otahal <markotahal@gmail.com>	13 years ago
Michael Christen	3eccdca63c	protection against too long running snippet fetch processes	13 years ago
low012	7cfdc2c092	Improved CGI capabilities: ) CGI respects shebang now (should solve problems with MS Windows) ) better error handling (more correct HTTP error codes) *) logging git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@8136 6c8d7289-2bf4-0310-a012-ef5d649a1542	13 years ago

1 2 3 4 5 ...

895 Commits (18f989dfb189f9e80eea507af5eb32931d2ad47a)