yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	daf0f74361	joined anomic.net.URL, plasmaURL and url hash computation: search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	24e25e1141	enhanced SSI server-side support: - SSIs may now refer to servlets, not only files - calling a servlet, the servlet/SSI engine is called recursively - SSIs now work also for non-chunked-encoding supporting clients This will support the new search page functionality, to show search results dynamically without using javascript. To test this method, a test page has been added http://localhost:8080/ssitest.html ..calls dynamicalls 3 servlets, which produce some delays during their execution please verify that you can see the result step-by-step on your browser To implement this feature, some refactoring had been taken place, mostly code had been made static and will execute faster. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4037 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9ca46a8c69	indexing of local (intranet) urls enabled To do this, one must create a separate YaCy network that has a local URL domain A description how to do this is here: http://www.yacy-websuche.de/wiki/index.php/De:Netzdefinition git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4001 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	40b0547611	- documentaton changes (removed old forum links) - different handling of link quotation - different handling of link normalization - enhanced html/unicode en/de-coding git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	26ddf797eb	added bmp and ico image format to all parser/viewing methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3969 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	154ffd7c2c	fix for wrong http connection version and SSIs git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3928 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1782ef57e5	- added SSI parser and include directive for <!--# include virtual="<file>" --> - added chunked file transfer for non-yacy clients - SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished - added client-side network unit identification - cleaned up code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	465145cb6f	revert to insecure, but dau-proof defaults git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3898 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	7ad11ceaaa	security fix for peers without password. allow access only from localhost git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3897 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	66ec8b63c1	added a httpd access tracker: - all requests to the own httdp can now be listed in the access tracker menu - the search statistics had been renamed to access tracker and extended by this tracker git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3861 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	33ad0c8246	added a web structure computation and logging: - all web page parsing operations will now increase a web structure file - the file is computed in memory and dumped at shutdown-time to PLASMASB/webStructure.map in readable form (not a database) - the file can be used externally to analyse the link structure of the crawled pages - the web structure can also be retrieved using a xml-interface at http://localhost:8080/xml/webstructure.xml - the short-term purpose is the computation of a link-graph image (before linuxtag!) - a long-term purpose could be a decentralized computation of the citation rank git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3746 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	601fc7d1c5	- added source to J7Zip-modifed.jar and it's license (changelog is still to come) - moved HTML-*replace-methods from wikiCode to de.anomic.data.htmlTools - prepared use of different wiki parsers as suggested here: http://www.yacy-forum.de/viewtopic.php?p=34444#34444 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3741 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d755a8026d	- better OOM protection - better memory allocation for FlexTable indexes - splitting between static index and dynamic index (only the dynamic part must grow) - to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes - added new iterator classes that support cloneable iterators - adopted all iterator classes to implement cloneable itarators git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	bf69a721cb	more protection against mis-use of YaCyHop interface: - target must not be at port 80 - target access not more than every 3 seconds - requester may not access more than every 10 seconds git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3357 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b4aa195c27	added user-agent check for yacy-hop proxy authentication git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3343 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d25caa07bf	redesigned some parts of http authentication added another access check for peer hops git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3340 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	2401e748a3	- fixed wrong replacement of POST-parameters in httpd ('<' and '>' are still replaced, don't know why): http://www.yacy-forum.de/viewtopic.php?t=3466 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3324 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	e68cdeeeb3	- reverted parseArg(String) to use a byte-array to handle correct UTF-8 parsing - arguments aren't passed html-escaped to the servlets anymore, bug-fix for http://www.yacy-forum.de/viewtopic.php?p=30573 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3321 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	47ab83a7c0	added flag for YaCyHop - proxy access for all paths that start with /yacy/ git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3304 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	7c40197e42	- fixed error pages and <label>s for index.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3226 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	0c81bd39d4	XSS-safe put as default. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	340dc52a9d	- ConfigProfile_p.html now transmits usable encoding for other than 7-bit ASCII charset, see TODO in httpd.parseArg(String) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3174 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	00aa9472d6	- added decode of HTML-entities in request lines - removed Bookmark symbol on search pages and surftips if not authenticated git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3172 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d0c32c6aeb	better protection against fraud peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3104 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	e17591acc3	- parse HTML arguments as UTF-8 strings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3085 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	d30932c7d8	- fix for fix... sry git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3084 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	6118fb73ec	- added decode of UTF-16 escapes in url-arguments (%u0123), bugfix for http://www.yacy-forum.de/viewtopic.php?t=2762 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3083 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	f77d624b94	*) bugfix for persistent connection support on transfer-encoded requests git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2942 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	29a1f132ec	*) some strings replaced by constants git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2910 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	532c23b5c7	*) soap handler - better errorhandling - adding support for outgoing transfer- and content-encoding - avoid holding outgoing messages into memory before sending them git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2872 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	68204ff729	*) Suppressing for bad client requests. See: http://www.yacy-forum.de/viewtopic.php?p=26918 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2814 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	cd5f349666	) Better handling of large files during parsing Extracted text of files that are larger than 5MB is stored in a temp file instead of keeping it in memory ) plasmaParserDocument.java; getText now returnes an inputStream instead of a byte array ) plasmaParserDocument.java: new function getTextBytes returns the parsed content as byte array Attention: the caller of this function has to ensure that enough memory is available to do this to avoid OutOfMemory Exceptions ) httpd.java: better error handling if the soaphander is not installed ) pdfParser.java: - better handling of documents with exotic charsets - better handling of large documents - better error logging of encrypted documents ) rtfParser.java: Bugfix for UTF-8 support ) tarParser.java: better handling of large documents ) zipParser.java: better handling of large documents ) plasmaCrawlEURL.java: new errorcode for encrypted documents ) plasmaParserDocument.java: the extracted text can now be passed to this object as byte array or temp file git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2679 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	df1629b05a	- code cleanup - version 0.471 - moved surftipps to own web page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	2a06ce5538	*) next bugfix for UTF-8 - Sending UFT-8 messages to other peers did not work - httpd.java: minor corrections for UTF-8 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2570 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	3879a0ecd0	replaced java.net.URL usage by use of new class de.anomic.net.URL This shall be seen as an experiment to exclude all cases where there could be a DNS lookup during URL comparisment. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	ed2cb040d1	*) Bugfix for http connection header validation - Connection header was not handled correctly if it contains multiple values, e.g. Connection: TE, close git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2219 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	d7a3fdb18b	no white pages, when clicking cancel on the password-dialog git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2198 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	b4ab183518	*) Bugfix for NullpointerException if the seeds IP could not be resolved git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2099 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	9938c252dd	better Errorhandling for proxyAccounts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2082 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	727aac4768	*) Bugfix for Transparent-Proxy-Support <-> Port Forwarding problem See: http://www.yacy-forum.de/viewtopic.php?p=20358 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2039 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
rramthun	42b0b10a95	-Adding Windows Media to types which are not sended compressed -Renaming writeandzip to writeandgzip to avoid confusion about type of compression -Adding new startup message to windows script -The usual language "enhancements" ;-) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1953 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	c7ececbfb2	) httpd.mime: adding jar mimetype ) httpd.java: charset is only appended to mimetype for text mimetypes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1839 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	3b4a99ff6a	fix for java 1.4.x git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1685 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	9b941fb773	*) bugfix for usage of yacy with extended port binding (e.g. #eth0:8080, 192.168.0.1:8080, etc.) - port was reported incorrectly to other peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1678 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	2d4e1325cf	UTF-8 fix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1676 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
hermens	c8f5adea4d	- don't send Message Body on HEAD requests, even in the case of an error git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1669 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	a7248fbb0a	*) bugfix for http/0.9 responses git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1668 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	a354bc2ec1	*) Bugfix for content length check git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1666 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
hermens	e974d0cb99	Improve compliance to rfc ) There is no status line in HTTP/0.9 ) Answers to HEAD requests should return the same headers as a GET request git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1664 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	62ffb5ece0	) httpdFileHandler.java: adding real streaming support for lage files - avoid to read the whole file into memory - support of chunked transfer-encoding for http/1.1 clients - support of gzip content-encoding suitable clients See: http://www.yacy-forum.de/viewtopic.php?p=17058#17058 ) MessageSend_p.html: better highlighting of peer response/status messages git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1646 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	eeba8b055e	*) guessing, testing and suggesting alternative hostnames on "unknown host" error See: http://www.yacy-forum.de/viewtopic.php?t=1879 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1636 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	44996afd79	*) Bugfix for handling of http/0.9 clients. - nothing was send as response git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1610 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
(no author)	001513cc1f	Now custom httpHeader can be created and filled with cookies and so on. This header one can set into serverObjects Check CookieTest.html and CookieTest.java for details. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1334 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
(no author)	55f3232219	Patch for the Coockie management. Version 0.1 Start Yacy, go to localhost:8080/CookieTest.html Play around with cookies Look into CookieTest.java to See, how it works This behavior will be changed such that httpHeader will be responsible for the cookies in the future git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1332 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
(no author)	1d3249e787	handle UTF-8 correctly git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1323 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	9544c47684	added some UTF-8 handling. hope this will help somehow.. for shure not THE solution to our UTF-8 problem git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1308 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	fed92d364b	introduced USAGE object for counter synchronization in kelondroRecords git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1199 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
hermens	35cf6712b2	*) fixes for httpd - don't send Body on HEAD requests - don't send a Last-modified: date, that is later then Date: - Use Cache-control instead of Pragma with HTTP/1.1 - don't send header with HTTP/0.9 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1198 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
hermens	ec1202edbe	*) Fixes for httpd - Fix for local timezone in http header See: http://www.yacy-forum.de/viewtopic.php?t=836 - Allow static content to be cached by browser See: http://www.yacy-forum.de/viewtopic.php?t=1311 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1184 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	37f88b4017	code cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1176 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	76618442e0	code cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1173 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	1c3750de57	*) Bugfix for code cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1161 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	1d6a6d1f85	code cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1159 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	a04930f025	code cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1158 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	7e670894d9	) Suppressing stackTraces in proxyError message for "connect timed out" errors See: http://www.yacy-forum.de/viewtopic.php?t=1504 ) Increasing default http client timeout git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1129 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	fb766413d1	*) Changes on httpc dns caching - Bugfix: old dns cache did not handle case insensitive hostnames correctly. - adding a possibility to set domain name patterns defining hostnames that should not be cached by the httpc dns cache e.g. borg-300.dyndns.org This can be done by setting the new httpc.nameCacheNoCachingPatterns property - using httpc.dnsResolve wherever possible within the sourcecode [httpd.java,plasmaCrawlStacker.java] git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1044 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
hydrox	cb69047b91	*)cleanup access static methods and fields git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1016 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
hydrox	56b9f34411	*)removed unused imports git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1015 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	40777556c5	) Connection Tracking - adding automatic refresh - accepts new parameter nameLookup which can be used to deactivate yacy-peer name lookup (because we have problems with this on large seed-dbs) ) ViewFile New page that can be used to view - original content - plain text content - parsed content - parsed sentences of a webpage specified by there url hash Mainly for debugging purpose at the moment ) Robots.txt Bugfix for if-modified-since usage TODO: synchronization of downloads to avoid loading the same robots-file multiple times in parallel by different threads ) Shutdown Better abortion of transferRWI and transferURL sessions on server shutdown *) Status Page Adding icon to start/stop crawling via status page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@950 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	43a127ff3a	allow httpsTunnels to other Ports than 443. (if secureHttps=false) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@940 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	4320425a17	ipAuth (this does not work yet) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@937 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	b88a9584f8	New Errorpage git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@928 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	b177a80bb7	*) Bugfix for sendRespondError StackOverFlowException problem git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@927 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	c8a35a0130	) Adding new connection tracking page (currently only for incoming connections) ) Displaying statistic for incoming connections on status page ) Bugfix for Loop-Access Bug when trying to access the yacy page while yacy is configured as proxy See: http://www.yacy-forum.de/viewtopic.php?p=6826 ) Bugfix for Referer Bug See: http://www.yacy-forum.de/viewtopic.php?p=11098#11098 *) Adding reverse Name lookup for yacy-domain names (used by the connection tracking page) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@916 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	f1ff33177d	reset Timelimits on Daychange git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@904 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	a9e25c26e1	*) adding new sendRespondError method to httpd which accepts a template include file for individual error messages git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@902 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	5605cc8018	TimeLimits git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@901 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	f65c939a60	userDB Auth git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@874 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	a2fa75e688	) Asynchronous queuing of crawl job URLs (stackCrawl) various checks like the blacklist check or the robots.txt disallow check are now done by a separate thread to unburden the indexer thread(s) TODO: maybe we have to introduce a threadpool here if it turn out that this single thread is a bottleneck because of the time consuming robots.txt downloads ) improved index transfer The index selection and transmission is done in parallel now to improve index transfer performance. TODO: maybe we could speed up performance by unsing multiple transmission threads in parallel instead of only a single one. ) gzip encoded post requests it is now configureable if a gzip encoded post request should be send on intex transfer/distribution ) storage Peer (very experimentell and not optimized yet) Now it's possible to send the result of the yacy indexer thread to a remote peer istead of storing the indexed words locally. This could be done by setting the property "storagePeerHash" in the yacy config file - Please note that if the index transfer fails, the index ist stored locally. - TODO: currently this index transfer is done by the indexer thread. To seedup the indexer a) this transmission should be done in parallel and b) multiple chunks should be bundled and transfered together ) general performance improvements - better memory cleanup after http request processing has finished - replacing some string concatenations with stringBuffers - replacing BufferedInputStreams with serverByteBuffer - replacing vectors with arraylists wherever possible - replacing hashtables with hashmaps wherever possible This was done because function calls to verctor or hashtable functions take 3 time longer than calls to functions of arraylists or hashmaps. TODO: we should take a look on the class serverObject which is inherited from hashmap Do we realy need a synchronization for this class? TODO: replace arraylists with linkedLists if random access to the list elements is not needed ) Robots Parser supports if-modified-since downloads now If the downloaded robots.txt file is older than 7 days the robots parser tries to download the robots.txt with the if-modified-since header to avoid unnecessary downloads if the file was not changed. Additionally the ETag header is used to detect changes. ) Crawler: better handling of unsupported mimeTypes + FileExtension ) Bugfix: plasmaWordIndexEntity was not closed correctly in - query.java - plasmaswitchboard.java *) function minimizeUrlDB added to yacy.java this function tests the current urlHashDB for unused urls ATTENTION: please don't use this function at the moment because it causes the wordIndexDB to flush all words into the word directory! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@853 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	cd77078aa0	old Version restored before Release git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@842 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	a4b747fe97	ProxyAccounts based on userDB git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@841 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	d388292f24	*) adding function for user accounting which is called after each http request git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@827 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	595e0c7e56	*) Bugfix for ProxyErrormsg: Wrong base URL See: http://www.yacy-forum.de/viewtopic.php?p=9905#9905 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@815 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	5f95a1cf62	*) Bugfix for ProxyErrormsg: Wrong http host header See: http://www.yacy-forum.de/viewtopic.php?p=9905#9905 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@795 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	1dc94e7753	) Adding support for gzip content-encoding of http post requests used to transferRWIs and transferURLs. See: http://www.yacy-forum.de/viewtopic.php?t=1167#10020 ) adding yacyVersion.java containing constants defining yacy versions that support a given feature. Needed to determine if a remote peer is able to decode gzip content-encoded http post bodies properly. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@772 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	b990dc1ad1	) Replacing jsch 0.1.19 lib with newer version 0.1.21 ) Replacing PDFBox 0.7.1 lib with newer version 0.7.2 ) Refactoring of classes httpd/httpc/httpHeaders to make many methods for httpHeader/Requestline parsing reusable for new icap implementation ) adding chunked input stream support - needed by new icap implementation - needed by future httpc HTTP/1.1 support ) httpd.java - moving all connection property contants to class httpHeader - moving readHeader function to class httpHeader - moving parseQuery function to class httpHeader - moving handleTransparentProxy function to class httpHeader ) httpHeader.java - adding new fuction to parse the http response line - adding new function to converte http headers to a string that can be send to the client - adding a function that generates a proper url using all parsed connection properties ) ICAP Support - yacy now supports handling of icap response modification requests - this feature can be used by other icap enabled proxies to contact yacy as icap server, and to handover the downloaded content to yacy.logging for indexing - functionality was successfully tested with squid 2.5Stable 10 + icap patch - further icap services e.g. URL filtering based on yacy's blacklists are possible ) plasmaSwitchboard.java - htcache entries that are still needed for indexing are now properly registered as in use after system restart - extended logging: log message now shows parsing and indexing time for each sb. entry git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@757 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	7809b382bf	*) Bugfix for Blacklist support for https (only initial connect) See: http://www.yacy-forum.de/viewtopic.php?p=9419 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@684 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	a7256e8f4e	) Adding X-Forwarded-For Header See: http://www.yacy-forum.de/viewtopic.php?t=1118&highlight=xforwardedfor ) httpc.java: Bugfix for incorrect http response statuscode parsing In some situations the statustext whas chopped ) Adding a lot of fileheaders containing YaCy copyright and license ) httpd.java: Adding additional debugging http header that should help du detect the "binary data in browser window" bug. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@653 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	4fd5b95b1f	*) Renaming Logger function names to reflect the proper Java Logging API Loglevels - please use logFine instead of logDebug - please use logSevere instead of logFailure and logError See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@615 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	6adf8a4bde	*) Renaming Logger function names to reflect the proper Java Logging API Loglevels - please use logFine instead of logDebug - please use logFailure instead of logError See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@614 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	8132a44305	*) Better error handling if yacy SOAP extension is not installed See: http://www.yacy-forum.de/viewtopic.php?t=1040 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@594 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	66ebce1109	use staticIP more often git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@592 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	858cb983d7	*) Printout date and system name on proxy error page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@581 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
theli	cb97d2972e	) Bugfix for "peer not accessible via .yacy name if Transparent Proxy Support is enabled" bug See: per Browser nicht erreichbare Peers ) Proxy Error Page now displays the Peer Name on top git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@575 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
theli	8c62fb49ba	*) Bugfix for httpdSoapHandler Initialisation. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@545 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
orbiter	2d8557cb10	minor changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@487 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
theli	13eeaa08f3	) httpc.java: - Now it's possible to interrupt pending httpc-actions on server shutdown - this is possible because of a newly introduced registration mechanism for open sockets ) yacyCore.java - blocking peerPing threads can now be interrupted on server shutdown ) serverCore.java - restructuring shutdown code ) error.html - port number is now set correctly if port forwarding was enabled git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@389 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
orbiter	5159a090b0	fixed parser bug with lowercase force (appeared in: http://spellbound.sourceforge.net/) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@367 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
theli	6e97f70549	*) httpd.java: improved errorhandling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@333 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
theli	9d8c66fb5e	*) adding possibility to forward received yacy-messages (htroot/yacy/message.java) via a command-line email program (e.g. sendmail) to a configured email address - the configuration dialog is reachable via Settings_p.html#messageForwarding git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@332 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago

1 2 3 4

175 Commits (d6a5c98080240771f3877b49c400f60004e646db)