- remove unnecessary generation of Calendar and Date objects
- synchronized SimpleDateFormat objects in blog-, message- and wikiBoard
- correct use of TimeZones and SimpleDateFormats
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4288 6c8d7289-2bf4-0310-a012-ef5d649a1542
RFC 2616 requires a client to support RFC 1123 (default), RFC 1036 and ANSI C formatted date strings (we only supported 1123 before).
Closes: http://forum.yacy-websuche.de/viewtopic.php?f=6&t=525 (and probably others). There are servers which break the standards, please report those "DATE ERROR" messages if they contain a "sane" date string.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4243 6c8d7289-2bf4-0310-a012-ef5d649a1542
search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added chunked file transfer for non-yacy clients
- SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished
- added client-side network unit identification
- cleaned up code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542
- better errorhandling
- adding support for outgoing transfer- and content-encoding
- avoid holding outgoing messages into memory before sending them
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2872 6c8d7289-2bf4-0310-a012-ef5d649a1542
until EOF even if a persistent connection is used
*) httpdByteCountInputStream.java: adding skip method
*) httpHeader.java: adding getCharacterEncoding function
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2616 6c8d7289-2bf4-0310-a012-ef5d649a1542
A new port forwarding method for upnp was added.
If this method is enabled, yacy automatically determines an UPnP
capable internet gateway and configures the gateway port forwarding
settings properly.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2328 6c8d7289-2bf4-0310-a012-ef5d649a1542
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
and filled with cookies and so on.
This header one can set into serverObjects
Check CookieTest.html and CookieTest.java for details.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1334 6c8d7289-2bf4-0310-a012-ef5d649a1542
to disallow yacy to index the response that belongs to the request where
X-YACY-Index-Contro is set to "no-index"
*) Bugfix for Seed-List download via Remote Proxy.
Now the pragma and cache-control http headers of the request are properly set to "no-cache"
See: http://www.yacy-forum.de/viewtopic.php?p=11639#11639
*) Bugfix for http-Proxy
yacy has ignored "no-cache"- pragma and cache-control http headers that were send in requests.
Now, these request headers are evaluated properly
TODO: Missing evaluation of "no-store" request headers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@971 6c8d7289-2bf4-0310-a012-ef5d649a1542
- adding automatic refresh
- accepts new parameter nameLookup which can be used to deactivate
yacy-peer name lookup (because we have problems with this on large seed-dbs)
*) ViewFile
New page that can be used to view
- original content
- plain text content
- parsed content
- parsed sentences
of a webpage specified by there url hash
Mainly for debugging purpose at the moment
*) Robots.txt
Bugfix for if-modified-since usage
TODO: synchronization of downloads to avoid loading the same robots-file
multiple times in parallel by different threads
*) Shutdown
Better abortion of transferRWI and transferURL sessions on server shutdown
*) Status Page
Adding icon to start/stop crawling via status page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@950 6c8d7289-2bf4-0310-a012-ef5d649a1542
*) Replacing PDFBox 0.7.1 lib with newer version 0.7.2
*) Refactoring of classes httpd/httpc/httpHeaders to
make many methods for httpHeader/Requestline parsing
reusable for new icap implementation
*) adding chunked input stream support
- needed by new icap implementation
- needed by future httpc HTTP/1.1 support
*) httpd.java
- moving all connection property contants to class httpHeader
- moving readHeader function to class httpHeader
- moving parseQuery function to class httpHeader
- moving handleTransparentProxy function to class httpHeader
*) httpHeader.java
- adding new fuction to parse the http response line
- adding new function to converte http headers to a string that
can be send to the client
- adding a function that generates a proper url using all parsed
connection properties
*) ICAP Support
- yacy now supports handling of icap response modification requests
- this feature can be used by other icap enabled proxies to contact
yacy as icap server, and to handover the downloaded content to yacy.logging
for indexing
- functionality was successfully tested with squid 2.5Stable 10 + icap patch
- further icap services e.g. URL filtering based on yacy's blacklists are possible
*) plasmaSwitchboard.java
- htcache entries that are still needed for indexing are now properly registered
as in use after system restart
- extended logging: log message now shows parsing and indexing time for each sb. entry
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@757 6c8d7289-2bf4-0310-a012-ef5d649a1542
See: http://www.yacy-forum.de/viewtopic.php?t=1118&highlight=xforwardedfor
*) httpc.java: Bugfix for incorrect http response statuscode parsing
In some situations the statustext whas chopped
*) Adding a lot of fileheaders containing YaCy copyright and license
*) httpd.java: Adding additional debugging http header that should help du detect
the "binary data in browser window" bug.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@653 6c8d7289-2bf4-0310-a012-ef5d649a1542