- method loadResourceContent defined as deprecated.
Please do not use this function to avoid OutOfMemory Exceptions
when loading large files
- new function getResourceContentStream to get an inputstream of a cache file
- new function getResourceContentLength to get the size of a cached file
*) httpc.java:
- Bugfix: resource content was loaded into memory even if this was not requested
*) Crawler:
- new option to hold loaded resource content in memory
- adding option to use the worker class without the worker pool
(needed by the snippet fetcher)
*) plasmaSnippetCache
- snippet loader does not use a crawl-worker from pool but uses
a newly created instance to avoid blocking by normal crawling
activity.
- now operates on streams instead of byte arrays to avoid OutOfMemory
Exceptions when operating on large files
- snippet loader now forces the crawl-worker to keep the loaded
resource in memory to avoid IO
*) plasmaCondenser: adding new function getWords that can directly operate on input streams
*) Parsers
- keep resource in memory whenever possible (to avoid IO)
- when parsing from stream the content length must be passed to the parser function now.
this length value is needed by the parsers to decide if the parsed resource content is to large
to hold it in memory and must be stored to file
- AbstractParser.java: new function to pass the contentLength of a resource to the parsers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2701 6c8d7289-2bf4-0310-a012-ef5d649a1542
*) better logging of parser failures
*) simplified usage of plasmaparser through switchboard
*) restructuring of crawler
- crawler now returns an error message if it is used in sync mode (e.g. by snippet fetcher)
*) snippet-fetcher: more verbose error messages
*) serverByteBuffer.java: adding new function append(String,encoding)
*) serverFileUtils.java: adding functions to copy only a given number of bytes between streams
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2641 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added switchh to show or hide surftipps
- more news contribute to surftipps
- added voting system for surftipps
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2638 6c8d7289-2bf4-0310-a012-ef5d649a1542
- serverFileUtils.java:
-- adding methods to copy from stream to writer and readers to writers
-- moving httpc writeX methods into serverFileUtils class
- serverCharBuffer.java: removing inheritance from Writer class
- replacing htmlFilterOutputStream by htmlFilterWriter class which handles
content as char stream
- htmlFilterContentTransformer.java: deactivating getText mode
(still needs to be migrated to use char streams instead of byte streams)
- changes in several classes to use htmlFilterWriter instead of htmlFilterOutputStream
- changes in Scraper and Transformer classes to operate on chars instead of bytes
- httpdProxyHandler.java: bugfix. clientTimeout setting was missing in config file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2617 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added new image 'B' in front of search results for bookmark generation
- added news generation when a public bookmark is added
- the '+' in front of search results has new meaning: positive rating for that result
- added news generation when a '+' is hit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2613 6c8d7289-2bf4-0310-a012-ef5d649a1542
*) htmlFilterContentScraper.java: using proper charset for document title
*) serverByteBuffer.java: adding new toString which allows to specify the charset for byte encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2593 6c8d7289-2bf4-0310-a012-ef5d649a1542
the ymageMatrix initializer throws an RuntimeException if there is not
enough memory available to generate a new ymage of wanted size
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2541 6c8d7289-2bf4-0310-a012-ef5d649a1542
- catchup of OutOfMemoryError in server threads
- automatic adoption of word cache size after a Short Mem Cycle
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2426 6c8d7289-2bf4-0310-a012-ef5d649a1542
A new port forwarding method for upnp was added.
If this method is enabled, yacy automatically determines an UPnP
capable internet gateway and configures the gateway port forwarding
settings properly.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2328 6c8d7289-2bf4-0310-a012-ef5d649a1542
when extracting from eurl, the html output format is recommended, since
this format adds also the fail reason to the domain/url.
The complete syntax for domain extraction is now
java -Xmx<megabytes>m -classpath classes yacy -domlist [ -source { lurl | eurl } ] [ -format { text | zip | gzip | html } ] [ <path to DATA folder> ]
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2322 6c8d7289-2bf4-0310-a012-ef5d649a1542
* mainly for the transition to the new indexing database structure
* a bugfix for an endless loop inside kelondroTree iteration
* a bugfix for bulk read inside a kelondroTree iteration; the bug caused that some elements had been iterated twice
* very strong speed enhancement for url/domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2320 6c8d7289-2bf4-0310-a012-ef5d649a1542
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
The pause is started by calling intermission(Long.MAX_VALUE)
and can be stopped by calling intermission(0)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2272 6c8d7289-2bf4-0310-a012-ef5d649a1542
instead of creating a new one.
Notes:
This import is done automatically on startup if the following properties
are set in the config file:
pkcs12ImportFile =
pkcs12ImportPwd =
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2139 6c8d7289-2bf4-0310-a012-ef5d649a1542
-Renaming writeandzip to writeandgzip to avoid confusion about type of compression
-Adding new startup message to windows script
-The usual language "enhancements" ;-)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1953 6c8d7289-2bf4-0310-a012-ef5d649a1542
- implemented hand-over of new word index attributes during remote search
- implemented word-distance computation during search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1382 6c8d7289-2bf4-0310-a012-ef5d649a1542
and filled with cookies and so on.
This header one can set into serverObjects
Check CookieTest.html and CookieTest.java for details.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1334 6c8d7289-2bf4-0310-a012-ef5d649a1542
Version 0.1
Start Yacy, go to localhost:8080/CookieTest.html
Play around with cookies
Look into CookieTest.java to See, how it works
This behavior will be changed
such that httpHeader will be responsible for the cookies in the future
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1332 6c8d7289-2bf4-0310-a012-ef5d649a1542
the kelondro database needs specific information about the order of
base64-encoded keys. Since no other package depends on base64
(only the httpd uses base64 for encryption, but does not need to encode these strings)
it is good to move base64 encoding to the new ordering classes in kelondro.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1284 6c8d7289-2bf4-0310-a012-ef5d649a1542
next steps: attach voting and restrict to administrator
to see the deletion button, move the mouse pointer to the left of a search result
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1172 6c8d7289-2bf4-0310-a012-ef5d649a1542