in the network configuration, you can configure a whiteliste and a blacklist
- blacklistet clients cannot search
- whitelistet client get never any search restrictions
- for all other clients: apply DoS search restrictions
Please see the example configuriation in yacy.network.freeworld.unit
by default, all clients from localhosts get whitlistet.
If you have your own YaCy network, please put all the IPs of your peers into the whitelist
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5475 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added dummy servlet class, because otherwise the template engine is not triggered.
thats so because the yacy httpd works much faster as normal file server without a scan
of the served pages. Therefore each page with templates must now have a class file associated to it.
- fixed json output format of yacysearch
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5449 6c8d7289-2bf4-0310-a012-ef5d649a1542
The cache is now flushed only for one second every ten seconds. During a crawl the cache
fills up completely, and is only flushed if space is needed for more documents.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5446 6c8d7289-2bf4-0310-a012-ef5d649a1542
this is a migration step to support a new method to store the web index, which will also based on the same data structure. made also a lot of refactoring for a better structuring of the BLOBHeap class.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5430 6c8d7289-2bf4-0310-a012-ef5d649a1542
- introduced blocking queues in CrawlStacker to make it ready for concurrency
- added a second busy thread for the CrawlStacker
The CrawlStacker is multithreaded. It shall be transformed into a BlockingThread in another step.
The concurrency of the stacker will hopefully solve some problems with cases where DNS blocks.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5395 6c8d7289-2bf4-0310-a012-ef5d649a1542
- modified and activated write buffer
- increased cache flush factor
- fixed a problem with deadlocking of indexing process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5382 6c8d7289-2bf4-0310-a012-ef5d649a1542
- applied knowledge about concurrent files stream reading and index processing from the wikimedia reader
to the EcoTable initialization process: the file reader is now concurrent to the index generation
- changed also some initialization processes to avoid some pauses during initialization
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5354 6c8d7289-2bf4-0310-a012-ef5d649a1542
- removed never-used secondary crawl depth
- added a must-not-match filter that can be used to exclude urls from a crawl
- added stub for crawl tags which will be used to identify search results that had been produced from specific crawls
please update the yacybar: replace property name 'crawlFilter' with 'mustmatch'.
Additionally, a new parameter named 'mustnotmatch' can be used, which should be by default the empty sring (match-never)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5342 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added hash to HTCACHE storage files which will make it possible to join separate caches by just copying files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5329 6c8d7289-2bf4-0310-a012-ef5d649a1542
- implemented vertical DHT acceptance ("my own DHT") to accept new targets
- added new target computation for global search: addresses vertical targets also
- enhanced remote crawling: collection of remote crawl urls if queue has less than 100 entries (was: 0 entries)
- better performance value computations for PPM selection in network configuration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5319 6c8d7289-2bf4-0310-a012-ef5d649a1542
- two different computations (but mathematical equivalent) of the DHT distance had been consolidated
- moved from 0.0 .. 1.0 double-range position computation to 0 .. Long.Max range for DHT targets
- added fast Long - to - hash computation
- high-precision target computation of gaps for new peers
- added new target computation for horizontal and vertical DHT targets (not yet in use)
- old horizontal-only DHT targets will be upwards compatible to new horizontal and vertical DHT positions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5318 6c8d7289-2bf4-0310-a012-ef5d649a1542
with index.storeCommons=false all currently stored commons are deleted!
Default is now 'true', but in future full releases it will be switched to 'false'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5315 6c8d7289-2bf4-0310-a012-ef5d649a1542
- removed the now superfluous HT storage thread
- reduced number of file decompression by shifting the compression moment to the future
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5286 6c8d7289-2bf4-0310-a012-ef5d649a1542
- entries are buffered and written as stream with many entries at once (saves many IO accesses)
- entries are compressed with gzip: increases capacity of cache
- concurrency for stream-writing and compression: all writings to the cache are non-blocking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5284 6c8d7289-2bf4-0310-a012-ef5d649a1542
- files are not stored any more as individual files
- a new database structure using BLOBHeap files stores many cache entries in common files
- all file-writing procedures had been migrated to generate byte[] objects which are written with the new database methods
this is only an intermediate step to the final architecture, where cached files are written together with their metadata in one single database structure.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5276 6c8d7289-2bf4-0310-a012-ef5d649a1542
- small change to network grafics: smaller circles / more URLs necessary for full radius; more PPM necessary for full crawling circles
- fixed exclusion search ('-' did not work any more)
- fixed NPE bug when FTP loader wrote to the error-db
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5218 6c8d7289-2bf4-0310-a012-ef5d649a1542
- removed distinction between header file types for http and ftp; ftp is simulated by using http properties
- removed all old resourceInfo classes that handled this distinction
- introduced a new distinction between http request and http response objects
- unified new response objects with two other object types that had been introduced elsewhere
- changed all servlet call methods to use the new http request header object type
- divided static object keys for http header properties into request and response types
- refactoring here and there (a large number of type changes and many methods merged/moved)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5079 6c8d7289-2bf4-0310-a012-ef5d649a1542
- fixed "Inefficient use of keySet iterator instead of entrySet iterator" [WMI_WRONG_MAP_ITERATOR, FindBugs]
- fixed some possible null pointer accesses
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5063 6c8d7289-2bf4-0310-a012-ef5d649a1542
- refactoring of the HTCache (separation of cache entry)
- added new storage class for BLOBs. (not used yet, this is half-way to a new structure)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5062 6c8d7289-2bf4-0310-a012-ef5d649a1542