- speed enhancements for the cache-only cache policy by using special no-delay rules in the balancer
- fixed some deadlock- and 100% CPU problems in the balancer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6243 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added special rule to balancer to omit forced delays if cache is used exclusively
- extended the htCache size by default to 32GB
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6241 6c8d7289-2bf4-0310-a012-ef5d649a1542
- removed never-used secondary crawl depth
- added a must-not-match filter that can be used to exclude urls from a crawl
- added stub for crawl tags which will be used to identify search results that had been produced from specific crawls
please update the yacybar: replace property name 'crawlFilter' with 'mustmatch'.
Additionally, a new parameter named 'mustnotmatch' can be used, which should be by default the empty sring (match-never)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5342 6c8d7289-2bf4-0310-a012-ef5d649a1542
- the serverBusyThread checks folders every 60 min. (==> autoReCrawl_idlesleep in yacy.conf)
- added option to create bookmarks from CrawlStart URL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5033 6c8d7289-2bf4-0310-a012-ef5d649a1542
- fixed loosing of own seed hash (hopefully)
- fixed a bug with crawl start s beginning with (bookmark) files
- added better IP recognition during hello process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4882 6c8d7289-2bf4-0310-a012-ef5d649a1542
A large refactoring was neccessary
- added another crawl start option: automatic restriction to sub-path
- removed crawlStartSimple and renamed crawl start expert
to crawl start (without expert)
- some changes to texts in crawl start
- added some more deletions when an web index is deleted:
delete also queues and robots cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4881 6c8d7289-2bf4-0310-a012-ef5d649a1542