You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
yacy_search_server/source/de/anomic/crawler
orbiter dad5b586a4
added a concurrent warmin-up of Table data structures. that should speed-up the start-up process but may also cause stronger CPU load at that time.
14 years ago
..
retrieval added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled. 14 years ago
Balancer.java added a concurrent warmin-up of Table data structures. that should speed-up the start-up process but may also cause stronger CPU load at that time. 14 years ago
CrawlProfile.java *) Invalid crawl profiles (containing invalid mustmatch/mustnotmatch filters) will be moved from active crawls to invalid crawls (new file: DATA/INDEX/freeworld/QUEUES/crawlProfilesInvalid.heap). This file can not be edited yet, but it shoudl be easy to extend the CrawlProfileEditor accordingly. 14 years ago
CrawlQueues.java YaCy can now use the solr index to compute text snippets. This makes search result preparation MUCH faster because no document fetching and parsing is necessary any more. 14 years ago
CrawlStacker.java - fix for wrong entries in NOLOAD indexing queue (that caused that urls had been only indexed based on their url and not loaded) 14 years ago
CrawlSwitchboard.java *) Invalid crawl profiles (containing invalid mustmatch/mustnotmatch filters) will be moved from active crawls to invalid crawls (new file: DATA/INDEX/freeworld/QUEUES/crawlProfilesInvalid.heap). This file can not be edited yet, but it shoudl be easy to extend the CrawlProfileEditor accordingly. 14 years ago
ImporterException.java added final where possible 17 years ago
Latency.java - refactoring of robots 14 years ago
NoticedURL.java added a handling of appearances of yacy bot entries in robots.txt if this entry addresses the yacy peer 14 years ago
RSSLoader.java stop loading via http at defined maximum of bytes - even size is unknown before loading 14 years ago
ResourceObserver.java Implementation of strategies for controlling memory resources. 14 years ago
ResultImages.java - fixed a bug in crawl start with file name (npe in new url) 14 years ago
ResultURLs.java refactoring: moved all score-related classes to new ranking package 14 years ago
RobotsTxt.java - enhanced ybr ranking computation 14 years ago
RobotsTxtEntry.java hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: 14 years ago
RobotsTxtParser.java - refactoring of robots 14 years ago
SitemapImporter.java hack to reduce resource contention caused by massive UTF8 decodings which use java.nio resources: 14 years ago
ZURL.java added a concurrent warmin-up of Table data structures. that should speed-up the start-up process but may also cause stronger CPU load at that time. 14 years ago