You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
yacy_search_server/source/de/anomic/crawler
orbiter 22dbbcfa56
better (and corrected) recognition of intranet and internet-addresses. This corrects the isLocal property that is used by network definitions to restrict index ranges to local and global addresses. Address locations (intranet or internet) had been partly identified by the top level domain of the host address. Since intranet addresses can also be addressed using a host name that is in a country domain it is necessary to do a dns resolving for each check. The check is supported by a local dns cache so the intranet/internet check should not affect network traffic too much. To ensure that the cache works properly the cache class was upgraded to better concurrency data structures.
15 years ago
..
retrieval redesign of parser interface: 15 years ago
AbstractImporter.java
Balancer.java
CrawlProfile.java more abstraction of the htcache when using the LoaderDispatcher: 15 years ago
CrawlQueues.java - added animated visualization for DHT-in and DHT-out in network graphic 15 years ago
CrawlStacker.java better (and corrected) recognition of intranet and internet-addresses. This corrects the isLocal property that is used by network definitions to restrict index ranges to local and global addresses. Address locations (intranet or internet) had been partly identified by the top level domain of the host address. Since intranet addresses can also be addressed using a host name that is in a country domain it is necessary to do a dns resolving for each check. The check is supported by a local dns cache so the intranet/internet check should not affect network traffic too much. To ensure that the cache works properly the cache class was upgraded to better concurrency data structures. 15 years ago
CrawlSwitchboard.java fixed a bug in snippet fetch strategy: cache only does not help if resource can only be found in web 15 years ago
Importer.java
ImporterException.java
ImporterManager.java
Latency.java better (and corrected) recognition of intranet and internet-addresses. This corrects the isLocal property that is used by network definitions to restrict index ranges to local and global addresses. Address locations (intranet or internet) had been partly identified by the top level domain of the host address. Since intranet addresses can also be addressed using a host name that is in a country domain it is necessary to do a dns resolving for each check. The check is supported by a local dns cache so the intranet/internet check should not affect network traffic too much. To ensure that the cache works properly the cache class was upgraded to better concurrency data structures. 15 years ago
NoticedURL.java
ResourceObserver.java
ResultImages.java redesign of parser interface: 15 years ago
ResultURLs.java - more abstraction (HashMap -> Map) 15 years ago
RobotsEntry.java
RobotsTxt.java better handling of OOM situations 15 years ago
SitemapImporter.java
ZURL.java
robotsParser.java