Commit Graph

15 Commits (59181e8009f98371f3294ea4f83cb345d40dc60a)

Author SHA1 Message Date
Michael Peter Christen 25573bd5ab added a crawl filter based on <div> tag class names
7 years ago
luccioman bf55f1d6e5 Started support of partial parsing on large streamed resources.
8 years ago
luccioman 6a4d51d8f9 Cleaned up some Javadoc warnings.
8 years ago
reger 06d0e2aeb9 result heuristic (also used in greedy learning mode) to use outbound links if result is full index doc. Otherwise use default loader methode.
9 years ago
Michael Peter Christen fed26f33a8 enhanced timezone managament for indexed data:
10 years ago
Michael Peter Christen b5ac29c9a5 added a html field scraper which reads text from html entities of a
10 years ago
Michael Peter Christen 61c5e40687 - replaced the properties object in AnchorURL with distinct variables
12 years ago
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
12 years ago
Michael Peter Christen 5f0ab25382 removed the option to prevent removal of &amp; parts inside of the
12 years ago
Michael Peter Christen 5fc6524ca8 - moved triple store to net.yacy.cora.lod (should be generalized there
13 years ago
low012 11ea966f9e *) added SID file (Commodore 64) sound file parser
14 years ago
orbiter aacf572a26 - enhancements for search speed
15 years ago
orbiter b6fb239e74 redesign of parser interface:
15 years ago
orbiter 3528b970d6 - refactoring
16 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
16 years ago