You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
yacy_search_server/source/net/yacy/document/parser
orbiter b6d57f06eb
enhanced the apk parser (up to beeing production-ready).
11 years ago
..
augment
html fix for image alt attachment to AnchorURLs in html parser. 11 years ago
images fix for image alt attachment to AnchorURLs in html parser. 11 years ago
rdfa added an option to set 'obey nofollow' for links with rel="nofollow" 11 years ago
xml
apkParser.java enhanced the apk parser (up to beeing production-ready). 11 years ago
audioTagParser.java
bzipParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 11 years ago
csvParser.java
docParser.java extract author and keywords in .doc and .ppt parser 11 years ago
dwgParser.java
genericParser.java
gzipParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 11 years ago
htmlParser.java fix for image alt attachment to AnchorURLs in html parser. 11 years ago
linkScraperParser.java added linkScraperParser, a parser which ignores the text like the 11 years ago
mmParser.java
odtParser.java
ooxmlParser.java
pdfParser.java optimize pdfParser 11 years ago
pptParser.java extract author and keywords in .doc and .ppt parser 11 years ago
psParser.java
rdfParser.java
rssParser.java simplify rssreader and improve atom feed link extraction 11 years ago
rtfParser.java
sevenzipParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 11 years ago
sidAudioParser.java
sitemapParser.java fix for image alt attachment to AnchorURLs in html parser. 11 years ago
swfParser.java
tarParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 11 years ago
torrentParser.java
vcfParser.java added an option to set 'obey nofollow' for links with rel="nofollow" 11 years ago
vsdParser.java
xlsParser.java
zipParser.java - added a new Crawler Balancer: HostBalancer and HostQueues: 11 years ago