Commit Graph

130 Commits (404bc21da980c81ed8ceb0b46286152e17b20683)

Author SHA1 Message Date
orbiter c12bb8a6d0 - refactoring of the http client
16 years ago
orbiter b57c9da1f8 - fixes to doc, ppt, xls parser: better title
16 years ago
orbiter 94110df85a moved logging partially to kelondro
16 years ago
orbiter 024da2916b refactoring of logging
16 years ago
orbiter 83ce65707a (almost) completed partition of classes in kelondro
16 years ago
orbiter 7ee494fde5 more refactoring of kelondro:
16 years ago
orbiter bf93767ec6 refactoring of kelondro database classes
16 years ago
orbiter fc27bf8c4c refactoring of kelondro classes:
16 years ago
f1ori 025094675f * remove empty directory
16 years ago
f1ori 963da8c3f9 * updated tm-extractors to new version 1.0
16 years ago
orbiter 47292e696a more performance hacks
16 years ago
f1ori 5af8923f37 * distribute forgotten jar-file in parser
16 years ago
orbiter 674ad2d55b different handling of error cases that occur during loading files with http or ftp:
16 years ago
orbiter 2f49666908 integrated the character decoding into the parser, removed old code
16 years ago
orbiter 6941bf42b1 performance hacks
16 years ago
orbiter bfcf9b7aa3 - added language detection using metadata from documents: html and odt documents provide this information
16 years ago
orbiter 0cd0fee546 fixed bug with wrong proxy result enqueueing. See:
16 years ago
orbiter 05dbba4bab added logging conditions to all fine and finest log line calls
16 years ago
orbiter 536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
16 years ago
f1ori f99c307eff * correct debian build dependencies
16 years ago
danielr 8422ee5ec4 - fixed UnsupportedEncoding (in proxy) using defaultCharset if no characterEncoding can be determined
17 years ago
danielr 621b473b18 * removed some warnings of findbugs (http://findbugs.sf.net)
17 years ago
danielr 17b7845eb5 * refactoring
17 years ago
danielr 3bb870bfcd added final where possible
17 years ago
orbiter c3d461d191 - removed superfluous copyright statement
17 years ago
orbiter e81be7d4f2 added many missing user-agent declarations for yacy http client connections.
17 years ago
danielr 7feae906aa - organize imports
17 years ago
danielr 74b1a60043 fixed "java.lang.NoClassDefFoundError: org/a"
17 years ago
danielr ae03a54d23 pdfParser: updated lib, fixed ClassNotFoundException: CMSError
17 years ago
orbiter 1689030ee8 refactoring: moved all crawler classes into their own package
17 years ago
orbiter 724bbdf9b2 refactoring of RSS reader
17 years ago
danielr 5c3c1fdf41 replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
17 years ago
orbiter fa1090113d - next try to fix the networking problem:
17 years ago
orbiter 87a8747ce3 - enhanced recognition, parsing, management and double-occurrence-handling of image tags
17 years ago
orbiter 0f5c4abaca more generics
17 years ago
orbiter efd0b8371a - added parsing of Dublin Core - compliant metadata (see RFC 5013 and ISO 15836) to html parser
17 years ago
orbiter f4e9ff6ce9 more generics
17 years ago
borg-0300 3cab85158c update for last commit
17 years ago
orbiter ecd7f8ba4e - added NEAR operator (must be written in UPPERCASE in search query)
17 years ago
low012 b08f877e97 *) tried to get rid of warnings when compiling parsers (http://forum.yacy-websuche.de/viewtopic.php?t=660)
17 years ago
orbiter e22014dc83 some memory enhancements when generating and displaying ymage objects
17 years ago
fuchsi 69521d92e5 Add another external dependency from PDFBox package ("Bouncy Castle"). This is necessary for parsing of some encrypted PDF files.
17 years ago
fuchsi ca83f5a8d9 Add external lib FontBox which is part of the PDFBox (they extracted the font handling code into this package in 0.7.3).
17 years ago
fuchsi e77aec8c9d fix handling of encrypted PDF-Documents (with default user password "")
17 years ago
orbiter daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
17 years ago
orbiter 367fc28928 corrected Brausse->Brausze
18 years ago
orbiter e76fe1c078 - replaced unicode characters in copyright holder name ('Brausse')
18 years ago
orbiter 40b0547611 - documentaton changes (removed old forum links)
18 years ago
orbiter dcb8687904 fix to update cycle
18 years ago
orbiter f323e1813d added commons.logging again (is used by mimeTypeParser)
18 years ago