Commit Graph

43 Commits (7c149e0f9d8f312cd7f2ad52fc74e470e42a6889)

Author SHA1 Message Date
orbiter 3d5104d357 - fixed a bug in crawl start with file name (npe in new url)
14 years ago
orbiter 958ff4778e enhanced location search:
14 years ago
orbiter 0430a94eaa the location search shows now not re-evaluated locations but only such locations that are attached as metadata to web pages
14 years ago
orbiter 9b25d07295 - added geo information parsing to html parser
14 years ago
orbiter 78d4c45d09 enhancement during search process: fast fail of search in case that all index feeder have terminated.
14 years ago
orbiter 694fa3a2a5 - replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
14 years ago
orbiter 30aed9824a moved getBytes() to UTF8.getBytes() to use a default String encoding
14 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
14 years ago
orbiter a92d80a545 performance enhancements using an alternative to a insensitive collator (a complex string compare):
14 years ago
orbiter e717bf74ba more logging, more care about OOMs
14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
orbiter 88773e4daa changed the default port from 8080 to 8090
14 years ago
low012 3d95981f7d *) cleaning up the code a little bit
14 years ago
orbiter 9b25a33fd9 - fixed numerous bugs
14 years ago
orbiter 56264dcc17 - added CamelCase parser to MultiProtocolURI: generate better to-be-indexed words from urls
14 years ago
orbiter c36da90261 added a very fast ftp file list generator to site crawler:
14 years ago
f1ori a025b1da89 * fix bug when browsing local filesystem (e. g. repository) with yacy
14 years ago
f1ori 7d8de34778 * add a bit documentation to DigestURI, use DigestURI(string) instead of DigestURI(string, null)
14 years ago
orbiter b8aee6d402 performance hacks for better search performance
15 years ago
orbiter 24502fe3de performance hacks
15 years ago
orbiter 0010cd9db1 Support for indexing of RSS feeds!
15 years ago
orbiter 5924a0d851 - enhanced concurrency in database index access for multicore
15 years ago
orbiter 60e71876ad - more abstraction (HashMap -> Map)
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter cf43bdc87e This is a large bugfix and enhancement commit to support a better location detection for data
15 years ago
orbiter 4cd5418963 removed finalize methods because of a hint in
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
orbiter e0da0a84b0 performance fix in http parser
15 years ago
orbiter 82f76e1296 removed log line
15 years ago
orbiter 0f8004f9da enhanced html parser to recognize a href tags inside header tags
15 years ago
orbiter 54af9e6b49 - added parsing of robots meta-tag in html headers to detect a noindexing request
15 years ago
orbiter dd459281c8 applied code changes that are recommended by PMD
15 years ago
orbiter a37878b7d5 url parser regex performance hack
15 years ago
orbiter e34e63a039 preset of proper HashMap dimensions: should prevent re-hashing and increase performance
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 969123385b added json and rss output for image search
15 years ago
orbiter d183f8d980 refactoring (moved code from ContentTransformer to TemplateEngine)
15 years ago
orbiter dbdf2570ba added comparator and more fixes for SortStack/SortStore
15 years ago
orbiter 06d0dcde20 more enhancements to image search
15 years ago
orbiter 2d8f3ee301 some performance hacks
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 52470d0de4 - fix for xls parser
16 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
16 years ago