Commit Graph

234 Commits (631b08e7e26bf9c72ce67fba098518fe5572899b)

Author SHA1 Message Date
f1ori a025b1da89 * fix bug when browsing local filesystem (e. g. repository) with yacy
14 years ago
orbiter 4c72885cba added a sitemap entry parser and loader for sitemaps
14 years ago
orbiter fb92f9ae8e added mime type image/jpeg (image/jpg is wrong but it is left here because it does not harm and this error also exists in configuration of web servers)
14 years ago
f1ori 7d8de34778 * add a bit documentation to DigestURI, use DigestURI(string) instead of DigestURI(string, null)
14 years ago
orbiter 58e74282af added a word counter statistic in condenser which is used by the did-you-mean to calculate best matches for given search words.
14 years ago
orbiter 0d363a94d7 more performance hacks
14 years ago
orbiter b8aee6d402 performance hacks for better search performance
14 years ago
orbiter aacf572a26 - enhancements for search speed
14 years ago
orbiter d2fd93135c - moved yacybot user agent string definition to MultiProtocolURI since there are basic access mechanisms where the bot string is needed
14 years ago
f1ori e670e1ef8e add charset auto-detection for htmlParser
14 years ago
f1ori ddcd5ae78c fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2989
14 years ago
f1ori 8fe1102452 fix http://forum.yacy-websuche.de/viewtopic.php?p=20889#p18426
14 years ago
orbiter 84a023cbc8 fixed several search bugs
14 years ago
orbiter 114bdd8ba7 fixed old sitemap importer which was not able to parse urls containing post elements
14 years ago
orbiter c0b08ac59b slighlty changed way of pdf parser integration
14 years ago
orbiter 5fe828fa06 - replaced pdfbox and fontbox version 1.1.0 with 1.2.1
14 years ago
orbiter 24502fe3de performance hacks
14 years ago
orbiter 22047ffad5 enhanced computation speed of many replaceAll string operations
14 years ago
orbiter 3988a95fb5 added ability in rss reader to parse atom feeds
14 years ago
orbiter 0010cd9db1 Support for indexing of RSS feeds!
14 years ago
orbiter 844f158686 - removed dependencies in header framework:
14 years ago
orbiter 5e7081cd19 refactoring towards a unified loading mechanism for MultiProtocolURIs
14 years ago
orbiter e10cd115a9 - added a new RSS reader interface. This is not finished but you can now load and look at RSS feeds. It will be used to index RSS feeds in a way that is appropriate for such kind of data.
14 years ago
orbiter 933dc1a600 removed old rss parser (will be replaced with parser from cora package)
14 years ago
orbiter 5924a0d851 - enhanced concurrency in database index access for multicore
14 years ago
orbiter 989948e1a9 fixed generic image parser
15 years ago
orbiter 27d8a8b53e removed wrong com.sun.codec class access in generic image parser
15 years ago
orbiter b6fb239e74 redesign of parser interface:
15 years ago
low012 d4851441b0 *) Added Android packages to parser in order to be able to create a decentralized search for direct downloads of Android apps.
15 years ago
orbiter 37b8827a7a - removed the UPnP library sources from sbbi and added the jar library again. The library was included to get support for fedora releases, but after this time the fact that the sbbi cannot be part of fedora should be re-discussed. If this will still not be possible, then we may integrate the sbbi UPnP package using reflection.
15 years ago
orbiter 7bcfa033c9 more abstraction of the htcache when using the LoaderDispatcher:
15 years ago
orbiter 87087f12fe - scanned remote search process and enhanced some data structure and synchronizations here and there
15 years ago
orbiter de4f30bb2e UTF-8 fix
15 years ago
orbiter 3a1cebb598 bugfixes
15 years ago
orbiter 60e71876ad - more abstraction (HashMap -> Map)
15 years ago
orbiter 2eea806005 less errors in image parser
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter 9842fab6e4 - fixes to query parameter
15 years ago
orbiter cf43bdc87e This is a large bugfix and enhancement commit to support a better location detection for data
15 years ago
orbiter 6eba2cb96b fix in bmp parser
15 years ago
orbiter 90c3e5d6f6 - cleanup, removed unused imports
15 years ago
orbiter 4cd5418963 removed finalize methods because of a hint in
15 years ago
orbiter f204076d25 removed usage of temporary files: causes too much IO
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
orbiter 9ddb8e4a43 set an option for the java-internal image parser that prevents that the image is cached using the file-system in a temporary file. This should speed up image parsing during image indexing dramatically and should also cause better performance when showing the yacy banner and OSM tiles.
15 years ago
orbiter e0da0a84b0 performance fix in http parser
15 years ago
orbiter 89b4fff1c2 adopted ant script for new exif library
15 years ago
orbiter 24e5faee75 added exif parsing for jpg images
15 years ago
orbiter 82f76e1296 removed log line
15 years ago
orbiter 0f8004f9da enhanced html parser to recognize a href tags inside header tags
15 years ago
orbiter 54af9e6b49 - added parsing of robots meta-tag in html headers to detect a noindexing request
15 years ago
lotus 38a3d55afd added more possible php extensions for html
15 years ago
orbiter 56e0d9bd01 - testings with image parser
15 years ago
orbiter 7d400b17d0 html parser support for .cfm files
15 years ago
orbiter f6731c6240 more logging etc.
15 years ago
orbiter 007f8297de added php3 as extension type for html
15 years ago
orbiter 5df628a2a4 - added BEncoder class
15 years ago
orbiter 2113fcd7e5 - fixed usage of isEmpty() which is not available in java 1.5
15 years ago
orbiter dd459281c8 applied code changes that are recommended by PMD
15 years ago
orbiter 3f771d2a16 fix for rss parser: be lazy when rss is not well-formed
15 years ago
orbiter dff4f95c78 some patches to get the torrent parser working
15 years ago
orbiter fbd24c2d84 integrated the torrent parser
15 years ago
orbiter bd32f8b8cb added a torrent metadata file parser
15 years ago
orbiter a37878b7d5 url parser regex performance hack
15 years ago
orbiter 8281e29963 - more configuration for profiling graph (number of events)
15 years ago
orbiter e34e63a039 preset of proper HashMap dimensions: should prevent re-hashing and increase performance
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 969123385b added json and rss output for image search
15 years ago
orbiter d183f8d980 refactoring (moved code from ContentTransformer to TemplateEngine)
15 years ago
orbiter dbdf2570ba added comparator and more fixes for SortStack/SortStore
15 years ago
orbiter d2938c44a1 - added bmp parser to the document parsers
15 years ago
orbiter 06d0dcde20 more enhancements to image search
15 years ago
orbiter 2d8f3ee301 some performance hacks
15 years ago
orbiter a97fdb4566 catch for NPE in image parser
15 years ago
orbiter cd6745b292 accept rss feeds without channel descriptions
15 years ago
orbiter 08f1cbb125 another update to the pdf parser
15 years ago
orbiter 605e896d6c more details for exception catching when parsing pdfs
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 11f7da06ed - fixes to csv parser
15 years ago
orbiter 9b6762ec2e - added a csv "comma separated values" parser to parse OAI-PMH sources from
15 years ago
orbiter 52470d0de4 - fix for xls parser
15 years ago
orbiter 26fafd85a5 - more refactoring
15 years ago
orbiter 3528b970d6 - refactoring
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago