Commit Graph

167 Commits (b3ffcde0c77b07b0f5f74c34751eeb3999ba8edf)

Author SHA1 Message Date
orbiter 777195e8d1 more abstraction for access of LoaderDispatcher and cache
15 years ago
orbiter 7bcfa033c9 more abstraction of the htcache when using the LoaderDispatcher:
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter 9842fab6e4 - fixes to query parameter
15 years ago
orbiter 2a8f70f0ca - fix for caching of OSM tiles. if you want that this fix applies to your peer, please delete the crawl profiles
15 years ago
orbiter 2126c03a62 - removed download-limit that can be given for the crawler for non-crawler download tasks. This was necessary because the same procedure was used for other downloads like for the download of dictionary files where a limit is not useful. The limit still stays for the indexer
15 years ago
orbiter c45117f81f fixed dates in metadata
15 years ago
orbiter 06ff0c5b06 fixes for metadata retrieval and presentation
15 years ago
orbiter 7ab207d93a better presentation of search result metadata and fixes to htcache loading
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
orbiter 0f8004f9da enhanced html parser to recognize a href tags inside header tags
15 years ago
orbiter 3300930fc5 - (almost) fixed FTP crawler
15 years ago
orbiter 61493a9a9f added more information about metadata in ViewFile.html
15 years ago
orbiter c4bdb1e7f2 added one more option in ViewFile to show an iframe like for the orginal web page content but using the cache than the direct link to the content in the web. Upgraded the very old and previously not any more used CacheResource_p servlet to a new and working version.
15 years ago
orbiter 270fb38674 - fixed some bugs in Table viewer
15 years ago
orbiter 38d7a28cd2 fix in viewfile needed when ViewFile is called only with 'url' parameter
15 years ago
orbiter fe41a84330 some enhancements in web caching: avoid double loading of response metadata and/or content
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
15 years ago
orbiter 26fafd85a5 - more refactoring
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago
orbiter 5841ee83d3 refactoring
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter bea3b99aff moved table and util classes
15 years ago
orbiter 735e2737e3 * added index segments
15 years ago
orbiter 6aa474f529 - better logging for web cache access and fail reasons
15 years ago
orbiter 72e5407115 refactoring of snippet cache
15 years ago
orbiter 161d2fd2ef redesign of access to the HTCache (now http.client.Cache):
16 years ago
orbiter 1d8d51075c refactoring:
16 years ago
orbiter 5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
16 years ago
orbiter ca72ed7526 -removed superfluous crawl cache
16 years ago
orbiter 0e8647d62f refactoring of search classes
16 years ago
orbiter dafffd0153 refactoring of parsers and document processing
16 years ago
orbiter 409538e17a code cleanup and code simplifcation
16 years ago
orbiter 222850414e simplification of the code: removed unused classes, methods and variables
16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex:
16 years ago
orbiter a642d6a7b5 - added navigation icons for search result pages
16 years ago
orbiter c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
16 years ago
orbiter 14a1c33823 refactoring of wordIndex class
16 years ago
orbiter 396a4451be increased timeout in ViewFile
16 years ago
orbiter aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
16 years ago
orbiter 76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
16 years ago
orbiter c12bb8a6d0 - refactoring of the http client
16 years ago
orbiter b57c9da1f8 - fixes to doc, ppt, xls parser: better title
16 years ago
orbiter 94110df85a moved logging partially to kelondro
16 years ago
orbiter c4c4c223b9 fixed a problem with attribute flags on RWI entries that prevented proper selection of index-of constraint
16 years ago
orbiter 47292e696a more performance hacks
16 years ago
orbiter 0edec2b760 FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html.
16 years ago
orbiter 47f0c3b002 replaced the cacheAdmin with the ViewFile servlet, because the cacheAdmin was an interface to the old HTCACHE data structure which does not exist any more. Changed links to point to the ViewFile servlets.
16 years ago
low012 77e41da7d2 *) further propagation of display value (see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1536)
16 years ago
orbiter 536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
16 years ago
orbiter 7989335ed6 Preparations to replace the HTCache with a new storage data structure:
16 years ago
danielr 3bb870bfcd added final where possible
17 years ago
orbiter c3d461d191 - removed superfluous copyright statement
17 years ago
orbiter 3ca98fee42 removed superfluous copyright statement
17 years ago
danielr 7feae906aa - organize imports
17 years ago
orbiter cfe6790498 - added option to switch between yacy networks, especially between the two default networks (freeworld and intranet),
17 years ago
orbiter e024e3b9cf added new default profiles to distinguish snippet fetch for local and global search
17 years ago
danielr 5c3c1fdf41 replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
17 years ago
orbiter 7f9f639d20 - refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering
17 years ago
orbiter d6050b9ffb - separated the LURL data storage and Crawl result stack for process supervision.
17 years ago
orbiter 541b817502 refactoring of switchboard queueing
17 years ago
orbiter 3f2b18a4e7 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=905&hilit=&p=5979#p5979
17 years ago
orbiter 87a8747ce3 - enhanced recognition, parsing, management and double-occurrence-handling of image tags
17 years ago
orbiter a8a5df4a51 - more dublin core naming of page metadata
17 years ago
orbiter efd0b8371a - added parsing of Dublin Core - compliant metadata (see RFC 5013 and ISO 15836) to html parser
17 years ago
orbiter 03e7782269 more generics
17 years ago
orbiter c527969185 - enhanced monitoring of ranking parameters
17 years ago
orbiter a31b9097a4 preparations for mass remote crawls:
17 years ago
fuchsi 0e1738899f * Complete number localization and provide a more reasonable interface to serverObjects:
17 years ago
fuchsi f717beecb1 - Changed yFormatter handling to be more flexible and produce more readable code for server pages. There are serverObject.putNum() methods to allow adding of number type values in a formatted form, and put() methods for number types that add them without formatting. This reduces the need to transform them into Strings in server pages and removes the HTML encoding step which is unecessary for numbers.
17 years ago
orbiter daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
17 years ago
orbiter b5346141b3 made the plasmaHTCache static (there is only one internet, so we need only one cache)
18 years ago
orbiter 947fc46904 refactoring of search process:
18 years ago
orbiter 40b0547611 - documentaton changes (removed old forum links)
18 years ago
karlchenofhell b4bb48132a - fixed -UNRESOLVED_PATTERN- in ViewFile for Link List
18 years ago
orbiter e923cb59b4 added hyperlinks to viewFile
18 years ago
karlchenofhell 601fc7d1c5 - added source to J7Zip-modifed.jar and it's license (changelog is still to come)
18 years ago
orbiter 6b9eea3932 - removed differentiation between longTitle and shortTitle; this cannot be used for search results,
18 years ago
karlchenofhell 3bafd643c0 - fix for http://www.yacy-forum.de/viewtopic.php?t=3483
18 years ago
orbiter f25c0e98d1 - replaced String by StringBuffer in condenser
18 years ago
orbiter 76fab83395 fixed bugs in seach statistics
18 years ago
allo 0c81bd39d4 XSS-safe put as default.
18 years ago
auron_x 454f182ba3 *) fix for http://www.yacy-forum.de/viewtopic.php?p=30118#30118
18 years ago
karlchenofhell 35fb671721 - updated DetailedSearch and ViewFile
18 years ago
orbiter 61798f0ae6 added option to distinguish between text crawl and media crawl
18 years ago
orbiter 937ccd4e76 fix for snippet-generation
18 years ago
orbiter 10d888e70c - added a media search for images, audio, video and applications
18 years ago
orbiter 109ed0a0bb - cleaned up code; removed methods to write the old data structures
18 years ago
orbiter ceb9e3aa17 - enhanced parser: collection of audio, video, image and application links
18 years ago
orbiter 8fa4a01c38 added password check to url retrieval in FiewFile
18 years ago
orbiter 8e7215475b - extended ViewFile to use is as debugging-tool: you can now use the
18 years ago
orbiter bb7d4b5d5e refactoring to prepare new RWI entry object
18 years ago
orbiter b79e06615d - added new LURL.Entry class for next database migration
18 years ago
orbiter a5dd0d41af - refactoring of plasmaCrawlLURL.Entry to prepare new Entry format
18 years ago
orbiter 1969522dc1 removed lowercase of snippets (and other things):
18 years ago
theli f17ce28b6d *) plasmaHTCache:
18 years ago
theli a2e3095044 *) Bugfix. Add missing plasmaParserDocument.close() calls
18 years ago
theli cd5f349666 *) Better handling of large files during parsing
18 years ago