Commit Graph

187 Commits (487a733c99078ad85e30b6cafa43b6a63664af6e)

Author SHA1 Message Date
orbiter ca72ed7526 -removed superfluous crawl cache
16 years ago
orbiter 0e8647d62f refactoring of search classes
16 years ago
orbiter dafffd0153 refactoring of parsers and document processing
16 years ago
orbiter 409538e17a code cleanup and code simplifcation
16 years ago
orbiter 222850414e simplification of the code: removed unused classes, methods and variables
16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex:
16 years ago
orbiter a642d6a7b5 - added navigation icons for search result pages
16 years ago
orbiter c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
16 years ago
orbiter 14a1c33823 refactoring of wordIndex class
16 years ago
orbiter 396a4451be increased timeout in ViewFile
16 years ago
orbiter aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
16 years ago
orbiter 76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
16 years ago
orbiter c12bb8a6d0 - refactoring of the http client
16 years ago
orbiter b57c9da1f8 - fixes to doc, ppt, xls parser: better title
16 years ago
orbiter 94110df85a moved logging partially to kelondro
16 years ago
orbiter c4c4c223b9 fixed a problem with attribute flags on RWI entries that prevented proper selection of index-of constraint
16 years ago
orbiter 47292e696a more performance hacks
16 years ago
orbiter 0edec2b760 FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html.
16 years ago
orbiter 47f0c3b002 replaced the cacheAdmin with the ViewFile servlet, because the cacheAdmin was an interface to the old HTCACHE data structure which does not exist any more. Changed links to point to the ViewFile servlets.
16 years ago
low012 77e41da7d2 *) further propagation of display value (see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1536)
16 years ago
orbiter 536e77e8b7 modifications towards a single database operation to read/write http header and cached file at once:
16 years ago
orbiter 7989335ed6 Preparations to replace the HTCache with a new storage data structure:
16 years ago
danielr 3bb870bfcd added final where possible
17 years ago
orbiter c3d461d191 - removed superfluous copyright statement
17 years ago
orbiter 3ca98fee42 removed superfluous copyright statement
17 years ago
danielr 7feae906aa - organize imports
17 years ago
orbiter cfe6790498 - added option to switch between yacy networks, especially between the two default networks (freeworld and intranet),
17 years ago
orbiter e024e3b9cf added new default profiles to distinguish snippet fetch for local and global search
17 years ago
danielr 5c3c1fdf41 replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;)
17 years ago
orbiter 7f9f639d20 - refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering
17 years ago
orbiter d6050b9ffb - separated the LURL data storage and Crawl result stack for process supervision.
17 years ago
orbiter 541b817502 refactoring of switchboard queueing
17 years ago
orbiter 3f2b18a4e7 fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=905&hilit=&p=5979#p5979
17 years ago
orbiter 87a8747ce3 - enhanced recognition, parsing, management and double-occurrence-handling of image tags
17 years ago
orbiter a8a5df4a51 - more dublin core naming of page metadata
17 years ago
orbiter efd0b8371a - added parsing of Dublin Core - compliant metadata (see RFC 5013 and ISO 15836) to html parser
17 years ago
orbiter 03e7782269 more generics
17 years ago
orbiter c527969185 - enhanced monitoring of ranking parameters
17 years ago
orbiter a31b9097a4 preparations for mass remote crawls:
17 years ago
fuchsi 0e1738899f * Complete number localization and provide a more reasonable interface to serverObjects:
17 years ago
fuchsi f717beecb1 - Changed yFormatter handling to be more flexible and produce more readable code for server pages. There are serverObject.putNum() methods to allow adding of number type values in a formatted form, and put() methods for number types that add them without formatting. This reduces the need to transform them into Strings in server pages and removes the HTML encoding step which is unecessary for numbers.
17 years ago
orbiter daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
17 years ago
orbiter b5346141b3 made the plasmaHTCache static (there is only one internet, so we need only one cache)
18 years ago
orbiter 947fc46904 refactoring of search process:
18 years ago
orbiter 40b0547611 - documentaton changes (removed old forum links)
18 years ago
karlchenofhell b4bb48132a - fixed -UNRESOLVED_PATTERN- in ViewFile for Link List
18 years ago
orbiter e923cb59b4 added hyperlinks to viewFile
18 years ago
karlchenofhell 601fc7d1c5 - added source to J7Zip-modifed.jar and it's license (changelog is still to come)
18 years ago
orbiter 6b9eea3932 - removed differentiation between longTitle and shortTitle; this cannot be used for search results,
18 years ago
karlchenofhell 3bafd643c0 - fix for http://www.yacy-forum.de/viewtopic.php?t=3483
18 years ago
orbiter f25c0e98d1 - replaced String by StringBuffer in condenser
18 years ago
orbiter 76fab83395 fixed bugs in seach statistics
18 years ago
allo 0c81bd39d4 XSS-safe put as default.
18 years ago
auron_x 454f182ba3 *) fix for http://www.yacy-forum.de/viewtopic.php?p=30118#30118
18 years ago
karlchenofhell 35fb671721 - updated DetailedSearch and ViewFile
18 years ago
orbiter 61798f0ae6 added option to distinguish between text crawl and media crawl
18 years ago
orbiter 937ccd4e76 fix for snippet-generation
18 years ago
orbiter 10d888e70c - added a media search for images, audio, video and applications
18 years ago
orbiter 109ed0a0bb - cleaned up code; removed methods to write the old data structures
18 years ago
orbiter ceb9e3aa17 - enhanced parser: collection of audio, video, image and application links
18 years ago
orbiter 8fa4a01c38 added password check to url retrieval in FiewFile
18 years ago
orbiter 8e7215475b - extended ViewFile to use is as debugging-tool: you can now use the
18 years ago
orbiter bb7d4b5d5e refactoring to prepare new RWI entry object
18 years ago
orbiter b79e06615d - added new LURL.Entry class for next database migration
18 years ago
orbiter a5dd0d41af - refactoring of plasmaCrawlLURL.Entry to prepare new Entry format
18 years ago
orbiter 1969522dc1 removed lowercase of snippets (and other things):
18 years ago
theli f17ce28b6d *) plasmaHTCache:
18 years ago
theli a2e3095044 *) Bugfix. Add missing plasmaParserDocument.close() calls
18 years ago
theli cd5f349666 *) Better handling of large files during parsing
18 years ago
orbiter df1629b05a - code cleanup
18 years ago
theli b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
18 years ago
orbiter 9340dbb501 fixed all possible problems with nullpointer exception for LURLs
18 years ago
orbiter 4866868c0e added write cache for LURLs
18 years ago
theli dae763d8e3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2495 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli 2126c51906 *) bugfix for ViewFile.java. Wrong http header were used
18 years ago
theli 3870d615e3 *) setting htCache.Entry fields to private
18 years ago
orbiter 3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
19 years ago
orbiter 015d044c25 tried to fix some problems with latest changes to httpc
19 years ago
low012 97c6a70b71 *) Fixed XSS vulnerability. I was able to crawl a PDF that caused the loading of an image in the admin's browser.
19 years ago
theli dc9174c809 *) Implementing snippet fetching via ajax
19 years ago
orbiter f4ffa9aee5 - implemented more attributes to index entries
19 years ago
orbiter bb79fb5d91 - changed handling of error cases retrieving urls from database
19 years ago
orbiter a04930f025 code cleanup
19 years ago
theli 47a2a8885d *) Display MimeType on URL Info page
19 years ago
theli dd24f0252f *) Searchword highlighting for info page
19 years ago
theli 40777556c5 *) Connection Tracking
19 years ago