Commit Graph

520 Commits (f814e0fa813da361cfb07353a210760b361f1ec5)

Author SHA1 Message Date
orbiter dafffd0153 refactoring of parsers and document processing
16 years ago
orbiter 77d2a3782c removed strange debugging strings
16 years ago
orbiter 409538e17a code cleanup and code simplifcation
16 years ago
orbiter 1f1399e5c5 extending visibility of objects and methods to avoid synthetic accessor methods and increase performance
16 years ago
orbiter 154bbc3364 code cleanup: call of static methods directly to the class
16 years ago
orbiter 222850414e simplification of the code: removed unused classes, methods and variables
16 years ago
orbiter c5122d6836 completed migration of BLOBTree to BLOBHeaps:
16 years ago
orbiter ae015e8e98 refactoring of blob package classes
16 years ago
orbiter ce1adf9955 serialized all logging using concurrency:
16 years ago
orbiter 27fa6a66ad - completed the author navigation
16 years ago
orbiter c079b18ee7 - refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing.
16 years ago
orbiter 88426912ad more refactoring to make the segment object easier to use and to be prepared to integrate author navigation
16 years ago
orbiter 99bf0b8e41 refactoring of plasmaWordIndex:
16 years ago
orbiter 3d4b826ca5 migration of all databases that use the deprecated BLOBTree format into the BLOBHeap format. Old databases are migrated automatically.
16 years ago
orbiter 26a46b5521 increased default maximum file size for database files to 2GB
16 years ago
orbiter e005cfea37 fix for bug in -incell option of URLAnalysis
16 years ago
orbiter a7e392f31b The collection index will not be supported any more.
16 years ago
low012 ea27853c59 *) some refactoring
16 years ago
orbiter 89aeb318d3 enhanced the wikimedia dump import process
16 years ago
orbiter c097531e3d added a catch Exception to all thread to check if any of them silently dies without any other notification
16 years ago
low012 ff5f82d780 *) removed description of removed commands from wikiHelp ([= =])
16 years ago
orbiter 9c6ac43f66 fixes for wiki parser
16 years ago
low012 78ffb61297 *) got rid of unnecessary variable which might also fix IndexOutOfBoundsException
16 years ago
orbiter d079d6dfdb small changes in surrogate reader, wiki code and portal test
16 years ago
low012 f1244264b8 *) hopefully fixed bug reported in http://forum.yacy-websuche.de/viewtopic.php?t=2057
16 years ago
low012 d1116c049f *) added new method "contains()" to Blacklist interface
16 years ago
orbiter c8624903c6 full redesign of index access data model:
16 years ago
orbiter d4d87d90c4 - extended experimental wikipedia dump parser
16 years ago
orbiter c08f9b36a4 refactoring of wiki parser.
16 years ago
low012 9180617dd9 *) Classes to handle import of lists (especially blacklists) from XML files, not used yet, but will be used soon.
16 years ago
orbiter c2359f20dd refactoring: better abstraction of reference and metadata prototypes.
16 years ago
orbiter 96eaecda3e - added migration class to go from index collections to the index cell data structure.
16 years ago
orbiter 7dff1cba62 removed option to use different primary keys in kelondro tables
16 years ago
orbiter 14a1c33823 refactoring of wordIndex class
16 years ago
orbiter d49238a637 more performance hacks: better default values for scaling, less memory usage
16 years ago
orbiter d988204875 better shutdown of tools
16 years ago
orbiter 100247bdda added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following:
16 years ago
orbiter 60078cf322 added next tool for url analysis: check for references, that occur in the URL-DB but not in the RICOLLECTIONS
16 years ago
orbiter dbdd10da84 better logging and startup behaviour for referenceHash computation
16 years ago
orbiter d64836c34f added statistical analysis of URL reference
16 years ago
orbiter b80db04667 - refactoring of IntegerHandleIndex and LongHandleIndex (better method names)
16 years ago
orbiter efcd95dc37 simplification of (internal) query process / refactoring
16 years ago
orbiter f1b712c29a small corrections to image loading methods in result presentation
16 years ago
orbiter aa44d9bad9 more refactoring of kelondro.text / deleted de.anomic.index
16 years ago
orbiter 6ffc6e3389 more refactoring of indexer and kelondro classes;
16 years ago
orbiter 76ef5f0f14 refactoring of index package: better names for the classes (to be continued)
16 years ago
orbiter d1d9fbae5c enabling the URLAnalysis to operate on multime input files, just use a wild card when calling the class from the command line
16 years ago
orbiter 7ea53fe47b added another url list transformation option:
16 years ago
orbiter 54625360f7 performance update
16 years ago
orbiter d884c4718a added gzip support for URLAnalysis:
16 years ago