- the test phase of the new collection data structure is finished
- test data that had been generated is void. There will be no migration
- the new collection files are located in DATA/INDEX/PUBLIC/TEXT/RICOLLECTION
- the index dump is void. There will be no migration
- the new index dump is in DATA/INDEX/PUBLIC/TEXT/RICACHE
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2983 6c8d7289-2bf4-0310-a012-ef5d649a1542
- moved all url and index(RWI) entries to index package
- better naming to distinguish RWI entries and URL entries
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2937 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added generation of a compressed index within remote peers during global search
- added selection of specific urls within remote peers during secondary global search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2539 6c8d7289-2bf4-0310-a012-ef5d649a1542
for indexing, the plasmaWordIndex.
The new data structure is ready-to-use, but currently disabled.
It can be activated by setting the static
plasmaWordIndex.useCollectionIndex
to true. This shall be done for testing purpose.
The new index is stored to
DATA/INDEX/PUBLIC/TEXT
The directory PLASMA shall be used only for crawler in the future.
Attention: during testing the data structure in INDEX may change,
and created indexes with the new data structure may get useless.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2348 6c8d7289-2bf4-0310-a012-ef5d649a1542
this is the new database structure that is supposed to replace the
plasmaAssortmentCluster AND the plasmaWordIndexFileCluster
The new structure is not yet active and needs to be integrated into
plasmaWordIndex. This has some migration constraints that are not yet
completely solved.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2347 6c8d7289-2bf4-0310-a012-ef5d649a1542
shall replace the current assortment files.
* used the kelondroFlexTable to hold the index of collections
* used kelondroRow definitions to declare all data structures
* fixed several bugs that appeared in kelondroRowSet and kelondroRowCollection during testing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2344 6c8d7289-2bf4-0310-a012-ef5d649a1542
indexContainers from RAM must be cloned explicitely to prevent
side-effects on stored indexContainer objects in Cache
* changed behaviour of urlReference deletion from indexContainers:
deletion does not user retrieval of all Elements from the assortments
* added textual configuration of kelondroRow and kelondroColumn definition
* update of kelondroRow usage in yacyNews
* modified kelondroAttrSeq to use modified kelondroColumn parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2339 6c8d7289-2bf4-0310-a012-ef5d649a1542
an iteration of key elements in kelondroTree databases is no longer supported.
this is now replaced by an iteration of kelondroRow.Entry objects from the database
Iteration of keys from the database was mostly followed by retrieval of the row
from the database, whcih caused unnecessary database load.
The index selection was also redesigned to use the new row iteration methods.
This affects many funktions, most important is the DHT selection routine which is now much faster.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2327 6c8d7289-2bf4-0310-a012-ef5d649a1542
a reverse word index based on a a collection index,
which is an index for a set of array files containing
row collections.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2271 6c8d7289-2bf4-0310-a012-ef5d649a1542