- all requests to the own httdp can now be listed in the access tracker menu
- the search statistics had been renamed to access tracker and extended by this tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3861 6c8d7289-2bf4-0310-a012-ef5d649a1542
- all web page parsing operations will now increase a web structure file
- the file is computed in memory and dumped at shutdown-time to PLASMASB/webStructure.map in readable form (not a database)
- the file can be used externally to analyse the link structure of the crawled pages
- the web structure can also be retrieved using a xml-interface at http://localhost:8080/xml/webstructure.xml
- the short-term purpose is the computation of a link-graph image (before linuxtag!)
- a long-term purpose could be a decentralized computation of the citation rank
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3746 6c8d7289-2bf4-0310-a012-ef5d649a1542
- better memory allocation for FlexTable indexes
- splitting between static index and dynamic index (only the dynamic part must grow)
- to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes
- added new iterator classes that support cloneable iterators
- adopted all iterator classes to implement cloneable itarators
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542
- robots.txt is a servlet now
- no need to rewrite the whole file each time a section is added or removed
- user-defined disallows, added manually, won't be overwritten anymore
- new config-setting: httpd.robots.txt, holding names of the disallowed sections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3423 6c8d7289-2bf4-0310-a012-ef5d649a1542
- target must not be at port 80
- target access not more than every 3 seconds
- requester may not access more than every 10 seconds
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3357 6c8d7289-2bf4-0310-a012-ef5d649a1542
- less memory usage
- better usage of awt classes
- drawing abstractions: preparations for movable objects for animation class
- test applet for animations
- known bugs: wrong colours for network picture
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3214 6c8d7289-2bf4-0310-a012-ef5d649a1542
- redesign of data storage in plasmaSearchRankingProfile
- profiles are extended by new ranking parameters
- new RWI ranking parameters are considered during ranking
- appearance attributes (i.e. emphasised text) is now considered
- faster ranking
- some attributes that had been checked during post-ranking can now be
checked during pre-ranking phase
- removed old ranking parameter on index.html page (will be replaced by profiles in the future)
- ranking can now consider appearances of media content
- snippet-loading for media types now work correctly (fetches only from the wanted media)
- ranking-profiles can be handed over the remote peers and apply there also
- re-search of same query with different domain now also re-triggers remote search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3105 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added a cache-control for cache miss flush to the dns miss cache
- better naming of cache variables to distinguish hit- and miss- cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3015 6c8d7289-2bf4-0310-a012-ef5d649a1542
- the test phase of the new collection data structure is finished
- test data that had been generated is void. There will be no migration
- the new collection files are located in DATA/INDEX/PUBLIC/TEXT/RICOLLECTION
- the index dump is void. There will be no migration
- the new index dump is in DATA/INDEX/PUBLIC/TEXT/RICACHE
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2983 6c8d7289-2bf4-0310-a012-ef5d649a1542
- enhanced local domain detection
- bugfixing for memory assignment in kelondroFlexSplit
- automatic memory assignment to caches according to available RAM
- bugfixes for details during search process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2924 6c8d7289-2bf4-0310-a012-ef5d649a1542