YaCy: News http://www.yacy.net/yacy/News.html This is essentially the YaCy release change-log. en-us Tue, 30 Jun 2007 09:00:00 GMT http://www.yacy.net/yacy/grafics/yacy.gif YaCy http://www.yacy.net/ New Release V0.52 (20070512_3715) http://www.yacy.net/yacy/News.html#3715 Sat, 12 May 2007 00:12:29 GMT 3715
  • New Functions
    • Added exclusion-search (a search with '-' to exclude specific words from the search results)
    • Added extraction of sitemap-url from robots.txt, which can be used for crawl starts
    • Added a network configuration menu for new cluster configuration functions: a set of peers may now operate as an isle within the YaCy network, without exchange of index data over the border of the isle. Peers within the cluster can trigger internal remote crawls and search only within the own cluster.
    • Added a postscript parser
  • Interface Enhancements
    • Redesigned the status page, shows now also hints and warnings
    • Better layout for image search results
    • The peer profile can now be displayed as vcard, e.g. http://localhost:8080/ViewProfile.vcf?hash=localhash
  • Performance Enhancements
    • Added an option to configure a path to a secondary index location. This shall be used to store a fragment of the index on another physical device, to split IO load and enhance access speed. The index is splitted in such a way that the LURLs are stored to the secondary location, and the RWIs to the primary location.
    • Optimized memory allocation when accessing the web-index (now half of memory throughput as before)
    • Fixed bugs in database engine that corrupted the data when entries had been removed
    ]]>
    New Release V0.51 (20070321_3501) http://www.yacy.net/yacy/News.html#3501 Wed, 21 Mar 2007 15:00:00 GMT 3501
  • Better Crawling
    • Higher crawling speed possible caused by better ram cache flush methods
    • The crawl balancer now has a security function which prevents that remote web servers are accessed more than two times in one second. In case a crawling from a single domain is made, this means the crawling speed is restricted to not more than 120 pages per minute
    • The crawl balancer chooses better urls. Newly added urls are now prevented from beeing hidden by masses of links generated by the crawler. The effect is that in most cases the security function described above is not needed.
    • Added a crawling speed button on the crawling monitor page.
    • Crawl targets get informed about the yacy bot; a link to http://yacy.net/yacy/bot.html is attached to each crawl request; the page explains YaCy and that YaCy respects robots.txt
  • Better Monitoring
    • New search result page SearchStatistics_p.html shows local and remote search requests; remote requests are anonymized
    • Added network-wide QPM (queries per minute) computation to show how much the network is used for web search. The statistics are not reported from searching peers, but from searched peers; therefore the accumulation preserves privacy of the searcher
    • New page LogStatistics_p.html which shows an evaluation of entries from the log.
    • New page BlacklistCleaner_p.html to clean up wrong blacklist entries. The page allows categorization of blacklist error case, correction of error and the optional deletion of the blacklist entry.
    • Added RSS feed for YaCyNews
  • Enhanced User Interface
    • Added a robots.txt configuration menu to enable/disable external crawlers to access the yacy user interface
    • New wiki-parser
    • Blog entries may now have user-comments
    • The network list page now provides links to the users blog pages
    • The menu points had been rearranged
  • Less Memory Usage and Better Memory Management
    • All caches (node cache, object cache) now have enhanced self-organization and dont need fixed size assigments
    • Memory protection by disallowing collection arrays beyond kca-7. Collections larger than those are written to 'common' files.
    • The network picture uses less memory
  • Bugfixes: a very large number of bugfixes were made.
  • ]]>
    New Release V0.50 (20061222_3124) http://www.yacy.net/yacy/News.html#3124 Tue, 30 Jan 2007 09:00:00 GMT 3124
  • Added Media Search
    • Added search pages for Images, Audio, Video and Application search.
    • Added media link presentation during snippet fetch; the Image Search presents search results as image thumbnails.
    • Better recognition of search hits for text snippet generation.
    • Media search results are indexed again after remote search results are collected; only media links are used to update the index.
  • Better Result Ranking
    • New ranking parameters and appearance attributes are now considered.
    • Faster ranking; more references can be ranked and sorted within given search time.
    • Ranking Parameters can be handed over to remote peers and are applied there.
    • Adopted Detailed Search to new ranking parameters.
    • Coefficients from detailed search can be set as default ranking for search page; this replaces the old ranking alternatives.
  • Better Crawl Monitoring
    • After a crawl start was initialized, the Crawler Monitor is shown.
    • The Crawl Monitor now shows all queue elements in one table.
    • Monitoring of index size.
    • The Crawl Profiles are shown; crawls can be interrupted within the profile table.
    • A crawl may now distinguish between text indexing and media link indexing.
  • Migration to new Database Structure
    • The new Collection Database is now the only database structure that can be used; Assortments are switched off.
    • Added functions to migrate Assortment databases and WORDS databases to Collection database.
    • Removed all methods to write Assortment data structures.
    • Migrated DHT position computation to base64-decoded values; this changes the DHT structure slightly and closes the gaps in the old DHT structure.
    ]]>
    New Release V0.49 (20061202_3040) http://www.yacy.net/yacy/News.html#3040 Sat, 02 Dec 2006 09:00:00 GMT 3040
  • Enhanced search service
    • Web searches are faster because of the new data structures implemented in this version (see below) and because bugs had been found and fixed.
    • Searches can be re-done with changed search properties. Please use the 'more options' link at the search page.
    • Added search constraints. These are search restrictions to web searches which are applied to information that is scraped from the web pages during page parsing. The first application of search constraints is a search restricted to index pages ('index of'). Please use the flag at the extended search functions.
    • Enabled index-abstracts search; this should solve the distributed-combined search challenge (still beeing tested).
  • New Database Structures for Index and URL storage
    • The new 'Collections' Data Structure is now the default data structure.
    • Index entries and URL entries carry more ranking and selection attributes, e.g. for image, video, audio and application search.
    • Enhanced Storage of URLs: they are now divided into different creation times. This enables easy deletion of outdated URLs, enables a index-limitation function and solves the problem that the URL database was too big to fit into a 2 GB file.
    • Search requests can now be answered in less time.
    • The index organization needs less IO.
    • Index transfers will now only be done to latest peers supporting the collection data structure.
    • Index transfers from old peers to new peers are translated automatically to new data format.
    • Assortments are no longer supported.
  • Enhanced SOAP support
    • Added protocol for peer administration, custom services, status queries, blacklist management, file share management, support for outgoing transfer- and content-encoding, better error handling, function to get and set message forwarding, handling of YaCy bookmarks, log display, manage peer messages, get and set peer profile, query peer status, query the pause/resume state of the crawling queues, and a check if a specific URL is blacklisted.
    • Added new ANT target to allow generation of client stub classes for YaCy SOAP api.
  • Other new Features
    • Added DNS-cache-miss caching.
    • Added Flash (experimental), MS Excel and Powerpoint parser.
    • New mint-green and dark skin.
    • Better non-7bit ascii character support.
    • Added ant support for rpms.
    • Added ant target for windows installer.
    • Added template to display file share in xml format.
    • Better object caching for kelondro database (combined read/write object cache with synergy effects).
    • More anonymization in logging.
    • New HTCACHE layout using files hashes; tree- and hash-layout can be used simultanously; hash-layout is now default.
    • Access to wiki is now limited to administrator, if wanted. This can be configured at the wiki page.
    • ..and many bugfixes.
  • New 'satellite' Projects: these applications work as service applications for the YaCy application (start-up/experimental status)
    • YaCy admin: a swing-based client, that is able to administrate yacy using the SOAP interface.
    • YaCy Screen Saver: presentation of the peer status in a screen saver
    • YaCy Updater: automated donwloads/updates
    • YaCy logalizer: analyzer for the YaCy log
    ]]>