Commit Graph

76 Commits (84ae66f0a163cf74c01019fbf1c53b728688c32f)

Author SHA1 Message Date
orbiter 0a050bc043 enhanced ranking
18 years ago
orbiter 61798f0ae6 added option to distinguish between text crawl and media crawl
18 years ago
orbiter e4570bffaf -implemented a specialized snippet-fetch for media content
18 years ago
low012 694a6e4f44 *) better text snipptes: any possible searchword (welt, linux, tag) in welt-linux-tag will be marked correctly now
18 years ago
orbiter bddc197453 reverted by-mistake removed change from low012/SVN 3068
18 years ago
orbiter 1377c53aa3 extraction of media links from search results
18 years ago
low012 586add4c6c *) Better snippets: words like GNU/Linux will not prevent Linux or GNU from being marked if they are searchword (see http://www.yacy-forum.de/viewtopic.php?t=2891)
18 years ago
orbiter 937ccd4e76 fix for snippet-generation
18 years ago
orbiter bf0d820659 - added correct flagging of word properties
18 years ago
orbiter ceb9e3aa17 - enhanced parser: collection of audio, video, image and application links
18 years ago
orbiter b5a29e9651 - fix for snippets that are too short
18 years ago
orbiter 30888e7a2f implementation of search constraints
18 years ago
orbiter 497428c8ec refactoring
18 years ago
orbiter bb7d4b5d5e refactoring to prepare new RWI entry object
18 years ago
orbiter b79e06615d - added new LURL.Entry class for next database migration
19 years ago
orbiter a5dd0d41af - refactoring of plasmaCrawlLURL.Entry to prepare new Entry format
19 years ago
orbiter c8f3a7d363 added snippet-url re-indexing
19 years ago
low012 2cfd4633ac *) even better handling of searchwords in snippets, words can consist of letters and numbers now
19 years ago
low012 2d3b7251a4 *) better handling of searchwords in snippets (see http://www.yacy-forum.de/viewtopic.php?t=2891 for details)
19 years ago
orbiter 1969522dc1 removed lowercase of snippets (and other things):
19 years ago
theli f17ce28b6d *) plasmaHTCache:
19 years ago
orbiter 630a955674 read snippets from cache in case they are not provided in RAM
19 years ago
orbiter dbc2e039bb added time-out option parameter to call hierarchy
19 years ago
orbiter 00746ca232 identified and fixed search performance problem caused by
19 years ago
theli a2e3095044 *) Bugfix. Add missing plasmaParserDocument.close() calls
19 years ago
low012 f8ac694e51 *) fixed a bug where searchword in snippets were not displayed bold in front of a punctuation mark (see http://www.yacy-forum.de/viewtopic.php?p=25998)
19 years ago
orbiter df1629b05a - code cleanup
19 years ago
theli 625c2ce6b1 *) bugfix for snippet fetching problem if content but not http header is available in cache
19 years ago
theli 813a8a8179 *) migration of mimeTypeParser to jmimemagic 0.1
19 years ago
theli b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
19 years ago
theli 97d2a08ef1 *) restructuring needed to support parsing of documents using various charsets
19 years ago
orbiter 3aac5b26da - added automatic tag generation when a web page from the search results is added
19 years ago
theli d0a5a53789 *) changes needed for multi-language support
19 years ago
orbiter 9340dbb501 fixed all possible problems with nullpointer exception for LURLs
19 years ago
theli dae763d8e3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2495 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli 393a7d10be *) setting htCache.Entry fields to private
19 years ago
theli 09b106eb04 *) next step of restructuring for new crawlers
19 years ago
theli eb9b138986 *) next step of restructuring for new crawlers
19 years ago
theli 1395aae742 *) starting restructuring which is needed to add crawlers for additional protocols
19 years ago
theli f3ac4dbbb9 *) better handling of server shutdown
19 years ago
orbiter abf22f6e60 removed url normalform computation from htmlFilterContentScraper.
19 years ago
orbiter 3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
19 years ago
orbiter 90d569d70f refactoring of index management:
19 years ago
orbiter a474669338 start with refactoring of index management
19 years ago
orbiter 83e0e765ec redesigned some parts of the html scanner & parser
19 years ago
orbiter d8d0ac29c3 added image-viewer servlet that can do:
19 years ago
orbiter bae3783d38 added a snippet marking
19 years ago
theli dc9174c809 *) Implementing snippet fetching via ajax
19 years ago
orbiter 3d8a5ae652 code cleanup
19 years ago
theli bdf30117c1 *) Redesign of parser configuration
19 years ago