yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	947fc46904	refactoring of search process: - re-designed remote request result processing - re-designed local result accumulation, will be further enhanced with snippet fetcher - removed search process handling in switchboad - made snippet class static (there is no need for multiple snippet objects) - removed some redundant tasks in server-side search process, should be a little bit faster now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4043 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	62347b50f4	added security layer for ViewImage: - images may be requested by localhost and authorized users only, if the request is done using a clear-text URL - the image may be requested also using a code that can be a license to retrieve a URL for everyone - some servelets produce URL licenses for ViewImage, like image search results git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4027 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9ca46a8c69	indexing of local (intranet) urls enabled To do this, one must create a separate YaCy network that has a local URL domain A description how to do this is here: http://www.yacy-websuche.de/wiki/index.php/De:Netzdefinition git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4001 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	511dcbb172	fixed encoding bug made in SVN 3993 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3998 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	40b0547611	- documentaton changes (removed old forum links) - different handling of link quotation - different handling of link normalization - enhanced html/unicode en/de-coding git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a4e8ad95ab	enhancements to news and switchboard queue processing removed direct access and replaced by iteration git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3961 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	36a37f758b	fix for oom exception during release download see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	71ca9aa6d4	- fix for changed blacklist types git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3857 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	339153d40e	*) favicons that are specified in the document content via html link-tags are now detected and displayed on the search page (requested by allo). git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3845 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	051a65f7af	) Snippet fetching: Snippet are now fetched synchronous if the query parameter "fetchSnippet=" is appended to the query string on the yacy search page. This is required for the RSS feed. See: http://www.yacy-forum.de/viewtopic.php?t=4051 ) Small changes in the XSLT-stylesheet that is used to generate a html page from the RSS feed. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3787 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	5fc00871a9	getpageinfo/sitemap bugfix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3781 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	e7da3d2340	fixed sitemap url in getpageinfo added suggested tags/keywords in getpageinfo git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3780 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
(no author)	92351c4dcb	*) SOAP: bookmarks list now indicates if a bookmark is private (requested by KoH) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3775 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a585b4d41b	added web structure image see http://localhost:8080/WatchWebStructure_p.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3747 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	33ad0c8246	added a web structure computation and logging: - all web page parsing operations will now increase a web structure file - the file is computed in memory and dumped at shutdown-time to PLASMASB/webStructure.map in readable form (not a database) - the file can be used externally to analyse the link structure of the crawled pages - the web structure can also be retrieved using a xml-interface at http://localhost:8080/xml/webstructure.xml - the short-term purpose is the computation of a link-graph image (before linuxtag!) - a long-term purpose could be a decentralized computation of the citation rank git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3746 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	601fc7d1c5	- added source to J7Zip-modifed.jar and it's license (changelog is still to come) - moved HTML-*replace-methods from wikiCode to de.anomic.data.htmlTools - prepared use of different wiki parsers as suggested here: http://www.yacy-forum.de/viewtopic.php?p=34444#34444 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3741 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	7d9259e44d	*) Bugfix for umlaut problem See: http://www.yacy-forum.de/viewtopic.php?t=3932 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3674 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	0b5fc3c28c	) moving date functions to serverDate class ) Sitemap-parser - logging added - parsing of modDate added git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3667 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	6f46245a51	) Bookmarks: Ajax icon is displayed while loading title ) First version of a sitemap parser added - currently only autodetection of sitemap files is supported *) DB-Import restructured - pause/resume should work again now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3666 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6e7340ef52	added exclusion search (you can now search and exclude words from the result with '-') git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3540 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	91c2a042a7	*) bugfix for wrong proxy traffic accounting git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3484 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	861f41e67e	redesigned NURL-handling: - the general NURL-index for all crawl stack types was splitted into separate indexes for these stacks - the new NURL-index is managed by the crawl balancer - the crawl balancer does not need an internal index any more, it is replaced by the NURL-index - the NURL.Entry was generalized and is now a new class plasmaCrawlEntry - the new class plasmaCrawlEntry replaces also the preNURL.Entry class, and will also replace the switchboardEntry class in the future - the new class plasmaCrawlEntry is more accurate for date entries (holds milliseconds) and can contain larger 'name' entries (anchor tag names) - the EURL object was replaced by a new ZURL object, which is a container for the plasmaCrawlEntry and some tracking information - the EURL index is now filled with ZURL objects - a new index delegatedURL holds ZURL objects about plasmaCrawlEntry obects to track which url is handed over to other peers - redesigned handling of plasmaCrawlEntry - handover, because there is no need any more to convert one entry object into another - found and fixed numerous bugs in the context of crawl state handling - fixed a serious bug in kelondroCache which caused that entries could not be removed - fixed some bugs in online interface and adopted monitor output to new entry objects - adopted yacy protocol to handle new delegatedURL entries all old crawl queues will disappear after this update! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3483 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9f929b5438	better snippet handling in case of snippet load fail see also http://www.yacy-forum.de/viewtopic.php?p=31096#31096 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3475 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	bf7a69197d	- fix for possible NPE in queues_p - WatchCrawler_p: - display crawler traffic - pause/resume local- and global crawler git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3389 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	306c50ac40	QPM (queries per minute) statistic stub git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3308 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	29aa7031d3	workaround for the snippets git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3225 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	8803f813c5	partly fixed snippets git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3224 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	0c81bd39d4	XSS-safe put as default. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	00ca6ecf58	-made snippet-timeout for text and media configurable -Now completely working OpenSearch plugin! Please have a look at the search-field of modern browsers (IE 7+, FF2+). It should change its colour when you visit the index/search-page of a peer and you should be able to add your YaCy-peer as search source very easily now. Credits for adapting the plugin to make it work go to Philipp Redeker. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3212 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	41bc31d2c2	- ConfigAdvanced_p => XHTML (no invalid IDs) - removed unmappable characters from code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3133 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1d2d1854b9	added size of rwi and urls to WatchCrawler git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3112 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	0a050bc043	enhanced ranking - redesign of data storage in plasmaSearchRankingProfile - profiles are extended by new ranking parameters - new RWI ranking parameters are considered during ranking - appearance attributes (i.e. emphasised text) is now considered - faster ranking - some attributes that had been checked during post-ranking can now be checked during pre-ranking phase - removed old ranking parameter on index.html page (will be replaced by profiles in the future) - ranking can now consider appearances of media content - snippet-loading for media types now work correctly (fetches only from the wanted media) - ranking-profiles can be handed over the remote peers and apply there also - re-search of same query with different domain now also re-triggers remote search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3105 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	61798f0ae6	added option to distinguish between text crawl and media crawl - for each crawl start, there is now a flag for text and media - the localCrawl flag is superfluous - added new crawl profiles - if an image search is done, only media links are crawled for the snippets git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3100 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	febe6b114a	design update of crawler monitor git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3094 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e4570bffaf	-implemented a specialized snippet-fetch for media content -changed search result preparation for media search presentation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3073 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1377c53aa3	extraction of media links from search results these links are mixed to the snippets for testing purpose (a final version will handle this differently) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3069 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	fb9e0f0284	preparations for media snippets git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3064 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	937ccd4e76	fix for snippet-generation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3060 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9a85f5abc3	cleanup - removed 'deleteComplete' flag; this was used especially for WORDS indexes - shifted methods from plasmaSwitchboard to plasmaWordIndex git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3051 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	109ed0a0bb	- cleaned up code; removed methods to write the old data structures - added an assortment importer. the old database structures can be imported with java -classpath classes yacy -migrateassortments - modified wordmigration. The indexes from WORDS are now imported to the collection database. The call is java -classpath classes yacy -migratewords (as it was) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3044 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	ceb9e3aa17	- enhanced parser: collection of audio, video, image and application links - enhanced condenser: better handling of utf-8 and pre-formatted texts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3017 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b5a29e9651	- fix for snippets that are too short - added keyword to snippet fetch to suppres removal of not-found snippet words (for debugging) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3009 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	30888e7a2f	implementation of search constraints Such constraints may formulate specific restrictions to web searches This is implemented by scraping information for constraints from a web page during parsing, and storing flags to the pages within the web index. In this first step, only information for index pages ("index of", directory listings) are scraped and stored in flags - added new flag class kelondroBitfield - added scraper method in condenser - added bitfield structure for all scrape types (see also condenser) - added bitfield structure for appearance locations (see RWIEntry) - added handover protocol for remote search and index distribution - extended kelondroColumn class to hold bitfield types - added another search attribute on search page (index.html) - extended search-filter to enable filtering of non-matching constraints - set all new database types to be default - refactoring: moved word hash generation to condenser class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2999 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d34f10c63d	some tests with reverse dns lookup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2954 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	497428c8ec	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2949 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	a75f895884	memory and traffic informations git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2904 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	2ba56f70a8	XML-safe put. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2848 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	a17c43779f	removed wrong part of template git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2830 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	27f9e0b1c6	xml interface for blacklists git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2829 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	74f09a0510	some more xml-backend files. ConfigAdvanced_p.java: list settings after changing. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2784 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago

1 2 3

101 Commits (344911bfaa1079e278faa34b605e00904ce1bbc4)