yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	11639aef35	- added new protocol loader for 'file'-type URLs - it is now possible to crawl the local file system with an intranet peer - redesign of URL handling - refactoring: created LGPLed package cora: 'content retrieval api' which may be used externally by other applications without yacy core elements because it has no dependencies to other parts of yacy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6902 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	7b880d73d0	adjustments to granted query size git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6868 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	586bc4d920	- remove superfluous entries in remote search tracker handles - avoid concurrent access from same client this is a fix for http://forum.yacy-websuche.de/viewtopic.php?p=20045#p20045 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6866 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	90c3e5d6f6	- cleanup, removed unused imports - added crawling queue sizes to /api/status_p.xml, syntax same as in queues_p.html - fixed a bug in queue enumeration that caused a out of bounds exception git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6842 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	b18a7606a0	some performance hacks and fixed after reading dump in http://forum.yacy-websuche.de/viewtopic.php?p=19920#p19920 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6837 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1a8a134e0c	continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 and continued in SVN 6790 The result should be a less usage of new String() and less memory usage (since a String-encapsulated byte[] has 40 bytes overhead) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6815 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	6c093d6aed	- enhanced domain navigator computation - fixed domain navigator content in case that a mustmatch constraint was given git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6763 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	bfb518cd47	some refactoring to get the LoaderDispatcher a little bit more independent from the switchboard git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6755 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	884b262130	- added a new Wiki Namespace Navigator - some redesign of Navigator data structures git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6716 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	f175f9a2d3	changed way how number of search requests are counted: so far only search requests at the remote search interface had been counted. This was done to protect the privacy of searchers, because counting was not done and published at the own search interface. This caused that no search requests of robinson peers had been counted, becuase they cannot be counted at remote peer. This change introduces a distinction of locally done search requests at the local search interface from search requests that are on the local interface but had been submitted from a remote IP without authentication. Now 3 counters are maintained: - partial count of remote searches - total count of local searches on robinson peers from non-authenticated clients - total count of local searches on robinson peers from localhost or authenticated clients In the global statistic of search requests now the first two counters of the three cases are added Because we habe a large number of robinson peers with a large number of remote non-authenticated requests the statistic should show at least three times of the number of search requests. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6696 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	5d930c96f0	more fixes to search result page navigation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6575 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	8281e29963	- more configuration for profiling graph (number of events) - more logging for a shutdown: print reason and accessing IP into log git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6520 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	491ba6a1ba	- some refactoring in workflow - some refactoring in search process - fixed image search for json and rss output - search navigation on bottom of search result page in cases where there are more than 6 results on page - fixes for number of displayed documents - disabled pseudostemming git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6504 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	4c6312d103	enhanced image search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6489 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	36fbfdcb21	more performance for remote search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6487 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	3528b970d6	- refactoring - added new experimental (not-yet-working) image parser - added new test image git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6431 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	b79f4f062f	refactoring of yacy documents and parsers: they depend now only on the kelondro classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6426 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	5841ee83d3	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6400 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	ce8dc575ca	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6398 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	bea3b99aff	moved table and util classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6397 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1e4f8b56ed	accumulated classes from different packages into the new rwi package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6394 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	4446acc8cd	moved kelondro order git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6392 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	735e2737e3	* added index segments This is a major change in the organization of indexes. Please consider a back-up of your data before you run this update. All existing index files will be moved and renamed to a new position. With this change, it will be possible to maintain different indexes for different purposes and it will be possible to have a distinction between DHT-in and DHT-out specific indexes. Tenants may also have their own index, and it may be possible to have histories and back-ups of indexes. This is just the beginning, many servlets must be adopted after this change, but all functions that had been there should still work. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6389 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	e7736d9c8d	more refactoring: made all variables in SearchEvent private to prepare splitting of the class into two parts: local and remote search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6265 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	d8ca6e6bf1	more refactoring for search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6263 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	72ac5bd80f	refactoring of search process. this is the beginning of some architecture changes that will hopefully bring some more stability, speed and transparency to the search process. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6260 6c8d7289-2bf4-0310-a012-ef5d649a1542	15 years ago
orbiter	1d8d51075c	refactoring: - removed the plasma package. The name of that package came from a very early pre-version of YaCy, even before YaCy was named AnomicHTTPProxy. The Proxy project introduced search for cache contents using class files that had been developed during the plasma project. Information from 2002 about plasma can be found here: http://web.archive.org/web/20020802110827/http://anomic.de/AnomicPlasma/index.html We stil have one class that comes mostly unchanged from the plasma project, the Condenser class. But this is now part of the document package and all other classes in the plasma package can be assigned to other packages. - cleaned up the http package: better structure of that class and clean isolation of server and client classes. The old HTCache becomes part of the client sub-package of http. - because the plasmaSwitchboard is now part of the search package all servlets had to be touched to declare a different package source. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6232 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	5bb8074150	removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency. - The indexing queue was a historic data structure that was introduced at the very beginning at the project as a part of the switchboard organisation object structure. Without the indexing queue the switchboard queue becomes also superfluous. It has been removed as well. - Removing the switchboard queue requires that all servlets are called without a opaque generic ('<?>'). That caused that all serlets had to be modified. - Many servlets displayed the indexing queue or the size of that queue. In the past months the indexer was so fast that mostly the indexing queue appeared empty, so there was no use of it any more. Because the queue has been removed, the display in the servlets had also to be removed. - The surrogate work task had been a part of the indexing queue control structure. Without the indexing queue the surrogates needed its own task management. That has been integrated here. - Because the indexing queue had a special queue entry object and properties attached to this object, the propterties had to be moved to the queue entry object which is part of the new indexing queue withing the blocking queue, the Response Object. That object has now also the new properties of the removed indexing queue entry object. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6225 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	0e8647d62f	refactoring of search classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6184 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	dafffd0153	refactoring of parsers and document processing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6182 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	154bbc3364	code cleanup: call of static methods directly to the class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6155 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	222850414e	simplification of the code: removed unused classes, methods and variables git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6154 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	57af311627	fix for wrong urls in navigator when a tenant is used git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6119 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	be1c7ddc64	refactoring of search classes -- moved Ranking Profile to search package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6086 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	bc6dd8194b	refactoring: moved search query class to new search package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6075 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e0b3984805	added navigation keys for site and author facets to remote search interface git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6038 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	27fa6a66ad	- completed the author navigation - removed some unused variables git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6037 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c079b18ee7	- refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing. - added a analysis method that counts bytes that could be saved in case the new HandleMap can be applied in the most efficient way. Look for the log messages beginning with "HeapReader saturation": in most cases we could save about 30% RAM! - removed the old FlexTable database structure. It was not used any more. - removed memory statistics in PerformanceMemory about flex tables and node caches (node caches were used by Tree Tables, which are also not used any more) - add a stub for a steering of navigation functions. That should help to switch off naviagtion computation in cases where it is not demanded by a client git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6034 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	15fad767c0	some refactoring of topic generation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6018 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ab06a6edd2	renamed topwords to topics and enhanced computation methods of topics topics will now only be computed using the document title, not the document url, because the host navigator is now responsible for statistical effects of urls. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6011 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	88426912ad	more refactoring to make the segment object easier to use and to be prepared to integrate author navigation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5992 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	99bf0b8e41	refactoring of plasmaWordIndex: divided that class into three parts: - the peers object is now hosted by the plasmaSwitchboard - the crawler elements are now in a new class, crawler.CrawlerSwitchboard - the index elements are core of the new segment data structure, which is a bundle of different indexes for the full text and (in the future) navigation indexes and the metadata store. The new class is now in kelondro.text.Segment The refactoring is inspired by the roadmap to create index segments, the option to host different indexes on one peer. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5990 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fec6f9054f	some refactoring of search methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5988 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	63a0255166	- refactoring: added new content package, which will contain connector classes for different types of data sources to import texts into the YaCy index - refactoring: migrated data objects for the new connector classes - added a DAO interface class to specify an abstract interface for database retrieval connector methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5977 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c8624903c6	full redesign of index access data model: terms (words) are not any more retrieved by their word hash string, but by a byte[] containing the word hash. this has strong advantages when RWIs are sorted in the ReferenceContainer Cache and compared with the sun.java TreeMap method, which needed getBytes() and new String() transformations before. Many thousands of such conversions are now omitted every second, which increases the indexing speed by a factor of two. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5812 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	89ec3acb3e	- full abstraction of index content type: the kelondro full text index may now also contain indexes about other content than text, i.e. navigation indexes or reverse linking indexes. - during index joins all word positions are maintained: better ranking for word distance possible; exact phrase match can be implemented soundly git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5804 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7ba078daa1	- added fast site-operator - refactoring merge into BLOBArray git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5770 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	83792d9233	more refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5722 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7f67238f8b	refactoring of plasmaWordIndex: less methods in the class, separated the index to CachedIndexCollection git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5710 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	14a1c33823	refactoring of wordIndex class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5709 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago

1 2 3 4

190 Commits (60e71876adea5eaae40c515ec7bcf16788e416bb)