yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	57af311627	fix for wrong urls in navigator when a tenant is used git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6119 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	be1c7ddc64	refactoring of search classes -- moved Ranking Profile to search package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6086 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	bc6dd8194b	refactoring: moved search query class to new search package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6075 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e0b3984805	added navigation keys for site and author facets to remote search interface git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6038 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	27fa6a66ad	- completed the author navigation - removed some unused variables git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6037 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c079b18ee7	- refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing. - added a analysis method that counts bytes that could be saved in case the new HandleMap can be applied in the most efficient way. Look for the log messages beginning with "HeapReader saturation": in most cases we could save about 30% RAM! - removed the old FlexTable database structure. It was not used any more. - removed memory statistics in PerformanceMemory about flex tables and node caches (node caches were used by Tree Tables, which are also not used any more) - add a stub for a steering of navigation functions. That should help to switch off naviagtion computation in cases where it is not demanded by a client git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6034 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	15fad767c0	some refactoring of topic generation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6018 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ab06a6edd2	renamed topwords to topics and enhanced computation methods of topics topics will now only be computed using the document title, not the document url, because the host navigator is now responsible for statistical effects of urls. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6011 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	88426912ad	more refactoring to make the segment object easier to use and to be prepared to integrate author navigation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5992 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	99bf0b8e41	refactoring of plasmaWordIndex: divided that class into three parts: - the peers object is now hosted by the plasmaSwitchboard - the crawler elements are now in a new class, crawler.CrawlerSwitchboard - the index elements are core of the new segment data structure, which is a bundle of different indexes for the full text and (in the future) navigation indexes and the metadata store. The new class is now in kelondro.text.Segment The refactoring is inspired by the roadmap to create index segments, the option to host different indexes on one peer. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5990 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fec6f9054f	some refactoring of search methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5988 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	63a0255166	- refactoring: added new content package, which will contain connector classes for different types of data sources to import texts into the YaCy index - refactoring: migrated data objects for the new connector classes - added a DAO interface class to specify an abstract interface for database retrieval connector methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5977 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c8624903c6	full redesign of index access data model: terms (words) are not any more retrieved by their word hash string, but by a byte[] containing the word hash. this has strong advantages when RWIs are sorted in the ReferenceContainer Cache and compared with the sun.java TreeMap method, which needed getBytes() and new String() transformations before. Many thousands of such conversions are now omitted every second, which increases the indexing speed by a factor of two. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5812 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	89ec3acb3e	- full abstraction of index content type: the kelondro full text index may now also contain indexes about other content than text, i.e. navigation indexes or reverse linking indexes. - during index joins all word positions are maintained: better ranking for word distance possible; exact phrase match can be implemented soundly git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5804 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7ba078daa1	- added fast site-operator - refactoring merge into BLOBArray git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5770 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	83792d9233	more refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5722 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7f67238f8b	refactoring of plasmaWordIndex: less methods in the class, separated the index to CachedIndexCollection git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5710 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	14a1c33823	refactoring of wordIndex class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5709 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e2e7949feb	replaced old PPM computation with a better one that simply sums up events that had been stored in the profiling table. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5706 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	6ffc6e3389	more refactoring of indexer and kelondro classes; - integrating the indexer into kelondro as package 'text' - renaming of classes in kelondro.index git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5663 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	404bc21da9	simplification of (internal) query process / refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5662 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	76ef5f0f14	refactoring of index package: better names for the classes (to be continued) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5661 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c25c334b75	replaced old DHT transmission method with new method. Many things have changed! some of them: - after a index selection is made, the index is splitted into its vertical components - from differrent index selctions the splitted components can be accumulated before they are placed into the transmission queue - each splitted chunk gets its own transmission thread - multiple transmission threads are started concurrently - the process can be monitored with the blocking queue servlet To implement that, a new package de.anomic.yacy.dht was created. Some old files have been removed. The new index distribution model using a vertical DHT was implemented. An abstraction of this model is implemented in the new dht package as interface. The freeworld network has now a configuration of two vertial partitions; sixteen partitions are planned and will be configured if the process is bug-free. This modification has three main targets: - enhance the DHT transmission speed - with a vertical DHT, a search will speed up. With two partitions, two times. With sixteen, sixteen times. - the vertical DHT will apply a semi-dht for URLs, and peers will receive a fraction of the overall URLs they received before. with two partitions, the fractions will be halve. With sixteen partitions, a 1/16 of the previous number of URLs. BE CAREFULL, THIS IS A MAJOR CODE CHANGE, POSSIBLY FULL OF BUGS AND HARMFUL THINGS. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5586 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	83ce65707a	(almost) completed partition of classes in kelondro git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5543 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7ee494fde5	more refactoring of kelondro: - seperated BLOB from table classes - renamed 'coding' package to 'order' git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5542 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	bf93767ec6	refactoring of kelondro database classes (to be continued) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5540 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	47292e696a	more performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5379 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	1951d30a62	addendum to last commit handle words with length < 3 correctly git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5369 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d0bdcdd57c	small changes to attributes of DoS attack protection parameters git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5246 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	820a03f9d6	- removed some warnings - used fix in SVN 5233 for ysearch.java and search.java git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5237 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	00c1535f84	added ranking and evaluation of language type in a search the wanted language is taken from the browser user-agent string git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5192 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	536e77e8b7	modifications towards a single database operation to read/write http header and cached file at once: - removed distinction between header file types for http and ftp; ftp is simulated by using http properties - removed all old resourceInfo classes that handled this distinction - introduced a new distinction between http request and http response objects - unified new response objects with two other object types that had been introduced elsewhere - changed all servlet call methods to use the new http request header object type - divided static object keys for http header properties into request and response types - refactoring here and there (a large number of type changes and many methods merged/moved) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5079 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
danielr	621b473b18	* removed some warnings of findbugs (http://findbugs.sf.net ) - removed unnecessary code (unused variables, String.toString) - corrected some calculations (cast int to double or long ;) - improved little performance (using Integer.valueOf() instead of new Integer) - log if some File-actions fail (mkdir(), delete(), ...) and some ignored exceptions - finalized some (more) fields - finally close some streams - made inner classes static if not using environment - generalized some equals (from specificClass to Object) - fixed some potential nullpointer accesses git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5039 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	3bb870bfcd	added final where possible git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5030 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	7feae906aa	- organize imports - removed potential null pointer accesses - removed unnecessary casts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4893 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	0e1c7dfaaf	small fix for testing environment that should not affect a production environment git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4886 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	40d7f485f3	- fixed several NPE bugs - fixed loosing of own seed hash (hopefully) - fixed a bug with crawl start s beginning with (bookmark) files - added better IP recognition during hello process git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4882 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2f381b8d7a	- fixed at least two causes for a NPE after a use case switch. A large refactoring was neccessary - added another crawl start option: automatic restriction to sub-path - removed crawlStartSimple and renamed crawl start expert to crawl start (without expert) - some changes to texts in crawl start - added some more deletions when an web index is deleted: delete also queues and robots cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4881 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	11e00a0849	- refactoring of seedURL handling - additional check for seedURL pointing to localhost: deny such peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4851 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	0c173821fd	more access security regarding database access and snippet retrieval: restrict number of results for not-authorized searchers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4838 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3aa69dab94	prevent too high search request frequency submitted from the same peer git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4813 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	cfe6790498	- added option to switch between yacy networks, especially between the two default networks (freeworld and intranet), from the ConfigNetwork online interface - to make this possible, a large refactoring and reorganisation of data structures was necessary git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4803 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	b32736762c	enhanced rssTerminal - 3 lines possible - distinguishing of private and public data, if not authorized only public data is shown - shows now more events, including local searches in clear text if user is logged in - simplyfied peer events - better recognition of 'real' new peers - presentation of peer pings from other peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4771 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d2ba1fd2ab	major step forward to network switching (target is easy switch to intranet or other networks .. and back) This change is inspired by the need to see a network connected to the index it creates in a indexing team. It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder. The remaining YACYDB is superfluous and can be deleted. The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy). The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT). No other functional change has been made. The next steps to enable network switcing are: - shift of crawler tables from PLASMADB into the network (crawls are also network-specific) - possibly shift of plasmaWordIndex code into yacy package (index management is network-specific) - servlet to switch networks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4765 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d0b893523e	- protection against RAM overflow caused by new peer rss news - more XSS protection git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4742 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	9935e83c86	added new news window into the status page. At this moment it is just a test. The news inside the window are about peer arrivals and departures, remote search accesses and crawls git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4739 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d6050b9ffb	- separated the LURL data storage and Crawl result stack for process supervision. this is another step to enable multiple, concurrent fulltext-indexes - another try to make the yacy-httpc more stable git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4602 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	541b817502	refactoring of switchboard queueing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	5530b8e1ca	reverted changes to yacy protocol classes: they caused the sciencenet to loose connections a comparisment with the main release 0.57 had been made: this showed a stable network This is an emergency operation to ensure availability of the sciencenet network. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4553 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	b4ed937f1e	- modified zone navigation (does still not work correctly) - added dht switch in network definition - 0.574 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4550 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	727feb4358	- fixed some bugs in ranking computation - introduced generalized method to organize ranked results (2 new classes) - added a post-ranking after snippet-fetch (before: only listed) using the new ranking data structures - fixed some missing data fields in RWI ranking attributes and correct hand-over between data structures git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4498 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	61a81820e3	- refactoring of search tracker - added link to search history to repeat the search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4493 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	22485dcca8	rename opeer -> oseed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4462 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	77ba446332	seedDB helpers update/cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4461 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	bd63999801	- faster search: using different data structures that avoid multiplr calculations - no more table copy for error-eco table - optional table copy for lurl-entries - more abstractions (less single constant strings) - better logging (using host names instead of ips) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4459 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	d1758eb17d	mistake corrected cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4454 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	85a82950e0	seedDB helpers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4447 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7404256997	- no more search time-out! - fixed a bug with last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4430 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a8a5df4a51	- more dublin core naming of page metadata - better presentation of result counters in search results git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4420 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	45339c3db5	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4341 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a5054c038d	- added large number of generics - redesign of ordering structures in kelondro (old did not work with strict generics) - 50% IO reduction during read access on kelondroFlex (ommiting of read on index table) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4320 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	ecd7f8ba4e	- added NEAR operator (must be written in UPPERCASE in search query) - more generics - removed unused commons classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4310 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	21b8d1b918	small cosmetic change for static fields in serverCore (special protocol ASCII entities) to improve readability git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4275 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	270d016d89	fix for missing anonymization in search profiling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4274 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f243e338cf	implemented online caution also for local and remote search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4252 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	b46bcaa5d8	changed method of profiling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4248 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	cc20870267	fix for constraint handover problem: old yacy versions set a catchall-constraint if no constraint is given, but the new versions expect a null-constraint. see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=565&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4238 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2fcd18a972	- fixed bad behaviour of search event worker processes - fixed export of url lists in xml git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4229 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c48b73cda2	redesign of ranking data structure - the index administration now uses the same code base for url selection and collection as the search interface. The index administration is therefore a good test environment for ranking order control - removed old postsorting-algorithms, will be replaced with new one - fixed many bugs occurred before during ranking; especially the contraint filtering method removed too many links - fixed media search flags; had been attached to too many urls. The effect should be a better pre-sorting before media load within snippet fetch git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4223 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6f1308da2f	- some enhancements to IndexControlURLs (shows more links, connects referrer to another query) - some refactoring to search process git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4222 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c527969185	- enhanced monitoring of ranking parameters for details, please try http://localhost:8080/IndexControlRWIs_p.html - fixed computation of ranking ordering in some cases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4220 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	0e1738899f	* Complete number localization and provide a more reasonable interface to serverObjects: - put(key, value) methods are now used if a value added to the map should be kept as it is. Numbers are transformed (but not formatted) to an equivalent String representation. - putASIS(...) have been removed, now done with simple put(...) (see above). - puNum(...) can be used for number values which should be stored in a formatted way, either depending on the current locale setting for yacy (default) or in a "none" locale (see javadocs and setLocalize()). - putHTML(...) escapes special characters into corresponding HTML enities ('<' => '<') which was done with put(...) before and so was called too often, becauses it is necessary only for very few cases. Additionally there is a "forXML" mode which only replaces < > & ". In short: Use put(...) for almost everything, use putXY(...) if you need some special transformation of the value. A few bugs have been fixed as well, and there should be a small performance improvement for complex pages with a lot of values. * added additional Sum/Avg rows to access tracker pages, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=456 * removed duplicate code (mostly related to the big changes above). TODO: - make sure, number formats work as expected _everywhere_, report overseen stuff http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437 - probably a good idea to add special putDate() methods as they are used in many pages and create duplicated formatting code + maybe some centralized handling for memory value formatting. - further improve the speed of page creation for the WatchCrawler. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4178 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	f717beecb1	- Changed yFormatter handling to be more flexible and produce more readable code for server pages. There are serverObject.putNum() methods to allow adding of number type values in a formatted form, and put() methods for number types that add them without formatting. This reduces the need to transform them into Strings in server pages and removes the HTML encoding step which is unecessary for numbers. - some minor code cleanups (mostly unnecessary casts, null checks) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4166 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	01e0669264	re-designed some parts of DHT position calculation (effect is the same as before) and replaced old fist hash computation by new method that tries to find a gap in the current dht to do this, it is necessary that the network bootstraping is done before the own hash is computed this made further redesigns in peer initialization order necessary git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4117 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	341f7cb327	steps to enhance remote search performance: - added a file size limitation, that disallows parsing of large documents during (offline-) remote search - added profiling information to search result computation, visible at search access tracker. this info shows used time for URL fetch and snippet computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4112 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f4a5c287fe	re-implemented post-ranking of search results (should enhanced search result quality) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4080 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	8ff5e2c283	- fixed/re-implemented media search - fixed search tipps (topwords, now appearing at the bottom of the page) - added search consequences execution (deletion of bad referenced some time after the search happened) - added some formatting at network table git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4078 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	daf0f74361	joined anomic.net.URL, plasmaURL and url hash computation: search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	e90afa9483	fixed search access tracker git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4072 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4779f314fe	first version of next-generation search interface: - snippets are not fetched by browser using ajax, they are now fetched internally - YaCy-internat threads control existence of snippets and sort out bad results - search results are prepared using SSI includes - the search result page is visible right after the search request, the results drop in when they are detected - no more time-out strategy during search processes, results are shifted within queues when they arrive from remote peers - added result page switching! after the first 10 results, the next page can be retrieved - number of remote results is updated online on the result page as they drop in - removed old snippet servelet (which had been also a security leak btw) - media search is broken now, will be redesigned and fixed in another step git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4071 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f9e6cf6a3d	more refactoring of search: integrated first version of ssi-using search interface, but the function is currently disabled git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4063 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a34d9b8609	* added a search history cache that maintains search results for 10 minutes it is necessary for the new search process that will do automatic re-searches a positive effect is, that when a re-search is done it can be monitored how many results had been contributed from other peers. The message for this contribution was moved from the end of the result page to the top. * enhanced re-search time when a global search was done an the local index has already a great number of results for this word * re-organised presearch computation; must be further enhanced git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4059 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	ae86d010bb	more refactoring of search processes; also some small speed enhancements git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4058 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	16c203f759	fixed remote search access tracker git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4048 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	947fc46904	refactoring of search process: - re-designed remote request result processing - re-designed local result accumulation, will be further enhanced with snippet fetcher - removed search process handling in switchboad - made snippet class static (there is no need for multiple snippet objects) - removed some redundant tasks in server-side search process, should be a little bit faster now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4043 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1af0e3bd84	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4031 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	5605887571	refactoring of search processes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4030 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e76fe1c078	- replaced unicode characters in copyright holder name ('Brausse') - more logging for bootstrap seedlist loading - larger DHT chunks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4015 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f40566f9bb	separate YaCy networks: - added server-side network unit identification - added server-side network access authorization - enhanced client-side network authentification essentials generation - implemented first peer-peer salted-magic authentification method git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3953 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f04add6cb4	limitation of remote search result number git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3880 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	81844e85b2	- fixed more cluster routing problems - fixed a problem in remote search when balancer caused shift process to wait too long git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3627 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f8de19fb2f	robinson cluster: added client-side protocol implementation - the network configuration page shows a new option: robinson clusters - when a global search is made, all robinson peers are excluded, but: - robinson peers/clusters that provide peer tags and where search words match such tags, they are included in global search. Therefore, robinson peers/clusters support the global yacy network with their indexes, without doin DHT-exchange git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3598 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	657585fe0d	network functions for robinson peers: server-side protection git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3591 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	2e052eb816	fixed a bug in remote search with remote search tracker git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3565 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b79b4082e2	completed search exclusion: - exclusion on index-level (not only from search snippets) - exclusion hand-over at remote search protocol git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3556 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	40c14a4f0e	- better implementation of search query properties - basic protection against start-up problems when database files are corrupted - auto-delete of not-critical databases during startup when load error occurs - on-the-fly reset option for all database tables - automatic on-the-fly reset for seed tables during enumeration exceptions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3547 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	30d79d69a6	fix for wrong display of search statistics see http://www.yacy-forum.de/viewtopic.php?p=31242#31242 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3352 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c464157a6e	replaced some toString() see http://www.yacy-forum.de/viewtopic.php?p=31151#31151 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3345 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c2d6edf21d	integrated number of remote targets as 'partitions' into remote search protocol git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3317 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	4f6eed5623	QPM increment git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3309 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f3f99b19c6	extended search statistics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3249 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c0851ee943	refactoring: moved and renamed de.anomic.data.searchResults to plasma package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3248 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	76fab83395	fixed bugs in seach statistics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3240 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	0c81bd39d4	XSS-safe put as default. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	52c6461e6b	some bugfix for statistics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3211 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
(no author)	fe72b772cf	added a monitor page for search requests git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3206 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	0a050bc043	enhanced ranking - redesign of data storage in plasmaSearchRankingProfile - profiles are extended by new ranking parameters - new RWI ranking parameters are considered during ranking - appearance attributes (i.e. emphasised text) is now considered - faster ranking - some attributes that had been checked during post-ranking can now be checked during pre-ranking phase - removed old ranking parameter on index.html page (will be replaced by profiles in the future) - ranking can now consider appearances of media content - snippet-loading for media types now work correctly (fetches only from the wanted media) - ranking-profiles can be handed over the remote peers and apply there also - re-search of same query with different domain now also re-triggers remote search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3105 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1377c53aa3	extraction of media links from search results these links are mixed to the snippets for testing purpose (a final version will handle this differently) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3069 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	109ed0a0bb	- cleaned up code; removed methods to write the old data structures - added an assortment importer. the old database structures can be imported with java -classpath classes yacy -migrateassortments - modified wordmigration. The indexes from WORDS are now imported to the collection database. The call is java -classpath classes yacy -migratewords (as it was) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3044 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	ad1e4aa88e	added selection of audio, video, image and application resources to search procedure. This function can currently not used through the search interface, but only through remote search. added accumulation of search attributes to enable the audio, video, image and application selection. fixed a problem with external URL representation generation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3036 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	ceb9e3aa17	- enhanced parser: collection of audio, video, image and application links - enhanced condenser: better handling of utf-8 and pre-formatted texts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3017 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	8e7215475b	- extended ViewFile to use is as debugging-tool: you can now use the post-parameter url to submit an url directly - fixed some bugs in text parser (not all parts had been analysed) - fixed a bug in remote search interface (could not handle constraints) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3001 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	30888e7a2f	implementation of search constraints Such constraints may formulate specific restrictions to web searches This is implemented by scraping information for constraints from a web page during parsing, and storing flags to the pages within the web index. In this first step, only information for index pages ("index of", directory listings) are scraped and stored in flags - added new flag class kelondroBitfield - added scraper method in condenser - added bitfield structure for all scrape types (see also condenser) - added bitfield structure for appearance locations (see RWIEntry) - added handover protocol for remote search and index distribution - extended kelondroColumn class to hold bitfield types - added another search attribute on search page (index.html) - extended search-filter to enable filtering of non-matching constraints - set all new database types to be default - refactoring: moved word hash generation to condenser class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2999 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	497428c8ec	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2949 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	bb7d4b5d5e	refactoring to prepare new RWI entry object - moved all url and index(RWI) entries to index package - better naming to distinguish RWI entries and URL entries git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2937 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	8fdefd5c68	generalization of payload definition of index storage this is one step forward to the migration to a new collection data format git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2912 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d3431433b0	more anonymization in logging git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2876 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b79e06615d	- added new LURL.Entry class for next database migration - refactoring of affected classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2802 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a5dd0d41af	- refactoring of plasmaCrawlLURL.Entry to prepare new Entry format - added test migration method to migrate the old LURL to a new LURL the new LURL will be splitted into different tables for each month this solves several problems: - the biggest table in YaCy is splitted in different parts and can also be managed in filesystems that are limited to 2GB - the oldest entries can easily be identified, used for re-crawl und deleted - The complete database can be limited to a specific size (as wanted many times) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2755 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	afbb547f3d	extended options for abstracts generation in remote search interface git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2739 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	2e4aa6a170	refactoring of Advanced Config: - removed settings that are in Basic Settings - joined pages that belong together - moved include pages from yacy/ to / git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2726 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	dbc2e039bb	added time-out option parameter to call hierarchy git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2691 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	00746ca232	identified and fixed search performance problem caused by snippet loading. Some access to header-db had been twice and even more times in some cases. Snippet resource loading fixed. Furthermore the snippet loading during remote search within the remote peer has been disabled, but can be switched on remotely by new flag 'includesnippet=true' git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2688 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	df1629b05a	- code cleanup - version 0.471 - moved surftipps to own web page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	3aac5b26da	- added automatic tag generation when a web page from the search results is added - added new image 'B' in front of search results for bookmark generation - added news generation when a public bookmark is added - the '+' in front of search results has new meaning: positive rating for that result - added news generation when a '+' is hit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2613 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6e2907135a	bugfixes for remote search server part git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2573 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	cf9884e22b	first attempt to implement a secondary search this is a set of search processes that shall enrich search results with specialized requests to realize a combination of search results from different peers. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2571 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	75b198bc02	- updated references to indexContainer - more bugfixes and debugging for indexAbstract processing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2555 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	4f9e42d5ed	more changes towards better join-search - fixed problems with index-abstract generation - added analysis output for index abstract receive git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2551 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	82a6054275	- fixed bug with new indexAbstract generation - added partly evaluation of indexAbstracts during remote searches git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2544 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	74d1dea30b	changes towards better join-search - added generation of a compressed index within remote peers during global search - added selection of specific urls within remote peers during secondary global search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2539 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	96c6e4e322	- enhancements to detailed search page - enhancements to search ranking computation process - removed bugs in postranking git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2516 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	8a0e35618b	enhancements to search result preparation - added detailed count on remote search results - enhanced search sequence during remote searches (doing local search in sequence) - strict adherence to timout limits git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2497 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a474669338	start with refactoring of index management git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2110 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	dbe96e6541	added hand-over of search filter and prefer ranking to yacy protocol git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2029 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	00a5d435e2	- fixed some bugs with domain filter - added new ranking filter "prefermask": urls that match the filter are ranked better git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2022 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	bd283b8443	fixed bugs: - null pointer exception during startup of a robinson-configured peer - wrong time calculation of default value of re-crawl option git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2005 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	3286b1f498	re-organisation of lurl-creation and -stacking this was necessary to prevent useless write to the database in case of blacklist appearance of the url git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1905 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	bae3783d38	added a snippet marking (search words are now bold in snippets) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1823 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	fb7411d7bb	re-structuring of ranking application: concentration of all ranking attributes in the plasmaSearchRankingProfile git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1541 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	d98418390b	- introduced rankingProfile Class - selection of ranking and timing profiles for each search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1539 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	03c65742ba	changes towards the new index storage scheme: - replaced usage of temporary IndexEntity by EntryContainer - added more attributes to word index - added exact-string search (using quotes in query) - disabled writing into WORDS during search; EntryContainers are used instead git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1485 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	f4ffa9aee5	- implemented more attributes to index entries - implemented hand-over of new word index attributes during remote search - implemented word-distance computation during search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1382 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	37f88b4017	code cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1176 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	7920e1547d	code cleanup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1163 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	b9cc9029e3	added ybr selection for remote search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1119 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
borg-0300	e642a5d8b7	more constants git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@947 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	d29dfb0a12	refactoring of search / preparation for better search methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@921 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
borg-0300	a1777788a5	small change git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@879 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
borg-0300	52168fab9b	cleaned, finals, Properties git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@856 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	a2fa75e688	) Asynchronous queuing of crawl job URLs (stackCrawl) various checks like the blacklist check or the robots.txt disallow check are now done by a separate thread to unburden the indexer thread(s) TODO: maybe we have to introduce a threadpool here if it turn out that this single thread is a bottleneck because of the time consuming robots.txt downloads ) improved index transfer The index selection and transmission is done in parallel now to improve index transfer performance. TODO: maybe we could speed up performance by unsing multiple transmission threads in parallel instead of only a single one. ) gzip encoded post requests it is now configureable if a gzip encoded post request should be send on intex transfer/distribution ) storage Peer (very experimentell and not optimized yet) Now it's possible to send the result of the yacy indexer thread to a remote peer istead of storing the indexed words locally. This could be done by setting the property "storagePeerHash" in the yacy config file - Please note that if the index transfer fails, the index ist stored locally. - TODO: currently this index transfer is done by the indexer thread. To seedup the indexer a) this transmission should be done in parallel and b) multiple chunks should be bundled and transfered together ) general performance improvements - better memory cleanup after http request processing has finished - replacing some string concatenations with stringBuffers - replacing BufferedInputStreams with serverByteBuffer - replacing vectors with arraylists wherever possible - replacing hashtables with hashmaps wherever possible This was done because function calls to verctor or hashtable functions take 3 time longer than calls to functions of arraylists or hashmaps. TODO: we should take a look on the class serverObject which is inherited from hashmap Do we realy need a synchronization for this class? TODO: replace arraylists with linkedLists if random access to the list elements is not needed ) Robots Parser supports if-modified-since downloads now If the downloaded robots.txt file is older than 7 days the robots parser tries to download the robots.txt with the if-modified-since header to avoid unnecessary downloads if the file was not changed. Additionally the ETag header is used to detect changes. ) Crawler: better handling of unsupported mimeTypes + FileExtension ) Bugfix: plasmaWordIndexEntity was not closed correctly in - query.java - plasmaswitchboard.java *) function minimizeUrlDB added to yacy.java this function tests the current urlHashDB for unused urls ATTENTION: please don't use this function at the moment because it causes the wordIndexDB to flush all words into the word directory! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@853 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	7fc822a59b	changed handling of time-zones git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@801 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	4fd5b95b1f	*) Renaming Logger function names to reflect the proper Java Logging API Loglevels - please use logFine instead of logDebug - please use logSevere instead of logFailure and logError See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@615 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	6adf8a4bde	*) Renaming Logger function names to reflect the proper Java Logging API Loglevels - please use logFine instead of logDebug - please use logFailure instead of logError See: http://www.yacy-forum.de/viewtopic.php?p=8726#8726 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@614 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	2d8557cb10	minor changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@487 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
rramthun	85c2f3be8a	Fixed spelling mistakes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@110 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
theli	e7f7aa0bb9	) Import statements reorganized Now it's easier to determine which class really uses which other class) Reogranizing Import Statements git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@83 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
orbiter	248077d3f0	initial load with yacy 0.36 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago

... 2 3 4 5 6 ...

308 Commits (3d14fb51c5d2c66796d39fa6cece613c8df46531)