yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	1f1399e5c5	extending visibility of objects and methods to avoid synthetic accessor methods and increase performance git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6156 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	154bbc3364	code cleanup: call of static methods directly to the class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6155 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	222850414e	simplification of the code: removed unused classes, methods and variables git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6154 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c5122d6836	completed migration of BLOBTree to BLOBHeaps: - removed migration code - removed BLOBTree after the removal of the BLOBTree, a lot of dead code appeared: - removed dead code that was needed for BLOBTree Some more classes may have not much use any more after the removal of BLOBTree, but still have some component that are needed elsewhere. Additional Refactoring steps are needed to clean up dependencies and then more code may appear that is unused and can be removed as well. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6150 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ae015e8e98	refactoring of blob package classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6088 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	ce1adf9955	serialized all logging using concurrency: high-performance search query situations as seen in yacy-metager integration showed deadlock situation caused by synchronization effects inside of sun.java code. It appears that the logger is not completely safe against deadlock situations in concurrent calls of the logger. One possible solution would be a outside-synchronization with 'synchronized' statements, but that would further apply blocking on all high-efficient methods that call the logger. It is much better to do a non-blocking hand-over of logging lines and work off log entries with a concurrent log writer. This also disconnects IO operations from logging, which can also cause IO operation when a log is written to a file. This commit not only moves the logger from kelondro to yacy.logging, it also inserts the concurrency methods to realize non-blocking logging. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6078 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	27fa6a66ad	- completed the author navigation - removed some unused variables git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6037 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c079b18ee7	- refactoring of IntegerHandleIndex and LongHandleIndex: both classes had been merged into the new HandleMap class, which handles (key<byte[]>,n-byte-long) pairs with arbitraty key and value length. This will be useful to get a memory-enhanced/minimized database table indexing. - added a analysis method that counts bytes that could be saved in case the new HandleMap can be applied in the most efficient way. Look for the log messages beginning with "HeapReader saturation": in most cases we could save about 30% RAM! - removed the old FlexTable database structure. It was not used any more. - removed memory statistics in PerformanceMemory about flex tables and node caches (node caches were used by Tree Tables, which are also not used any more) - add a stub for a steering of navigation functions. That should help to switch off naviagtion computation in cases where it is not demanded by a client git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6034 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	88426912ad	more refactoring to make the segment object easier to use and to be prepared to integrate author navigation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5992 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	99bf0b8e41	refactoring of plasmaWordIndex: divided that class into three parts: - the peers object is now hosted by the plasmaSwitchboard - the crawler elements are now in a new class, crawler.CrawlerSwitchboard - the index elements are core of the new segment data structure, which is a bundle of different indexes for the full text and (in the future) navigation indexes and the metadata store. The new class is now in kelondro.text.Segment The refactoring is inspired by the roadmap to create index segments, the option to host different indexes on one peer. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5990 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	3d4b826ca5	migration of all databases that use the deprecated BLOBTree format into the BLOBHeap format. Old databases are migrated automatically. This removes the last very IO-intensive data structures which were still used for Wiki, Blog and Bookmarks. Old database files will still remain in the DATA subdirectory but can be deleted manually if no major bugs appear during migration. There is no need for any user action, all migration is done automatically. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5986 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	26a46b5521	increased default maximum file size for database files to 2GB Other file sizes can now be configured with the attributes filesize.max.win and filesize.max.other the default maximum file size for non-windows OS is now 32GB git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5974 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e005cfea37	fix for bug in -incell option of URLAnalysis git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5967 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	a7e392f31b	The collection index will not be supported any more. Existing indexes based on the old index collections must be migrated with YaCy 0.8 - removed index collection classes and all migration tools - added a 'incell' reference collection feature in URL analysis git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5966 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	ea27853c59	) some refactoring ) added one assertion *) no functional changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5935 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	89aeb318d3	enhanced the wikimedia dump import process enhanced the wiki parser and condenser speed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5931 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c097531e3d	added a catch Exception to all thread to check if any of them silently dies without any other notification git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5922 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	ff5f82d780	) removed description of removed commands from wikiHelp ([= =]) ) used format function of Netbeans for wikiCode to make it more readable, no functional changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5907 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	9c6ac43f66	fixes for wiki parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5905 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	78ffb61297	*) got rid of unnecessary variable which might also fix IndexOutOfBoundsException git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5902 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d079d6dfdb	small changes in surrogate reader, wiki code and portal test git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5894 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	f1244264b8	*) hopefully fixed bug reported in http://forum.yacy-websuche.de/viewtopic.php?t=2057 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5882 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	d1116c049f	) added new method "contains()" to Blacklist interface ) implemented contains() in class AbstractBlacklist *) used new method in Blacklist_p to prevent double entries in blacklists git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5832 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c8624903c6	full redesign of index access data model: terms (words) are not any more retrieved by their word hash string, but by a byte[] containing the word hash. this has strong advantages when RWIs are sorted in the ReferenceContainer Cache and compared with the sun.java TreeMap method, which needed getBytes() and new String() transformations before. Many thousands of such conversions are now omitted every second, which increases the indexing speed by a factor of two. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5812 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d4d87d90c4	- extended experimental wikipedia dump parser - removed historic, possibly unused code from wiki parser that was in conflict with actual wikipedia wiki code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5790 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c08f9b36a4	refactoring of wiki parser. This was done to prepare the wiki parser as parser for wikipedia dumps, which will be used for performance test (to omit crawling) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5785 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	9180617dd9	*) Classes to handle import of lists (especially blacklists) from XML files, not used yet, but will be used soon. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5780 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c2359f20dd	refactoring: better abstraction of reference and metadata prototypes. This is a preparation to introduce other index tables as used now only for reverse text indexes. Next application of the reverse index is a citation index. Moved to version 0.74 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5777 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	96eaecda3e	- added migration class to go from index collections to the index cell data structure. - added better control over file deletion, because this sometimes fails, especially on windows git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5756 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7dff1cba62	removed option to use different primary keys in kelondro tables this option was never used and there is also no use to set other columns but the first as the primary key. as a result, access methods to the key do not need to compute key positions, and they work faster. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5711 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	14a1c33823	refactoring of wordIndex class git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5709 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d49238a637	more performance hacks: better default values for scaling, less memory usage git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5708 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d988204875	better shutdown of tools git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5695 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	100247bdda	added also an export and delete-feature to the URLAnalysis. This completes the clean-up feature for URLs. To do a complete clean-up of the url database, start the following: java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -incollection DATA/INDEX/freeworld/TEXT/RICOLLECTION used.dump java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -diffurlcol DATA/INDEX/freeworld/TEXT used.dump diffurlcol.dump java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -export DATA/INDEX/freeworld/TEXT xml urls.xml diffurlcol.dump java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -delete DATA/INDEX/freeworld/TEXT diffurlcol.dump The export-feature is optional, the purpose of that function is to provide a back-up function for URLs to be deleted. The export function can also be used to create html files with embedded links and simple text-files. Simply replace the 'xml' word with 'html' or 'text'. The last argument in the cann, the diffurlcol.dump value, can also be omitted. This will cause that the complete URL database is exported. This is an alternative to the Web-Interface based export function. The delete-feature is the only destructive method of the four presented here. Please use it with care. It is better to make a back-up of the url database files before starting the deletion. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5694 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	60078cf322	added next tool for url analysis: check for references, that occur in the URL-DB but not in the RICOLLECTIONS to use this, you must user the -incollection command before (see SVN 5687) and you need a used.dump file that has been produced with that process. Now you can use that file, to do a URL-hash compare with the urls in the URL-DB. To do that, execute java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -diffurlcol DATA/INDEX/freeworld/TEXT used.dump diffurlcol.dump or use different names for the dump files or more memory. As a result, you get the file diffurlcol.dump which contains all the url hashes that occur in the URL database, but not in the collections. The file has the format {hash-12}* that means: 12 byte long hashes are listed without any separation. The next step could be to process this file and delete all these URLs with the computed hashes, or to export them before deletion. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5692 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	dbdd10da84	better logging and startup behaviour for referenceHash computation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5690 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d64836c34f	added statistical analysis of URL reference use that with the following command on a linux shell: java -Xmx1000m -cp classes de.anomic.data.URLAnalysis -incollection DATA/INDEX/freeworld/TEXT/RICOLLECTION used.dump for freeworld indexes. For more details please see discussion below: http://forum.yacy-websuche.de/viewtopic.php?p=13204#p13204 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5687 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	b80db04667	- refactoring of IntegerHandleIndex and LongHandleIndex (better method names) - fix for problem in httpdFileHandler: mising close of open Files if tempate cache was disabled - more memory for DHT selection required - stub for URL reference hash statistics in index collections git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5682 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	efcd95dc37	simplification of (internal) query process / refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5671 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	f1b712c29a	small corrections to image loading methods in result presentation especially loading of favicons in search results. This is a fix that affects only searches in intranet/repository configurations. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5670 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	aa44d9bad9	more refactoring of kelondro.text / deleted de.anomic.index git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5664 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	6ffc6e3389	more refactoring of indexer and kelondro classes; - integrating the indexer into kelondro as package 'text' - renaming of classes in kelondro.index git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5663 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	76ef5f0f14	refactoring of index package: better names for the classes (to be continued) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5661 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d1d9fbae5c	enabling the URLAnalysis to operate on multime input files, just use a wild card when calling the class from the command line git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5658 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7ea53fe47b	added another url list transformation option: - check the list and kick out entries with lines that contain not valid urls - normalize the urls - remove doubles - sort the list - split the list in smaller chunks This is all done in one process which can be called with a new -sort option git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5655 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	54625360f7	performance update git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5653 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d884c4718a	added gzip support for URLAnalysis: url lists can also be compressed with gzip If such a file is handed over to URLAnalysis, the output will also be written as .gz-file git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5652 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	cf9b74e6e3	added another method to process url lists: extract hosts only This can be used like java -Xmx2000m -cp classes de.anomic.data.URLAnalysis -host DATA/EXPORT/20090224213823.txt changed als the call method to generate statistics, please use now java -Xmx2000m -cp classes de.anomic.data.URLAnalysis -stat DATA/EXPORT/20090224213823.txt git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5650 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	89d8e824ed	memory protection for URLAnalysis git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5649 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	0f6fa804ff	performance update to URLAnalysis git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5648 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e8f5f2f612	added tool to analyse url strings and to generate statistics about words occurring in urls git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5646 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c12bb8a6d0	- refactoring of the http client - added a protection against memory leaks for the access tracker git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5621 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	411f2212f2	more memory leak fixing hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5599 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	333489420b	- fix for NPE when loading the cytag image - some hacks for less memory usage: -- less usage of buffer and cache memory in EcoFS -- buffer allocation on-demand in BufferedIOChunks -- removed largest ybr idx git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5595 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	c25c334b75	replaced old DHT transmission method with new method. Many things have changed! some of them: - after a index selection is made, the index is splitted into its vertical components - from differrent index selctions the splitted components can be accumulated before they are placed into the transmission queue - each splitted chunk gets its own transmission thread - multiple transmission threads are started concurrently - the process can be monitored with the blocking queue servlet To implement that, a new package de.anomic.yacy.dht was created. Some old files have been removed. The new index distribution model using a vertical DHT was implemented. An abstraction of this model is implemented in the new dht package as interface. The freeworld network has now a configuration of two vertial partitions; sixteen partitions are planned and will be configured if the process is bug-free. This modification has three main targets: - enhance the DHT transmission speed - with a vertical DHT, a search will speed up. With two partitions, two times. With sixteen, sixteen times. - the vertical DHT will apply a semi-dht for URLs, and peers will receive a fraction of the overall URLs they received before. with two partitions, the fractions will be halve. With sixteen partitions, a 1/16 of the previous number of URLs. BE CAREFULL, THIS IS A MAJOR CODE CHANGE, POSSIBLY FULL OF BUGS AND HARMFUL THINGS. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5586 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	94110df85a	moved logging partially to kelondro git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5545 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	024da2916b	refactoring of logging git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5544 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	83ce65707a	(almost) completed partition of classes in kelondro git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5543 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7ee494fde5	more refactoring of kelondro: - seperated BLOB from table classes - renamed 'coding' package to 'order' git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5542 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	bf93767ec6	refactoring of kelondro database classes (to be continued) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5540 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	fc27bf8c4c	refactoring of kelondro classes: kelondro shall become independent from other packages. moved bytebuffer, date and memory to kelondro git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5539 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
apfelmaennchen	3484e55be4	- small fix for bookmarksDB git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5527 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
apfelmaennchen	6dd52422ea	- added two dialogs to manage bookmark tags in YaCy-UI - fixed renameTag() in bookmarksDB - added /api/bookmarks/tags/addTag.xml - added /api/bookmarks/tags/editTag.xml git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5525 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
apfelmaennchen	3dc208fad0	bugfix: bookmarks can now handle folder names like /news and /newspaper without getting confused... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5470 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	f26b8fcb1b	*) comment mode is 'moderated' instead of 'activated' by default now (to avoid spam being visible) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5465 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e004da48d3	- added fast fingerprint computation for files (any). Will be used in new index dump method - refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5415 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	7535fd7447	- refactoring of CrawlEntry and CrawlStacker - introduced blocking queues in CrawlStacker to make it ready for concurrency - added a second busy thread for the CrawlStacker The CrawlStacker is multithreaded. It shall be transformed into a BlockingThread in another step. The concurrency of the stacker will hopefully solve some problems with cases where DNS blocks. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5395 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
lotus	18513e2ee2	npe fix: http://forum.yacy-websuche.de/viewtopic.php?t=1646 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5393 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	e1acdb952c	fix for problem with userDB and bookmarksDB which was caused by changes in kelondroRA in SVN 5376 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5385 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	47292e696a	more performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5379 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	d39d420b39	performance hacks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5376 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	0b4808ba3d	added new interactive search feature: - during the user types search queries, the local database is searched - results are presented interactively This was implemented using a new JSON result format for search results in YaCy - added JSON as file format for servlets - refactoring of current search servlets (xml and html) - added JSON output format for search results - added AJAX-based search page, that uses the yacysearch.json selrvlet to print results as a query is typed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5373 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	e423fa9846	) added method to only get file names in directory listing which match a filter ) only files which end with .black will be listed as blacklists *) added a little bit of Javadoc git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5366 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
orbiter	dba7ef5144	extended crawling constraints: - removed never-used secondary crawl depth - added a must-not-match filter that can be used to exclude urls from a crawl - added stub for crawl tags which will be used to identify search results that had been produced from specific crawls please update the yacybar: replace property name 'crawlFilter' with 'mustmatch'. Additionally, a new parameter named 'mustnotmatch' can be used, which should be by default the empty sring (match-never) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5342 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
f1ori	7e1fe05e3c	* added utf8-encoding to many getBytes-calls * utf8 should work now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5323 6c8d7289-2bf4-0310-a012-ef5d649a1542	16 years ago
low012	baae3d91b1	) fixed warning when compiling listManager ) fixed display of values of information for which part of YaCy (crawler, proxy, ...) blacklist is activated for *) replaced regular put() with putXML() in several cases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5305 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
low012	a99a629ed4	*) quick fix to prevent comments for blog entries which don't exist (http://forum.yacy-websuche.de/viewtopic.php?f=5&t=1554 ) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5302 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
low012	00e27e5050	) fixed bug which made it possible to write files outside of the DATA/LIST directory when creating a new blacklist ) a blacklist will only be created if no blacklist with same name exists (some refactoring has been necessary for this) ) further minor fixes ) to be continued... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5301 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	0edec2b760	FULL redesign of algorithms in htmlTools to encode/decode strings from/to unicode and html. The old process used a not really efficient way to detect html encoding strings in texts. All calling methods had been adoped to call the new class in an enhanced way with less parameters. Many classes in interfaces used a XML encoding only (instead of full html conversion from unicode to html); this behavior was not changed with this commit but should be controlled again since it points out possible XSS leaks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5295 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6fb865fbdc	- fix of bug in iterator in kelondroBLOBHeap which caused bug in crawl profile listing - some refactoring of classes that use kelondroMap (Map instead of HashMap) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5262 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	b97ff24b43	bookmarksDB / xbel.xml: - added support for folder=/foldername - it crashes if foldername ends with / git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5207 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lotus	0a0cc3bf67	added missing classes to build target "run" git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5201 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lotus	a81cb78211	finally some putHTML on htroot/xml/ git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5188 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	7b63c66a08	- bugfix in bookmarksDB.Tag.hasPublicItems() - this anoying little bug prevented display of public items without admin login for /xml/bookmarks/... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5151 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	05dbba4bab	added logging conditions to all fine and finest log line calls this will prevent an overhead for the generation of the log lines in case that they then are not printed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5102 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	aa6ae77e5e	- autoReCrawl: fix for filter settings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5088 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	8ae29bad57	- fix to previous change of Crawl Profile Names git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5087 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	434104e4a0	- change Crawl profile name for autoreCrawl git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5085 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lotus	0df2e47012	changed auto recrawl to comply with new date format git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5083 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	536e77e8b7	modifications towards a single database operation to read/write http header and cached file at once: - removed distinction between header file types for http and ftp; ftp is simulated by using http properties - removed all old resourceInfo classes that handled this distinction - introduced a new distinction between http request and http response objects - unified new response objects with two other object types that had been introduced elsewhere - changed all servlet call methods to use the new http request header object type - divided static object keys for http header properties into request and response types - refactoring here and there (a large number of type changes and many methods merged/moved) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5079 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	bd931a82f7	- added dynamic filters to autoReCrawl.conf - Restrict to sub-path: sub - Restrict to start-domain: dom git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5070 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	b3fc5e96a3	- removed unused import from bookmarksDB git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5067 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	bc048db7b6	- bugfix for bookmarksDB's rebuildDates() - dates are now saved as String.valueOf(TimeStamp) - it might be a good idea to delete (backup) bookmarkDates.db and restart YaCy to rebuild it git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5066 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	3c68905540	remove redundant null checks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5065 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	753a1ae430	- changed default browser from netscape to firefox - fixed "Inefficient use of keySet iterator instead of entrySet iterator" [WMI_WRONG_MAP_ITERATOR, FindBugs] - fixed some possible null pointer accesses git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5063 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	be28af50f5	- fixed "yacy2yacy no proxy"-problem git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5058 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	621b473b18	* removed some warnings of findbugs (http://findbugs.sf.net ) - removed unnecessary code (unused variables, String.toString) - corrected some calculations (cast int to double or long ;) - improved little performance (using Integer.valueOf() instead of new Integer) - log if some File-actions fail (mkdir(), delete(), ...) and some ignored exceptions - finalized some (more) fields - finally close some streams - made inner classes static if not using environment - generalized some equals (from specificClass to Object) - fixed some potential nullpointer accesses git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5039 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	0500b1179e	added a 2 min start up delay to serverBusyThread autoReCrawl to avoid a Null Pointer Exception... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5035 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	e1574fe02e	- added autoReCrawl folders to bookmarks (DATA/SETTINGS/autoReCrawl.conf) - the serverBusyThread checks folders every 60 min. (==> autoReCrawl_idlesleep in yacy.conf) - added option to create bookmarks from CrawlStart URL git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5033 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	17b7845eb5	* refactoring - moved constants from plasmaSwitchboard to own class (all 232 ;) - moved remoteProxy-Methods to httpRemoteProxyConfig, better names - removed some unnecessary code (else-statements) * formatting (correct indentation) * minor bugfixes (due to findbugs.sf.net) * hopefully fixed "missing quote" (announcing StringParts as UTF-8) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5031 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	3bb870bfcd	added final where possible git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5030 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c3d461d191	- removed superfluous copyright statement - updated my email address git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5011 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3ca98fee42	removed superfluous copyright statement git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5010 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7b1c9e6aee	discovered and removed a (possibly large) memory leak: many classes used the kelondroMapDataMining (was: kelondroMapObjects) which adds statistical functions to the kelondroMap (was: kelondroObjects), but these functions were not used by these classes. Especially the HTCACHE and robots.txt database allocate a very large number of objects for statistical use, but never used them. By replacing the kelondroMapDataMining with the kelondroMap object for these classes now less memory is allocated. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4986 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	0f5fe8cc53	refactoring of method calling for objects from kelondroMapDataMining git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4985 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4acf0a61cd	refactoring of kelondroObjects (mainly renaming to kelondroMap) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4982 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f7aaeb3fad	created new main menu entry 'Customization and Integration' - moved some already existing servlets to this menu - renamed the skin servlet to appearance - added a set-to-default-button to the search page appearance setting - removed the peer profile servlet which is now replaced by a field in the new appearance servlet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4980 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	1e6d12f146	Major update to BLOB data structures: - introduced a new BLOB file format: kelondroBLOBHeap. This is a flat file with an index in RAM. very similar to the eco-tables, but with flexible value sizes. It will replace the kelondroBLOBTree, which is based on a kelondroTree, a file-AVL-based index data structure. - the HTCACHE header file was replaced by the new blob heap file structure - the robots.txt file was replaced by the new blob heap file structure - the robots parser was enhanced (bugfixing for double-loading of the same robots.txt) - other BLOB-dependent data structures were prepared to use also the new BLOB heap - fixed a bug in the snippet fetch process: the file header was not written to the header index There should now be less IO during snippet fetch and during crawling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4978 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	81f75f5056	- removed unnecessary classes (these objects are much easier to handle using generics) - generalized BLOB referencing. This is the preparation to use another BLOB class, the kelondroHeap git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4977 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a6719dfd2b	- refactoring of robots parser - no more keep-order parameter in remove (it was not possible to make this strict, and not useful) - some small enhancements in balancer - robots parser without references in switchboard - changes synchronization in robots git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4969 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	e81be7d4f2	added many missing user-agent declarations for yacy http client connections. the most important fix was the addition of the yacybot user-agent for robots.txt loading, because web masters look for that access to see if the crawler behaves correctly. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4968 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	68c38c2d34	- WatchCrawler shows status without JavaScript - Performance can be scaled + DHT-profile - names for pool-threads - some small refactorings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4923 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	3330181aa0	refactoring: find a better way to store BLOBs; generalize current BLOG data structure (kelondroDyn) and prepare it to replace it with something better. The best candidate is the kelondroHeap, which will become the kelondroBLOBHeap; removed also some never-used classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4902 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	4b71912e76	fixed wrong class name git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4894 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	7feae906aa	- organize imports - removed potential null pointer accesses - removed unnecessary casts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4893 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	cfe6790498	- added option to switch between yacy networks, especially between the two default networks (freeworld and intranet), from the ConfigNetwork online interface - to make this possible, a large refactoring and reorganisation of data structures was necessary git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4803 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	2113672bf2	small fix on tag comporator functions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4794 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	fbb712c669	refactoring: moved importer classes to crawler and plasma package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4770 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	1689030ee8	refactoring: moved all crawler classes into their own package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4768 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d2ba1fd2ab	major step forward to network switching (target is easy switch to intranet or other networks .. and back) This change is inspired by the need to see a network connected to the index it creates in a indexing team. It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder. The remaining YACYDB is superfluous and can be deleted. The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy). The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT). No other functional change has been made. The next steps to enable network switcing are: - shift of crawler tables from PLASMADB into the network (crawls are also network-specific) - possibly shift of plasmaWordIndex code into yacy package (index management is network-specific) - servlet to switch networks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4765 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	d4bce6affd	refactoring (initialized static fields, removed empty if/else, serialized some fields in serializable classes) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4755 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	1995faef8d	- refactoring of Colage back-end: move to plasma package - renamed also the plasmaCrawlResults to have a consistent naming for url and image queues - added a double-check for the images - added additional queues for the images: all worse-quality images go there, so the queue can be used also if no sizes are given; no image is lost - added a cleanup for the stacks so they cannot flood the memory git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4722 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	8313d58ae7	- integrated the collage into the Web Visualization menu - added a counter for the public and private queue on the page (testing..) - fixed wrong public/private categorization git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4686 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	82bf9ac1c8	- added Collage servlet from datengrab and modified it: * all images are queued * private/public is respected * inserted into switchboard * added collageQueue class that stores all the queued images git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4683 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	202a3adb3e	refactoring of HttpClient Writer processes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4678 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	e356625b22	- refacotring of stream copy handling to support time-consuming operations - made usage of BufferedStreams explizit to distinct different copy method in serverFileUtils (byte-by-byte and using an own buffer) - introduced another timeout setting (java internal property) - more restrictions to clients accessing a single host (a security setting to prevent DoS by mistake) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4674 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c3342e1178	- removed class with only one static method - removed connection method with too long time-out git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4672 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
danielr	5c3c1fdf41	replaced httpc with Apache Jakarta Commons HttpClient (includes some refactoring ;) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4640 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7f9f639d20	- refactoring and abstraction of index reference (urls) handling: blacklisting is part of reference filtering - refactoring of word/phrase handling: word abstraction from condenser becomes part of index element handling - removed unused code parts from condenser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4603 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	d6050b9ffb	- separated the LURL data storage and Crawl result stack for process supervision. this is another step to enable multiple, concurrent fulltext-indexes - another try to make the yacy-httpc more stable git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4602 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	541b817502	refactoring of switchboard queueing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4591 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	275a226cc5	refactoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4524 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	bc3d3b4c97	fixed rebuildTags() to correctly rebuild folders... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4523 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	2327451653	- changed order of database initialisation (index first) - removed mainly unused init-time for databases (was only used for tree tables, which are not used any more) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4496 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	9ecc17baef	fixed double Blog entrys git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4492 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	94e256e13b	* removed single Blogview, now links direct to BlogComments.html * some other small changes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4483 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	00f5f917de	- more refactoring to blog - fixed moderate comment bug. see http://forum.yacy-websuche.de/viewtopic.php?f=9&t=860 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4478 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7f445f34a6	bitte die Java 5 - typischen Warnings einschalten! (unboxed-Fehler wies auf Programmfehler hin und Typangabe fehlte) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4476 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	c1b9a03304	* some refactoring to Blog * changed default sort order to reverse (newest first) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4475 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	766a04bc06	fixed sort problem in Blog. see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=639 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4474 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	bd63999801	- faster search: using different data structures that avoid multiplr calculations - no more table copy for error-eco table - optional table copy for lurl-entries - more abstractions (less single constant strings) - better logging (using host names instead of ips) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4459 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	8358652fa9	some small changes to blog git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4457 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	6a85764e1a	Second bugfix for numberbug in Blog. This update fix automatic existing blogentrys. A backup is not needed but almost a good idea ;) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4451 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
lulabad	40a0591942	Fixed numberbug in Blog, see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=639 . This wont fix existing Blogentrys (comes later). git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4443 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	7d875290b2	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4417 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	9d693ee635	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4415 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	0f5c4abaca	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4414 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	4a80902081	- added ViewProfile as rdf in foaf syntax - added link to rdf and vCard version on html page - can be seen on http://localhost:8080/ViewProfile.html?hash=localhash - more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4411 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	b1fae9b5af	fixed import Netscape Bookmarks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4401 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	f3a9e9c542	added getFolderList() to bookmarksDB added cleanTagsString() to bookmarksDB added getFoldersString() to Bookmark modified getTagsString() to exclude folderTags git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4383 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	e81bced2bd	reorganized the code and adjusted getTagIterator() to suit folders git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4357 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
borg-0300	53367d941a	more information (BASE64) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4324 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
apfelmaennchen	704de4dee8	Neue Funktion angelegt - notwendig für Einschränkung der Tagwolke public Iterator getTagIterator(String tagName, boolean priv) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4313 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	03e7782269	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	d517e96714	last cleanup bits to serverDate before the release. only safe refactoring (method renaming) changes outside of serverDate. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4289 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
hermens	4748d5c1ab	Some enhancements to time management: - remove unnecessary generation of Calendar and Date objects - synchronized SimpleDateFormat objects in blog-, message- and wikiBoard - correct use of TimeZones and SimpleDateFormats git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4288 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	1cb6e431a6	Replace the ISO8601 aka W3C datetime parser by one that supports every representation allowed by this standard, see http://www.w3.org/TR/NOTE-datetime - useful expecially for sitemaps parsing, where this date format is used git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4286 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	33ee6745f6	more cleanup in serverDate - remove direct accesses to SimpleDateFormat fields in serverDate and use the static parse... methods instead - remove nowDate() as a Date doesn't store timezone information and a new Date() is always faster - default formatter methods use a GMT timezone by default now, this is important for interchangability as some date formats we use don't include a timezone offset. - continued renaming and rearanging (formatter) methods. all should follow the general naming scheme formatWHAT(...) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4285 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	21b8d1b918	small cosmetic change for static fields in serverCore (special protocol ASCII entities) to improve readability git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4275 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c527969185	- enhanced monitoring of ranking parameters for details, please try http://localhost:8080/IndexControlRWIs_p.html - fixed computation of ranking ordering in some cases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4220 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	6eaa5a0e64	enhanced local search speed. The ranking process is now 6 times faster that before. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4197 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	425e4ead66	Allow absolute paths in configuration settings. - before absolute paths would be expanded incorrectly, e.g.: fooPath=/a/b/c would become /path/to/yacy/root/a/b/c. Now you can put nearly every dynamically generated data with a configurable path to a location outside of yacys root dir without having to use symlinks (probably good for third party distribution packaging). - abstractServerSwitch.getConfigPath(setting, default) returns a File instance, either with an absolute path or relative to the applications root path. - exceptions (hardcoded): DATA/LOG/yacy.logging DATA/SETTINGS/httpProxy.conf DATA/SETTINGS/user.db TODO: all of these are the global configuration files and they should probably be put into _one_ command line configurable settings path, so it would be possible to package them in /etc/ for example. - add missing workPath to yacy.init (it was used in code, but there was no default in the file) - fix broken skinPath (was skinsPath in yacy.init but skinsPath in the code) + a few other broken config reading caused by typos. - replaced path setting names and their default values with the related static fields in plasmaSwitchboard where not already done/existing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4196 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a31b9097a4	preparations for mass remote crawls: two main changes must be implemented to enable mass remote crawls: - shift control of robots.txt to crawl queue (away from stacker). This is necessary since remote crawls can contain unchecked urls. Each peer must check the robots to prevent that it is misused as crawl agent for unwanted file retrieval - implement new index files that control double-check of remotely crawled urls After removal of robots.txt checking from stacker threads, the multi-threading of this process is void. Multithreading has been removed. Also the thread pools for the crawl threads had been removed, since creation of these threads is not resource-consuming, for a detailed explanation see svn 4106 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4181 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
fuchsi	0e1738899f	* Complete number localization and provide a more reasonable interface to serverObjects: - put(key, value) methods are now used if a value added to the map should be kept as it is. Numbers are transformed (but not formatted) to an equivalent String representation. - putASIS(...) have been removed, now done with simple put(...) (see above). - puNum(...) can be used for number values which should be stored in a formatted way, either depending on the current locale setting for yacy (default) or in a "none" locale (see javadocs and setLocalize()). - putHTML(...) escapes special characters into corresponding HTML enities ('<' => '<') which was done with put(...) before and so was called too often, becauses it is necessary only for very few cases. Additionally there is a "forXML" mode which only replaces < > & ". In short: Use put(...) for almost everything, use putXY(...) if you need some special transformation of the value. A few bugs have been fixed as well, and there should be a small performance improvement for complex pages with a lot of values. * added additional Sum/Avg rows to access tracker pages, see http://forum.yacy-websuche.de/viewtopic.php?f=5&t=456 * removed duplicate code (mostly related to the big changes above). TODO: - make sure, number formats work as expected _everywhere_, report overseen stuff http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437 - probably a good idea to add special putDate() methods as they are used in many pages and create duplicated formatting code + maybe some centralized handling for memory value formatting. - further improve the speed of page creation for the WatchCrawler. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4178 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
fuchsi	06e6a1ff62	Add a generalized Formatter class yFormatter inspired by http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437 At the current state it allows formatting of numbers (integer + decimal types) for output according to the Locale derived from the language setting in yacy. Network.(html\|xml) and Status.html have been changed to use it for now (TODO: should be integrated into other servlets as well to reduce duplicate formatting code). NOTE: For now the output format for Network.xml simulates the old behaviour which is wrong (it uses '.' as decimal and grouping separator), to make sure external scripts like the yacystats.de one won't break with this update. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4162 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
fuchsi	9b0948cb4c	gnarf. mixed up the positions. finally fixed... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4143 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
fuchsi	c0f5fc51ef	bugfix for last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4142 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
fuchsi	c5a8585ac6	fix more encooding problems in yacysearch.rss. - URL encoding for search terms where required - removed "ugly" CDATA escaping - UTF-8 encoding for the XML - no HTML style escaping for XML/RSS element values Note: some unicode characters might still be encooded in a wrong way. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4140 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	01e0669264	re-designed some parts of DHT position calculation (effect is the same as before) and replaced old fist hash computation by new method that tries to find a gap in the current dht to do this, it is necessary that the network bootstraping is done before the own hash is computed this made further redesigns in peer initialization order necessary git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4117 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	842308ea97	- redesigned crawl start menu, integrated monitoring pages - removed web structure picture from indexing menu and grouped it together with htcache monitor - added a database for terminated crawls, when a crawl is finished it is automatically moved to the new database - extended crawl profile edit servlet, shows now also terminated crawls - option that was used to delete profiles is now redesigned to a function that moves the current crawl to the terminated crawls and removes all urls from the current queues! - fixed here and there problems with indexing queues - enhances indexing speed by changing cache flush sizes. - changed behaviour of crawl result servlet: the list of crawled urls is shown if there is one, othevise the overview window is shown attention: the new profile databases are not compatible with the old one. current crawls will be lost! the web index is not touched. next steps: the database of terminated crawls can be used to start with them a new crawl. This is useful if one wants to re-crawl specific pages and wants to use a old crawl profile. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4113 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	11b4f80bde	- fixed non-closing client connections - added client connection tracker in connections servelet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4108 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1488769e1f	cleanup of unmaintained and outdated performance methods: removed object pools in httpc. Object pooling is not recommended, if the creation of the object is not time-intensive. Object pools are only useful, if there is much computation necessary to create some basic data that is stored in the object pool and can be re-used. This does not apply to object pools in YaCy. Object pooling of client sessions would make sense if they would allow re-use of living connections to other yacy clients. But every connection is closed after usage of an object in the client pool, therefore the YaCy server client objects are not such that hold hardware/network-allocated entities. See: http://www.javaperformancetuning.com/news/qotm033.shtml http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_pooling http://docs.sun.com/source/816-7159-10/pt_chap5.html http://www.microjava.com/articles/techtalk/recylcle2 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4106 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
fuchsi	5b0c1449e1	various fixes and cleanups for blacklist handling: 1. avoid adding duplicate file name entries in config properties for lists, 2. correctly merge all path masks from all list files for the same host masks, 3. rewrite helper methods standard java methods for Collection transformations, 4. merged various methods with identical functionality for different Collection implementations into one, 5. minor refactoring to improve code readability. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4087 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	daf0f74361	joined anomic.net.URL, plasmaURL and url hash computation: search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f9e6cf6a3d	more refactoring of search: integrated first version of ssi-using search interface, but the function is currently disabled git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4063 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e76e996737	fixed umlaute-problem git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4039 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	62347b50f4	added security layer for ViewImage: - images may be requested by localhost and authorized users only, if the request is done using a clear-text URL - the image may be requested also using a code that can be a license to retrieve a URL for everyone - some servelets produce URL licenses for ViewImage, like image search results git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4027 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	57a5b6fa71	some generalization of remote proxy configuration and setting handling in httpc git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4023 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	367fc28928	corrected Brausse->Brausze git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4020 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e76fe1c078	- replaced unicode characters in copyright holder name ('Brausse') - more logging for bootstrap seedlist loading - larger DHT chunks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4015 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	40b0547611	- documentaton changes (removed old forum links) - different handling of link quotation - different handling of link normalization - enhanced html/unicode en/de-coding git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b6d9cca67e	- fixed problem with yacyVersion and own version generation - within this context: generalized date format handling - extended Update interface: * a version lookup can be triggered manually * a complete lookup + download + re-boot process can be triggered with one click git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3986 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9da0e53fe8	repaired rss feed reader - removed old rss parser - removed unused rss parser libraries - added new rss reader - added previously removed FeedReader_p.java and adopted it to new rss parser - adopted parser interface for rss indexing to new rss parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3970 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	bec4dbc753	added options and execution methods for automated updates git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3959 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a9e73b6852	fixed great mess with localization paths. the problem was: automatic re-translation after update did not work. hopefully now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3952 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	36a37f758b	fix for oom exception during release download see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
low012	2158f83d43	*) cosmetics, changed a character to get rid of "warning: unmappable character for encoding UTF8" during compilation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3946 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1782ef57e5	- added SSI parser and include directive for <!--# include virtual="<file>" --> - added chunked file transfer for non-yacy clients - SSIs are streamed using chunked transfer, partly delivered pages can be seen in browser before transmission is finished - added client-side network unit identification - cleaned up code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3926 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	6074264267	dynamic rights. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3847 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	854eb1492f	.yacy /.yacyh urls for the feedreader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3844 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	7a5b22a0b8	Integration of FeedReader in Bookmarks. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3841 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	7921f07c9d	userDB fix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3837 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	7b2e1bb8f2	Feedparser with reflection. TODO: This needs a special build.xml entry git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3832 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	8bff810d19	- fixed logging output of serverMemory.request() - don't start up if DATA/yacy.running exists as this is usually a sign of an already started yacy-instance git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3831 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	f05ca43780	- the wiki-parser works for remote wiki-code now, not displaying links anymore as if they were local (ViewProfile comment) - fixed wrong link to CrawlStart on Status-page git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3816 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	30c3d909b1	- fixed charset problem in ConfigProfil_p.html (use accept-charset="UTF-8" in forms) - fixed wrong XML output if no peers are known in Network.xml - simplified parsing of table properties in wikiCode and ZTableToken - reimplemented GC heuristics. They are needed to constantly ensure that an amount of free memory is available which is higher than Java's max. limit for performing a Full GC (please use serverMemory.request(long, boolean) rather than serverMemory.available(long, boolean) to provide data for averaging over the last GCs) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3793 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	4392ee0c51	BugFix for typo and wrong include git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3789 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	d1e1580223	Surftips Blacklist Blacklists List Hardcoded instead of only updated on firststart / migration.java git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3788 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
hydrox	44bac7dea1	*) blog-comments can now be moderated git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3778 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	957a25afff	getRight(rightName) instead of get...Right() git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3774 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
low012	a0149317ac	*) fixed bug where headlines were added to directory of a wiki page multiple times (http://www.yacy-forum.de/viewtopic.php?t=4034 ) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3762 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	baa9402b97	- wiki-parser is now configurable via the config setting wikiParser.class which holds the class-name for the parser to use git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3742 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	601fc7d1c5	- added source to J7Zip-modifed.jar and it's license (changelog is still to come) - moved HTML-*replace-methods from wikiCode to de.anomic.data.htmlTools - prepared use of different wiki parsers as suggested here: http://www.yacy-forum.de/viewtopic.php?p=34444#34444 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3741 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	b1680ab71f	*) bugfix for ArrayIndexOutOfBoundsException in robots-parser (thanks to low012) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3739 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	9a4375b115	*) robots.txt: adding support for crawl-delay git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3737 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	65a8a9fc58	fix for nullpointer git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3726 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	139c59ebbd	- fixed dht selction problem: the seed tables used a wrong ordering - cleaned some code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3693 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	cb43ae11ba	*) Bugfix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3668 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	0b5fc3c28c	) moving date functions to serverDate class ) Sitemap-parser - logging added - parsing of modDate added git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3667 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	6f46245a51	) Bookmarks: Ajax icon is displayed while loading title ) First version of a sitemap parser added - currently only autodetection of sitemap files is supported *) DB-Import restructured - pause/resume should work again now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3666 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e48189c710	enhanced cluster routing - cluster definitions can now contain an addition for local ip addresses - cluster-cluster communication uses the local ip address instead the global address, if one is given git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3624 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	2399ed817c	) robots.txt parser now extracts the sitemap-URL (will be used later) ) some javadoc added *) junit testclass for robots.txt parser added git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3602 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	e6fb6426a3	*) Some cosmetical changes and corrections git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3582 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	40c14a4f0e	- better implementation of search query properties - basic protection against start-up problems when database files are corrupted - auto-delete of not-critical databases during startup when load error occurs - on-the-fly reset option for all database tables - automatic on-the-fly reset for seed tables during enumeration exceptions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3547 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	f4af360f7c	bugfix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3494 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
hydrox	9b5fb3908d	*) a peer-message are now created when a blog-comment is written git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3480 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6ad39bae1e	fixed shutdown problem this fixes the 'inconsistency' messages during start-up git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3457 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	264a82eec8	- fix for http://www.yacy-forum.de/viewtopic.php?t=3657 - fix for http://www.yacy-forum.de/viewtopic.php?p=32758#32758 - Diff takes any objects now, not only strings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3455 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d755a8026d	- better OOM protection - better memory allocation for FlexTable indexes - splitting between static index and dynamic index (only the dynamic part must grow) - to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes - added new iterator classes that support cloneable iterators - adopted all iterator classes to implement cloneable itarators git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1cba31de43	redesigned ram organization for database caches - each cache can now allocate as much memory as is available - no more fixed limits - replaced old performance memory monitor by new one - added supervision methods as static functions into the classes that provide cache functionality - steering of ram allocation is done with two simple limits that are ram availability-relative git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3434 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	bd03c6b874	*) bugfix in bookmarksDB: - NullpointerException when trying to get an unknown bookmark - bookmarks can either start with http or https git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3427 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	9623bf7bbe	- removed call of java 1.5 method - added config servlet for local robots.txt - removed YPStats_p as it is of no use anymore - supertemplates use XHTML now - quick-fix for http://www.yacy-forum.de/viewtopic.php?p=32296#32296 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3422 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	a1d68fe092	- use .class rather than Class.forName for classes in class-path - added Bost's patch for Diff.findDiagonale() from: http://www.yacy-forum.de//files/patch_685.txt - fixed minor bugs in Blog git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3416 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
hydrox	54fef3574f	*) missing files for last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3406 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
hydrox	cb89c74d52	) added blog-comments ) removed debug-output when deleting news git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3405 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	6fbe31425a	- some code-cleanup (no more syntax-warnings here) - added deletion from loadedURLs of URLs to be blacklisted in IndexControl_p git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3404 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e3480d4ad3	fix for warning in crawl balancer git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3402 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	39a2000d8b	- added support for [[Bookmark:$bookmarkTag\|description]]-link-listings (requested by theli) to wiki-parser - added support for <pre>-tags to wiki-parser git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3393 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	619653c054	- fix for last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3392 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	a5a36d9252	- hopefully last fix fo 1.5 methods (sorry for that, eclipse isn't that helpful in identifying those methods) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3387 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	e97b6f0458	- we still use Java 1.4 ... git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3386 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	0c7b8cf632	- added first version of new wiki-parser - added blacklist support to manual URLFetcher stack fill - fix for NPE: http://www.yacy-forum.de/viewtopic.php?t=3559 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3385 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
low012	801eea8849	*) Fixed bug where pairReplace() got caught in infinite recursion. (http://www.yacy-forum.de/viewtopic.php?t=3466 ) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3383 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	d114a0136e	- crawl profile: don't add null-values - added some settings and statistics for url-fetcher 'server'-mode - added own stack for fetchable URLs - added possibility to fill stack via shift from peer's queues, via POST (addurls=$count and url$num=$url) or via file-upload - added "htroot" to classpath of linux start-script git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3370 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c464157a6e	replaced some toString() see http://www.yacy-forum.de/viewtopic.php?p=31151#31151 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3345 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
(no author)	e218940293	The copyright sign "\u00A9" is already replaced by "©". String "(C)" is not a unicode sequence! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3334 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
low012	1bc4d8d470	*) If there is more than one pair of patterns in a line, all of them (and not only one pair) will be replaced. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3333 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
low012	ea7a8cf7aa	) <hr> and <br> tags are XHTML compliant now. ) Avoid superflous trailing blank in non-proportional sections. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3332 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	f2e6f19b90	- added versioning to Wiki git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3327 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	02a73dce87	- added Diff-class for wiki-versioning (forthcoming, first need suitable serverObjects.put() for it) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3325 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e4910f03d1	tag storage fix git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3302 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	991182b29b	more space for bookmarks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3299 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	88fa764b64	implemented new kelondroObjects into bookmarkDB - Bookmark-Objects are stored inside the kelondroObjects cache - removed superfluous classes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3298 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9c05e2a820	re-design ob kelondroMap - this class is replaced by an object that can hold any type of object - this object must be defined as a class that implements kelondroObjectsEntry - the kelodroMap is now implemented as kelondroMapObjects git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3297 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	669c21db05	first version of abstracted kelondroMap Cache. get returns a kelondroCachedObject(or in most cases a subclass of it), or a map, which can be used to construct a kelondroCachedObject. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3295 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	14f2068daf	some more bookmark changes towards multiuser bookmarks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3291 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	ff79c52fc0	bookmark users can now edit bookmarks. TO COME: tag bookmarks with username, list bookmarks of a special user, filter private bookmarks for users. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3274 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	f40169fcd7	preparing multiuser bookmarks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3256 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c0851ee943	refactoring: moved and renamed de.anomic.data.searchResults to plasma package git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3248 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
allo	c39dda2374	finished refactoring of searchtemplates. now plasmaSwitchboard.searchFromLocal calculates a searchResults structure, which is parsed in the yacysearch/detailedSearch Servlets. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3244 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago

... 3 4 5 6 7 ...

717 Commits (9bdee5c71c0057d095818dddb0cd15fcf6953399)