yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	acf771d5e1	- fixed bug with too much RAM in crawler queue - fixed dir bug - better calculation of TF for join - better waiting-on-result logic git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4424 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	9e7cd4fdbb	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4380 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	03e7782269	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4305 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	f7c5ccedc7	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4301 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	df2a7a8ac8	more generics git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4295 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	33ee6745f6	more cleanup in serverDate - remove direct accesses to SimpleDateFormat fields in serverDate and use the static parse... methods instead - remove nowDate() as a Date doesn't store timezone information and a new Date() is always faster - default formatter methods use a GMT timezone by default now, this is important for interchangability as some date formats we use don't include a timezone offset. - continued renaming and rearanging (formatter) methods. all should follow the general naming scheme formatWHAT(...) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4285 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	21b8d1b918	small cosmetic change for static fields in serverCore (special protocol ASCII entities) to improve readability git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4275 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	445c0b5333	added domain list extraction and html export format to URL administration menu http://localhost:8080/IndexControlURLs_p.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4228 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	bf6952abe7	- added url export to http://localhost:8080/IndexControlURLs_p.html - removed command-line option to export urls git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4226 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	c527969185	- enhanced monitoring of ranking parameters for details, please try http://localhost:8080/IndexControlRWIs_p.html - fixed computation of ranking ordering in some cases git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4220 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	425e4ead66	Allow absolute paths in configuration settings. - before absolute paths would be expanded incorrectly, e.g.: fooPath=/a/b/c would become /path/to/yacy/root/a/b/c. Now you can put nearly every dynamically generated data with a configurable path to a location outside of yacys root dir without having to use symlinks (probably good for third party distribution packaging). - abstractServerSwitch.getConfigPath(setting, default) returns a File instance, either with an absolute path or relative to the applications root path. - exceptions (hardcoded): DATA/LOG/yacy.logging DATA/SETTINGS/httpProxy.conf DATA/SETTINGS/user.db TODO: all of these are the global configuration files and they should probably be put into _one_ command line configurable settings path, so it would be possible to package them in /etc/ for example. - add missing workPath to yacy.init (it was used in code, but there was no default in the file) - fix broken skinPath (was skinsPath in yacy.init but skinsPath in the code) + a few other broken config reading caused by typos. - replaced path setting names and their default values with the related static fields in plasmaSwitchboard where not already done/existing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4196 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	a31b9097a4	preparations for mass remote crawls: two main changes must be implemented to enable mass remote crawls: - shift control of robots.txt to crawl queue (away from stacker). This is necessary since remote crawls can contain unchecked urls. Each peer must check the robots to prevent that it is misused as crawl agent for unwanted file retrieval - implement new index files that control double-check of remotely crawled urls After removal of robots.txt checking from stacker threads, the multi-threading of this process is void. Multithreading has been removed. Also the thread pools for the crawl threads had been removed, since creation of these threads is not resource-consuming, for a detailed explanation see svn 4106 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4181 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
fuchsi	06e6a1ff62	Add a generalized Formatter class yFormatter inspired by http://forum.yacy-websuche.de/viewtopic.php?f=5&t=437 At the current state it allows formatting of numbers (integer + decimal types) for output according to the Locale derived from the language setting in yacy. Network.(html\|xml) and Status.html have been changed to use it for now (TODO: should be integrated into other servlets as well to reduce duplicate formatting code). NOTE: For now the output format for Network.xml simulates the old behaviour which is wrong (it uses '.' as decimal and grouping separator), to make sure external scripts like the yacystats.de one won't break with this update. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4162 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	11b4f80bde	- fixed non-closing client connections - added client connection tracker in connections servelet git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4108 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	1488769e1f	cleanup of unmaintained and outdated performance methods: removed object pools in httpc. Object pooling is not recommended, if the creation of the object is not time-intensive. Object pools are only useful, if there is much computation necessary to create some basic data that is stored in the object pool and can be re-used. This does not apply to object pools in YaCy. Object pooling of client sessions would make sense if they would allow re-use of living connections to other yacy clients. But every connection is closed after usage of an object in the client pool, therefore the YaCy server client objects are not such that hold hardware/network-allocated entities. See: http://www.javaperformancetuning.com/news/qotm033.shtml http://java.sun.com/docs/hotspot/HotSpotFAQ.html#gc_pooling http://docs.sun.com/source/816-7159-10/pt_chap5.html http://www.microjava.com/articles/techtalk/recylcle2 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4106 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	daf0f74361	joined anomic.net.URL, plasmaURL and url hash computation: search profiling showed, that a major amount of time is wasted by computing url hashes. The computation does an intranet-check, which needs a DNS lookup. This caused that each urlhash computation needed 100-200 milliseconds, which caused remote searches to delay at least 1 second more that necessary. The solution to this problem is to attach a URL hash to the URL data structure, because that means that the url hash value can be filled after retrieval of the URL from the database. The redesign of the url/urlhash management caused a major redesign of many parts of the software. Since some parts had been decided to be given up they had been removed during this change to avoid unnecessary maintenance of unused code. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4074 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	bb426565f0	added new yacy protocol for mass url-pull for better remote crawling distribution git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4056 6c8d7289-2bf4-0310-a012-ef5d649a1542	17 years ago
orbiter	24e25e1141	enhanced SSI server-side support: - SSIs may now refer to servlets, not only files - calling a servlet, the servlet/SSI engine is called recursively - SSIs now work also for non-chunked-encoding supporting clients This will support the new search page functionality, to show search results dynamically without using javascript. To test this method, a test page has been added http://localhost:8080/ssitest.html ..calls dynamicalls 3 servlets, which produce some delays during their execution please verify that you can see the result step-by-step on your browser To implement this feature, some refactoring had been taken place, mostly code had been made static and will execute faster. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4037 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	57a5b6fa71	some generalization of remote proxy configuration and setting handling in httpc git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4023 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	40b0547611	- documentaton changes (removed old forum links) - different handling of link quotation - different handling of link normalization - enhanced html/unicode en/de-coding git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3993 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b6d9cca67e	- fixed problem with yacyVersion and own version generation - within this context: generalized date format handling - extended Update interface: * a version lookup can be triggered manually * a complete lookup + download + re-boot process can be triggered with one click git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3986 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	bec4dbc753	added options and execution methods for automated updates git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3959 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	527b3decde	- re-sructuring of configuration menus - added new system update configuration page - moved system update from status page to system udate page - moved shutdown and restart from status page to main menu - added new configuration properties to yacy.init (not yet actively used) - added some methods to handle new automatic update process git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3958 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	a9e73b6852	fixed great mess with localization paths. the problem was: automatic re-translation after update did not work. hopefully now git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3952 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	36a37f758b	fix for oom exception during release download see http://forum.yacy-websuche.de/viewtopic.php?f=6&t=101&hilit= git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3950 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c5c268c43e	tried to fix restart button kann das mal jemand auf seiner linux-platform testen und feed-back geben ob der restart funktionier ? git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3937 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	9bbd39b67c	- removed unfinished auto-updater from roland and martin - added new download-option for releases on the status page still mising: - thomas-style restart for linux/mac - untar/gunzip on shell basis (comes next) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3931 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	0e57a8062b	added network definition for different YaCy networks (needs much more work) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3919 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6518bb6c08	changed release strategy: we will provide two different releases in the future, one standard release and one 'pro'-release. the 'pro'-release contains all additional parsers AND has different default performance values. The pro-version differs therefore from the previous 'all'-version by this default values. The pro-configuration is automatically choosen if the libx-folder exists. If a version is once initialized, its configuration stays independently from an existing libx folder. The ant targets had been changed. There are now 3 different targets to create standard and pro-releases, and one target to upgrade: - dist: creates a standard release (only, no libx target any more) - distPro: creates a pro-release (includes the libx) - distExt: creates a libx-release which includes the libx-folder only. It may be used to upgrade from standard to pro Furthermore, the naming of 'dev'-releases had been removed. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3902 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	f64d9f1c6c	removed forced termination in case that a previous bad termination is detected this will cause many users to be unsure what to do next an leave them helpless to simply delete the control file is the same thinig that the user is othervise forced to do git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3885 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
karlchenofhell	8bff810d19	- fixed logging output of serverMemory.request() - don't start up if DATA/yacy.running exists as this is usually a sign of an already started yacy-instance git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3831 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	b30e64daab	*) passing homepath to serverLog.configureLogging git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3738 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	139c59ebbd	- fixed dht selction problem: the seed tables used a wrong ordering - cleaned some code git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3693 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	24db55a541	added timeout for httpd-sockets during read git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3691 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	269d5ca45b	*) Bugfix for IllegalMonitorException git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3664 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	e8610f049a	*) new method allowing the updater to wait until yacy has finished startup git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3662 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	43748f87fb	*) changes required for the uploader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3655 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	abb63e3289	) disabled system.exit() in case of YaCy shutdown as it kills the whole VM (including updater and other management threads) ) UpdateCheck-Thread now pauses for given interval correctly git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3642 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	e9d87b2fce	*) changes required for the uploader git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3621 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	871ee1ce0f	one step closer to automatic updates: automatically aquire release information from download archives web pages from latest.yacy-forum.net and yacy.net are retrieved, parsed, links wihin are analysed, sorted and the most recent developer and main releases are provided as direct download link on the status page, if it was discovered that a more recent version than the current version is available. This process is done only once during run-time of a peer, to protect our download archives from DoS by YaCy peers. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3606 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	40c14a4f0e	- better implementation of search query properties - basic protection against start-up problems when database files are corrupted - auto-delete of not-critical databases during startup when load error occurs - on-the-fly reset option for all database tables - automatic on-the-fly reset for seed tables during enumeration exceptions git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3547 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	5c3afb3202	added option to configure a path to a secondary index location. this shall be used to store a fragment of the index on another physical device, to split IO load and enhance access speed. The index is splitted in such a way that the LURLs are stored to the secondary location, and the RWIs to the primary location. This is especially useful for environments where symbolic links are not possible and may cause IO access even if there is no write access to the device which hosts the symbolic link. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3519 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	2cb16824e3	removed support for old database structures. The new collection index will be more generalized to support other indexes i.e. YBR block-rank computation. A clean-up of the many conditions to support the old database was necessary. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3506 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	beb772d6cd	fixed problem with broken notifier image, occurred only at initial start-up git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3497 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6b9eea3932	- removed differentiation between longTitle and shortTitle; this cannot be used for search results, and it is difficult to get both types from all document types - added some author parsing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3489 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	861f41e67e	redesigned NURL-handling: - the general NURL-index for all crawl stack types was splitted into separate indexes for these stacks - the new NURL-index is managed by the crawl balancer - the crawl balancer does not need an internal index any more, it is replaced by the NURL-index - the NURL.Entry was generalized and is now a new class plasmaCrawlEntry - the new class plasmaCrawlEntry replaces also the preNURL.Entry class, and will also replace the switchboardEntry class in the future - the new class plasmaCrawlEntry is more accurate for date entries (holds milliseconds) and can contain larger 'name' entries (anchor tag names) - the EURL object was replaced by a new ZURL object, which is a container for the plasmaCrawlEntry and some tracking information - the EURL index is now filled with ZURL objects - a new index delegatedURL holds ZURL objects about plasmaCrawlEntry obects to track which url is handed over to other peers - redesigned handling of plasmaCrawlEntry - handover, because there is no need any more to convert one entry object into another - found and fixed numerous bugs in the context of crawl state handling - fixed a serious bug in kelondroCache which caused that entries could not be removed - fixed some bugs in online interface and adopted monitor output to new entry objects - adopted yacy protocol to handle new delegatedURL entries all old crawl queues will disappear after this update! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3483 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	6ad39bae1e	fixed shutdown problem this fixes the 'inconsistency' messages during start-up git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3457 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d755a8026d	- better OOM protection - better memory allocation for FlexTable indexes - splitting between static index and dynamic index (only the dynamic part must grow) - to enable a merge-iteration of new splittet index, a huge number of classes needed to be adopted for new iterator classes - added new iterator classes that support cloneable iterators - adopted all iterator classes to implement cloneable itarators git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3453 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	1cba31de43	redesigned ram organization for database caches - each cache can now allocate as much memory as is available - no more fixed limits - replaced old performance memory monitor by new one - added supervision methods as static functions into the classes that provide cache functionality - steering of ram allocation is done with two simple limits that are ram availability-relative git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3434 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	d25caa07bf	redesigned some parts of http authentication added another access check for peer hops git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3340 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago

1 2 3 4 5 ...

284 Commits (40a05919424e3a8ee01a5e6d167e79d533d80957)