yacy_search_server

Commit Graph

Author	SHA1	Message	Date
orbiter	6396f5971e	bugfixes and migration attempt toward new kelondroFlex db - more synchronization - bugfix for remove in collections - bugfix in kelondroFlex (wrong exception condition!) - options to use RAM, FLEX and TREE tables for Crawl URL stacker - default for Crawl URL stacker is now FLEX (!) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2746 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	c8f3a7d363	added snippet-url re-indexing - snippets will generate an entry in responseHeader.db - there is now another default profile for snippet loading - pages from snippet-loading will be indexed, indexing depth = 0 - better organization of default profiles git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2733 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	0f10bdde22	more generic cache methods git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2721 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
hermens	440c6ee657	Implement alternative htcache layout mostly according to: http://www.yacy-forum.de/viewtopic.php?p=26205#26205 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2718 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	ca8ef0ca9f	)Documented the lng-file format )Updated language files to the new standard, especially German )Wrote language highlighting definition for Notepad++ )Corrected News.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2685 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	e03427871e	enhanced surftipps: - added switchh to show or hide surftipps - more news contribute to surftipps - added voting system for surftipps git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2638 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	97d2a08ef1	*) restructuring needed to support parsing of documents using various charsets - serverFileUtils.java: -- adding methods to copy from stream to writer and readers to writers -- moving httpc writeX methods into serverFileUtils class - serverCharBuffer.java: removing inheritance from Writer class - replacing htmlFilterOutputStream by htmlFilterWriter class which handles content as char stream - htmlFilterContentTransformer.java: deactivating getText mode (still needs to be migrated to use char streams instead of byte streams) - changes in several classes to use htmlFilterWriter instead of htmlFilterOutputStream - changes in Scraper and Transformer classes to operate on chars instead of bytes - httpdProxyHandler.java: bugfix. clientTimeout setting was missing in config file git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2617 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	63893003be	) Adding settings page for the crawler which allows to specify a file size limit and the timeout to use. ) adding first version of maximum filesize check for the crawler git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2534 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
(no author)	2dacf63dd9	Spelling correction of the language list "Slovenky" -> "Slovensky" git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2490 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	fce9e7741b	*) next step of restructuring for new crawlers - renaming of http specific crawler settings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2480 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
theli	09b106eb04	*) next step of restructuring for new crawlers - adding interface class (plasma/crawler/plasmaCrawlWorker.java) for protocol specific crawl-worker threads - moving reusable code into abstract crawl-worker class AbstractCrawlWorker.java - the load method of the worker threads should not be called directly anymore (e.g. by the snippet fetcher) to crawl a page and wait for the result use function plasmaCrawlLoader.loadSync([...]) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2474 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	18b6876860	new cache flush configuration settings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2460 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
orbiter	b7f4a1521b	added options to switch on or off the kelondroFlexTable for NURL, EURL and PreNURL git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2456 6c8d7289-2bf4-0310-a012-ef5d649a1542	18 years ago
rramthun	38c4248814	Some language updates Removes the ; behind Slovenky in language list git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2430 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	1ce3c22761	better memory control: - added memory monitor for preNURL-db in performanceMemory - changed default memory assignments git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2427 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	eee44be602	*) adding an interface for customized blacklist classes - now it's possible to use a customized blacklist engine instead of the default one - this can be done by configuring the property BlackLists.class See: http://www.yacy-forum.de/viewtopic.php?t=2108 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2397 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	d2e8e76218	*) now it's possible to configure the yacy blacklist separately for dht, search, proxy, crawler See: http://www.yacy-forum.de/viewtopic.php?t=2541 http://www.yacy-forum.de/viewtopic.php?p=24516 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2389 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
rramthun	23a99b8283	Small changes to the language git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2383 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	314021453f	* more logging * option in yacy.init to set useCollectionIndex usage git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2374 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	61b151b083	* added another auto-fix for collection index inconsitency check * fixed words size computation for collection index git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2368 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	c7b6389ca1	*) renaming indexDistribution.dhtReceiptLimitEnabled property to indexDistribution.transferRWIReceiptLimitEnabled so that the default value is taken over by all peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2356 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	0baadcadca	*) enable indexDistribution.dhtReceiptLimitEnabled limit per default See: http://www.yacy-forum.de/viewtopic.php?p=24425 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2355 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	279b1d969d	Integrated new indexing data structure 'collections' into the main class for indexing, the plasmaWordIndex. The new data structure is ready-to-use, but currently disabled. It can be activated by setting the static plasmaWordIndex.useCollectionIndex to true. This shall be done for testing purpose. The new index is stored to DATA/INDEX/PUBLIC/TEXT The directory PLASMA shall be used only for crawler in the future. Attention: during testing the data structure in INDEX may change, and created indexes with the new data structure may get useless. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2348 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	6e676224d0	*) adding support for upnp A new port forwarding method for upnp was added. If this method is enabled, yacy automatically determines an UPnP capable internet gateway and configures the gateway port forwarding settings properly. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2328 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	7b0e2521bb	Support for a supertemplate, which can do all thing, a normal template can do. Its a layer under the servlets, this means, #[page]# will be replaced by serverletcode, the rest can be set by you. (TODO: if we use this for layout, we need to read "TITLE" from the servlet's tp, to set it outside of the servlet.) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2302 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
rramthun	b5ec7de936	Correction to last commit + spelling git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2296 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
rramthun	ca33eaa442	- Some spelling - Removed unused init value - Set default upload value to "none", which avoids an warning which says, upload method '' would be unknown, on new installations git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2295 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	92f4cb4d73	added option to configure the start-up delay time for kelondro database files. the start-up delay is used to pre-load the database node cache git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2276 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	b594ee9a5a	*) Adding possibility to configure if the http proxy should send the X-forwarded-for header (requested by TeeSee) See: http://www.yacy-forum.de/viewtopic.php?t=2577 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2257 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	b20496e42b	*) make DHT DoS check configurable (requested by KoH) - check can be disabled via property indexDistribution.dhtReceiptLimitEnabled - upper bound can be configured via indexDistribution.dhtReceiptLimit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2234 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	9937730560	added localization for Slovenky/Slovenian language, provided by Rostislav Svoboda git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2148 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	24686e50a2	- fixed a caching bug - added counter for cache delete to distinguish between flush and delete - changed some default paramenters for cache size settings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2143 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	a520ab2e8c	*) adding possibility to use an existing PKCS12 certificate for https instead of creating a new one. Notes: This import is done automatically on startup if the following properties are set in the config file: pkcs12ImportFile = pkcs12ImportPwd = git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2139 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	df068cf23c	*) adding first version of native SSL support for yacy VERY EXPERIMENTAL! See: http://www.yacy-forum.de/viewtopic.php?p=18516 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2096 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	0c2cbc572b	reverted a change to idle/busy time default setting. There was a misunderstanding of the meaning of these values: this is not the time that the process may take, instead it is the time that the proces pauses after each loop. increased the busysleep time pause from 2 seconds to 10 seconds. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2094 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
rramthun	fb064fdbee	Many additions/corrections to the German language file Increased default time for DHT distribution, because many people complained about the high load on their systems, see http://www.yacy-forum.de/viewtopic.php?p=20922#20922 Avoided problems with some browsers and ampersands, see http://www.htmlhelp.com/tools/validator/problems.html#amp Removed nearly invisible "bug" in menu git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2087 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	461548698c	configuration of index transfer chunk size see http://www.yacy-forum.de/viewtopic.php?p=20951#20951 new properties in yacy.init: indexDistribution.minChunkSize = 5 indexDistribution.maxChunkSize = 1000 indexDistribution.startChunkSize = 50 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2073 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	c67125d8d3	better default values for queue times to prevent (D)DoS-similar situations git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2062 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	fe4ad214f1	configuration of time-out for fetching seed-lists during bootstraping (weil das beim linuxtag nur auf meinem Notebook nicht funtioniert) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2054 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	93eb4f14e6	release 0.45 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2047 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	00e768b259	added Picture button to search results git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2011 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	ac114d69c0	tried to fix some problems with time-outs during search git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1994 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	63f39ac7b5	added 3 new crawling steering options: - re-crawl by age of page (enter in minutes) - auto-domain-filter - maximum number of pages per domain NOT YET TESTED! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1949 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	59b9540eb6	save the current Skin git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1912 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
hydrox	8da13088e9	)removed multiple DHT_Distribution_Threads )boosted DHT_Distribution sending chunk parallel to multiple peers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1890 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	bcd99fe83e	introduced a second RAM cache for DHT transfer git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1880 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	5ee0125046	*) adding possibility to configure the server port for seed uploading via scp. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1861 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	f0a38873eb	* added yacysearch page with better view on search results the old search page is obsolete and will be removed * ConfigBasic.html is now the default page instead of index.html as long as no password is set git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1815 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	f0041d504d	remove of several results from a single domain is stopped if the result set is smaller than the wanted number of results git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1811 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
rramthun	c2f8318b4f	Ok, the 08.03 commit ;-) Added nice graphic for the 1-2-3-interface. Used one graphic less (check.png-->ok.png). Saves disk/download-space. Updated italian translation. Deleted my old version of the changelog as we have a new one. Many corrections to the spelling. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1791 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
auron_x	8c6f38fe70	*) added Blog to YaCy (atm not reachable through interface) -> Blog.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1790 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	87e90b9d8c	refinements in ram cache flush procedure and default timing git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1768 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	fbbbf5f411	*) remote trigger for proxy-crawl - remote crawling can now be enabled for the proxy crawling profile See: http://www.yacy-forum.de/viewtopic.php?p=17753#17753 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1758 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	2336f0f013	*) allow pausing/resuming of crawlJob Threads separately - pausing/resuming localCrawls - pausing/resuming remoteTriggeredCrawls - pausing/resuming globalCrawlTrigger See: http://www.yacy-forum.de/viewtopic.php?t=1591 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1723 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
hydrox	e2af2a3f45	*) it's now possible to run more then one indexDistribution-Thread git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1673 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	8fcb25f9f9	*) Setting via header according to rfc - can be disabled via settings dialog git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1662 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	40199cea1f	migration with svn Numbers git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1623 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	7bd61ab0e5	Locales will now be in DATA/HTDOCS. So it works with readonly htroot. git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1527 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
borg-0300	c5b6154136	added CRDistOn = true/false git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1372 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
borg-0300	f1643228f5	more media extension (7z,asx,db,lx,lxl,scr,tbz,war) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1244 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
borg-0300	395626dbe3	new media extension (dcm) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1242 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	5e2673e86c	new version number to check performance enhancements git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1207 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	6b1a49ea23	fix for last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1181 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	44fa94ac52	) Modifications for dbImport functionality - dbImporter threads are now shutdown by the switchboard on server shutdown - adding possibility to pause a importer thread via GUI - Bugfix for abort function See: http://www.yacy-forum.de/viewtopic.php?p=13363#13363 ) Modification of content parser configuration - now it's possible to configure which parsers should be enabled for the proxy, crawler, icap, etc. separately - ) htmlFilterContentScraper.java - adding regular expression to normalize URLs containing /../ and /./ parts ) httpc.java - adding functionality to unzip gzipped content - requested by roland: should be used later to allow gzipped seed lists git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1170 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	7e670894d9	) Suppressing stackTraces in proxyError message for "connect timed out" errors See: http://www.yacy-forum.de/viewtopic.php?t=1504 ) Increasing default http client timeout git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1129 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	4776f3f815	squid like redirctors git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1120 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	7ad4353fc6	fix for last commit git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1111 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	0ec54d9c5f	enhanced CR-file handling and added first RCI-evaluation tests git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1110 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	d0dfccdb77	*) Making CrawlStacker pool configurable via GUI and config file See: http://www.yacy-forum.de/viewtopic.php?t=1448 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1087 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	86a9210264	*) indexing queue slots are now configurable via config file See: http://www.yacy-forum.de/viewtopic.php?t=1480 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1081 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	fd58d5f8e6	*) Adding possibility to specify the interface / IP-Address where YaCy should bind to. - e.g. Port = 192.168.0.1:8080 Port = #eth0:8080 Port = 8080 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1071 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	79818a320f	introduced citation-rank transmission protocol and activate transport for anonymisation git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1055 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	d2731418bf	added creation of global ranking files and changed url normal form usage git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1046 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	fb766413d1	*) Changes on httpc dns caching - Bugfix: old dns cache did not handle case insensitive hostnames correctly. - adding a possibility to set domain name patterns defining hostnames that should not be cached by the httpc dns cache e.g. borg-300.dyndns.org This can be done by setting the new httpc.nameCacheNoCachingPatterns property - using httpc.dnsResolve wherever possible within the sourcecode [httpd.java,plasmaCrawlStacker.java] git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1044 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
hydrox	295aff52a3	)added offline-browsing-support (onlineMode=0) )online-mode now can be changed in Status.html git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1010 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	5b0911d7ea	added new performance menu for search sequence configuration and monitoring git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@990 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	02d9af1a70	) Restructuring and extending of Remote Proxy Support - remote proxy configuration can now be "really" changed on the fly and takes effect immediately - adding possibility to disable remote proxy usage for yacy->yacy communication - adding possibility to disable remote proxy usage for ssl - restructuring proxy configuration so that it is stored in a single place now ) Adding possibility to import a foreign word DB (or even more of them in parallel) at runtime into the peers DB - this can be done by calling IndexImport_p.html - ATTENTION: please not that at the moment this thread must be aborted via gui before a normal server shutdown is done. - TODO: integrating IndexImport Thread into normal server shutdown - TODO: Adding posibility to import crawl-queues, etc. from foreign peers - TODO: removing old import function from yacy.java and calling the new routines instead git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@968 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	43a127ff3a	allow httpsTunnels to other Ports than 443. (if secureHttps=false) git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@940 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	1eb95176b6	-t not needed git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@930 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	461374e175	*) Restricting amount of files that yacy is allowed to open during index transfer/distribution This option is configurable via config file and is set per default to 800 See: http://www.yacy-forum.de/viewtopic.php?p=11137#11137 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@918 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	839db8869c	added high/low priority for index adding git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@899 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	c83594528c	integrated crawl stacker into thread control git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@887 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	959eefbc4f	) Robots.txt parser/ppt cutting of comments at the line end ) Adding Threadpool for stackCrawl Thread to speedup robots.txt download and double url checks git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@882 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	f65c939a60	userDB Auth git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@874 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	a2fa75e688	) Asynchronous queuing of crawl job URLs (stackCrawl) various checks like the blacklist check or the robots.txt disallow check are now done by a separate thread to unburden the indexer thread(s) TODO: maybe we have to introduce a threadpool here if it turn out that this single thread is a bottleneck because of the time consuming robots.txt downloads ) improved index transfer The index selection and transmission is done in parallel now to improve index transfer performance. TODO: maybe we could speed up performance by unsing multiple transmission threads in parallel instead of only a single one. ) gzip encoded post requests it is now configureable if a gzip encoded post request should be send on intex transfer/distribution ) storage Peer (very experimentell and not optimized yet) Now it's possible to send the result of the yacy indexer thread to a remote peer istead of storing the indexed words locally. This could be done by setting the property "storagePeerHash" in the yacy config file - Please note that if the index transfer fails, the index ist stored locally. - TODO: currently this index transfer is done by the indexer thread. To seedup the indexer a) this transmission should be done in parallel and b) multiple chunks should be bundled and transfered together ) general performance improvements - better memory cleanup after http request processing has finished - replacing some string concatenations with stringBuffers - replacing BufferedInputStreams with serverByteBuffer - replacing vectors with arraylists wherever possible - replacing hashtables with hashmaps wherever possible This was done because function calls to verctor or hashtable functions take 3 time longer than calls to functions of arraylists or hashmaps. TODO: we should take a look on the class serverObject which is inherited from hashmap Do we realy need a synchronization for this class? TODO: replace arraylists with linkedLists if random access to the list elements is not needed ) Robots Parser supports if-modified-since downloads now If the downloaded robots.txt file is older than 7 days the robots parser tries to download the robots.txt with the if-modified-since header to avoid unnecessary downloads if the file was not changed. Additionally the ETag header is used to detect changes. ) Crawler: better handling of unsupported mimeTypes + FileExtension ) Bugfix: plasmaWordIndexEntity was not closed correctly in - query.java - plasmaswitchboard.java *) function minimizeUrlDB added to yacy.java this function tests the current urlHashDB for unused urls ATTENTION: please don't use this function at the moment because it causes the wordIndexDB to flush all words into the word directory! git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@853 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	3dd7e90cdd	kbytes instead of bytes in performance settings; new default values git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@808 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	3fcc95a82c	integrated crawl-profiles db in memory-performance monitor git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@788 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	fe6a6abc0b	*) Adding robots.txt db to Performance Settings for Memory menue git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@785 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	e6b9b23290	configuration of startup-memory in webinterface git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@771 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	96a5b6e8fb	removed yacy peer types from serverSwitch git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@758 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
orbiter	fb52a82008	added new performance page for memory settings git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@751 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	a6a8af0f04	*) httpdFileHandler templateCache can now be disabled git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@708 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
allo	286442fbc5	do not Use YaCy-Sites as Referer, if useYacyReferer = false http://www.yacy-forum.de/viewtopic.php?p=8896#8896 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@637 6c8d7289-2bf4-0310-a012-ef5d649a1542	19 years ago
theli	b70de495a0	*) Remembering Crawler-isPaused setting git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@586 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
theli	17be77a468	) Bugfix for "Crawler data will not be removed from htcache if content parsing failed" See: http://www.yacy-forum.de/viewtopic.php?t=965&highlight=ramdisk ) Making ACCEPT_LANGUAGE configureable for crawler See: http://www.yacy-forum.de/viewtopic.php?p=8327 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@583 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
orbiter	8d6c288f04	display of peer name in headline; see http://www.yacy-forum.de/viewtopic.php?p=7466#7466 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@535 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
orbiter	f5259f29e8	word cache behaviour fix and other fixes git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@519 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
allo	38e65b5a55	more mediaexts git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@518 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
theli	865b9490a2	*) Making DHT Transfer while Crawling configurable See: http://www.yacy-forum.de/viewtopic.php?p=6904 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@496 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago
theli	470839a16a	*) Crawler/Session pool settings will now be stored properly into configfile Bugfix for: - http://www.yacy-forum.de/viewtopic.php?t=502 - http://www.yacy-forum.de/viewtopic.php?t=778 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@477 6c8d7289-2bf4-0310-a012-ef5d649a1542	20 years ago

1 2 3 4

189 Commits (52cb3208d09a5449f9584968d84e1c7ffc67fdac)