yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	30276a2b48	prevent that a local Solr search and a local RWI search are running concurrently. When a RWI search result is flushed into the result set, id does Solr Queries (which replaced the old-style Metadata Queries) and they are possibly running concurrently to a previously startet Solr search. Both methods may block each other with IO. To enhance the speed, they are now serialized. Because the Solr search results may result in better results using the more advanced and configurable Ranking methods, this result is preverred over the RWI search result. However, remote RWI search results are still feeded concurrently into the search result as well.	10 years ago
Michael Peter Christen	84763126e0	added option to make the YaCy proxy act as the cache is never stale. If set to 'Always Fresh' the cache is always used if the entry in the cache exist. This is a good way to archive web content and access it without going online again in case the documents exist. To do so, open /Settings_p.html?page=ProxyAccess and check the "Always Fresh" checkbox. This is set do false which behave as set before. If you set this to true, then you have your web archive in DATA/HTCACHE. Copy this to carry around your private copy of the internet!	10 years ago
reger	1e7ee72240	fix path lookup to ./defaults/yacy.badwords (fix of commit `ee277b9b3e`)	10 years ago
reger	7d863d6254	fix empty text facet entry (noticed on Author facet)	10 years ago
Michael Peter Christen	a39419f2ef	more stacks shall be considered for on-demand loading, not only deep-depth stacks to prevent "too many open files" problem	10 years ago
Michael Peter Christen	5bb52f79be	reduce number of calls to queue.size() because that may be a bottleneck during crawling	10 years ago
Michael Peter Christen	4920ab7b76	optimize usage of size() cache	10 years ago
reger	ee277b9b3e	allow for local yacy.stopwords and yacy.badwords list (in DATA/SETTINGS/) if file in DATA/SETTINGS it is loaded otherwise file in ./defaults is loaded (if locale ./defaults/stopwords.xx doesn't exist take solr/lang/stopwords_xx.txt as default) move yacy.stopwords, yacy.stopwords.de and yacy.badwords.example out of root directory to ./defaults directory	10 years ago
reger	de56266bcb	remove redundant toLower for topwords	10 years ago
Michael Peter Christen	a34f837592	better delete all files in path when removing host crawl stack	10 years ago
Michael Peter Christen	10b1db430a	if we have many hosts, use on-demand earlier	10 years ago
Michael Peter Christen	1324927e66	prevent division by zero	10 years ago
Michael Peter Christen	2beb6abeb6	disabled crazy sleep loop	10 years ago
Michael Peter Christen	70f03f7c8e	do not cache search requests to Solr if the result is used for doublechecking. If a double-check comes from cached results the doublecheck fails.	10 years ago
Michael Peter Christen	a0b84e4def	use a LinkedHashMap for factes to maintain facet order as given by solr	10 years ago
reger	ef5dc68313	include domtype to searcheventcache id to differenciate between local / global events for reuse of cached events fix for http://mantis.tokeek.de/view.php?id=493	10 years ago
Michael Peter Christen	0dc6e0a5f2	added option to enrich vocabularies with synonyms from synonym database	10 years ago
Michael Peter Christen	6a2a669db4	added loading of the synonyms file from addon/synonyms into the knowledge loader	10 years ago
Michael Peter Christen	c67c5c0709	added new solr schema fields which record the occurences of vocabulary matchings. These matches can be used for result boosting, i.e. if a document contains words from a specific vocabulary, boost it.	10 years ago
Michael Peter Christen	a67a465415	fix field counter for multi-fields in html writer for the solr servlet	10 years ago
Michael Peter Christen	ec9d021568	added option in vocabulary editor to import CSV files with different encodings (preselected windows-type character encoding which is typical for CSV files). Fixed also other problems with character encoding in dictionary files. Automatically generated vocabularies are now also noted in the API steering.	10 years ago
reger	3c818fc912	add a check of java version string >=1.7 to startup class stopping start with error msg on version < 1.7	10 years ago
Michael Peter Christen	0550b54d56	added fix to postprocessing: avoid caching of postprocessing collection to always get fresh lists of documents. This is necessary since the postprocessing changes the same documents which the postprocessing-collection query selects.	10 years ago
Michael Peter Christen	68e8039fd1	added high-precision scheduler for API processes. This allows also to make the execution in dependency of available RAM or CPU load. The default value for CPU load is 4.0 and the check runs once a minute.	10 years ago
Michael Peter Christen	8aee7f940e	added missing class for latest changes	10 years ago
Michael Peter Christen	97039049e4	fix in key enumeration methods for cases where the enumeration is done in reverse order.	10 years ago
Michael Peter Christen	7e1b0b6712	fix for wildcard patch in search queries	10 years ago
Michael Peter Christen	0a879c98e7	added new 'firstSeen' database table and necessary data structures which hold a date for each URL to record when a url was first seen. This is then used to overwrite the modification date for urls upon recrawl in case that the first-seen date is before the latest document date. This behaviour is necessary due to the common behaviour of content management systems which attach always the current date to all documents. Using the firstSeen database it is possible to approximate a real first document creation date in case that the crawler starts frequently for the same domain. As a result the search results ordered by date have a much better quality and the usage of YaCy as search agent for latest news has a better quality.	10 years ago
Michael Peter Christen	421ee64f33	another fix to ordering of table indexes; fixes also network stats graphics	10 years ago
Michael Peter Christen	1db476c67e	fix for bad table iteration	10 years ago
reger	e4316e2d74	skip creation of local var in proxyhandler.storetocache	10 years ago
sixcooler	9c6e3a6b1c	fix assertation-failure in version-string for Solr-4.10.2 by changing the assert - hope that is ok + add forgotten NB-Projekt-changes	10 years ago
sixcooler	725b206fb4	update to solr-/lucene-4.10.2	10 years ago
Michael Peter Christen	5c97ecb30f	fix of bad query generation for search facets	10 years ago
Michael Peter Christen	95d87f00b3	fix for bad query generation in doublecheck in postprocessing	10 years ago
orbiter	72c2bc5189	fix for search in case where local peer has no local seed address in portal mode	10 years ago
orbiter	5be352da99	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	10 years ago
orbiter	0fcd8097a3	removed unused options from BusyThreads	10 years ago
Michael Peter Christen	fe8b1d137d	emergency bugfix for 100% CPU in image drawing	10 years ago
Michael Peter Christen	92007e5d2d	more enhancements to posprocessing speed	10 years ago
Michael Peter Christen	9a7fe9e0d1	fix for bad timing computation in postprocessing	10 years ago
Michael Peter Christen	bd16119a00	another fix for postprocessing (the query for "" on numeric field did not work in external solr)	10 years ago
Michael Peter Christen	327e83bfe7	more fixes in postprocessing: partitioning of the complete queue to enable smaller queries	10 years ago
orbiter	2bc6199408	more concurrency for postprocessing	10 years ago
orbiter	a83cf26c38	more fixes and enhancements to postprocessing	10 years ago
orbiter	71758f0d62	enhanced postprocessing by usage of a field-list generation to prevent lazy initialization of the documents. This is useful because the documents must be read completely anyway.	10 years ago
orbiter	7856fbdbe8	fix for npe (in rare cases)	10 years ago
orbiter	8a2b569d7c	fix for literal computation	10 years ago
orbiter	856da2712b	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	10 years ago
orbiter	ca9cd7b58a	more IPv6 fixes	10 years ago
Michael Peter Christen	b4585e9546	added new index size history image in /Status.html page	10 years ago
Michael Peter Christen	167c5a51f0	IPv6 fix	10 years ago
Michael Peter Christen	fe537679de	fix for exact_signature_unique_b, exact_signature_copycount_i, fuzzy_signature_unique_b and fuzzy_signature_copycount_i: apply same criteria for 'valid document' as for title and description uniqueness test.	10 years ago
sixcooler	eb9d2705d2	fix for ConnectionInfo.cleanup of server-connections	10 years ago
Michael Peter Christen	2e5214eb21	added field postprocessing.partialUpdate to settings which can be used to switch on or off partial updates. Both options should cause the same result. Default is on.	10 years ago
Michael Peter Christen	11074d8d24	fix for a ssl bug that appear only in java 7. The bug was reported in http://forum.yacy-websuche.de/viewtopic.php?f=23&t=5407&p=30956#p30956 a solution was described in http://teknosrc.com/javax-net-ssl-sslprotocolexception-handshake-alert-unrecognized_name-solved/ which worked for this example given in the yacy forum	10 years ago
Michael Peter Christen	e96490e3a1	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	77662e08e1	concurrently initialize the error cache; extended also the cache by factor 10 up to 1000 entries. This error cache is only used to catch up paused crawls between shutdown+startup	10 years ago
sixcooler	d8fcc4a2f5	added a timeout on Jetty connectors	10 years ago
Michael Peter Christen	0f0b60404b	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
sixcooler	72561926aa	do not overwrite yacy.conf in case of an exception may be a fix for http://mantis.tokeek.de/view.php?id=180	10 years ago
Michael Peter Christen	07c5b57953	removed warnings	10 years ago
orbiter	fa2ad101ec	enhanced graphics computation (avoiding long string parsing for colours)	10 years ago
orbiter	ef813cec91	added proper copyright notice to OSM tiles presented at the search result page	10 years ago
Michael Peter Christen	fca11701f0	better profiling of solr queries	10 years ago
Michael Peter Christen	2e09da9832	npe fix	10 years ago
Michael Peter Christen	d80418f1b1	added partial updates to solr during postprocessing: during postprocessing the solr documents are now not completely retrieved. instead, only fiels, needed for the postprocessing are extracted. When Solr document are written, this is done using partial updates. This increases postprocessing speed by about 50% for embedded Solr configurations. For external Solr configurations the enhancement should be much higher because the postprocessing with remote Solr is very slow. When doing partial updates to a remote Solr, this method should perform much better than before, it is expected that this is even much higher than the increase with local Solr.	10 years ago
Michael Peter Christen	b1cfbc4a04	added new solr field url_paths_count_i which can be used to enhance the index browser and maybe also for ranking; possibly also for SEO-with-YaCy applications.	10 years ago
Michael Peter Christen	e69883d5ab	fix-fix for `30d4402cd1`	10 years ago
Michael Peter Christen	30d4402cd1	fixed location search	10 years ago
Michael Peter Christen	6983dff334	explain crawl denial when not switched to intranet mode	10 years ago
Michael Peter Christen	f818f84adb	more ipv6 fixes	10 years ago
Michael Peter Christen	afd5bd5f5f	slightly enhanced Network table computation by using a lazy initialized bitfield for peer flags	10 years ago
Michael Peter Christen	2c2b50e65d	refactoring (class name should start with uppercase letter)	10 years ago
Michael Peter Christen	bc275dca07	added network history graph image /NetworkHistory.png which can show many different statistics about the history of the peer.	10 years ago
Marc Nause	ce9368246b	Merge branch 'master' of gitorious.org:yacy/rc1	10 years ago
Marc Nause	5603809deb	Minor changes: ) reduced visibility of a method ) updated comments	10 years ago
Michael Peter Christen	d8beafba3a	fix for values in CrawlProfileEditor table and xml; now the full profile is available in the xml.	10 years ago
Michael Peter Christen	ec95dfa2e6	fixed crawl profile xml result which did not show the correct crawl status.	10 years ago
Michael Peter Christen	8c1a89cb34	added another decoration flag to switch off network graphics in crawler monitor and index browser: decoration.grafics.linkstructure Please set this to false to remove the graphics from the interface.	10 years ago
Michael Peter Christen	ee27be3399	misc bugfixes (concurrency, memory protection)	10 years ago
Michael Peter Christen	9b1958e8ca	more ipv6 bugfixes	10 years ago
Michael Peter Christen	7817fc50c9	added a high cpu cycle monitor to PerformanceQueues	10 years ago
Michael Peter Christen	5082feb103	less volume for effect sounds	10 years ago
Michael Peter Christen	e8392e2ff2	fix for local search	10 years ago
Michael Peter Christen	0bfc69b29b	more ipv6 bugfixes	10 years ago
Michael Peter Christen	a27563e5c3	removed the atmo sound clips because they had been too large	10 years ago
Michael Peter Christen	883622306e	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git Conflicts: source/net/yacy/peers/Protocol.java	10 years ago
Michael Peter Christen	97995a1dd9	fix for remote search process	10 years ago
Michael Peter Christen	0843b12ef3	ipv6 fix: avoid that shrinked own ip set is overwritten with (non-valid) set of local IPs	10 years ago
Michael Peter Christen	92c5d97486	fix for bad node flag setting with IPv6	10 years ago
orbiter	c27bad9326	more ipv6 fixes	10 years ago
orbiter	cddf884bc4	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	10 years ago
Michael Peter Christen	460858fb22	more ipv6 fixes	10 years ago
Michael Peter Christen	5cef88a315	argh.. adding missing java class for latest audio feature	10 years ago
Michael Peter Christen	74957f3760	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	2a052f446a	Added an experimental audio feedback system. This is the first element of a new 'decoration' component which may hold switches for different external appearance parameters. The first switch in that context is decoration.audio (as usual in yacy.init). This value is set to false by default, that means the audio feedback element is switched off by default. To switch it on, set decoration.audio = true (using /ConfigProperties_p.html). You will then hear sounds for the following events: - remote searches - incoming dht transmissions - new documents from the crawler Sound clips are stored in htroot/env/soundclips/ which is done so because a future implementation will read these files using the http client and with configurable urls which will make it very easy for the user to replace the given sounds with own sounds.	10 years ago
Marc Nause	1e6e69bc40	Finished implementation of UPNP: ) will try other ports if YaCy standard ports are not available ) distinguish between internal and external port (not sure if this works 100%) Still to add: propery in config to enter own external port (in case of manually configured NAT)	10 years ago
Michael Peter Christen	d0358e568b	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	e1bc768f9d	more IPv6 bugfixes	10 years ago

1 2 3 4 5 ...

3077 Commits (ab6cc3c88c29e551fc1f05e728e9d56e5a600735)