yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	181911376c	showing list of all thread in threaddump using the ThreadMXBean counter (this obviously show more threads than before?)	10 years ago
Michael Peter Christen	64887f6b21	show number of threads on status page	10 years ago
Michael Peter Christen	6f0167fac1	get cloned crawl start parameter for snapshots	10 years ago
Michael Peter Christen	97f6089a41	YaCy can now create web page snapshots as pdf documents which can later be transcoded into jpg for image previews. To create such pdfs you must do: Add wkhtmltopdf and imagemagick to your OS, which you can do: On a Mac download wkhtmltox-0.12.1_osx-cocoa-x86-64.pkg from http://wkhtmltopdf.org/downloads.html and downloadh ttp://cactuslab.com/imagemagick/assets/ImageMagick-6.8.9-9.pkg.zip In Debian do "apt-get install wkhtmltopdf imagemagick" Then check in /Settings_p.html?page=ProxyAccess: "Transparent Proxy" and "Always Fresh" - this is used by wkhtmltopdf to fetch web pages using the YaCy proxy. Using "Always Fresh" it is possible to get all pages from the proxy cache. Finally, you will see a new option when starting an expert web crawl. You can set a maximum depth for crawling which should cause a pdf generation. The resulting pdfs are then available in DATA/HTCACHE/SNAPSHOTS/<host>.<port>/<depth>/<shard>/<urlhash>.<date>.pdf	10 years ago
Michael Peter Christen	41d00350e4	moved network configuration to Use Case submenu; this is necessary because the definiton of portal peers within the YaCy freeworld network is otherwise splitted into two different main menus.	10 years ago
reger	221f86dd5e	position api icon (ViewFile.html)	10 years ago
Michael Peter Christen	ad0da5f246	added new web page snapshot infrastructure which will lead to the ability to have web page previews in the search results. (This is a stub, no function available with this yet...)	10 years ago
reger	c475be2937	fix (enable) error msg on empty query	10 years ago
reger	f709132961	remove obsolete alternate link fix api link	10 years ago
Michael Peter Christen	3c71e1c872	show vocabularies in search result (in case of debugging)	10 years ago
Michael Peter Christen	2fce2e2697	larger boost fields for ranking	10 years ago
Michael Peter Christen	6c03ff8355	bold words in snippets should not be coloured black in the base style because there are styles with dark backgrounds which make the bold word invisible	10 years ago
Michael Peter Christen	c0f9f6ac66	added option to change the navbar-default, i.e. usable for dark skins	10 years ago
Michael Peter Christen	84763126e0	added option to make the YaCy proxy act as the cache is never stale. If set to 'Always Fresh' the cache is always used if the entry in the cache exist. This is a good way to archive web content and access it without going online again in case the documents exist. To do so, open /Settings_p.html?page=ProxyAccess and check the "Always Fresh" checkbox. This is set do false which behave as set before. If you set this to true, then you have your web archive in DATA/HTCACHE. Copy this to carry around your private copy of the internet!	10 years ago
Michael Peter Christen	5bb52f79be	reduce number of calls to queue.size() because that may be a bottleneck during crawling	10 years ago
Michael Peter Christen	092d97d7ac	when importing vocabulary csv files, accept also files without semicolon and truncate quotes from literals	10 years ago
Michael Peter Christen	ee9ec40048	added hints to ranking to make ranking boosts using vocabularies easier	10 years ago
Michael Peter Christen	70f03f7c8e	do not cache search requests to Solr if the result is used for doublechecking. If a double-check comes from cached results the doublecheck fails.	10 years ago
Michael Peter Christen	a0b84e4def	use a LinkedHashMap for factes to maintain facet order as given by solr	10 years ago
Michael Peter Christen	0dc6e0a5f2	added option to enrich vocabularies with synonyms from synonym database	10 years ago
Michael Peter Christen	6a2a669db4	added loading of the synonyms file from addon/synonyms into the knowledge loader	10 years ago
Michael Peter Christen	fdba8e2fa0	fix for 2-day network stats table: showing 48 instead of 24 hours from peer history	10 years ago
Michael Peter Christen	ec9d021568	added option in vocabulary editor to import CSV files with different encodings (preselected windows-type character encoding which is typical for CSV files). Fixed also other problems with character encoding in dictionary files. Automatically generated vocabularies are now also noted in the API steering.	10 years ago
reger	b558433211	adjust tag cloud font size calculation to limit max font size to ~ TOPWORDS_MAXSIZE	10 years ago
Michael Peter Christen	0550b54d56	added fix to postprocessing: avoid caching of postprocessing collection to always get fresh lists of documents. This is necessary since the postprocessing changes the same documents which the postprocessing-collection query selects.	10 years ago
Michael Peter Christen	68e8039fd1	added high-precision scheduler for API processes. This allows also to make the execution in dependency of available RAM or CPU load. The default value for CPU load is 4.0 and the check runs once a minute.	10 years ago
Michael Peter Christen	0a879c98e7	added new 'firstSeen' database table and necessary data structures which hold a date for each URL to record when a url was first seen. This is then used to overwrite the modification date for urls upon recrawl in case that the first-seen date is before the latest document date. This behaviour is necessary due to the common behaviour of content management systems which attach always the current date to all documents. Using the firstSeen database it is possible to approximate a real first document creation date in case that the crawler starts frequently for the same domain. As a result the search results ordered by date have a much better quality and the usage of YaCy as search agent for latest news has a better quality.	10 years ago
Michael Peter Christen	487a733c99	fix for catchall handling in search	10 years ago
sixcooler	33b0234454	added a input-field for setting 'fileHost' Set this to avoid error-messages like 'proxy use not allowed / granted' on accessing your Peer by its hostname.	10 years ago
Michael Peter Christen	1db476c67e	fix for bad table iteration	10 years ago
Michael Peter Christen	e05b7332b9	html fix	10 years ago
reger	c1ad265efd	remove not used accordion javascript call for facet navs	10 years ago
Michael Peter Christen	ecdfb35f09	added long variables to debug output in index browser	10 years ago
Michael Peter Christen	95d87f00b3	fix for bad query generation in doublecheck in postprocessing	10 years ago
orbiter	a2b5cfb3cf	added reverse button to tables, by default on now (to see latest entries first)	10 years ago
orbiter	fceac5d2d4	added (missing) Tables_p.xml for table xml api	10 years ago
orbiter	dbafd4865e	enhanced debug code in host browser	11 years ago
Michael Peter Christen	8f6587e87b	fix for broken protocol navigation	11 years ago
Michael Peter Christen	5c962dd009	better scaling of network statistic graphs	11 years ago
orbiter	3ffe19b85c	replaced old /api/table_p.xml servlet with /Tables_p.xml to avoid double code	11 years ago
Michael Peter Christen	b4585e9546	added new index size history image in /Status.html page	11 years ago
Michael Peter Christen	9aebbbebc0	added network history in /Network.html?page=5	11 years ago
Michael Peter Christen	26279b0993	added debug code for statistics about document attributes related to domains	11 years ago
reger	d65e3f2b53	RankingSolr: display only available or configured boost fields	11 years ago
Michael Peter Christen	4e56d79fc8	replaced input text field with text field for index deletion with query and replaced GET with POST method. This should make it possible to tubmit here very large queries for deletion.	11 years ago
orbiter	6f707b4305	removed spaces in seedlist.xml to reduce data	11 years ago
orbiter	78c9d31388	fix for bad json	11 years ago
Michael Peter Christen	8098a86f1d	ipv6 fix for api /yacy/seedlist.[json\|xml], multiple IPs are now attached to the seed info. API clients must be adopted. Documentation will be fixed in http://www.yacy-websuche.de/wiki/index.php/Dev:APIseedlist Also added a new retrieval option for seeds, they can now be retrieved by their name with the get parameter name=<name>	11 years ago
Michael Peter Christen	07c5b57953	removed warnings	11 years ago
Michael Peter Christen	509eba2484	automatically zoom to location/POI	11 years ago
orbiter	fa2ad101ec	enhanced graphics computation (avoiding long string parsing for colours)	11 years ago
orbiter	ef813cec91	added proper copyright notice to OSM tiles presented at the search result page	11 years ago
Michael Peter Christen	1269e77dfa	enhanced location search	11 years ago
Michael Peter Christen	75b5f24be4	make browsing of file://z: - paths in index browser easier - this will now show the root paths on a shared drive	11 years ago
Michael Peter Christen	8ac3e9f890	fix for api icon in yacysearch_location.html	11 years ago
Michael Peter Christen	a1dd0ae62c	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
reger	f5967dfedf	add filter to citation page and a on/off button to display only sentences with citations, while maintaining the sentence number. Make the filtered list the default in search result citation link	11 years ago
Michael Peter Christen	f818f84adb	more ipv6 fixes	11 years ago
Michael Peter Christen	2c2b50e65d	refactoring (class name should start with uppercase letter)	11 years ago
Michael Peter Christen	14385057c2	added also the NetworkHistory servlet...	11 years ago
Michael Peter Christen	d8beafba3a	fix for values in CrawlProfileEditor table and xml; now the full profile is available in the xml.	11 years ago
Michael Peter Christen	ec95dfa2e6	fixed crawl profile xml result which did not show the correct crawl status.	11 years ago
Michael Peter Christen	8c1a89cb34	added another decoration flag to switch off network graphics in crawler monitor and index browser: decoration.grafics.linkstructure Please set this to false to remove the graphics from the interface.	11 years ago
Michael Peter Christen	764e4ed673	fixed appearance of RSS icon on search result page	11 years ago
Michael Peter Christen	9b1958e8ca	more ipv6 bugfixes	11 years ago
Michael Peter Christen	7817fc50c9	added a high cpu cycle monitor to PerformanceQueues	11 years ago
Michael Peter Christen	5082feb103	less volume for effect sounds	11 years ago
Michael Peter Christen	0bfc69b29b	more ipv6 bugfixes	11 years ago
Michael Peter Christen	a27563e5c3	removed the atmo sound clips because they had been too large	11 years ago
Michael Peter Christen	ae58b22f5b	ipv6 fixes for Network.html front page	11 years ago
Michael Peter Christen	e413beac04	fix for latest UPnP update	11 years ago
Michael Peter Christen	74957f3760	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
Michael Peter Christen	2a052f446a	Added an experimental audio feedback system. This is the first element of a new 'decoration' component which may hold switches for different external appearance parameters. The first switch in that context is decoration.audio (as usual in yacy.init). This value is set to false by default, that means the audio feedback element is switched off by default. To switch it on, set decoration.audio = true (using /ConfigProperties_p.html). You will then hear sounds for the following events: - remote searches - incoming dht transmissions - new documents from the crawler Sound clips are stored in htroot/env/soundclips/ which is done so because a future implementation will read these files using the http client and with configurable urls which will make it very easy for the user to replace the given sounds with own sounds.	11 years ago
Marc Nause	1e6e69bc40	Finished implementation of UPNP: ) will try other ports if YaCy standard ports are not available ) distinguish between internal and external port (not sure if this works 100%) Still to add: propery in config to enter own external port (in case of manually configured NAT)	11 years ago
Michael Peter Christen	e1bc768f9d	more IPv6 bugfixes	11 years ago
Michael Peter Christen	961f06c0b6	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
reger	209e0f2fe8	allow url parameter in worktable apicall allow url=wwwl?param=a&param=b (with ?, & encoded) fix: http://mantis.tokeek.de/view.php?id=100 fix double adding of '&' in MultiProtocolURL.escape()	11 years ago
reger	3562b5e3a4	open rejected urls in new browser	11 years ago
reger	b0c87d8240	fix image search expand box, cut-off of 2nd capture line height tested with IE11 and Firefox 32 (change worked for both to show 2nd line without cutting off height) +fix charset parameter in metadataImageParser +update start errMsgTxt to "java 1.7"	11 years ago
reger	fa99b286cc	add html5 autofocus to query input field (leave onload untouched = redundant, for IE9 http://www.w3schools.com/tags/att_input_autofocus.asp) adjust Peer-to-Peer/ Privacy switch label to display "Peer-to-Peer" as 2nd switch option in active stealth mode	11 years ago
Michael Peter Christen	329262231f	unresolved pattern fix	11 years ago
Michael Peter Christen	528f583d72	ipv6 fixes	11 years ago
Michael Peter Christen	e4ccca9497	fix for xss bugs found by CTF365	11 years ago
Michael Peter Christen	247e626083	IPv6 host parsing bugfixes	11 years ago
Michael Peter Christen	fe917deb2d	when pinging other peers, be able to select the right IP option	11 years ago
Michael Peter Christen	65e6ae52fb	IPv6-enhanced Network monitoring page	11 years ago
reger	7c1707872b	search result showPicture update search parameter used parameter &cat=image is obsolete and returns no results - remove &cat=image and &cat=href references - remove &tenant= references (unused) Use contentdom=image and inurl: parameter to make showPicture link display something (open in new window because of used inurl modifier changes original query)	11 years ago
Michael Peter Christen	3073c69aee	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
Michael Peter Christen	6491270b3a	large IPv6 redesign of peer ping methods! removed preferred IPv4 in start options and added a new field IP6 in peer seeds which will contain one or more IPv6 addresses. Now every peer has one or more IP addresses assigned, even several IPv6 addresses are possible. The peer-ping process must check all given and possible IP addresses for a backping and return the one IP which was successful when pinging the peer. The ping-ing peer must be able to recognize which of the given IPs are available for outside access of the peer and store this accordingly. If only one IPv6 address is available and no IPv4, then the IPv6 is stored in the old IP field of the seed DNA. Many methods in Seed.java are now marked as @deprecated because they had been used for a single IP only. There is still a large construction site left in YaCy now where all these deprecated methods must be replaced with new method calls. The 'extra'-IPs, used by cluster assignment had been removed since that can be replaced with IPv6 usage in p2p clusters. All clusters must now use IPv6 if they want an intranet-routing.	11 years ago
reger	0ecbf32134	update to Jetty 9.2.3	11 years ago
reger	46afdf7d21	add link to thread pool settings in status panel	11 years ago
reger	54019313e7	fix NPE in ViewFile - show snippet on document not in index	11 years ago
reger	4873a2d3a4	adjust link to peer in Network list (www path obsolete)	11 years ago
orbiter	3ac31614a3	added option to reverse-sort YaCy tables (internal API change only)	11 years ago
Michael Peter Christen	6d3d4c4ea6	changed the concurrent enumeration of query results in such a way that it is now possible to get the results in two steps: - first retrieve all IDs as given for a query - then retieve each document individually This was necessary for very large result sets where a query may run for hours and is possibly terminated by a solr-internal timeout. This occurs regulary during postprocessing and therefore this commit may fix unwanted postprocessing terminations.	11 years ago
reger	ed0d7a80d5	modifiy description for Field-Reindex to act only on local index http://mantis.tokeek.de/view.php?id=279	11 years ago
Michael Peter Christen	81f9b34da7	increaesed ability ot search for all images on a single server within the p2p remote search	11 years ago
Michael Peter Christen	9b92685771	automatically add a wild card if only a search on a single domain is done. This makes it possible to search all documents on a single domain even if no search word is given. This is in particular interesting when searching for all images on a single domain.	11 years ago
Michael Peter Christen	abde89438b	fix for favicon	11 years ago
Michael Peter Christen	ca8b2bf099	removed www and welcome servlet, these had been demo servlets and are not needed any more	11 years ago

1 2 3 4 5 ...

5111 Commits (3ac1d14a21dc3a41a6372e5d94c8cfeb645f51f8)