yacy_search_server

Commit Graph

Author	SHA1	Message	Date
Michael Peter Christen	e586e423aa	in case that loading from the cache fails, load from wkhtmltopdf without cache using the user agent string given in the crawl profile	10 years ago
Michael Peter Christen	8480641f2d	fix to xvfb-run usage (quotes did not parse in xvfb-run, default values are appropriate)	10 years ago
Michael Peter Christen	68b040e31e	added fail-over missing http proxy service (i.e. overload) and quiet mode	10 years ago
Michael Peter Christen	25a64c51b3	moved snapshot generation out of the html handler to prevent that existing cache entries cause that the handler is not executed	10 years ago
Michael Peter Christen	c35170a305	more logging	10 years ago
Michael Peter Christen	e8be07ec78	grr	10 years ago
Michael Peter Christen	6f81bb756c	wrap wkhtmltopdf with xvfb if necessary	10 years ago
Michael Peter Christen	0119f8665d	more logging when failing to create pdf snapshot	10 years ago
Michael Peter Christen	416fe886e3	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	10 years ago
Michael Peter Christen	60f27bdf49	added the property timeoutrequests to configuration to disable TimeoutRequests. The purpose is to test if YaCy runs better on VMs where there is a limitation of concurrent processes; see /proc/user_beancounters in row numproc; this value is limited and should be low. Try to set timeoutrequests to keep this low. (works only after restart)	10 years ago
Michael Peter Christen	97f6089a41	YaCy can now create web page snapshots as pdf documents which can later be transcoded into jpg for image previews. To create such pdfs you must do: Add wkhtmltopdf and imagemagick to your OS, which you can do: On a Mac download wkhtmltox-0.12.1_osx-cocoa-x86-64.pkg from http://wkhtmltopdf.org/downloads.html and downloadh ttp://cactuslab.com/imagemagick/assets/ImageMagick-6.8.9-9.pkg.zip In Debian do "apt-get install wkhtmltopdf imagemagick" Then check in /Settings_p.html?page=ProxyAccess: "Transparent Proxy" and "Always Fresh" - this is used by wkhtmltopdf to fetch web pages using the YaCy proxy. Using "Always Fresh" it is possible to get all pages from the proxy cache. Finally, you will see a new option when starting an expert web crawl. You can set a maximum depth for crawling which should cause a pdf generation. The resulting pdfs are then available in DATA/HTCACHE/SNAPSHOTS/<host>.<port>/<depth>/<shard>/<urlhash>.<date>.pdf	10 years ago
reger	ff80700aff	replace depreciated Solr DateField.formatExternal with recommended TrieDateField.formatExternal	10 years ago
Michael Peter Christen	5f5c7d69d1	added image screenshot generator	10 years ago
Michael Peter Christen	10794e8efd	trying facet.method fc instead of fcs to handle large facets	10 years ago
Michael Peter Christen	a0b84e4def	use a LinkedHashMap for factes to maintain facet order as given by solr	10 years ago
Michael Peter Christen	0dc6e0a5f2	added option to enrich vocabularies with synonyms from synonym database	10 years ago
Michael Peter Christen	6a2a669db4	added loading of the synonyms file from addon/synonyms into the knowledge loader	10 years ago
Michael Peter Christen	a67a465415	fix field counter for multi-fields in html writer for the solr servlet	10 years ago
Michael Peter Christen	ec9d021568	added option in vocabulary editor to import CSV files with different encodings (preselected windows-type character encoding which is typical for CSV files). Fixed also other problems with character encoding in dictionary files. Automatically generated vocabularies are now also noted in the API steering.	10 years ago
Michael Peter Christen	95d87f00b3	fix for bad query generation in doublecheck in postprocessing	10 years ago
Michael Peter Christen	92007e5d2d	more enhancements to posprocessing speed	11 years ago
Michael Peter Christen	327e83bfe7	more fixes in postprocessing: partitioning of the complete queue to enable smaller queries	11 years ago
orbiter	2bc6199408	more concurrency for postprocessing	11 years ago
orbiter	a83cf26c38	more fixes and enhancements to postprocessing	11 years ago
orbiter	71758f0d62	enhanced postprocessing by usage of a field-list generation to prevent lazy initialization of the documents. This is useful because the documents must be read completely anyway.	11 years ago
orbiter	7856fbdbe8	fix for npe (in rare cases)	11 years ago
orbiter	8a2b569d7c	fix for literal computation	11 years ago
orbiter	856da2712b	Merge branch 'master' of git@gitorious.org:yacy/rc1.git	11 years ago
orbiter	ca9cd7b58a	more IPv6 fixes	11 years ago
Michael Peter Christen	167c5a51f0	IPv6 fix	11 years ago
Michael Peter Christen	fe537679de	fix for exact_signature_unique_b, exact_signature_copycount_i, fuzzy_signature_unique_b and fuzzy_signature_copycount_i: apply same criteria for 'valid document' as for title and description uniqueness test.	11 years ago
sixcooler	eb9d2705d2	fix for ConnectionInfo.cleanup of server-connections	11 years ago
Michael Peter Christen	fca11701f0	better profiling of solr queries	11 years ago
Michael Peter Christen	d80418f1b1	added partial updates to solr during postprocessing: during postprocessing the solr documents are now not completely retrieved. instead, only fiels, needed for the postprocessing are extracted. When Solr document are written, this is done using partial updates. This increases postprocessing speed by about 50% for embedded Solr configurations. For external Solr configurations the enhancement should be much higher because the postprocessing with remote Solr is very slow. When doing partial updates to a remote Solr, this method should perform much better than before, it is expected that this is even much higher than the increase with local Solr.	11 years ago
Michael Peter Christen	e69883d5ab	fix-fix for `30d4402cd1`	11 years ago
Michael Peter Christen	30d4402cd1	fixed location search	11 years ago
Michael Peter Christen	ee27be3399	misc bugfixes (concurrency, memory protection)	11 years ago
Michael Peter Christen	9b1958e8ca	more ipv6 bugfixes	11 years ago
Michael Peter Christen	97995a1dd9	fix for remote search process	11 years ago
Michael Peter Christen	92c5d97486	fix for bad node flag setting with IPv6	11 years ago
Michael Peter Christen	460858fb22	more ipv6 fixes	11 years ago
Michael Peter Christen	e1bc768f9d	more IPv6 bugfixes	11 years ago
Michael Peter Christen	961f06c0b6	Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git	11 years ago
reger	209e0f2fe8	allow url parameter in worktable apicall allow url=wwwl?param=a&param=b (with ?, & encoded) fix: http://mantis.tokeek.de/view.php?id=100 fix double adding of '&' in MultiProtocolURL.escape()	11 years ago
Michael Peter Christen	2c2ed8bf4e	typo in javadoc	11 years ago
Michael Peter Christen	528f583d72	ipv6 fixes	11 years ago
Michael Peter Christen	247e626083	IPv6 host parsing bugfixes	11 years ago
Michael Peter Christen	6491270b3a	large IPv6 redesign of peer ping methods! removed preferred IPv4 in start options and added a new field IP6 in peer seeds which will contain one or more IPv6 addresses. Now every peer has one or more IP addresses assigned, even several IPv6 addresses are possible. The peer-ping process must check all given and possible IP addresses for a backping and return the one IP which was successful when pinging the peer. The ping-ing peer must be able to recognize which of the given IPs are available for outside access of the peer and store this accordingly. If only one IPv6 address is available and no IPv4, then the IPv6 is stored in the old IP field of the seed DNA. Many methods in Seed.java are now marked as @deprecated because they had been used for a single IP only. There is still a large construction site left in YaCy now where all these deprecated methods must be replaced with new method calls. The 'extra'-IPs, used by cluster assignment had been removed since that can be replaced with IPv6 usage in p2p clusters. All clusters must now use IPv6 if they want an intranet-routing.	11 years ago
orbiter	a922b122a3	added a hack to forward solr search results from an external attached solr to the YaCy built-in solr search servlet. Its not complete and not fully correct (there is still a utf8 encoding problem) but it is a way to get easily requests forwarded through YaCy to an external Solr.	11 years ago
Michael Peter Christen	437ce3b8a0	added internal api for partial updates to Solr	11 years ago

1 2 3 4 5 ...

976 Commits (64887f6b21912ac676b9437474b459a0b2d34a9f)