Michael Peter Christen
a304058840
added Image Events as another option to generate images with a mac if no
...
Ghostscript is available or does not work...
10 years ago
Michael Peter Christen
d83de9ecf5
added another path for the convert command because on older Macs
...
ImageMagick has a different installation location
10 years ago
Michael Peter Christen
226aea5914
added a servlet which can create preview images, preview tumbnails and
...
preview pdfs from web pages, i.e.:
http://localhost:8090/api/snapshot.png?url=http://yacy.net/en/&width=128&height=128
http://localhost:8090/api/snapshot.jpg?url=http://yacy.net/en/&width=128&height=128
http://localhost:8090/api/snapshot.pdf?url=http://yacy.net/en/
This supports also an on-the-fly generation of the preview documents if
the user is an administrator. Otherwise, the servlet fails.
To enable this, you must add wkhtmltopdf, imagemagick and (on headless
servers) xvfb to your operation system.
for detailed instructions, see
97f6089a41
10 years ago
reger
28456dfc09
skip creation of unused Bluelist contenttransformer
10 years ago
Michael Peter Christen
321840fde3
Replaced all fixed thread pools with cached thread pools. The cached
...
thread pools will flush their cached (dead) threads after 60 seconds.
This will cause that YaCy now runs constantly withl about 50 threads,
about 100 at peak times. Previously, about 400 threads had been cached
and kept in a hibernation state, which caused that the numproc counter
in /proc/user_beancounters (exists only in VM-hosted linux) was as high
as the cached number of threads. This caused that VM supervisors
terminated whole VM sessions if a limit was reached. Many VM providers
have limits of numproc=96 which made it virtually impossible to run YaCy
on such machines. With this change, it will be possible to run many YaCy
instances even on VM hosts.
10 years ago
Michael Peter Christen
181911376c
showing list of all thread in threaddump using the ThreadMXBean counter
...
(this obviously show more threads than before?)
10 years ago
Michael Peter Christen
7bfab5eb9d
set Busy- and Blocking-Threads to daemon mode (they will now not prevent
...
YaCy from termination if still running)
10 years ago
Michael Peter Christen
64887f6b21
show number of threads on status page
10 years ago
Michael Peter Christen
e586e423aa
in case that loading from the cache fails, load from wkhtmltopdf without
...
cache using the user agent string given in the crawl profile
10 years ago
Michael Peter Christen
d5bac64421
recognize more html file types for snapshots
10 years ago
Michael Peter Christen
6f0167fac1
get cloned crawl start parameter for snapshots
10 years ago
Michael Peter Christen
a1ee101079
recognize more html file extensions
10 years ago
Michael Peter Christen
8480641f2d
fix to xvfb-run usage (quotes did not parse in xvfb-run, default values
...
are appropriate)
10 years ago
Michael Peter Christen
68b040e31e
added fail-over missing http proxy service (i.e. overload) and quiet
...
mode
10 years ago
Michael Peter Christen
25a64c51b3
moved snapshot generation out of the html handler to prevent that
...
existing cache entries cause that the handler is not executed
10 years ago
Michael Peter Christen
c35170a305
more logging
10 years ago
Michael Peter Christen
e8be07ec78
grr
10 years ago
Michael Peter Christen
6f81bb756c
wrap wkhtmltopdf with xvfb if necessary
10 years ago
Michael Peter Christen
0119f8665d
more logging when failing to create pdf snapshot
10 years ago
Michael Peter Christen
416fe886e3
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen
60f27bdf49
added the property timeoutrequests to configuration to disable
...
TimeoutRequests. The purpose is to test if YaCy runs better on VMs where
there is a limitation of concurrent processes; see
/proc/user_beancounters in row numproc; this value is limited and should
be low. Try to set timeoutrequests to keep this low. (works only after
restart)
10 years ago
Michael Peter Christen
97f6089a41
YaCy can now create web page snapshots as pdf documents which can later
...
be transcoded into jpg for image previews. To create such pdfs you must
do:
Add wkhtmltopdf and imagemagick to your OS, which you can do:
On a Mac download wkhtmltox-0.12.1_osx-cocoa-x86-64.pkg from
http://wkhtmltopdf.org/downloads.html and downloadh
ttp://cactuslab.com/imagemagick/assets/ImageMagick-6.8.9-9.pkg.zip
In Debian do "apt-get install wkhtmltopdf imagemagick"
Then check in /Settings_p.html?page=ProxyAccess: "Transparent Proxy" and
"Always Fresh" - this is used by wkhtmltopdf to fetch web pages using
the YaCy proxy. Using "Always Fresh" it is possible to get all pages
from the proxy cache.
Finally, you will see a new option when starting an expert web crawl.
You can set a maximum depth for crawling which should cause a pdf
generation. The resulting pdfs are then available in
DATA/HTCACHE/SNAPSHOTS/<host>.<port>/<depth>/<shard>/<urlhash>.<date>.pdf
10 years ago
Michael Peter Christen
41d00350e4
moved network configuration to Use Case submenu; this is necessary
...
because the definiton of portal peers within the YaCy freeworld network
is otherwise splitted into two different main menus.
10 years ago
reger
ff80700aff
replace depreciated Solr DateField.formatExternal with recommended TrieDateField.formatExternal
10 years ago
Michael Peter Christen
9ea120dbe5
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
reger
aa7122f079
update to guava.18.0.jar and jsch.0.1.51.jar
10 years ago
reger
0c97cc2440
skip unused call parameter for hashSentence()
10 years ago
reger
221f86dd5e
position api icon (ViewFile.html)
10 years ago
reger
4c14a8b44d
update to poi-3.10.1.jar
10 years ago
reger
ea633a794c
including small junit test case for WordTokenizer
10 years ago
reger
5790c7242e
skip to tokenize punktuation as word in WordTokenizer
...
remove unused variables in condenser related to Tokenizer
10 years ago
reger
f07392ff17
add. use host port parameter in YaCyApp
10 years ago
Michael Peter Christen
09d2867050
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen
ad0da5f246
added new web page snapshot infrastructure which will lead to the
...
ability to have web page previews in the search results.
(This is a stub, no function available with this yet...)
10 years ago
reger
aa0faeabc5
adjust translation text of error msg on empty query
...
(ru: needs correction)
10 years ago
reger
c475be2937
fix (enable) error msg on empty query
10 years ago
reger
ef5c5b4489
update to Jetty 9.2.4
10 years ago
reger
f709132961
remove obsolete alternate link
...
fix api link
10 years ago
Michael Peter Christen
5f5c7d69d1
added image screenshot generator
10 years ago
Michael Peter Christen
3c71e1c872
show vocabularies in search result (in case of debugging)
10 years ago
Michael Peter Christen
1d45d9405a
security bugfix
10 years ago
Michael Peter Christen
ff728b4aa5
ignore url errors during search
10 years ago
Michael Peter Christen
c94c24638f
disabled postprocessing by default. If you read this: please disable
...
postprocessing in your peer as well: open /IndexSchema_p.html, then
deselect field process_sxt
10 years ago
Michael Peter Christen
2fce2e2697
larger boost fields for ranking
10 years ago
Michael Peter Christen
6c03ff8355
bold words in snippets should not be coloured black in the base style
...
because there are styles with dark backgrounds which make the bold word
invisible
10 years ago
Michael Peter Christen
8317914ce3
changed vocabulary navigator object type to TreeMap to get a specific
...
order into the vocabularies. This is now lexicographic which is not so
much random as a hashed order
10 years ago
Michael Peter Christen
d5c1b07768
Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen
c0f9f6ac66
added option to change the navbar-default, i.e. usable for dark skins
10 years ago
Michael Peter Christen
10794e8efd
trying facet.method fc instead of fcs to handle large facets
10 years ago
Michael Peter Christen
041b605cfe
Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago