Especially for Turkish speaking users using "tr" as their system default
locale : strings for technical stuff (URLs, tag names, constants...)
must not be lower cased with the default locale, as 'I' doesn't becomes
'i' like in other locales such as "en", but becomes 'ı'.
- ensure use of HTTP POST method : HTTP GET should only be used for
information retrieval and not to perform server side effect operations
(see HTTP standard https://tools.ietf.org/html/rfc7231#section-4.2.1)
- a transaction token is now required for these administrative form
submissions to ensure the request can not be included in an external
site and performed silently/by mistake by the user browser
The main shutdown hook thread was not properly waiting for the main
thread termination which consequently could not properly close resources
and threads. After terminating a running YaCy peer this way (Ctrl+C in
console, or kill <pid> for example), you could see the still existing
DATA/yacy.running file.
Tested with :
- Debian Jessie openjdk 7 and 8 : regular shutdown, Ctrl+C, kill
command, system restart while yacy is running
- Windows 10 Oracle JDK 7 and 8 : non regression on regular shutdown
First fix for mantis 689 (http://mantis.tokeek.de/view.php?id=689).
On Debian Linux, with a headless jre and no open browser,
browser.openBrowserClassic() was called and waited forever the browser
process end (p.waitFor()). YaCy shutdown was therefore not working until
the browser was closed.
Also modified browser opening command for Unix platform to open the
default the browser (with xdg-open util) instead of Firefox.
xdg-open also has the advantage to be asynchronous (not blocking).
version numbers are expressed in a different way as we expect. That
could cause that YaCy does not run on systems which are appropriate but
we simply do not understand the version string.
If language is set to "browser" the client/user browser language is used to choose from
available translation.
simply: one users browser speaks English -> YaCy responds in English, other users browser speaks French -> YaCy responds in French.
! To make a translation/language available you have to activate the language once !
(or manually use the utility class TranslateAll)
In ConfigBasic.html availabel translations are marked green on setting language=Browser
The client language is determined by http header Accept-Language (checked in DefaultServlet)
This is the 1st rudimentary approach to support the translatio utilities.
It allows currently to edit untranslated text and save it in a local translation file
in the DATA/LOCALE directory.
+ refactor Translator (less static's) to leverage on class overrides and support garbage collection for this 1 time routine
+ adjust TranslatorXliff to check for local translations in DATA/LOCALE,
this includes storing manually downloaded translation files in DATA as well
(to keep default untouched)
+ on 1st call of Translator_p a master tanslation file is generated, checking
the supported languages for missing translation text (later this masterfile is planned to part of the distribution, to harmonize translation key text between the languages)
Outlook: the local modifications (possibly as translation fragments instead of complete file) to be shared with maintainer using xlif features.
JVM registers each file in a list regardless of already deleted and never
cleans up the list during runtime.
This accumulates to a considerable amount of mem during large crawls and/or
long uptime.
To tackle this, all temp files are now created in a subdir of java.io.tmpdir
and the jvm tmpdir property is set to this subdir, which is deleted by
code on shutdown.
Additionally let pdfParser use this tmp subdir too.
Background: some user report problem with connecting/crawling some sites via https which require SNI support (by default switched off in YaCy). On the other hand systems not demanding SNI support are sometimes not properly configured and due to a bug/feature in java 1.7 connection is aborted. The later is more often the case, so the default is still fine. With the java start parameter expert user can no alter the startparameter to -Djsse.enableSNIExtension=true (java default) if they crawl more hosts requiring SNI support.
The alternative to let YaCy try both during https handshake (deep inside the httpclient) is not pursut at this time.
*) will try other ports if YaCy standard ports are not available
*) distinguish between internal and external port (not sure if this
works 100%)
Still to add: propery in config to enter own external port (in case of
manually configured NAT)
tested with IE11 and Firefox 32 (change worked for both to show 2nd line without cutting off height)
+fix charset parameter in metadataImageParser
+update start errMsgTxt to "java 1.7"
removed preferred IPv4 in start options and added a new field IP6 in
peer seeds which will contain one or more IPv6 addresses. Now every peer
has one or more IP addresses assigned, even several IPv6 addresses are
possible. The peer-ping process must check all given and possible IP
addresses for a backping and return the one IP which was successful when
pinging the peer. The ping-ing peer must be able to recognize which of
the given IPs are available for outside access of the peer and store
this accordingly. If only one IPv6 address is available and no IPv4,
then the IPv6 is stored in the old IP field of the seed DNA.
Many methods in Seed.java are now marked as @deprecated because they had
been used for a single IP only. There is still a large construction site
left in YaCy now where all these deprecated methods must be replaced
with new method calls. The 'extra'-IPs, used by cluster assignment had
been removed since that can be replaced with IPv6 usage in p2p clusters.
All clusters must now use IPv6 if they want an intranet-routing.
found a situation after crash (reboot) with existing running semaphore but YaCy not running.
Ping generated exception which finally deleted the conf file (during pre-read procedure)
- change to ping (catch exception solved it)
- additionally removed delete yacy.conf file (if needed we need to make a backup)
besides adjustments in code it makes the servlet settings in web.xml significant.
This applies to solr, gsa and proxy servlet. There is no longer a default setup in code during init (as jetty 9 checks for double definition).
- move htroot exist check from old httpdfilehandler to startup, remove from filehandler and legacy proxyhandler
- use SwitchboardConstant.htroot where appropriate
- useful for remote installation to set any config file property
- multipe parameter can be set at once, on Windows enclose parameter in doublequotes
- special handling "adminAccount=adminuser:adminpwd" sets adminusername and md5 encoded admin-pwd
- adjusted windows startbatch to allow command line parameter handling
- remove not needed classpath calculation from startYACY_debug.bat
introduced, it was also used for search facets. The generic search
facets are now deduced from generic solr fields which makes jena as tool
for facet semantics superfluous.
request into a separate thread and ignores the furthure result of a
request if that does not answer within the requested time-out. This is a
try to solve a problem with the peer-ping, which hangs whenever a peer
appears to be dead or blocked.
- the admin user name can be configured, in apiExec calls the default "admin" username is used.
TODO: the bin/apicall.sh script should likely take that into account.
via Jetty IPAccessHandler to allow only configured IP's to access.
Handler is only loaded if a restriction is configured.
Since IPAcessHandler (Jetty 8) does not support IPv6 system property java.net.preferIPv4Stack=true
Testing showed system.setProperty seems to be sensitive to point of calling (earliest possible time seems to be best = early in yacy.main).
Moved the "isrunning..." just open browser check also to the new routine to preread the yacy.config only once.
- userDB is not sync'ed with Jetty credentials as of now only the std. admin account can login
switched initial browser open with ssl active back to std. http port
!!! attention !!! to make sure YaCy can start, https will be disabled if port 8443 is used
- added ping test for above to migration
- as of now port for https is hardcoded to default 8443
- if not urgend required I'd leave it this way (it's standard) to use different ports for http and https
- post https port on ConfigBasic.html (if active)