- added transport of string-search pattern to remote search protocol
- fixed a problem parsing snippets with a '-' inside
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7700 6c8d7289-2bf4-0310-a012-ef5d649a1542
- on client-side: fix bad IP reports from remote Peers by replacing their reported IP with their server IP if the reported IP is bad, broken or disallowed
- on server-side: the same during a peer ping (here the ping'ed server acts also as client during the back-ping) and also when receiving a message or a search where the client sends also its seed. Here the IP is replaced by the client IP if the reported IP is broken or bad
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7687 6c8d7289-2bf4-0310-a012-ef5d649a1542
This is also a protection against the foced-fake of a external IP: if such an IP was faked, then the next ping from the affected peer to another peer looks like a staticIP report. Such a bad staticIP-by-faked-response can now be discovered and fixed by the peer that gets the second ping after the first ping contained a faked response.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7683 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added more properties to solr index
- refactoring
- more constants in switchboard
- fix for some NPEs
- recognition of more images
- removed synchronization in HandleMap (obviously not necessary?)
- added a nolocal configuration to remove excessive dns lookup (works only on allip - default off). Indexes produced with this setting are all flagged with 'local' and are (on purpose) not usable for freeworld because they will be rejected as beeing local.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7672 6c8d7289-2bf4-0310-a012-ef5d649a1542
YaCy supports now the storage to remote solr indexes.
More federated storage (and search) methods may follow.
The remote index scheme is the same as produced by the SolrCell; see
http://wiki.apache.org/solr/ExtractingRequestHandler
Because this default scheme is used, the default example scheme can be used as solr configuration
This is also the same scheme that solr uses if documents are imported with apache tika.
federated solr storage is switched off by default.
To use this, do the following:
- set federated.service.solr.indexing.enabled = true
- download solr from http://www.apache.org/dyn/closer.cgi/lucene/solr/
- extract the solr (3.1) package, 'cd example' and start solr with 'java -jar start.jar'
- start yacy and then start a crawler. The crawler will fill both, YaCy and solr indexes.
- to check whats in solr after indexing, open http://localhost:8983/solr/admin/
Until now it is not possible to use the solr index to search with YaCy in that solr index.
This functionality is now available for two reasons:
1) to compare the functionality of Solr and YaCy and to compare the search speed
2) to use YaCy as a search appliance for people who need a crawler or other source harvesting methods
that YaCy provides (like dublin core reading, wikimedia dump reading, rss feed reader etc) if people still
want to use solr instead of YaCy.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7654 6c8d7289-2bf4-0310-a012-ef5d649a1542
(directly or indirectly) and it grants a crawl-delay of 0. Then all forced pause mechanisms in YaCy are switched off and the domain is crawled at full speed.
crawl delay values can be assigned to either
- all yacy peers using the user-agent yacybot
- a specific peer with peer name <peer-name>.yacy or
- a specific peer with peer hash <peer-hash>.yacyh
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7639 6c8d7289-2bf4-0310-a012-ef5d649a1542
- new concurrent score map using atom operation from java concurrency classes
- redesigned difference beween StaticScore and Dynamic Score into ScoreMap and ReversibleScoreMap allowed that many classes can now use simple ScoreMap Objects which can be used better in concurrent environments using the ConcurrentScoreMap
- switched from DynamicScore to ConcurrentScoreMap usage wherever possible
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7586 6c8d7289-2bf4-0310-a012-ef5d649a1542
This has two aspects: the user who searches may want to increase these values to get more results and more load on the remote side and the user of the server which is accessed for this search may want to restrict the load. Both sides can now be configured. The server-site maximum load parameters are defined by a network definition and the client-side search request load can be defined by each user individually but when the remote search is done the requested service is limited to the network definition.
You can find now in the network definition file:
network.unit.remotesearch.maxcount and network.unit.remotesearch.maxtime
and in the yacy.conf file:
remotesearch.maxcount and remotesearch.maxtime
There is currently no web interface to define the client-side remote search attributes, please set them manually
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7548 6c8d7289-2bf4-0310-a012-ef5d649a1542
Reason: if this is not used in StringBody-Class initialization, a default charset name is parsed.
This is a synchronized process and all classes using default charsets synchronize at that point
Synchronization is omitted if this class is used
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7530 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added more instances of formatter objects to different classes to make them independent in case of lockings that may applay during synchronization of the date formatter object (date formatting is not thread-safe and must be synchronized therefore)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7528 6c8d7289-2bf4-0310-a012-ef5d649a1542