f1ori
399d7d6878
* fix permissions of bin/-folder in debian package
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7647 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
cominch
9ac02caf00
different initialization of empty variables in alternative constructor. This leads to wrong interpretation of user credentials, resulting in unnecessary "@" in front of host, and different urlhash values.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7646 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
a47bdc405b
better logging for robinson selection according to peer tag
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7645 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
cafcb1f9ed
removed the DNS resolving for web structure computation from the indexing queue and placed it in a concurrent computation queue that does not block the crawler. Makes crawling faster and less DNS-speed-dependent
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7644 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
57ce1fb491
reverted synchronization from SVN 7641
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7643 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
17530ca7b5
fix for bug http://bugs.yacy.net/view.php?id=10
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7642 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
7c8e764201
removed synchronization again...
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7641 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
96c32e87b0
fixes to crawler and new user-agent crawl-delay handling
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7640 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
b2fe4b7b1a
added a handling of appearances of yacy bot entries in robots.txt if this entry addresses the yacy peer
...
(directly or indirectly) and it grants a crawl-delay of 0. Then all forced pause mechanisms in YaCy are switched off and the domain is crawled at full speed.
crawl delay values can be assigned to either
- all yacy peers using the user-agent yacybot
- a specific peer with peer name <peer-name>.yacy or
- a specific peer with peer hash <peer-hash>.yacyh
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7639 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
f1ori
21fe5e6c6a
* add bin-folder to debian package
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7638 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
e25c1f2ea3
*) preventing whitespace keys in config file
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7637 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
cb6f709a16
- enhancements in surrogate reading
...
- better display of map in location search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7636 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
1ff9947f91
*) added new user right: extended search right (allows to define users who can query more results than anonymous users)
...
*) cleaned up code a little bit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7635 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
564184909a
enhanced the surrogate parser: better reading of UTF-8 characters
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7634 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
156cf02703
- added an index constraint 'has location' to the condenser
...
- added evaluation of the 'has location' constraint to search using the /location operator
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7633 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
41b8d7f655
fix for url normalization (no backpath resolving in post parameters)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7632 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
0430a94eaa
the location search shows now not re-evaluated locations but only such locations that are attached as metadata to web pages
...
- added parser for in-text appearing geo-locations
- added geo-locations to rss search result
- added evaluation of metadata-attached geo-locations in yacysearch_location to show search results within a map
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7631 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
8412f8787d
fix for http://bugs.yacy.net/view.php?id=8
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7630 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
9b25d07295
- added geo information parsing to html parser
...
- extended metadata information in index with geolocalisation
- added display of location in yacydoc and ViewFile
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7629 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
f1ori
efcf37a953
* show info in log, if robots.txt is rejected due to wrong mime-type
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7628 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
lotus
cbf87fe72f
write PID to yacy.running
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7627 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
lotus
06afa94f9d
hups
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7626 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
lotus
a9a9db98c8
better rename modified version
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7625 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
lotus
e19ca27004
do not autocomplete on mouseover. this has resulted in unwanted autocomplete.
...
fixes bug #3
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7624 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
16cd919795
*) fixed Exceptions which caused 500 error when entering invalid URL mask or invalid prefer mask, invalid masks are ignored, error message is displayed on yacysearch.html (what about yacysearch.rss and yacysearch.json?)
...
*) fixed "more options" link on yacysearch.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7623 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
1a24917cea
*) fixed NPE which occured when empty String was entered as search word
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7622 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
01b968d836
better concurrency in ViewImage icon cache and OOM protection for too large icon caches
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7621 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
b1a8d0c020
enhancements to web cache and less strict caching rules
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7620 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
f3baaca920
- enhancements to DNS IP caching and crawler speed
...
- bugfixes (NPEs)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7619 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
e7860b1239
*) <mode="Homer">D'oh!</Homer>
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7618 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
82f1580a60
*) trying to fix ConcurrentModificationException
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7617 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
f1ori
df71776929
* fix bug #7
...
* log requires poison to finish, so Base64Order main-function doesn't finish, when called from debian configure script
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7616 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
9f0286b380
*) fixed potential "java.lang.IllegalArgumentException: Illegal group reference" which occured if special characters which are also used as metacharacters in regular expression were used inside of <pre>...</pre> (see: http://veerasundar.com/blog/2010/01/java-lang-illegalargumentexception-illegal-group-reference-in-string-replaceall/ )
...
The class still contains a potential ConcurrentModificationException which occurs when the List which contains the elements of the table of content is moified during a recursion of tagReplace(). Will try to fix this later today.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7615 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
78d4c45d09
enhancement during search process: fast fail of search in case that all index feeder have terminated.
...
This change should affect filtering and navigators and should cause that search navigation gets faster
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7614 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
ba03ca8620
added more configuration options for search:
...
- removed configuration button for 'search only for admin' from index.html and added this to ConfigPortal
- added configuration of link verification options (iffresh, cacheonly, nocache, ifexist) to ConfigPortal
- added configuration of navigation options to ConfigPortal
- added an option to switch off automatic index cleaning in case that a link verification method fails
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7613 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
f1ori
e0c7d490f9
* fix bug #6
...
* exclude signature files from auto-deletion of unknown files in DATA/RELEASE
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7612 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
18ec7fe53c
added a clearall.sh script that deletes the complete index and everything else that belongs to crawling
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7611 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
d98884f1d5
added script for importmediawiki.sh in build.xml
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7610 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
a50f28e6e7
- fixed missing save operation for peer name change
...
- fixed import of mediawiki dump files
- added script to add mediawiki dump files
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7609 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
2b5f8585bf
performance hack for Balancer and ip address parsing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7608 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
43e1660512
fix/enhancement in Crawler: do not generate domain match pattern if crawl depth is 0
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7607 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
b1d133b69f
another anhancement to the ThreadDump function: better multiple dumps and filtering out of not interesting dump parts
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7606 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
a35d513bd8
fix for not-deleted .gap and .idx files
...
see also: http://forum.yacy-websuche.de/viewtopic.php?p=22128#p22128
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7605 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
a6935e7dc8
fix for active dns resolving: do not resolve in case that the dns server is not available (offline mode)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7604 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
859c99886c
fix for multiple thread dump
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7603 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
61acf55da4
avoided using a synchronized(this) for the hash computation to prevent that the lock on the object is (accidently) stolen by another thread and replaced this synchronization using the protocol object. Made also the protocol object final.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7602 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
c2a968c23f
fix for bug in formatting in ThreadDump
...
and added hint for linux/Mac users that they may use the LOCKED feature using the start option -l
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7601 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
2861d0888a
*) simplified code\n*) fixed potential NumberFormatExceptions
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7600 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
078ecacf61
avoid synchronization in DigestURI hash requests
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7599 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
1989ebc24b
removed more warnings
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7598 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago