orbiter
c7a614830a
several bugfixes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3899 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
465145cb6f
revert to insecure, but dau-proof defaults
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3898 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
7ad11ceaaa
security fix for peers without password. allow access only from localhost
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3897 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
71fd972ac0
- reduced default search time
...
- catched case when web structure cannot be painted because of too less data
- better logging when balance fails
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3892 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
684ded0e09
added new news types
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3876 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
d7de0938a6
fix for http://www.yacy-forum.de/viewtopic.php?p=36587#36587
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3870 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
22ee85ca02
- specified exceptions thrown by ResourceInfoFactory and plasmaHTCache.loadResourceInfo()
...
- caught possible NPE in CacheAdmin_p and added more error-cases
- speeded up deletion of entries in the local crawl queue by crawl profile (it has been noted often that this deletion is slow)
- added a bit javadoc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3868 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
dfd5e823c3
automatic limitation of web structure host count
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3867 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
8b0aea6910
fixed automatic deletion of too many referenced hosts in web structure
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3866 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
9a8a87612d
added new qph column to search tracker servlet
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3854 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
e07458bad4
added time-out function to web analysis
...
the default time-out is 1 second
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3852 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hydrox
4a1bc4743a
*)News-entries with blacklisted URLs are now ignored
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3849 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
339153d40e
*) favicons that are specified in the document content via html link-tags
...
are now detected and displayed on the search page (requested by allo).
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3845 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
6265d321bd
- more constants
...
- display why global search is not available on search page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3839 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
rramthun
18a5380ee3
*) situation-dependent lock-buttons for search-page
...
*) removed one unused import and a double definition of "ogg" as media-type
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3817 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
9d6605a83c
- fixed NPE in Blacklist Cleaner during deletion of more than one double entries
...
- don't display responseHeader1.db in CacheAdmin_p anymore
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3814 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
594ff95955
:-(
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3801 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
4ca797401e
fix for ConcurrentModificationException
...
see http://www.yacy-forum.de/viewtopic.php?p=36566#36566
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3800 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
7b904e0077
integrated robots.txt crawlDelay into the crawl balancer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3797 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
52cb033f01
- slightly different painting of web structure picture:
...
hosts that have many own connections are painted farer away (this is not yet cato's idea, this will be implemented in another step)
- doc update
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3796 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
6c9df13552
more debugging
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3791 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
d1e1580223
Surftips Blacklist
...
Blacklists List Hardcoded instead of only updated on firststart / migration.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3788 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
(no author)
94cc9f05f5
*) Improvements for restart via update wrapper
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3785 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
borg-0300
2ab020445a
bugfix, i think - http://www.yacy-forum.de/viewtopic.php?t=4059
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3777 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
(no author)
ef24bed406
Sorry...
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3760 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
(no author)
a29cb2e1af
blupp
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3759 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
a585b4d41b
added web structure image
...
see http://localhost:8080/WatchWebStructure_p.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3747 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
33ad0c8246
added a web structure computation and logging:
...
- all web page parsing operations will now increase a web structure file
- the file is computed in memory and dumped at shutdown-time to PLASMASB/webStructure.map in readable form (not a database)
- the file can be used externally to analyse the link structure of the crawled pages
- the web structure can also be retrieved using a xml-interface at http://localhost:8080/xml/webstructure.xml
- the short-term purpose is the computation of a link-graph image (before linuxtag!)
- a long-term purpose could be a decentralized computation of the citation rank
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3746 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
7904175338
- sorry for typos
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3743 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
baa9402b97
- wiki-parser is now configurable via the config setting wikiParser.class which holds the class-name for the parser to use
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3742 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
0a64047081
- plasmaParserDocument can process subdocuments now (other archive-parsers may want to use this method)
...
- added 7zip parser
- added 'text/sgml' to realtime parseable mimetypes (sometimes returned by the mime type parser)
- added new cached output stream class, very suitable for parsers because of limited memory
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3740 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
9a4375b115
*) robots.txt: adding support for crawl-delay
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3737 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
086239da36
- added servlet: remote crawler queue overview
...
- added servlet: crawl profile editor
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3731 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b05e2314cf
another dht selection fix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3725 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b28e5d0ee9
protection against wrong word hash length
...
see http://www.yacy-forum.de/viewtopic.php?p=35657#35657
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3723 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
0384b8771b
fix for http://www.yacy-forum.de/viewtopic.php?p=35700#35700
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3719 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
578c2ef130
release 0.52
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3715 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
46367afaaa
update of memory-protection values
...
see http://www.yacy-forum.de/viewtopic.php?p=35539#35539
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3709 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
rramthun
ea87fe5d78
*) Updated German translation
...
*) Changed "Lost Handle" error to warning (masses of it if deleting crawl-profile)
*) Removed unnecessary code from Windows script
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3708 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
26f05d1fd0
avoid division by zero if search is done for no words
...
this case is relevant if the bluewords (yacy.blue) are used
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3698 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
139c59ebbd
- fixed dht selction problem: the seed tables used a wrong ordering
...
- cleaned some code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3693 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
e602436fda
fixed problem with cluster routing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3684 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
d6480dc670
fix for long transfer pauses
...
see http://www.yacy-forum.de/viewtopic.php?p=35243#35243
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3672 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
6f46245a51
*) Bookmarks: Ajax icon is displayed while loading title
...
*) First version of a sitemap parser added
- currently only autodetection of sitemap files is supported
*) DB-Import restructured
- pause/resume should work again now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3666 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
74dd6cac95
*) signal yacy shutdown to updater
...
*) some javadoc added
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3658 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
43748f87fb
*) changes required for the uploader
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3655 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
rramthun
e12e934ade
*) Fixed broken compile process.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3650 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
7cf8981a98
- added debugging code for wrong DHT target iterator
...
- restricted distance constraint from 0.4 to 0.2
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3644 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
dd44a1394f
disabled automatic performance setting change
...
- during crawl start
- each indexing cycle
- for delay values
- for short memory cycles
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3634 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b9add5cf37
some bugfixes:
...
- dht iterator start point
- wordIndex synchronization
- surftipps url check
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3633 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago