- partly redesign of diskUsage: little bit more functional behavior, less side effects, better error case handling
- the resourceObserver can now show a error message if the diskUsage is 'out of order'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4973 6c8d7289-2bf4-0310-a012-ef5d649a1542
- imported a random startpoint function from plasmaDHTChunk
in case there was already a gap at the beginning of the index, the transfer process was endless selecting from first startpoint
tested & working on my index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4970 6c8d7289-2bf4-0310-a012-ef5d649a1542
- no more keep-order parameter in remove (it was not possible to make this strict, and not useful)
- some small enhancements in balancer
- robots parser without references in switchboard
- changes synchronization in robots
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4969 6c8d7289-2bf4-0310-a012-ef5d649a1542
the most important fix was the addition of the yacybot user-agent for robots.txt loading,
because web masters look for that access to see if the crawler behaves correctly.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4968 6c8d7289-2bf4-0310-a012-ef5d649a1542
the skin menue. Additionally an example is given there how to integrate a search page with an iframe.
Please see the skin menu.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4967 6c8d7289-2bf4-0310-a012-ef5d649a1542
- added option to set minimum crawl delta for domains in balancer
- added default values to crawl deltas in yacy.init
- added configuration for these deltas in performance queues
- enhanced performance setting computation (more time for indexing queue for a faster flush
- remote crawling is now enabled during local crawling if indexer has space and time for more links
- added database stub for new distributed file system
- refactoring of time computation to get an abstraction level that will be used by a TTL rule in new distributed file system
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4966 6c8d7289-2bf4-0310-a012-ef5d649a1542
- select also non-robinson (dht-) peers if their peer tags match with search words
- the peer tag '*' can now act as catch-all rule: shall be selected always
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4963 6c8d7289-2bf4-0310-a012-ef5d649a1542
- less dht targets are selected
- more other peers are selected: all robinson peers with more than one million urls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4962 6c8d7289-2bf4-0310-a012-ef5d649a1542
1. The usage and dependency of the plasmaSwitchboad was used many times in the past but this was
a bad mistake. The classes should be independent from the switchboard to support a better abstraction. Therefore the object was removed. The parameters from the switchboard are computed outside and then handed over.
2. the class is considered as a tightly connected to hardware resources. Classes which handle data that cannot be replicated because it would need to replicate hadware should not support dynamic object allocation, but should be coded as collection of private static methods. Therefore all class objects had been transformed into static private objects.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4961 6c8d7289-2bf4-0310-a012-ef5d649a1542
- some more/better asserts
- slight performance enhancements in remove method in index management. Works for all who do not run using asserts (the majority)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4928 6c8d7289-2bf4-0310-a012-ef5d649a1542
* new message on portchange
* first implementation of external link-update for search page (still inactive)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4915 6c8d7289-2bf4-0310-a012-ef5d649a1542
- no access check when a search is made only local without snippet fetch
- added comment and status message in resourceObserver (this takes very long at startup time!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4911 6c8d7289-2bf4-0310-a012-ef5d649a1542
find a better way to store BLOBs; generalize current BLOG data structure (kelondroDyn)
and prepare it to replace it with something better. The best candidate is the kelondroHeap,
which will become the kelondroBLOBHeap;
removed also some never-used classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4902 6c8d7289-2bf4-0310-a012-ef5d649a1542
- prevention of false information of own IP address
- enabled searching before an own IP address is assigned (before first ping happened)
- removed warning about limited search function
- added better time-out settings for peer-ping process (10 seconds complete, 5 seconds for back-ping)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4883 6c8d7289-2bf4-0310-a012-ef5d649a1542
- fixed loosing of own seed hash (hopefully)
- fixed a bug with crawl start s beginning with (bookmark) files
- added better IP recognition during hello process
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4882 6c8d7289-2bf4-0310-a012-ef5d649a1542
A large refactoring was neccessary
- added another crawl start option: automatic restriction to sub-path
- removed crawlStartSimple and renamed crawl start expert
to crawl start (without expert)
- some changes to texts in crawl start
- added some more deletions when an web index is deleted:
delete also queues and robots cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4881 6c8d7289-2bf4-0310-a012-ef5d649a1542
the purpose of this list is not only to view attempts from read attacks, but also to show if the
yacy-yacy protocol is working with respect to the bug that was fixed with SVN 4869
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4873 6c8d7289-2bf4-0310-a012-ef5d649a1542
pleas disable before release because 2nd update fails at the moment
and commandline handling has to be improved for windows
* update via new unTar class
please review stream- and exceptionhandling because I'm fairly new to Java
maybe it can be done concurrent
* updated windows startscripts to values from yacy.init
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4832 6c8d7289-2bf4-0310-a012-ef5d649a1542
- access is granted for localhost users to administration pages by default
- the default setting can be changed in the BasicConfig.html page
- if the BasicConfig page was accessed with post and no password was submitted, a random password is generated
- a headless installation MUST give a password upon first call of the configuration page, otherwise they will not be able to access it again
- if no password is given within 10 minutes after start-up, a random password is generated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4804 6c8d7289-2bf4-0310-a012-ef5d649a1542
from the ConfigNetwork online interface
- to make this possible, a large refactoring and reorganisation of data structures was necessary
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4803 6c8d7289-2bf4-0310-a012-ef5d649a1542
- fixed bug in Collage
- added 'embedded mode' to collage
- integrated Collage to terminal_p as iframe in embedded mode (Pictures now visible in terminal_p)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4797 6c8d7289-2bf4-0310-a012-ef5d649a1542
- new authorization rule: localhost is always authorized for administration. This solves many problems with ajax, and also fixed a problem in rssTerminal
- fix bug in RSSFeed which prevented that entries had been recognized as individual, new entries
- added reloading/updating of status image on status page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4796 6c8d7289-2bf4-0310-a012-ef5d649a1542
- 3 lines possible
- distinguishing of private and public data, if not authorized only public data is shown
- shows now more events, including local searches in clear text if user is logged in
- simplyfied peer events
- better recognition of 'real' new peers
- presentation of peer pings from other peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4771 6c8d7289-2bf4-0310-a012-ef5d649a1542
This change is inspired by the need to see a network connected to the index it creates in a indexing team.
It is not possible to divide the network and the index. Therefore all control files for the network was moved to the network within the INDEX/<network-name> subfolder.
The remaining YACYDB is superfluous and can be deleted.
The yacyDB and yacyNews data structures are now part of plasmaWordIndex. Therefore all methods, using static access to yacySeedDB had to be rewritten. A special problem had been all the port forwarding methods which had been tightly mixed with seed construction. It was not possible to move the port forwarding functions to the place, meaning and usage of plasmaWordIndex. Therefore the port forwarding had been deleted (I guess nobody used it and it can be simulated by methods outside of YaCy).
The mySeed.txt is automatically moved to the current network position. A new effect causes that every network will create a different local seed file, which is ok, since the seed identifies the peer only against the network (it is the purpose of the seed hash to give a peer a location within the DHT).
No other functional change has been made. The next steps to enable network switcing are:
- shift of crawler tables from PLASMADB into the network (crawls are also network-specific)
- possibly shift of plasmaWordIndex code into yacy package (index management is network-specific)
- servlet to switch networks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4765 6c8d7289-2bf4-0310-a012-ef5d649a1542
*) changes in formatting for better readability in BlacklistCleaner_p.java
*) replaced test for necessary Java version (was 1.4.2, is 1.5 now)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4756 6c8d7289-2bf4-0310-a012-ef5d649a1542
- changed isLocal Property in such a way that it is possible to see if a domain is in the internet (and not intranet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4751 6c8d7289-2bf4-0310-a012-ef5d649a1542
- removed tree data type in kelondroHTCache
- added new class kelondroHeap; may be the core for a storage object that will once replace the many-files strategy of kelondroHTCache
- removed compatibility mode in indexRAMRI
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@4747 6c8d7289-2bf4-0310-a012-ef5d649a1542