orbiter
e4f1820b58
protection against too long authentication strings in switchboard
...
see also: http://www.yacy-forum.de/viewtopic.php?p=23943#23943
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2312 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3879a0ecd0
replaced java.net.URL usage by use of new class de.anomic.net.URL
...
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
671fd9a5c9
work towards new indexing database structure
...
(no effect on current functionality yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2277 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
92f4cb4d73
added option to configure the start-up delay time for kelondro database files.
...
the start-up delay is used to pre-load the database node cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2276 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
66964dc015
removed high/med/low from kelondroRecords cache control.
...
this was done because testing showed that cache-delete operations
slowed down record access most, even more that actual IO operations.
Cache-delete operations appeared when entries were shifted from low-priority
positions to high-priority positions. During a fill of x entries to a database,
x/2 delete situation happen which caused two or more delete operations.
removing the cache control means that these delete operations are not
necessary any more, but it is more difficult to decide which cache elements
shall be removed in case that the cache is full. There is not yet a stable
solution for this case, but the advantage of a faster cache is more important
that the flush problem.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2244 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
67a8c74be3
Fix for dynamic login with static password.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2210 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
ef9eb50c3c
fix for adminlogin
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2209 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
6fe2fed87e
cookieauth works with static Admin.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2208 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
4ca0857c0c
*) Index transfer now considers the pause time send by busy peers during
...
index transfer / index distribution
See: http://www.yacy-forum.de/viewtopic.php?p=22647#22491
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2205 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
c75cacda95
added a flex-width-array: this is a table where it is
...
possible to add columns to an existing table
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2163 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
5041d330ce
refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2150 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
bd057b44dd
- automatic setting of peer-does-not-accept-remote-crawl
...
- increased percentage of object cache to node cache to 30%
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2136 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
cda087f43b
- integrated cache miss storage into object cache
...
- removed cache-miss handling from indexURL
todo: new Monitoring in PerformanceMemory_p
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2132 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
61078b3885
*) adding support for delayed shutdown
...
- needed by Ismael to receive the Steering page properly on shutdown
- now the steering page should always be displayed properly in the web browser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2129 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
90d569d70f
refactoring of index management:
...
url storage is part of index management; moved plasmaURL to indexURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2122 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
a930be4ba3
refactoring of index management:
...
generalized the index entry
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2121 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
df7e1d9df3
Changes to plasmaURL and subclasses:
...
- Improve performance of plasmaURL.exists() by remembering URL-hashes that are not present
- Use a more realistic estimation of memory usage by the existsIndex cache
- Routine cleanup of the existsIndex to limit its memory usage
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2113 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
a474669338
start with refactoring of index management
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2110 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
f331def5d8
*) Bugfix for distribution. Incorrect behavior if peerCount == selectedCount
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2098 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
bcc950c533
*) Bugfix for Index Transfer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2088 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
461548698c
configuration of index transfer chunk size
...
see http://www.yacy-forum.de/viewtopic.php?p=20951#20951
new properties in yacy.init:
indexDistribution.minChunkSize = 5
indexDistribution.maxChunkSize = 1000
indexDistribution.startChunkSize = 50
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2073 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
51e3bb576f
Don't increase dhtTransferIndexCount when the last transferred index was smaller
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2064 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
a0ca4c5fb8
Remove a possible race condition between DHT transfer and deQueue
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2059 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
60e5aff9fc
some enhancements to the remote crawl trigger
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2030 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
14d6e476c9
tried to solve some problems with new picture viewer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2019 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f0833b0328
introduced simple search interface
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2007 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
83e0e765ec
redesigned some parts of the html scanner & parser
...
to better support image tags
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1995 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
e2e8d0c188
some kind of refactoring of yacysearch:
...
made 'room' for new picture search result presentation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1993 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
rramthun
250864406f
...
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1955 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
63f39ac7b5
added 3 new crawling steering options:
...
- re-crawl by age of page (enter in minutes)
- auto-domain-filter
- maximum number of pages per domain
NOT YET TESTED!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1949 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
1fc3b34be6
some pre-work (without function yet) to implement:
...
- re-crawl (by age of last crawl)
- auto-crawl-filter by crawl depth (to be explained..)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1948 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
c9e6b5e391
*) check size of indexing-queue and crawler pool before processing remote triggered crawl jobs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1946 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
1f4412a146
adopted isListed to discussed new behavior as discussed (url, getFile)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1940 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
063ef4660a
bug?
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1936 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3286b1f498
re-organisation of lurl-creation and -stacking
...
this was necessary to prevent useless write to the database
in case of blacklist appearance of the url
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1905 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
8da13088e9
*)removed multiple DHT_Distribution_Threads
...
*)boosted DHT_Distribution sending chunk parallel to multiple peers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1890 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
bcd99fe83e
introduced a second RAM cache for DHT transfer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1880 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
bae3783d38
added a snippet marking
...
(search words are now bold in snippets)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1823 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f0a38873eb
* added yacysearch page with better view on search results
...
the old search page is obsolete and will be removed
* ConfigBasic.html is now the default page instead of index.html
as long as no password is set
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1815 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
759800f543
*) Bugfix for storeHTCache problem
...
- content was not indexed if storeHTCache was off
See: http://www.yacy-forum.de/viewtopic.php?p=18269
See: http://www.yacy-forum.de/viewtopic.php?t=1882
See: http://www.yacy-forum.de/viewtopic.php?t=241
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1800 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
1b9b8922d9
* fixed problems with new basic 1-2-3 configuration (now authentication required)
...
* fixed graphics problem
* fixed some other problems with default values
* 1-2-3 config now appears automatically on start-up if no password is set
* added new config menu
* moved profile to new config menu
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1792 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
auron_x
8c6f38fe70
*) added Blog to YaCy (atm not reachable through interface) -> Blog.html
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1790 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
eaffcfefe2
* added more ranking attributes (without function; this will be added later)
...
* added ranking coefficient transmission to remote peer (without evaluation on server side, will be added later)
* changed ranking coefficients slightly
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1770 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3703f76866
- fixed re-search bug: after a search with several words, a second search could not
...
find the same words as before. This was caused because indexContaines stored the url references
with a hashtable. A tree was needed to work with the index conjunction-by-numeration
- added permanent ram cache flush (again)
- removed direct flush of ram cache after a large container is added.
this happens especially during DHT transmission and therefore this fix should
speed up DHT transmission on server side.
- removed unused and out-dated methods
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1765 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
fbbbf5f411
*) remote trigger for proxy-crawl
...
- remote crawling can now be enabled for the proxy crawling profile
See: http://www.yacy-forum.de/viewtopic.php?p=17753#17753
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1758 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
1d8ca6e082
serialized dhtChunk deletion with indexing
...
The dht selection, transmission and deletion is now completely serialized with indexing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1731 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
2336f0f013
*) allow pausing/resuming of crawlJob Threads separately
...
- pausing/resuming localCrawls
- pausing/resuming remoteTriggeredCrawls
- pausing/resuming globalCrawlTrigger
See: http://www.yacy-forum.de/viewtopic.php?t=1591
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1723 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
60dac4325e
serialized indexing with dht selection
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1719 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
a840755964
moved parts of index transfer logic back to switchboard
...
this is needed to merge the dht selection with the indexing thread
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1718 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
64441b1f78
ADDED: yacy.badwords list to filter the topwords
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1711 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago