theli
1395aae742
*) starting restructuring which is needed to add crawlers for additional protocols
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2472 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b4acbdaa97
*) better handling of server shutdown
...
See: e.g. http://www.yacy-forum.de/viewtopic.php?p=25234
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2470 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
f3ac4dbbb9
*) better handling of server shutdown
...
See: e.g. http://www.yacy-forum.de/viewtopic.php?t=2584
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2468 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
959b779aba
*) avoid performance loss if log level is greater than 'fine'
...
See: http://www.yacy-forum.de/viewtopic.php?p=25180
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2467 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
57dda1a92c
*)again fixing for wrong version display, now totally working with double instead of float
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2464 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
479b74e1dd
*) fix for stupid mistake in new ppm-calc which caused decimal digits beeing written to seedinfo
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2463 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
348258a557
*) changed PPM-calculation to be much more accurate
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2461 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
18b6876860
new cache flush configuration settings
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2460 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
f0278b4092
Bugfix for / by zero when the AssortmentCluster is empty
...
See: http://www.yacy-forum.de/viewtopic.php?t=2746
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2459 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
14e0bb0dcf
allow more references per word for new db
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2458 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
985dcbde7f
changed some parameters that may cause better memory usage and more indexing speed
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2457 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b7f4a1521b
added options to switch on or off the kelondroFlexTable for NURL, EURL and PreNURL
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2456 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
c26da4893b
turned back NURL usage of kelondroTree, kelondroFlexTable has still problems with deleted entries
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2454 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
db1eae0227
* simplified initialization of database objects
...
* replaced kelondroTree for NURLs by kelondroFlex
* replaced kelondroTree for EURLs by kelondroFlex
take care, may be very buggy
please finish crawls before updating. crawls will be lost.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2452 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
0b73f2b132
Repair DNS prefetch during cacheScan
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2451 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
27a159b401
* documentation update
...
* removed doc from release
* release information in doc/News.html
* release 0.46
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2442 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
f80f776b89
*) Trying to solve NullpointerException problem in function addURLtoErrorDB
...
See: http://www.yacy-forum.de/viewtopic.php?t=2705
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2441 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
d78b824e85
fixed problem with default path after first start-up
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2440 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
1c99b5a484
*)fixed logging for urldbcleanup
...
*)changed exception handling in urldbcleanup so that it shows NullPointerException correctly
*)added more Blacklisting to urlcleaner
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2436 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
135e019883
removed one superfluous line from last commit
...
(hasnot is included in remove)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2435 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
1591a55963
added object cache miss-cache use for remove method
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2434 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
8f3f4ab0eb
enhanced synchronisation in plasmaWordIndex
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2433 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f933f00f09
another patch to URL protocol handling for 'news', 'nntp' etc:
...
reject it! (the java.net.URL class rejects them too)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2432 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4c6e00d80a
more bugfixes for URL class, see:
...
http://www.yacy-forum.de/viewtopic.php?p=24844#24844
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2431 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
23dd972608
fixed memory calculation in performanceMemory web page
...
fixed also maximum cache size computation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2429 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b7dc251948
fixed bugs in url class:
...
- correct backpath ('..') handling
- correct absolute path handling
- included https
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2428 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
1ce3c22761
better memory control:
...
- added memory monitor for preNURL-db in performanceMemory
- changed default memory assignments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2427 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
39b4c26bdc
more memory control:
...
- catchup of OutOfMemoryError in server threads
- automatic adoption of word cache size after a Short Mem Cycle
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2426 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3e9d509c39
some small fixes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2425 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
276225d79e
fix for URL class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2423 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
eb633c0a4f
server threads must now supply a method that can be called in case
...
of short memory. This has been realized for the indexing thread.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2421 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f5720cb2fa
removed most synchronization in wordIndex (for testing)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2420 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
0187c60010
because of a bug in the JRE 1.4.2 there was no memory protection
...
see http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=4686462
this commit fixes the bug by using a memory-computation patch.
All uses of Runtime.maxMemory had been replaced by serverMemory.max
The bug is not present any more in Java 1.5
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2419 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
auron_x
4eca0f8830
*) fixed PPM calculation for multiple indexer-threads
...
*) fixed totalPPM calculation and added total PPM to Network.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2418 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
cfb51fdef1
less synchronization in plasmaWordIndex
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2416 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
d6a928c2da
quickfix for http://www.yacy-forum.de/viewtopic.php?t=2705
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2415 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
6ad471ef96
* applied many compiler warning recommendations
...
* cleaned up code
* added unit test code
* migrated ranking RCI computation to kelondroFlex and kelondroCollectionIndex
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2414 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
cf1186597b
utf fix from theli
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2412 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
9da3aa74d3
silly me, fix for the fix as advised by theli
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2408 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
bb3d9a5582
*) e.getMessage().indexOf() can only be used if there is actually an ExceptionMessage.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2407 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
7a54010a9c
*) Iterators can't be casted to IndexContainer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2406 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
5e0b6f8f83
*) sorting peer name list on Blacklist_p.html
...
*) restructuring of sharedBlacklist_p.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2405 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
cd5f7e137c
fixed problem with NURL-generation upon first startup
...
(a new kelondroFlexTable was generated, which should not)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2402 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
8418af141a
added several consistency checks and small changes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2400 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
9d13aeca13
*) removing class. does not work so far
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2399 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
95a84ae469
*) adding missing classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2398 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
eee44be602
*) adding an interface for customized blacklist classes
...
- now it's possible to use a customized blacklist engine
instead of the default one
- this can be done by configuring the property BlackLists.class
See: http://www.yacy-forum.de/viewtopic.php?t=2108
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2397 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
6d2f15971a
there is a very strange error that causes that the kelondroRecords structure
...
is corrupted. The cause is, that the deleted-records-chain has wrong entries,
and one of the pointers in that chain points to a place behind the file end.
This causes an IndexOutOfBoundsException within an IO operation.
I currently don't know the reason that the deleted-records-chain is
corrupted, but the error can be catched. If this now happens with the
assortment database, the database is deleted.
See also:
http://www.yacy-forum.de/viewtopic.php?p=24586#24586
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2396 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
d2e8e76218
*) now it's possible to configure the yacy blacklist separately for dht, search, proxy, crawler
...
See: http://www.yacy-forum.de/viewtopic.php?t=2541
http://www.yacy-forum.de/viewtopic.php?p=24516
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2389 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
9ae9062bd3
* disabled new kelondroFlex table for NURLs
...
* added new RAM index Class
* fixed possible synchronization problem in kelondroRecords
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2388 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
689bbcf9cd
replaced kelondroTree db for NURLs by new kelondroFlexTable
...
The new database is only created if the old is deleted or does not exist
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2387 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
7fbba41962
synchronization fixes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2386 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
328f9859a5
more synchronization in plasmaWordIndex
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2385 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f43c90fa98
fixed handling of null referer in crawlOrder
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2384 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
130e6d4719
generalized index object for eurl, nurl and lurl to prepare move
...
of these tables to new kelondroFlexTable Object
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2382 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
acdf24877f
more synchronization against outOfMemoryError in wordIndex
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2381 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
95160d7f2c
fixed size computation of index elements from the collection index
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2380 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
26116cabde
added missing rowdef assignment
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2379 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
cfbacbbf08
reverted change in robotsParser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2378 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
abf22f6e60
removed url normalform computation from htmlFilterContentScraper.
...
This method was implemented in de.anomic.net.URL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2377 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
740d49751d
* strict type and size check in kelondroRow handling
...
* adopted all code to use the declaration form of kelondroRow
* fixed a bug in kelondroRow which caused wrong parsing of encoding type
* the bug caused bad database behaviour in new indexCollection data structure.
because of this bug, all test databases are now already void. A new database is created
* the kelondroFlexTable and indexCollection data structures now store a declaration of the row definition
into a properties file along the database files.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2375 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
314021453f
* more logging
...
* option in yacy.init to set useCollectionIndex usage
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2374 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
a52f36787f
better templatedebugging
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2371 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
3480d36417
added some debug code
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2369 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
61b151b083
* added another auto-fix for collection index inconsitency check
...
* fixed words size computation for collection index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2368 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
0bbbd129ef
small fix for exception message
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2367 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
718fbc2dae
enhancements in kelondroCollectionIndex:
...
* synchronized array and index objects
* auto-fix function for slightly corrupted index entries
* generalized internal access methods
also extended kelondroIndex interface to support ordering access
which is used in kelondroCollectionIndex for string comparisments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2366 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f58283def2
better control of index flush
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2364 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4be21a3cab
ups
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2363 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
80b6c90d54
enhancements to prevent blocking during dht transfer receive
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2362 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
9f298083cd
*) adding more urls to the error url
...
- old error strings where replaced with there corresponding constants
See: http://www.yacy-forum.de/viewtopic.php?t=2638
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2360 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
d56f06401e
- Cache known URLs during indexReceive to avoid getting blocked during loadedURL.exists() whenever possible
...
- Small logging updates
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2359 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
c09f734d06
*) offer router configuration on ConfigBasic.html
...
- checkbox to allow router configuration is shown if
- a) the UPnP forwarder is installed
- b) a UPnP enabled router was found
- c) no other forwarder was configured
See: http://www.yacy-forum.de/viewtopic.php?p=24264
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2358 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
dcbb4d0a6b
Display the size of HashBlacklistedCache on PerformanceMemory page.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2357 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
d799622da1
better flush limit for index collections
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2354 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
d468d665c9
some changes that may help to prevent deadlocks that cause an OutOfMemoryError
...
as described in
http://www.yacy-forum.de/viewtopic.php?p=24359
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2353 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
d54767f634
*) last step of removing embedded html from dir class
...
- migration finished
*) dir list now sorts the dirlist entries.
- directories are listed before files
- files are sorted alphabetically, case insensitive
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2351 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
279b1d969d
Integrated new indexing data structure 'collections' into the main class
...
for indexing, the plasmaWordIndex.
The new data structure is ready-to-use, but currently disabled.
It can be activated by setting the static
plasmaWordIndex.useCollectionIndex
to true. This shall be done for testing purpose.
The new index is stored to
DATA/INDEX/PUBLIC/TEXT
The directory PLASMA shall be used only for crawler in the future.
Attention: during testing the data structure in INDEX may change,
and created indexes with the new data structure may get useless.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2348 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4ff742e42d
implemented indexCollectionRI
...
this is the new database structure that is supposed to replace the
plasmaAssortmentCluster AND the plasmaWordIndexFileCluster
The new structure is not yet active and needs to be integrated into
plasmaWordIndex. This has some migration constraints that are not yet
completely solved.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2347 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
01f95eccd3
re-write of kelondroCollectionIndex. This is the data structure that
...
shall replace the current assortment files.
* used the kelondroFlexTable to hold the index of collections
* used kelondroRow definitions to declare all data structures
* fixed several bugs that appeared in kelondroRowSet and kelondroRowCollection during testing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2344 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
ebc2233092
* implemented (finished) class indexRowSetContainer
...
* replaced indexTreeMapContainer by indexRowSetContainer
* deleted indexTreeMapContainer and abstract class
This is another step to the new database structure
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2343 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
9183d21f25
renamed new index class to old name
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2342 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
c4e922885a
replaced indexURLEntry by new class that uses a kelondroRow.Entry object
...
to store the index entry. This is another step to move to the new database structure.
A side effect of this change is, that index storage uses much less RAM space,
which affects the index RAM cache.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2341 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
0b7112f8b2
fix for missing topLevelClone in indexRAMCacheRI.wordContainerIterator
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2340 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
e357599f92
* fixed problem with indexContainer iteration from RAM:
...
indexContainers from RAM must be cloned explicitely to prevent
side-effects on stored indexContainer objects in Cache
* changed behaviour of urlReference deletion from indexContainers:
deletion does not user retrieval of all Elements from the assortments
* added textual configuration of kelondroRow and kelondroColumn definition
* update of kelondroRow usage in yacyNews
* modified kelondroAttrSeq to use modified kelondroColumn parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2339 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
57fe5cc671
*) code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2338 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
4e9f02c8ec
integration of Michaels string-extraction.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2337 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
8b77afd72c
some fixes to new container merger
...
and some code cleanup
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2336 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
830167596a
bugfix for
...
http://www.yacy-forum.de/viewtopic.php?p=24127#24127
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2333 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
839806a775
*) serverPortForwardingUpnp.java: code cleanup, license header added
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2332 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
03230cd887
*) removing old port forwarding classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2330 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
6e676224d0
*) adding support for upnp
...
A new port forwarding method for upnp was added.
If this method is enabled, yacy automatically determines an UPnP
capable internet gateway and configures the gateway port forwarding
settings properly.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2328 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
417ed5102e
redesign of database iterators:
...
an iteration of key elements in kelondroTree databases is no longer supported.
this is now replaced by an iteration of kelondroRow.Entry objects from the database
Iteration of keys from the database was mostly followed by retrieval of the row
from the database, whcih caused unnecessary database load.
The index selection was also redesigned to use the new row iteration methods.
This affects many funktions, most important is the DHT selection routine which is now much faster.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2327 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
0db237467f
*) bugfix for URL generation from file
...
see: http://www.yacy-forum.de/viewtopic.php?p=24116
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2326 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
ad692fc6c7
implemented option to extract nurls from the database
...
(plus some iteration enhancements for nurls)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2325 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
7fd90ca7c8
* strict handling of NURL entry element generation, storage and stacking
...
* more space for EURL reason strings (you must delete the EURL db to use this)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2324 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
5f72be2a95
some redesign of EURL storage
...
* store() is now called explicitely
* more urls are written to the EURL table
* the EURL stack does not store the complete entry any more, now only the URL hash
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2323 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
1ed3e2daef
added option to extract domains and/or urls from the eurl database
...
when extracting from eurl, the html output format is recommended, since
this format adds also the fail reason to the domain/url.
The complete syntax for domain extraction is now
java -Xmx<megabytes>m -classpath classes yacy -domlist [ -source { lurl | eurl } ] [ -format { text | zip | gzip | html } ] [ <path to DATA folder> ]
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2322 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
7e0a130fb5
new indexURLEntry class 'indexURLEntryNew', to replace old class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2321 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
58df8b7bbf
a large collection of different changes
...
* mainly for the transition to the new indexing database structure
* a bugfix for an endless loop inside kelondroTree iteration
* a bugfix for bulk read inside a kelondroTree iteration; the bug caused that some elements had been iterated twice
* very strong speed enhancement for url/domain extraction
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2320 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
e20ff77c10
another bugfix in new url class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2318 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
685430a1b5
bugfix in new URL class, better loggin for domain extraction
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2317 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
79af283f6c
better debugging in new URL class for wrong port numbers
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2315 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
1b2ea58ee9
wrong substring invocation.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2313 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
e4f1820b58
protection against too long authentication strings in switchboard
...
see also: http://www.yacy-forum.de/viewtopic.php?p=23943#23943
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2312 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b3f7e62e03
better handling of whitespace
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2311 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4149939c02
better handling of whitespace for gettext quotation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2310 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
97fa6788a1
added gettext support:
...
automatic replacement of string appearances in html files by
gettext quotes.
see also: http://www.yacy-forum.de/viewtopic.php?p=23901#23901
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2309 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
b3c569f706
*) renaming of function getTransferedEntitySpeed to getTransferedEntrySpeed to avoid confusion
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2308 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
67edd80884
removed tabs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2305 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
67c486a023
some example Code, how supertemplates can be used.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2304 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
5214f571cd
simplified method call in balancer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2303 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
7b0e2521bb
Support for a supertemplate, which can do all thing, a normal template can do.
...
Its a layer under the servlets, this means, #[page]# will be replaced by serverletcode, the rest can be set by you.
(TODO: if we use this for layout, we need to read "TITLE" from the servlet's tp, to set it outside of the servlet.)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2302 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4bd626572b
added hashCode and compareTo to new URL class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2301 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
abb5264929
fix for
...
http://www.yacy-forum.de/viewtopic.php?p=23868#23868
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2300 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
a70cbd959b
*) further improvements for the anomic.net.url class
...
- relpath starting with javascript: are ignored now
- bugfix for concatenation of relpath starting with # or ?
in this case no slash should be added to the baseURL, otherwise
we get URLs of the form http://test.de/index.html/?param=value
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2298 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
8a1f1d96b3
*) Bugfix for url concatenation. Relative urls with / or http:// at the beginning
...
were not handled correctly on url concatenation via new URL(URL,relPath).
See: http://www.yacy-forum.de/viewtopic.php?t=2623
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2297 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
rramthun
ca33eaa442
- Some spelling
...
- Removed unused init value
- Set default upload value to "none", which avoids an warning which says, upload method '' would be unknown, on new installations
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2295 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
8795875800
dirlisting for all empty directories.
...
no problem to update dir.java anymore, because its only in htroot/htdocsdefault needed.
migration to delete old dir.* files in the fileshare
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2294 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
7935f27038
enhanced synchronization in balancer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2291 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3879a0ecd0
replaced java.net.URL usage by use of new class de.anomic.net.URL
...
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
07900366ac
deactivated cache-initialization for file-indexes (files in WORDS)
...
see also: http://www.yacy-forum.de/viewtopic.php?p=23801#23801
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2289 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
40aa735520
fixe timing problem causing too long delay during initialization of kelondroTree objects
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2288 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
d2bb3f442e
fixed timing problem causing a division by zero exception
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2287 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
6acb6a4d8f
tiny performance optimization
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2285 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
2bdf1fc360
totalPPM
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2282 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
24a02cbeef
*) Bugfix for not parsable application/xhtml+xml resources if
...
an URL has no extension
See: http://www.yacy-forum.de/viewtopic.php?p=23687
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2280 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b0ca5fa784
some correction algorithm for preload time computation during assortment open
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2279 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
e22cbaee97
- extended logging for preload
...
- reduced preload-time for IndexImport_p.java
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2278 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
671fd9a5c9
work towards new indexing database structure
...
(no effect on current functionality yet)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2277 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
92f4cb4d73
added option to configure the start-up delay time for kelondro database files.
...
the start-up delay is used to pre-load the database node cache
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2276 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
ce9dd3e76d
some work in the index construction zone (no effect yet)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2275 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
fe617d7e54
*) adding function to return the protocol type of a ssl connection
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2274 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
018b3e0832
added pause option to server threads.
...
The pause is started by calling intermission(Long.MAX_VALUE)
and can be stopped by calling intermission(0)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2272 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
e1a52bea22
added a class stub for the new database structure:
...
a reverse word index based on a a collection index,
which is an index for a set of array files containing
row collections.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2271 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3b69b35bf2
added pre-load of node cache entries to kelondroRecords
...
this gives the kelondroTree data structure a similar start-up
behaviour like the kelondroFlexTable: the cache is filled with
routing data in such a way that is more performant than
reading node records during normal operation.
The pre-load phase stops automatically after a time-out of 500 milliseconds
of if the cache is full.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2270 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
85d575e928
enhancements to kelondroRow and kelondroColumn
...
these are changes towards a better indexURLEntry implementation
which are needed for the new database structures
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2268 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
ab1ed053f5
another small correction
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2267 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b92561fb67
removed unused code
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2266 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
eadbd56fc5
small adjustment to last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2265 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
e9765ac4e6
introduced bulk read for node iterator in kelondroRecords
...
this speeds up the iterator by factor 2
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2264 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
6643da3fbd
bugfix for http://www.yacy-forum.de/viewtopic.php?p=23463#23463
...
(affected URL DB Cleaner)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2263 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
866d53ed70
fix for DNS block bug
...
see http://www.yacy-forum.de/viewtopic.php?p=23458#23458
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2262 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
6af70febef
- added kelondroTree index option to kelondroFlexTable
...
- automatic generation of index file when index is too large for RAM
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2261 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
dd2865178a
major bugfix (searched a whole week for the bug) for
...
the kelondroRowBuffer, which has effect mostly to the
kelondroFlexTable but also to all other database functions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2260 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f9b9d085c4
just changed testing code
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2259 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
b594ee9a5a
*) Adding possibility to configure if the http proxy should send the
...
X-forwarded-for header (requested by TeeSee)
See: http://www.yacy-forum.de/viewtopic.php?t=2577
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2257 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
ef84fc4956
added IOException to size() and row()
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2256 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
84dfd76a6a
kelondroFlex bugfix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2254 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
8ba8e2b7d9
*) added cache for blacklists urlhashs recieved by DHT. DHT does not request URLs listed in this cache.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2251 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago