theli
a5ed86105b
*) bugfix for handling of ResourceInfo object in proxy
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2512 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
ff4362b02d
some more fixes for new plasmaCrawlLURL.load behavior
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2511 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
7aeadbe7cc
another NullPointerException in http.ResourceInfo
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2510 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
141f9e5bb4
fix for new plasmaCrawlLURL.load behavior
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2509 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
1e7fd48afd
added size method to ftpc
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2508 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
087f7511f8
prevent NullPointerException in http.ResourceInfo
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2507 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
a2525072f2
bugfix for kelondroRow - property generation
...
this bug affected ranking parameters :-(
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2506 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hydrox
59a5511dbb
*) added missing static Strings as requested by theli
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2505 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
6578564c9a
*) Ignore more hop by hop http headers
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2504 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b44514242a
*) crawler/ftp/CrawlWorker.java: better errorhandling
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2503 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
7d7f30139c
*) crawler/ftp/CrawlWorker.java: delete old cache file
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2502 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
4ae0f122f8
*) ResourceInfo.java: License header added
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2501 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
043edfa4d8
*) ftp/ResourceInfo.java ResourceInfo object for ftp resources added
...
*) ftp/CrawlWorker.java better errorhandling for ftp crawler
*) plasmaCrawlEURL.java: some errorcodes added
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2499 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
4866868c0e
added write cache for LURLs
...
This was necessary to speed up the index receive process during global search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2498 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
8a0e35618b
enhancements to search result preparation
...
- added detailed count on remote search results
- enhanced search sequence during remote searches (doing local search in sequence)
- strict adherence to timout limits
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2497 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
5c1bb53d2a
Missing description for last commit
...
*) next step of restructuring for new crawlers
> HTCaching should now work protocol independent
-- introduction of new ResourceInfo objects containing protocolspecific metadata
of a resource.
-- the ResourceInfo objects now implement old functions like shallIndexCacheForXXX,
shallStoreCacheForXXX in a protocol dependent manner
> Indexing should also work protocol independent now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2496 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
dae763d8e3
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2495 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
4825bfaaf3
*) Bugfix for PrintWriter Problem
...
See: http://www.yacy-forum.de/viewtopic.php?t=2792
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2494 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
d4c5e2af01
html-dirlist can now also be generated from existing connections
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2493 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
7930839594
*) URL.java: userinfo was not taken over when generating a new url from a base url and a rel. path
...
*) CrawlWorker.java: using new dirhtml function of ftpc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2492 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
17ba468165
added html dirlisting generation in ftpc.java:
...
ftpc.dirhtml() generates a StringBuffer with a complete web page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2491 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
7a35b8e237
*) direct access to responseheaders of sbQueue.Entry removed to make it more http independent
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2487 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
ffbf416e76
*) direct access to requestheader of htCache.Entry removed to make it more http independent
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2486 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
3870d615e3
*) setting htCache.Entry fields to private
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2485 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
393a7d10be
*) setting htCache.Entry fields to private
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2484 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
ab5a9bee66
*) adding some copyright headers
...
*) next step of restructuring for new crawlers
- adding first testversion of ftp crawler class
-- does not create a htCache entry yet
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2483 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
5847492537
*) next step of restructuring for new crawlers
...
- IndexCreate_p.java: correcting problems with ftp urls
- URL.java does not cutout the userinfo anymore
(needed to transport authentication info in ftp urls, e.g. ftp://username:pwd@ftp.irgendwas.de)
- plasmaCrawlLoader.java:
-- hack to re enable https urls
-- adding function getSupportedProtocols
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2482 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
6cce47e217
test of ftp-urls in URL class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2481 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
fce9e7741b
*) next step of restructuring for new crawlers
...
- renaming of http specific crawler settings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2480 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
e3f0136606
*) next step of restructuring for new crawlers
...
- adding function isSupportedProcotol to plasmaCrawlLoader.java
- disabling robots.txt check for protocols other than http(s)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2479 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
9ded4e8d5a
*) Bugfix for name resolution in proxy mode
...
See: http://www.yacy-forum.de/viewtopic.php?p=25241
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2478 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
1c8300fcec
*) Bugfix for name resolution in proxy mode
...
See: http://www.yacy-forum.de/viewtopic.php?p=25241
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2477 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
4e2a950ac9
*) next step of restructuring for new crawlers
...
- avoid using the http crawler class directly. Using the interface class instead
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2476 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
09b106eb04
*) next step of restructuring for new crawlers
...
- adding interface class (plasma/crawler/plasmaCrawlWorker.java) for protocol specific crawl-worker threads
- moving reusable code into abstract crawl-worker class AbstractCrawlWorker.java
- the load method of the worker threads should not be called directly anymore (e.g. by the snippet fetcher)
to crawl a page and wait for the result use function plasmaCrawlLoader.loadSync([...])
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2474 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
eb9b138986
*) next step of restructuring for new crawlers
...
- conversion of the crawler pool into a keyed object pool
- crawlers are now loaded based on the url protocol (of course works only for http now)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2473 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
1395aae742
*) starting restructuring which is needed to add crawlers for additional protocols
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2472 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b4acbdaa97
*) better handling of server shutdown
...
See: e.g. http://www.yacy-forum.de/viewtopic.php?p=25234
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2470 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
f3ac4dbbb9
*) better handling of server shutdown
...
See: e.g. http://www.yacy-forum.de/viewtopic.php?t=2584
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2468 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
959b779aba
*) avoid performance loss if log level is greater than 'fine'
...
See: http://www.yacy-forum.de/viewtopic.php?p=25180
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2467 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
b515d49f87
*) fix for new combinedVersionString2PrettyString by bost
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2466 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
24316ba937
*) improved implementation of combinedVersionString2PrettyString by bost
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2465 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
57dda1a92c
*)again fixing for wrong version display, now totally working with double instead of float
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2464 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
479b74e1dd
*) fix for stupid mistake in new ppm-calc which caused decimal digits beeing written to seedinfo
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2463 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
5e558fbaae
*) hopefully fixed the wrong display of yacy-version
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2462 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
348258a557
*) changed PPM-calculation to be much more accurate
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2461 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
18b6876860
new cache flush configuration settings
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2460 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
f0278b4092
Bugfix for / by zero when the AssortmentCluster is empty
...
See: http://www.yacy-forum.de/viewtopic.php?t=2746
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2459 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
14e0bb0dcf
allow more references per word for new db
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2458 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
985dcbde7f
changed some parameters that may cause better memory usage and more indexing speed
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2457 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b7f4a1521b
added options to switch on or off the kelondroFlexTable for NURL, EURL and PreNURL
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2456 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago