orbiter
d98418390b
- introduced rankingProfile Class
...
- selection of ranking and timing profiles for each search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1539 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
eab1805bca
refactoring: plasmaSearchProfile -> plasmaSearchTimingProfile
...
This was made to distiguish this profile from the
(to-be-implemented) plasmaSeachOrderProfile
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1538 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
6eef848954
re-design of post-ranking process
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1537 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
be77fe1a88
code clean-up
...
@Martin: bitte schaue mal warum die Variablenzuweisung
in plasmaCrawlNURLImporter war. So wie sie waren, waren sie überflüssig.
Das hattest du dir bestimmt nicht so gedacht.
Sollten es ggf. globale Variablen sein?
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1529 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
0bc2aaeb42
added normalization to search attributes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1528 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
008bcb7fb8
*) simplifying code by moving closeTransferIndexes into final block
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1522 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
50d85657b8
*) new import function for IndexImport_p.html
...
- can be used to import the crawling queue (noticeUrlDB + stacks)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1518 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
214302284e
*) undoing last commit because of problems with getUpdateTime
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1514 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
408de3beee
*) avoiding to search in the treemap two times for the same key
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1513 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
139ba4e0c8
Bugfix for getCachePath(URL url)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1510 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
442807cb29
*) Bugfix for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1506 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
22fd1ca9aa
*) minor changes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1505 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
6a99304b2b
*) Redesign of db import functionality
...
- restructuring to allow different import tasks to be controlled via one gui
- adding possibility to import a single assortment file
- adding possibility to set the cache size that should be used
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1504 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3834675084
fixed bug that caused wrong behavior of search result preparation
...
(second search on same topic resulted in less links)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1502 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
31c8476b5d
plasmaWordIndexCache.getContainer:
...
*) Also get entries from cache
*) calculate available remaining time for backend.getContainer correctly
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1501 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3419b3bcdd
fix for bug that caused the peer-counter problem.
...
See http://www.yacy-forum.de/viewtopic.php?p=16016#16016
The kelondroDyn now uses a generic fill character.
kelondroDyn-Tables containing peer/word/url-hashes must not use '_'
as fill character.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1498 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
4f43816ec0
*) Fix wrong class cast in indexSize()
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1495 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
a7f0adf6fa
bugfix in entity iterator
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1490 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
fa90c3ca7a
- removed some usage of indexEntity
...
- changed index collection process: indexes are not first flushed to indexEntity,
but now collected directly from ram cache and assortments
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1489 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
aea3e00864
cleanup: removed unused temporary index management in indexEntity.
...
This is replaced by indexContainers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1486 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
03c65742ba
changes towards the new index storage scheme:
...
- replaced usage of temporary IndexEntity by EntryContainer
- added more attributes to word index
- added exact-string search (using quotes in query)
- disabled writing into WORDS during search; EntryContainers are used instead
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1485 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
ab7a911bb3
*) Trying to solve pool not open problem
...
See: http://www.yacy-forum.de/viewtopic.php?t=1798
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1482 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
d665f3c39c
*) fixed Threadnames for stackCrawl-Threads
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1480 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
3d5347bc8e
*) changing loglevel for some messages
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1479 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
0fcd113c42
*) last bugfix part. Seems to work now for the stackCrawler
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1478 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
b9c9eaeb44
*) next try todo a bugfix :-((
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1477 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
4b4b93c413
*) next try todo a bugfix :-(
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1476 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
d9fbad71b9
*) next try todo a bugfix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1475 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
6da97bd2e4
*) next bugfix for threadpool problem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1474 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
bea2b9edee
*) further redesign of threadpools to solve too many thread problem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1473 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
784fd50437
*) more verbose thread names
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1471 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
56e4dbeb71
*) displaying current active + current idle threads in PerformanceQueues_p.html now
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1470 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
859c6a88f5
*) testing various thread pool eviction settings to avoid outOfMemory - Thread creation problem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1467 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f2b18cede9
AND-bugfix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1461 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b946e28e61
some ranking enhancements
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1460 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
rramthun
6c02f889f7
Cosmetic changes.
...
Corrected version numbering as described in http://www.yacy-websuche.de/wiki/index.php/De:Versionsnummern
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1453 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
b191f06d16
*) Adding additional logging message to locate problems with stackcrawl threads
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1452 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
d9bcd73d93
*) Bugfix for exception
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1448 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
f5abfe8d57
*) more failsafe threadpools
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1446 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
a56fefe0d3
added missing forced-flush for index cache
...
see http://www.yacy-forum.de/viewtopic.php?p=15732#15732
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1434 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
78bcb8014a
*) Limit range for selection of indexes for distribution to a DHTDistance of 0.2
...
(For wider ranges enough suitable targets are not probable)
*) Migrate Indexes from ClassicDB back to AssortmentCluster if transfer fails
*) Remove class iterateFiles from plasmaWordIndex
(The class iterateFiles from plasmaWordIndexClassicDB is used instead)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1430 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
861aae678d
*) cleanup cacheAge database when cleaning up the HTCache
...
*) Log directory deletes with level Fine
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1427 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
b4e2efef10
*) first test of new iteration function
...
ATTENTION: please don't use it at the moment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1418 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
eabf4a0386
fix for null pointer exception during shut-down
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1415 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
47843e69e2
auto-reset for switchboard queue stack
...
bugfix for http://www.yacy-forum.de/viewtopic.php?p=15684#15684
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1414 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
d6581c445b
added content iterator for corrupted database files
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1406 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
ecdc1f7547
*) Bugfix for crawling URLs with query parameters
...
See: http://www.yacy-forum.de/viewtopic.php?p=14065
*) Preparation for http://www.yacy-forum.de/viewtopic.php?t=1719
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1405 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
fc4ae899f7
added word-position to ranking (this is only a first step)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1395 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
bb2095fe39
assortment files are now not deleted, but shifted to a backup directory.
...
See also: http://www.yacy-forum.de/viewtopic.php?p=15458#15458
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1394 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
7366e39dd3
tried to fix 100% CPU bug.
...
See http://www.yacy-forum.de/viewtopic.php?p=15569#15569
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1393 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f14d49fae9
enhancements, bugfixes and additions to word index attribute storage
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1392 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
4d33020f56
Migration to WORK
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1389 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
rramthun
1e5feedf0e
Fix for http://www.yacy-forum.de/viewtopic.php?p=15547#15547
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1388 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f4ffa9aee5
- implemented more attributes to index entries
...
- implemented hand-over of new word index attributes during remote search
- implemented word-distance computation during search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1382 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
90b940e90e
fixed position storage problem.
...
Now the word position is properly stored.
No use of that now, but can be used for better ranking.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1378 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
0371494010
tried to add word position to index
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1377 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f1cfee7703
removed tabs from condenser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1376 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
37791fd529
*) Close indexEntities when "found not enough peers for distribution"
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1375 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
c5b6154136
added CRDistOn = true/false
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1372 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
71d5c2b2ca
better control for target peer selection for RWI transfer
...
see also http://www.yacy-forum.de/viewtopic.php?p=15343#15343
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1370 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
ca7407b7e1
*) Don't change maxTime if zero or negative
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1363 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3d7c8aaeae
removed confusing method
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1339 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4cd0c45a77
code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1337 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
971247b78f
- rotate merged indexes after merging
...
see: http://www.yacy-forum.de/viewtopic.php?t=1717
- fix -rwihashlist to correctly shutdown
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1336 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
e2ff1767b5
fix for last DHT distribution bug-fix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1330 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
060e5a0df0
fixed problem with DHT target peer selection:
...
- shifted selection in front of distribution
see http://www.yacy-forum.de/viewtopic.php?p=15131#15131
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1327 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
7c22afe3de
*) Bugfix for NullpointerException in deleteOldHTCache
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1326 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b21b9df2d0
added section headlines generation to html parser
...
can be viewed in cache control, but is not yet included to indexing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1320 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
rramthun
c4487deba9
Minor changes collected over some time.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1319 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
6822dce57b
Using Orbiters function for auth
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1315 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
2028403670
- consolidated different orderings to kelondroNaturalOrder
...
- added another iteration method to rwihash-enumeration
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1309 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
9544c47684
added some UTF-8 handling.
...
hope this will help somehow.. for shure not THE solution to our UTF-8 problem
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1308 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
9d8dca750e
BUGFIX for my last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1306 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
5449193167
bugfix for http://www.yacy-forum.de/viewtopic.php?t=1706 (i hope)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1304 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
2a23f5d419
F..., Sorry, no time, later
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1303 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
3a2d13786e
bugfix for http://www.yacy-forum.de/viewtopic.php?t=1706
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1302 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
dc0999ec9c
adapted to new HTCache structure
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1290 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
9086261476
refactoring of base64 encoding:
...
the kelondro database needs specific information about the order of
base64-encoded keys. Since no other package depends on base64
(only the httpd uses base64 for encryption, but does not need to encode these strings)
it is good to move base64 encoding to the new ordering classes in kelondro.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1284 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
b24fcc8ca4
oom
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1281 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
7da232b5b9
HTCache Reset if necessary
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1280 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
4f18f24d81
small change
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1278 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
c652527620
YaCy removes now the old HTCACHE data
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1277 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
69f65210e2
".yacy" has its own directory;
...
glad new year :)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1275 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
351fffc129
DATA/WORK for user-created content
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1274 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
a81cc9d969
no DATA/DATA to avoid confusion.
...
increasing version number
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1273 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
b95c5d5781
BUGFIX for URLs how "/../" ...;
...
new port handling;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1271 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
9cce3c5709
dates Table for bookmarksdb(needed for del.icio.us api)
...
Files in DATA/DATA
Migration: move bookmarks.db from SETTINGS in DATA
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1270 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
11fe95832e
avoid division by zero when index transfer is extremely fast
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1269 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
4ac0fd328a
First Version of the Bookmarksmanager
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1248 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
d7b6dcbe2e
*) Bugfix for MalformedURL problem if Location header is empty.
...
See: http://www.yacy-forum.de/viewtopic.php?p=14325#14325
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1247 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hermens
5b3e01bd3c
avoid division by zero when importing very small indexes (<100 entries)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1238 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
b7f9adc2c9
new filters added
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1231 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
79667a172e
*) Bugfix for additional parser problem
...
See: http://www.yacy-forum.de/viewtopic.php?p=14146#14146
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1221 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
8c594841a8
*) Bugfix for incorrectly indexing of URLs that were requested with Cookies in the
...
Request header
See: http://www.yacy-forum.de/viewtopic.php?p=14077
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1214 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b5d02d649a
fixed bug caused strange search result behaviour
...
(results from remote peers had not been saved propery after search)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1213 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4500506735
fixed some bugs concerning url entry retrieval and intexControl interface
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1212 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
83a34b838d
* added Object allocation monitor on performanceMemory page
...
* added some final statements
* changed shutdown sequence order
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1211 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4ff3d219e8
increased delay for cacheScan start and slowed down scan process
...
to provide more time to other tasks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1210 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3031903d50
re-design of RAM cache flush into assortment cluster
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1209 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
0c762daf4b
better startup failure handling
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1205 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f27f9ecf15
* activated write buffer for databases.
...
This should increase IO performance and reduce HD activity
* bugfixes for new exception-on-failure policy
* bugfixes for new IOChunks
* new Object pool for database write-buffer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1204 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
c59d1b2f5e
- Tests with write buffer (new class kelondroBufferedIOChunks, not yet active)
...
- minor bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1203 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
bb79fb5d91
- changed handling of error cases retrieving urls from database
...
(no more NULL values are returned, instead, an IOException is thrown)
- removed ugly damagedURLS implementation from plasmaCrawlLURL.java
(this inserted a static value into the Object which is not really a good style)
- re-coded damagedURLS collection in yacy.java by catching an exception and evaluating the exception message
to do:
- the urldbcleanup feature must be re-tested
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1200 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
e7d16ef831
*) Corrections in jMimeMagic MagicRule-file to detect some special rss feeds
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1196 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
386d9e45d8
*) Bugfix for code cleanup
...
- Code must be in finally block, otherwise it does not work if an error occurs!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1193 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
5a1d45715d
*) Bugfix for parser configuration bug
...
- it was not possible to disable all parsers
See: http://www.yacy-forum.de/viewtopic.php?t=1579
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1191 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
rramthun
a1061495d4
Fixed some spelling mistakes and added some text which (should) make it easier to understand the options.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1187 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
0cdc58aaea
fixed indexing of local domains.
...
see http://www.yacy-forum.de/viewtopic.php?p=13680#13680
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1186 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
e1c2d8ec5f
*) Speedup "removed from queue"
...
See: http://www.yacy-forum.de/viewtopic.php?p=13442#12188
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1183 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
96930f0d2b
*)added function to removed malformed URLs from urlHash.db
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1182 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
8862b6ba4b
*) Corrections for code cleanup 1175
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1179 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
13fdebc50d
added authentication for link deletion in search result
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1177 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
37f88b4017
code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1176 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
ec2b39c1ce
code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1175 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
8f1f2daa5e
implemented interactive link deletion of search results.
...
next steps: attach voting and restrict to administrator
to see the deletion button, move the mouse pointer to the left of a search result
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1172 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
6d0f7e6988
*) Adding missing file
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1171 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
44fa94ac52
*) Modifications for dbImport functionality
...
- dbImporter threads are now shutdown by the switchboard on server shutdown
- adding possibility to pause a importer thread via GUI
- Bugfix for abort function
See: http://www.yacy-forum.de/viewtopic.php?p=13363#13363
*) Modification of content parser configuration
- now it's possible to configure which parsers should be enabled for the proxy,
crawler, icap, etc. separately
-
*) htmlFilterContentScraper.java
- adding regular expression to normalize URLs containing /../ and /./ parts
*) httpc.java
- adding functionality to unzip gzipped content
- requested by roland: should be used later to allow gzipped seed lists
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1170 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
dc778659fb
fixed problem with time-out during result joint which caused OR behavior instead of AND beahvior
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1167 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3d8a5ae652
code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1166 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
64478b1f02
*) Adding possibility to delete crawler queue entries using regular expressions
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1160 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
a04930f025
code cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1158 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
low012
90b0eb144e
just a typo...
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1155 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
129b15f3e1
*) Correcting logging output of db importer thread
...
See: http://www.yacy-forum.de/viewtopic.php?t=1555
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1154 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
420d56ce79
extended db-testing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1152 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
ecf765ec33
temporary fix to make jrpm extension compilable with my netbeans environment
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1151 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
8ed0aaae8d
*) Adding content Parser for RPM Files
...
- at the moment only the metadata is extracted
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1147 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
818d37ce44
*) Removing getSimpleName
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1143 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
b35c5a48bf
*) First version of urlRedirector.pl script
...
- with this script it's possible to pass URLs from squid
to yacy via the squid redirector interface
- this URLs are then used by YaCy to feed the crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1141 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
bdf30117c1
*) Redesign of parser configuration
...
- restructuring of mimeTypes based on the parsers
- displaying parser usage count
- displaying human readably parser names
- displaying parser version information
*) httpdFileHandler.java
- adding possibility to support "streaming" servlets
which are special servlets that can communicate with
the client via the connection streams autonomous
- the name of these new servlet types must end with the
file extension .stream
- this feature will be needed by the yacy ScreenSaver
class to fetch statistic data from the peer without the
need to reconnect to the server all the time
*) Adding human readable names and version information for
all supported parsers
*) plasmaParser.java
- adding new structure to store parser statistic data
*) Adding openDocument parser
- can be used to parse odt files
*) jmimemagic
- adding rules to detect openDocument formats properly
*) serverLog.java
- adding functions that can be used to query if a given
logging level is enabled or not.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1140 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
d4ac3e25b1
*) Bugfix for file system link bug during detection of invalid URLs
...
See: http://www.yacy-forum.de/viewtopic.php?p=13301
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1134 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
adf75bc9fa
better logging for invalid file path detection
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1133 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
40621a5663
anhancements in ranking preparation and fixed problem with parser/mime recognition
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1132 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
c650b112ea
*) Bugfix for relative URL Bug in Crawler
...
See: http://www.yacy-forum.de/viewtopic.php?p=13266#13266
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1130 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
4e73035aef
*) Bugfix for "too many open files" during index distribution
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1128 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f57e2d67f5
shortened network overview (less columns fit easier on page)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1124 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
85282b1d98
enhanced YBR recognition and search result heuristics
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1121 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b9cc9029e3
added ybr selection for remote search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1119 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
0e25020f51
added first generation and usage of YBR index-files. Enhanced overall ranking of search results.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1118 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
90d6c6223b
*) Adding color codes to network graphic legend
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1114 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
bfe51c7228
added generation of domain-list
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1112 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
0ec54d9c5f
enhanced CR-file handling and added first RCI-evaluation tests
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1110 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
c2fe3a1670
*) Updating jMimeMagic Ruleset
...
- to detect some special formated html documents correctly
- adding rule to detect vCards
*) plasmaParser now supports parsing of files that have a supported fileExtension
but a unsupported mimeType because the webserver has set it incorrectly to text/plain
*) Adding vCard new Parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1107 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
88e3234393
fine-tuning of rci-generation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1105 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
a12759c1bf
first try to implement a rci-computation from cr-files
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1103 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4a8e8f269e
refactoring of cr-processing; new kelondro class to handle the attribute file format
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1100 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
24dc0e0760
implemented cr-file processing and further transmission steps
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1099 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
9d9a87f445
limited htcache storage length
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1096 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
d0dfccdb77
*) Making CrawlStacker pool configurable via GUI and config file
...
See: http://www.yacy-forum.de/viewtopic.php?t=1448
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1087 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
3631cb1f6d
*) deleting empty entities during index selection
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1086 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
ca26aab9b1
*) More debugging output for migrateWords
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1085 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
9b35ae9027
*) Correcting wrong % values on IndexTransfer_p page
...
See: http://www.yacy-forum.de/viewtopic.php?p=12646
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1084 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
e6bf9d90a5
*) Fixing Problems with MalformedURLs during Word Selection
...
- removing (lurl.toString() == null) comparison because toString() is never null
- adding (lurl.url() == null) condition because url() is null if we have selected a word entry with
a malformed URL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1083 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
86a9210264
*) indexing queue slots are now configurable via config file
...
See: http://www.yacy-forum.de/viewtopic.php?t=1480
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1081 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
3c11d7b81c
*) Bugfix for minimizeUrlDB
...
- function didn't work correctly because of new url hash structure
See: http://www.yacy-forum.de/viewtopic.php?p=12753#12753
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1080 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
9913049009
fixed outOfMemory bug caused by loops in kelondroTree during enumeration
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1079 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
bbb936b9ea
*) Bugfix for not human readable content of PDFs while viewing the URL Content via GUI
...
- This Bug also affects the snippet generation on non html/text documents
See: http://www.yacy-forum.de/viewtopic.php?t=1472
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1075 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
445e3a620f
*) Avoid rejecting of html content by the crawler when the file extension is not set properly
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1074 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
444a5a9368
*) Bugfix for Entries with null url in GlobalQueue
...
See: http://www.yacy-forum.de/viewtopic.php?p=12675#12675
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1069 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
ebac51df52
restore defaultRemoteProfile
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1063 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
5778428455
move cutUrlText to nxTools,
...
max length from URLs(title) on searchpage now 120 chars
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1060 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
9158845c3b
bugfix for snippet text null bytes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1059 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
f763923e0a
added missing files for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1057 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
79818a320f
introduced citation-rank transmission protocol and activate transport for anonymisation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1055 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
7e0647f692
*) Bugfix for userDB usage during authentication
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1052 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
02f8013013
auto-delete of corrupted word files during word-migration
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1047 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
d2731418bf
added creation of global ranking files and changed url normal form usage
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1046 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
6f9f8ed8f8
*) Automatic Reset of Stack Crawler DB on startup errors
...
See: http://www.yacy-forum.de/viewtopic.php?t=1432
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1045 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
fb766413d1
*) Changes on httpc dns caching
...
- Bugfix: old dns cache did not handle case insensitive hostnames correctly.
- adding a possibility to set domain name patterns defining hostnames that should not be cached by the httpc dns cache
e.g. borg-300.dyndns.org
This can be done by setting the new httpc.nameCacheNoCachingPatterns property
- using httpc.dnsResolve wherever possible within the sourcecode
[httpd.java,plasmaCrawlStacker.java]
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1044 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
bc420c62f6
fixed htcache path generation (never change a running system)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1041 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
dd24f0252f
*) Searchword highlighting for info page
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1036 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
72cde1d894
getCachePath: no logging
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1033 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
1fbd72f9e0
rename "index.html" to "ndx"
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1032 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
cd1107d85e
added support for URLs with '?&'
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1030 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
5fb2b017cb
small change
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1029 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
544e4ea90e
small change
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1027 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
00ab4d8723
cleaned, small change, Properties
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1026 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
b8ceb1ffde
*) Adding better https support for crawler
...
- solving problems with unkown certificates by implementing a dummy trust Manager
- adding https support to robots-parser
- Seed File can now be downloaded from https resources
- adapting plasmaHTCache.java to support https URLs properly
*) URL Normalization
- sub URLs are now normalized properly during indexing
- pointing urlNormalForm function of plasmaParser to htmlFilterContentScraper function
- normalizing URLs which were received by a crawlOrder request
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1024 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
e3179a6394
added getOwnSeedFile()
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1022 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
a803a509ae
bugfix: port handling in HTCache
...
grogram flow, cleared up
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1021 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
cb69047b91
*)cleanup access static methods and fields
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1016 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
hydrox
56b9f34411
*)removed unused imports
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1015 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
5f68b6886b
introduced new url-hashes for better ranking computation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1013 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
aadace1285
fixed network image in search performance monitor
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1012 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
bb369c98de
fixed search result ordering by date
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1011 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b058ecf0bc
refactoring of image-generation; added experimental PNG encoder (not active now)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1008 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
d42531e1b2
added auto-reset for NURL-DBs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1004 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
92c49b406b
adminAuth with userDB and adminAuthenticated (fix for statuspage)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1001 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
rramthun
27f180f24b
Update of YaWoStat to 0.2.
...
Now does not try to make 400000! operations to load a 4MB textfile :-/
Program is not finished yet.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1000 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
d656e2b433
added a memory-profile chart generation to database performance testing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@993 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
ec3af327f7
*) Bugfix for Proxy-Authentication against remote proxy
...
See: http://www.yacy-forum.de/viewtopic.php?p=11804#11804
*) Adding first version of db test for mysql
NOTES:
- db user + db + db table must be created before starting the test
- db table must be empty. Entries can not be updated at the moment
- db connection properties must be changed in the sourcecode at the moment
TODOs:
- accepting connection properties via command line
- implementing update + remove + read operations
- 'maybe' adding code to create db + table if it doesn't exists
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@991 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
5b0911d7ea
added new performance menu for search sequence configuration and monitoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@990 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
allo
ada06b0674
bugfix for Networkimage from Hydrox
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@986 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
1aa4ba8b62
added post-search filtering of redundant urls (longer than existing cited)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@982 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
8d827cdb30
tried to fix problems with order of network list by last-seen (which could also improve the network picture)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@980 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
097009d910
experimental visualization of DHT access during global search (temporary)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@977 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4dcbc26ef1
introduction of search profiles; very experimental
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@976 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
6c48c3ce39
*) Bugfix for ArithmeticException during IndexTransfer
...
See: http://www.yacy-forum.de/viewtopic.php?t=1362
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@974 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
525c8dcbd4
*) Adding Traffic Statistic for Crawler
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@972 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
9a5ab62928
*) Adding yacy specific X-YACY-Index-Control header which can be used by clients
...
to disallow yacy to index the response that belongs to the request where
X-YACY-Index-Contro is set to "no-index"
*) Bugfix for Seed-List download via Remote Proxy.
Now the pragma and cache-control http headers of the request are properly set to "no-cache"
See: http://www.yacy-forum.de/viewtopic.php?p=11639#11639
*) Bugfix for http-Proxy
yacy has ignored "no-cache"- pragma and cache-control http headers that were send in requests.
Now, these request headers are evaluated properly
TODO: Missing evaluation of "no-store" request headers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@971 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli
02d9af1a70
*) Restructuring and extending of Remote Proxy Support
...
- remote proxy configuration can now be "really" changed on the fly and takes effect immediately
- adding possibility to disable remote proxy usage for yacy->yacy communication
- adding possibility to disable remote proxy usage for ssl
- restructuring proxy configuration so that it is stored in a single place now
*) Adding possibility to import a foreign word DB (or even more of them in parallel)
at runtime into the peers DB
- this can be done by calling IndexImport_p.html
- ATTENTION: please not that at the moment this thread must be aborted via gui
before a normal server shutdown is done.
- TODO: integrating IndexImport Thread into normal server shutdown
- TODO: Adding posibility to import crawl-queues, etc. from foreign peers
- TODO: removing old import function from yacy.java and calling the new routines instead
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@968 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago