orbiter
6ffc6e3389
more refactoring of indexer and kelondro classes;
...
- integrating the indexer into kelondro as package 'text'
- renaming of classes in kelondro.index
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5663 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
404bc21da9
simplification of (internal) query process / refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5662 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
76ef5f0f14
refactoring of index package: better names for the classes (to be continued)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5661 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
2df57b1fd1
refactoring of index collection class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5660 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
39a177649b
* added upnp listener for devices that do not respond to discovery but advertise themselves
...
* moved package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5659 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
d1d9fbae5c
enabling the URLAnalysis to operate on multime input files, just use a wild card when calling the class from the command line
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5658 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
c728879ab8
fixes to yacyURL - more exceptions in case that urls are strange
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5657 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
7542336ae5
performance enhancement to yacyURL: omit second processing of resolveBackpath. This method is already applied during initialization of the object and was called a second time when the url was exportet.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5656 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
7ea53fe47b
added another url list transformation option:
...
- check the list and kick out entries with lines that contain not valid urls
- normalize the urls
- remove doubles
- sort the list
- split the list in smaller chunks
This is all done in one process which can be called with a new -sort option
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5655 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
e521e81148
bugfix in yacyURL (for latest performance hack)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5654 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
54625360f7
performance update
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5653 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
d884c4718a
added gzip support for URLAnalysis:
...
url lists can also be compressed with gzip
If such a file is handed over to URLAnalysis, the output will also be written as .gz-file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5652 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
46632f4385
performance update to yacyURL
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5651 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
cf9b74e6e3
added another method to process url lists: extract hosts only
...
This can be used like
java -Xmx2000m -cp classes de.anomic.data.URLAnalysis -host DATA/EXPORT/20090224213823.txt
changed als the call method to generate statistics, please use now
java -Xmx2000m -cp classes de.anomic.data.URLAnalysis -stat DATA/EXPORT/20090224213823.txt
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5650 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
89d8e824ed
memory protection for URLAnalysis
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5649 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0f6fa804ff
performance update to URLAnalysis
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5648 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
8444357291
added new row interator in kelondro tables files that enumerates rows
...
without an order by the primary key. The result is a very fast enumeration of the Eco table data structure. Other table data types are not affected.
The new enumerator is used for the url export function that can be accessed from the online interface (Index Administration -> URL References -> Export). This export should now be much faster, if all url database files are from type Eco
The new enumeration is also used at other functions in YaCy, i.e. the initialization of the crawl balancer and the initialization of YaCy News.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5647 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
e8f5f2f612
added tool to analyse url strings
...
and to generate statistics about words occurring in urls
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5646 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
6117e083e5
option to customize tray label (tooltip) with tray.label
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5642 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b8c3803bfc
don't panic when canceling server sessions
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5641 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
de714783b1
- added host, path, filename to search result
...
- modified yacyinteractive, shows now also date
- added size attribut to export file in xml format
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5639 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
9519d84372
changed "dooble" variable to "browserintegration" to be less specific
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5636 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
8429083972
adjusted tray for dooble:
...
you can now set dooble=true in yacy.init to disable the menu and browser popups by default
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5633 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
ef62ec635e
removed overwriting of logging config
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5629 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
c852d2d70e
- reject too old seeds
...
- do not store the complete seed in the reverse name cache, only the hash of the peer
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5628 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
aca973e2d9
catch more exceptions
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5627 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
9559bc23fd
automatic clean-up of dead connections
...
(hope that works well..)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5626 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
hermens
02dfd6183b
Fix logging in serverCore
...
Prevent NPEs from keeping stopped Sessions in the pool and blocking slots
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5625 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
hermens
d30456e2c8
Fix logging in serverCore
...
Prevent NPE:
I 2009/02/20 15:15:56 PLASMA check for Session_77.37.19.225:38812#0: 86515 ms alive, stopping thread
I 2009/02/20 15:15:56 PLASMA Closing main socket of thread 'Session_77.37.19.225:38812#0'
E 2009/02/20 15:15:56 SERVER receive interrupted - exception 2 = Socket closed
Exception in thread "Session_77.37.19.225:38812#0" java.lang.NullPointerException
at de.anomic.server.serverCore$Session.run(serverCore.java:623)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5624 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
4f9dae2571
remove reference in crawl entries
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5623 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
1ba4301920
automated interruption of dead incoming connections, if they are there for more than one minute
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5622 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
c12bb8a6d0
- refactoring of the http client
...
- added a protection against memory leaks for the access tracker
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5621 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
5d3983faae
the soLinger parameter was wrong.
...
With soLinger=true the httpd looses connections
The effect can be seen when crawling the internal repository:
lost connections filled the client process queue until it was full
and no more connections were possible.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5620 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
62505bb3cb
more bugfixes as recommendet by findbugs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5619 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
6b450d09ca
some fixes recommended by findbugs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5618 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
4db80065ac
select more
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5617 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
94c42691d8
- reject less transmissions as transmission receiver
...
- do not flag too much receiver when something goes wrong during transmission as sender
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5616 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
f887fc159f
try to reduce the large number of unclosed incoming connections
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5615 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
e04a0e05c3
fix for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5614 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
a9ad863686
second part of 'doubles' fix - better handling of doubles in RAMIndex. More logging.
...
still missing: deletion of double entries in collections
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5613 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
59427064fb
first part of 'doubles' fix (not fully ready yet)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5612 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
26978b2a25
- better memory protection in kelondro caches: computation of needed memory for cache grow
...
- removed excessive gc calls
- step to 16 vertical DHT partitions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5611 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
e9e2fff47a
better scaling on performance graph
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5610 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
4aad461100
added UPnP support
...
YaCy can now automatically forward ports on home routers
off by default
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5609 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
99b9788e54
fix for possible 100% CPU caused by concurrent access of HashMap
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5607 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
be0c492ae5
fix for memory leak bug in new dht transmissions
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5606 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
hermens
2173865f92
Prevent race condition when switching timezones.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5605 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
40d9849aa4
- better control of chunk size in dht selection
...
- more restrict values in selection
- step to 4 vertical partitions
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5603 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
30a1de41b3
disabled the BufferedIOChunks, because I consider it as broken.
...
I will try to fix that, but it is better to not use a buffer than using a broken buffer.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5600 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
411f2212f2
more memory leak fixing hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@5599 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago