orbiter
d8e934c085
better abstraction of http client identification
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7675 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
4c013d9088
more UTF8 getBytes() performance hacks
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7649 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
694fa3a2a5
- replaced more direct string-based UTF-8 conversions by predefined UTF-8 conversion
...
- changed menu structure slightly
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7583 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
low012
3b40b98256
*) set SVN properties
...
*) minor changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7567 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
cb1f49d0f2
replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7558 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
ad7fcb9d61
Enhanced Base64Order transformation: less overhead (transformation between StringBuilder and byte[])
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7523 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
dec4f36700
- fix for missing favicons in search widgets
...
- fix for bad digest/hash computation in case of interrupts to class
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7518 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
ac6b503adf
untar files without gzip decompression even if the file has gz extension. this is done when the decompression fails.
...
decompressed gzip files with gz extension may appear if the server sets a gzip compression header
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7282 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
d2fd93135c
- moved yacybot user agent string definition to MultiProtocolURI since there are basic access mechanisms where the bot string is needed
...
- migrated the 'yacy' user agent to 'yacybot' in many client methods since the 'yacy' user agent is only used for the proxy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7199 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
5e7081cd19
refactoring towards a unified loading mechanism for MultiProtocolURIs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7065 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
orbiter
90531f78ff
refactoring of the cora package to get subpackages for http and ftp (smb to come)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@7063 6c8d7289-2bf4-0310-a012-ef5d649a1542
14 years ago
sixcooler
b7102eff92
... migrating to HttpComponents-Client-4.x ...
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6989 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
3f93a0cc8f
redesign of remote proxy settings
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6903 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
dd459281c8
applied code changes that are recommended by PMD
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6563 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
4a5100789f
replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6510 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
4431b9767e
added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6458 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
30f108f97d
added stub of oai-pmh importer (not working yet)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6437 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
3528b970d6
- refactoring
...
- added new experimental (not-yet-working) image parser
- added new test image
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6431 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
b79f4f062f
refactoring of yacy documents and parsers: they depend now only on the kelondro classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6426 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
e7f18ba24b
refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6399 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
ce8dc575ca
refactoring
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6398 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
bea3b99aff
moved table and util classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6397 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
4446acc8cd
moved kelondro order
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6392 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
f677d534b1
start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
...
- moved here the logging classes as part of the new net.yacy.kelondro package
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6391 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
lotus
386b9f35f6
activated resource observer for windows 7
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6378 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
04a548a1e3
- temporary integrated the transferURL servlet as static class instead as a class that is called using reflection to investigate the OOM problems in that class
...
- fixes for numerous other problems
- removed dead code
- resdesign of the strings-method, which produces now less memory overhead and may help to prevent OOMs
- another fix for the deadlock problem in SplitTable
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6373 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
68465c37af
added a convenience class to add files into a YaCy index
...
to make this possible, the yacyURL must be able to process file:// urls, which has also been implemented
testing of the new class resulted in some bugfixes in other classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6313 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
2740d9dd79
added integration of osm maps for search
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6291 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
1762a7bcd6
- moved DidYouMean to the data package
...
- added a DidYouMeanLibrary class that shall support the did you mean function with additional word lists
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6281 6c8d7289-2bf4-0310-a012-ef5d649a1542
15 years ago
orbiter
1d8d51075c
refactoring:
...
- removed the plasma package. The name of that package came from a very early pre-version of YaCy, even before YaCy was named AnomicHTTPProxy. The Proxy project introduced search for cache contents using class files that had been developed during the plasma project. Information from 2002 about plasma can be found here:
http://web.archive.org/web/20020802110827/http://anomic.de/AnomicPlasma/index.html
We stil have one class that comes mostly unchanged from the plasma project, the Condenser class. But this is now part of the document package and all other classes in the plasma package can be assigned to other packages.
- cleaned up the http package: better structure of that class and clean isolation of server and client classes. The old HTCache becomes part of the client sub-package of http.
- because the plasmaSwitchboard is now part of the search package all servlets had to be touched to declare a different package source.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6232 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
ca72ed7526
-removed superfluous crawl cache
...
-refactoring of crawler classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6221 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
49bbb9bd45
replaced tar library with integrated apache ant tar lib
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6212 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
f814e0fa81
enable warnings and fix most of it
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6196 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
57a88d435b
redesign of parser mime type detection and parser steering
...
There is now a mime-blacklist instead of a mime-whitelist
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6190 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
21b8704fb4
refactoring of the ParserDispatcher and ParserConfig: resulted into Idiom, Parser and Classification classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6188 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
dafffd0153
refactoring of parsers and document processing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6182 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
77d2a3782c
removed strange debugging strings
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6177 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
7f868ca3c2
resource observer: support for yacyroot\DATA on an NTFS hardlink (Windows)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6162 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
1f1399e5c5
extending visibility of objects and methods to avoid synthetic accessor methods and increase performance
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6156 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
154bbc3364
code cleanup: call of static methods directly to the class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6155 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
222850414e
simplification of the code: removed unused classes, methods and variables
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6154 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
a10c8022d1
DidYouMean:
...
- limit the number of consumer threads to available CPUs
- added some javadoc
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6144 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
fd31a3616a
- more logging in server process
...
- fix for bas ascii in comment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6084 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
ce1adf9955
serialized all logging using concurrency:
...
high-performance search query situations as seen in yacy-metager integration showed deadlock situation caused by synchronization effects inside of sun.java code. It appears that the logger is not completely safe against deadlock situations in concurrent calls of the logger. One possible solution would be a outside-synchronization with 'synchronized' statements, but that would further apply blocking on all high-efficient methods that call the logger. It is much better to do a non-blocking hand-over of logging lines and work off log entries with a concurrent log writer. This also disconnects IO operations from logging, which can also cause IO operation when a log is written to a file. This commit not only moves the logger from kelondro to yacy.logging, it also inserts the concurrency methods to realize non-blocking logging.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6078 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
39779e4796
DidYouMean: as I moved to only 8 consumer and 4 producer threads, I removed poison pills as it does not make sense anymore - threads are interrupted directly. Having a consumer thread per test case just didn't make sense either (see svn 6070) due to the massive overhead.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6072 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
c3c4dd0933
DidYouMean - changed to much simpler LinkedBlockingQueue
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6071 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
apfelmaennchen
01ac1b5d7e
- blocking queue implementation of DidYouMean
...
- timeout ist set to 500ms
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6070 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b8bb1bb364
join with a timeout does not cause that the corresponding thread is stopped after the time-out. It does only cause that the waiting is stopped. Here we need additionally a signal to the thread to stop after we finished waiting.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6069 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b69f22e9ca
mistake in last commit: computation of loops in ReversingTwoConsecutiveLetters
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6068 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
3130334932
- start first with threads that run more loops
...
- join first with threads that run less loops
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6067 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago