orbiter
43c8defd79
enhanced parser with more extension + mime attributes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6214 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
49bbb9bd45
replaced tar library with integrated apache ant tar lib
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6212 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
b2263bc720
enhanced document type recognition
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6209 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
aa38eb5a20
* maxfilesize -1 for infinite filesize
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6208 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
9cfe89c8fc
* process content-length as soon as it is received
...
* corrected indentation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6206 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
50cf80056f
removed jmimemagic library
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6203 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
3f113f38a8
removed unused imports
...
removed unused libs from eclipse class path
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6201 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
9f083bb6b2
check filetype before loading (no more mp4 loading)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6200 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
076ae02c44
* added pl and py to extensions excepted by htmlParser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6198 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
d5e51cfd09
* workaround for non-working build property replacements
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6197 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
f814e0fa81
enable warnings and fix most of it
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6196 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
8931c8d6b4
improvments to debianpackage:
...
* autoupdate completely disabled, display hint
* restart-button in interface works!
* moved all build-Variables to yacyBuildProperties
* fixed some warnings
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6195 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
fc1dc38b55
*) added spaces to make sure that no words are concatinated by accident
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6194 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
f242e7d7bc
*) using Apache POI library to parse Word documents now
...
*) removed tm-extractors library (can be found at http://www.textmining.org/ if necessary again)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6193 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
caedd72400
- enhanced logging and exception details for parsers
...
- removed inconsistencies in mime type declaration (one mime type should only appear once in all parsers)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6192 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
4b74ad0a46
fixed setting of parser configuration servlets
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6191 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
57a88d435b
redesign of parser mime type detection and parser steering
...
There is now a mime-blacklist instead of a mime-whitelist
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6190 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
e15d27bc63
avoiding double/wrong parser errors
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6189 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
21b8704fb4
refactoring of the ParserDispatcher and ParserConfig: resulted into Idiom, Parser and Classification classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6188 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
8ca1f5d400
- some work to integrate the html parser the same way as the other parsers are integrated (not finished)
...
- added migration of code of settings pages (hmm.. does not work correctly yet, sorry)
- more refactoring
- removed more unused code
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6187 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
1ee109761f
*) added changes which were lost
...
*) additional annotations
*) additional svn properties
*) _no_ functional changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6186 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
499723891d
removed all non-http daemons; they had not been used and may be a potential security risk.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6185 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
0e8647d62f
refactoring of search classes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6184 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
dafffd0153
refactoring of parsers and document processing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6182 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
8041e91f56
*) Ooops!
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6181 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
69551ff3d9
*) added several MIME types (derived from http://filext.com/ ), some of them might be rather uncommon
...
*) added an annotation forgotten in last commit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6180 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
11dfb2d54f
minor changes:
...
*) added annotations
*) set svn properties and added keywords to comments of parser classes
*) made a variable final to prevent (theoretical case of) change of object instance in synchronized block
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6179 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
77d2a3782c
removed strange debugging strings
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6177 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
4320f69574
universal handling for crashed parsers
...
reverting r6090/1
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6176 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
024744245c
small refactoring to prepare for new queues
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6173 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
16efcd0366
fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2252&hilit=&p=16389#p16389
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6172 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
0f3246e90a
* fix debian package
...
* add Class containing buildvariables
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6171 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
8544cfd5a6
* remove seperate build-files for parsers
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6170 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
24cb6d68bc
- renamed Stack to RecordStack to avoid name confusion with new classes
...
- added new Stack class that implements a stack on BLOB files
- added new Stacks class that can be used for a set of Stacks (a 'Stack Database')
- added methods to other classes to support the new stacks
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6169 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
995da28c73
all stack/heap files that had been stored in DATA/PLASMA are now stored in the network-specific QUEUES path
...
There is no migration. All crawls must be restarted.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6167 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
aac89bf8ca
trying to avoid "exceeding limit" message of server
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6166 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
f1ori
48d78166ed
* fix double copy of libraries
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6164 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
lotus
7f868ca3c2
resource observer: support for yacyroot\DATA on an NTFS hardlink (Windows)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6162 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
409538e17a
code cleanup and code simplifcation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6161 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
160031758d
fix for problem with initializer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6160 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
302a02cec8
moved all libraries from libx to lib
...
removed libx directory
all libraries are now in lib, instead the test libraries in libt which are not part of releases
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6157 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
1f1399e5c5
extending visibility of objects and methods to avoid synthetic accessor methods and increase performance
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6156 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
154bbc3364
code cleanup: call of static methods directly to the class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6155 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
222850414e
simplification of the code: removed unused classes, methods and variables
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6154 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
93dfb51fd4
problems with code style
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6153 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
adf01c676e
reduce lookup time when merging a large number of BLOBs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6152 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
9a674d8047
- After the removal of the Tree class some code simplifications are possible. This affects mostly the Records class, which can be refactored and the result of the refactoring results in a reduced number of classes.
...
- The EcoTable was renamed to Table.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6151 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
c5122d6836
completed migration of BLOBTree to BLOBHeaps:
...
- removed migration code
- removed BLOBTree
after the removal of the BLOBTree, a lot of dead code appeared:
- removed dead code that was needed for BLOBTree
Some more classes may have not much use any more after the removal of BLOBTree, but still have some component that are needed elsewhere. Additional Refactoring steps are needed to clean up dependencies and then more code may appear that is unused and can be removed as well.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6150 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
orbiter
d1083a6913
maybe we have less problems with open connections to the server if we don't do BF forced sleeps (just a test)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6149 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago
low012
ebe6c823ac
*) changed svn properties agains (hopefully doing it right this time)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@6147 6c8d7289-2bf4-0310-a012-ef5d649a1542
16 years ago