orbiter
985fd807cc
bugfixing in collection methods
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2882 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
c7bea4addb
*) soap api
...
- adding function to get and set message forwarding
- adding new testclass
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2878 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
ee4d4e8567
*) Soap-handler: bugfix. wrong content-length was send when using content-encoding
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2877 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
d3431433b0
more anonymization in logging
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2876 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
e6044e5198
bugfix for
...
http://www.yacy-forum.de/viewtopic.php?p=27207#27207
and
http://www.yacy-forum.de/viewtopic.php?p=27219#27219
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2875 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
4d19d94348
*) bugfix for nullpointerexception
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2874 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
532c23b5c7
*) soap handler
...
- better errorhandling
- adding support for outgoing transfer- and content-encoding
- avoid holding outgoing messages into memory before sending them
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2872 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
(no author)
5141fa5942
combinedVersionString2PrettyString(..) renamd to combined2prettyVersion(..), new parameter "computerName" added to indentify the source of problems
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2871 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
78b7f6f7fd
bugfix for index remove bug,
...
appeared after search where snippet-loading triggered word removal
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2869 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
(no author)
0e79f2fd7e
name of the file to tranlate apears ahead its translation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2868 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
ebd2d629d8
added missing file for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2866 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
147d88cf23
re-design of database caching
...
this should reduce IO a lot, because write caches are now actived for all databases
- added new caching class that combines a read- and write-cache.
- removed old read and write cache classes
- removed superfluous RAM index (can be replaced by kelonodroRowSet)
- addoped all current classes that used the old caching methods
- more asserts, more bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2865 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
4e363108e1
- removed bad debug code that caused a large and unnecessary delay during global search
...
- fixed problem that global search results disappear after a search
- removed some stopwords
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2861 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
f21ede312e
bugfixes for internals of database organization
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2860 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
eb4bfb0e9d
fixed problem with cache.profile()
...
see also: http://www.yacy-forum.de/viewtopic.php?p=27109#27109
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2859 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
2a9d868f6d
- removed object cache from kelondroTree
...
- generalized object caching and added new object caching class
- added object caching wherever kelondroTree was used
- added object caching also to usage of kelondroFlex
- added object buffering (a write cache) to NURLs
- added many assert statements; fixed bugs here and there
- added missing close methods to latest added classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2858 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
7299dc30e3
*) new soap service to manage the yacy file-share
...
- upload / download files (as soap attachment)
- create directory
- receive directory listing
- delete files / directories
- change file comment
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2857 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
777e39cea0
*) new template to display the dir-listing in xml format.
...
This can e.g. be done by using the url http://localhost:8080/share/?format=xml
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2856 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
9e8942a064
*) adding method to implement blacklist from file
...
- file transfer is done via soap attachments (see BlaclistSerivceTest for details)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2855 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
4d1f933ea1
*) avoid reading of content body into memory
...
*) Bugfix for soap attachment support
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2854 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
88cfdecd38
*) Bugfix: calling close must not close the wrapped input stream, otherwise
...
keep-alive connections would terminate
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2853 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
d38ef0493d
*) be more tolerant against missing ports in url
...
"http://yacy.net:/ " is now interpreted as "http://yacy.net/ "
See: http://www.yacy-forum.de/viewtopic.php?p=27102
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2852 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
cfe54fedc7
*) Bugfix for resolveBackpath problem with tailing /..
...
*) Junit testclass for resolveBackpath testing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2850 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
dc056fabf3
small bugfix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2847 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
278d8c3c7e
- more asserts
...
- bugfix for reading of previously deleted nodex
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2845 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
5a6488256d
catch the "username too short" exception
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2844 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
2d3f1a53fd
handling of Missing byte-order mark exception
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2842 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
ac13fa763a
*) bugfix for blacklist remove (blacklist was not informed about remove)
...
*) adding new soap service class for blacklist management
*) new junit class to test soap blacklist service
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2841 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
c5a5a9eb1c
- patch for NullPointerException by Fuchs: see http://www.yacy-forum.de/viewtopic.php?p=27033#27033
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2840 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
8a5c2d0a19
fix for supertemplates, too.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2839 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
c35793fb46
fix for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2838 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
3e0516446b
*) new soap function to get the current queue status
...
*) new junit testclass to test soap statusService
*) refactoring of admin service (usage of constants instead of strings)
*) libraries upgraded to newer version + adding missing dependency
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2836 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
a831c83025
create servletProperties, with the servlet specific funktions from serverObjects
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2835 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
1825540020
another fix for url-db migration
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2834 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
83a0efc65a
better assert statements and fixes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2833 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
d13b381f83
- added mint-green skin
...
- removed test-urls because of problems with text-encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2832 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
2025e885d6
a fix for problems with remove situations in kelondroFlexSplitTable
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2831 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b12da510f3
*) adding optional libraries for needed for soap attachments
...
(jikes won't compile without them)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2827 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
11843bba7f
fix for Malformed URL Exception in url migration
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2825 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
9eecc9a888
*) libs added to classpath
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2824 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a1acc9c389
*) new function to configure distributed crawling
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2823 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
0996e550e7
*) deploy soap peer admin service
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2822 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
3ffc5b8793
fixed problem with serverCharBuffer.append(char)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2821 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
8b56887676
removed unused code
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2820 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
06854988da
- full integration of new LURL database in INDEX
...
- added migration method for urlHash.db into INDEX
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2819 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
(no author)
02c66c04f2
*) Missing file from last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2818 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
octoate
e4a3574b77
StringBuffer now resets every time the parser is called
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2817 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
ef912811f1
*) adding new soap service for peer administration
...
- configure dht transfer properties
- configure remote proxy
- configure peer name / peer port
- configure admin username + pwd
- get peer version information
- set/get peer configuration settings
- shutdown peer
*) new function to get the opensearch description via soap call
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2816 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
ce237aefad
- assortment-sizes table from PerformanceQueues_p.html is not shown if not used
...
- escape query- and fragment-part of an url as well
- new resolveBackpath for urls: http://www.yacy-forum.de/viewtopic.php?t=2679#24867
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2815 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
68204ff729
*) Suppressing for bad client requests.
...
See: http://www.yacy-forum.de/viewtopic.php?p=26918
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2814 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
c1dff41f99
*) adding possibility to deploy custom SOAP services
...
See: http://www.yacy-forum.de/viewtopic.php?p=26748#26748
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2813 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
df49724f28
*) better error handling for seed upload - test download - problems
...
See: http://www.yacy-forum.de/viewtopic.php?p=26814#26814
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2812 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a5b9b514c1
*) retry crawling without content-encoding if the content-encoding header was not correct
...
See: http://www.yacy-forum.de/viewtopic.php?p=26917#26917
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2811 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
52466067d8
*) Bugfix for ArrayIndexOutOfBoundsExceptions which occure because SimpleDateFormat is not thread-safe
...
See: http://www.yacy-forum.de/viewtopic.php?t=2995
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2810 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b357a13e9a
*) adding synchronization block because SimpleDateFormat is not thread-safe
...
See: http://www.yacy-forum.de/viewtopic.php?p=26906#26906
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2809 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
92f774edd1
*) Better charset encoding detection
...
*) New testclass for charset encoding detection tests
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2808 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b79e06615d
- added new LURL.Entry class for next database migration
...
- refactoring of affected classes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2802 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
octoate
cc24dde5e0
First version of a MS Excel parser based on Apache POI
...
(event based parsing)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2801 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
4c63129136
- stupid mistake...
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2798 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
b14a500b88
- removed debug output from PerformanceMemory_p
...
- added URL escaping (tested, nevertheless watch out for possibly broken URLs)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2797 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
ebf0da2a45
- now the fix http://www.yacy-forum.de/viewtopic.php?t=2974 works
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2796 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
09337c9751
*) Bugfix wrong chars in soap search result document
...
See: http://www.yacy-forum.de/viewtopic.php?t=2906
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2795 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
3d152bfe43
*) Logging message added
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2794 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
b5e40e2fa2
- fix for http://www.yacy-forum.de/viewtopic.php?t=2974 (no cache-sizes for new db)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2792 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
96f45e9b15
*) Bugfix wrong chars in soap search result document
...
See: http://www.yacy-forum.de/viewtopic.php?t=2906
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2791 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
da2ac6fa23
*) adding new ant target to allow generation of client stub classes for yacy soap api
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2789 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a9cc6df21b
*) adding wsdl files to generate client stub classes with ant
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2788 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
77a59a115d
refactoring of indexing methods
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2787 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
14490f0a83
added missing flush statement
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2786 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
688cbfb776
- bugfixing for flextable bug
...
- bugfixing for collection index bug
- several other bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2785 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
a29b4d4fb5
extended Supertemplates for Headerincludes.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2780 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a7e11ada50
*) suppressing stacktrace for "server has closed connection"
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2779 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
72cc082ebe
created password generator for scripts.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2777 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
5b114249ce
*) Bugfix for ViewLog problem with multiline logging messages
...
See: http://www.yacy-forum.de/viewtopic.php?t=2972
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2774 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
de5e233766
*) Bugfix for GuiHandler sorting problem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2773 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
fd94aa4bef
*) Bugfix for IndexOutOfBound in GuiHandler
...
*) Bugfix for reversed order displaying of messages
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2772 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
29a1318ef9
bugfixes for wrong database access that do not consider deleted entries
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2767 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
cbb1e710b9
*) removing old class
...
- was replaced by plasma/urlPattern/defaultURLPattern
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2765 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
c6d46f7ebd
null pointer bugfix
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2761 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
decb09df6d
*) Trying to be more tolerant against wrong charset names
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2760 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
e9afe39cbb
*) Trying to be more tolerant against wrong charset names
...
See: http://www.yacy-forum.de/viewtopic.php?p=26662
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2759 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
7526c831a8
*) Suppressing stracktrace
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2758 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
50f2578c55
- some bugfixing and code cleanup
...
- now assortments can completely left out if they do not exist
before startup and collection index is selected.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2757 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
bdf4c7c51e
added missing files for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2756 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
a5dd0d41af
- refactoring of plasmaCrawlLURL.Entry to prepare new Entry format
...
- added test migration method to migrate the old LURL to a new LURL
the new LURL will be splitted into different tables for each month
this solves several problems:
- the biggest table in YaCy is splitted in different parts and can
also be managed in filesystems that are limited to 2GB
- the oldest entries can easily be identified, used for re-crawl und
deleted
- The complete database can be limited to a specific size (as wanted many times)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2755 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
130cc76927
loop detection and termination in deletedHandles method
...
see also: http://www.yacy-forum.de/viewtopic.php?p=26655#26655
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2754 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
octoate
1c4076da8a
First version of the MS Powerpoint parser based on Apache POI
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2753 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
5b75d64d7d
*) bugfix for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2750 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
71ed104bc7
*) adding additional rpm mimetype (used by packman)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2749 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
borg-0300
76d959122b
new constants, finals, Stringbuffer, cleanup
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2748 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
rramthun
581dd2ec72
*)Proper arrow-function on Network.html, but ordering is still broken. Perhaps someone could fix that?
...
*)Removed double creation of DATA directory. New warning message in case of insufficient rights.
*) Removed roland-ramthun.de-seedlist temporarily, because of server changes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2747 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
6396f5971e
bugfixes and migration attempt toward new kelondroFlex db
...
- more synchronization
- bugfix for remove in collections
- bugfix in kelondroFlex (wrong exception condition!)
- options to use RAM, FLEX and TREE tables for Crawl URL stacker
- default for Crawl URL stacker is now FLEX (!)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2746 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
48f81acc0e
reverse SVN 2744, it is not needed
...
(this resulted from a small misunderstanding of the newest cache layout)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2745 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
1da9aece12
Repair DNS prefetch during cacheScan
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2744 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
918b59dc5e
- bugfix for snippet profile (no delete button)
...
- bugfix for search process (avoided null pointer exception in case other peer does not respond)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2742 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
2bb529cedb
added peer tags for peers in robinson mode
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2741 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
afbb547f3d
extended options for abstracts generation in remote search interface
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2739 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
22649408ad
*) Better errorhandling for charset encoding problem during content parsing
...
See: http://www.yacy-forum.de/viewtopic.php?t=2952
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2737 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a9c7e3f061
*) Bugfix for NoSuchElementException
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2735 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
f25f61d9d3
documentation of compile problem. See
...
http://www.yacy-forum.de/viewtopic.php?p=26407#26407
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2734 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
c8f3a7d363
added snippet-url re-indexing
...
- snippets will generate an entry in responseHeader.db
- there is now another default profile for snippet loading
- pages from snippet-loading will be indexed, indexing depth = 0
- better organization of default profiles
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2733 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
low012
2cfd4633ac
*) even better handling of searchwords in snippets, words can consist of letters and numbers now
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2732 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b062847797
fix for
...
http://www.yacy-forum.de/viewtopic.php?p=26439#26439
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2731 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
e17fea7015
files in htcache are now stored in different hash/tree subdirectories
...
according to storage method
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2730 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
661f005214
fix for seed upload build script
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2729 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
low012
2d3b7251a4
*) better handling of searchwords in snippets (see http://www.yacy-forum.de/viewtopic.php?t=2891 for details)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2728 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
ddf8f220f6
fix for build fail
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2727 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
25ae3d3161
generalized definition of hexhash
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2725 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
86047f439d
removed very bad bug that prevented production of any remote search result
...
:-(((
Please update!
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2724 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
f0d747c723
removed deprecated method
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2723 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
5ff77612ac
bugfix for old WORDS storage method
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2722 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
0f10bdde22
more generic cache methods
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2721 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
72482b1426
fixed scraper
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2720 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
6557112d8f
small fix for plasmaURLPool.getURL() needed for new alternative htcache layout
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2719 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
440c6ee657
Implement alternative htcache layout
...
mostly according to: http://www.yacy-forum.de/viewtopic.php?p=26205#26205
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2718 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
226f2c5b2c
first version, of the Serverlet Debugger
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2717 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
adf1f74ab2
bugfix for java 1.5 compile problem with serverCharBuffer.append(char)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2716 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
fd61209797
lines inside tags without punctuation are extended by a single dot.
...
This enables the condenser to distinguish the lines in a better way.
The result is a better preparation of snippets.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2715 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
1d0c0edda3
first version of posts/get from the del.icio.us api
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2713 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
1969522dc1
removed lowercase of snippets (and other things):
...
- added new sentence parser to condenser
- sentence parsing can now handle charsets
to do: charsets must be handed over to new sentence parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2712 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
43614f1b36
bugfix in collection index. the index for collections was not created correctly
...
The bugfix includes a migration function which starts automatically
after startup of yacy.
This applies only to you, if you are using the new collection index.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2711 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
1dfab1abe3
more control for seed receive
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2709 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
1c0e65f55f
*) Bugfix for problems with charset detection
...
See: http://www.yacy-forum.de/viewtopic.php?p=26196
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2708 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
db294687ea
enhanced logging
...
- more logging output
- fix in log line preparation
- added filter to log page
- some small bugfixes
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2707 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a9a0f51303
*) suppressing InterruptedException errormessage
...
See: http://www.yacy-forum.de/viewtopic.php?t=2915
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2705 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
ce7ee74316
*) better errorhandling in filehandler (try catch block now starts before argument parsing)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2704 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
1d4fb680ce
*) CrawlWorker.java: only keep content in memory if size is equal or less than 5MB
...
TODO: make this limit configurable
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2703 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
1586d57187
*) odtParser: better handling of large files
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2702 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
f17ce28b6d
*) plasmaHTCache:
...
- method loadResourceContent defined as deprecated.
Please do not use this function to avoid OutOfMemory Exceptions
when loading large files
- new function getResourceContentStream to get an inputstream of a cache file
- new function getResourceContentLength to get the size of a cached file
*) httpc.java:
- Bugfix: resource content was loaded into memory even if this was not requested
*) Crawler:
- new option to hold loaded resource content in memory
- adding option to use the worker class without the worker pool
(needed by the snippet fetcher)
*) plasmaSnippetCache
- snippet loader does not use a crawl-worker from pool but uses
a newly created instance to avoid blocking by normal crawling
activity.
- now operates on streams instead of byte arrays to avoid OutOfMemory
Exceptions when operating on large files
- snippet loader now forces the crawl-worker to keep the loaded
resource in memory to avoid IO
*) plasmaCondenser: adding new function getWords that can directly operate on input streams
*) Parsers
- keep resource in memory whenever possible (to avoid IO)
- when parsing from stream the content length must be passed to the parser function now.
this length value is needed by the parsers to decide if the parsed resource content is to large
to hold it in memory and must be stored to file
- AbstractParser.java: new function to pass the contentLength of a resource to the parsers
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2701 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
630a955674
read snippets from cache in case they are not provided in RAM
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2700 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
bcf2b800b4
applied UTF-8 encoding parameter to yacy-internal protocol communication
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2694 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
c40fca08a2
fixed bad handling of string separation
...
you can now use a new encoding attribute to create strings from byte arrays
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2693 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
5a40ea7866
refactoring of wget string list generation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2692 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
dbc2e039bb
added time-out option parameter to call hierarchy
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2691 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
d4c239e4be
- fixed problem in collection index with deletion of single url references
...
- added automatic deletion of not-found snippets after search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2689 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
00746ca232
identified and fixed search performance problem caused by
...
snippet loading. Some access to header-db had been twice and even
more times in some cases. Snippet resource loading fixed.
Furthermore the snippet loading during remote search within the
remote peer has been disabled, but can be switched on remotely by
new flag 'includesnippet=true'
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2688 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b033a80750
better control of failure in node seek of kelondroTree
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2686 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
310f1c41cd
added option to see ranking scores in surftipps
...
and some cleanups
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2684 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a2e3095044
*) Bugfix. Add missing plasmaParserDocument.close() calls
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2680 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
cd5f349666
*) Better handling of large files during parsing
...
Extracted text of files that are larger than 5MB is stored in a temp file instead of keeping it in memory
*) plasmaParserDocument.java; getText now returnes an inputStream instead of a byte array
*) plasmaParserDocument.java: new function getTextBytes returns the parsed content as byte array
Attention: the caller of this function has to ensure that enough memory is available to do this
to avoid OutOfMemory Exceptions
*) httpd.java: better error handling if the soaphander is not installed
*) pdfParser.java:
- better handling of documents with exotic charsets
- better handling of large documents
- better error logging of encrypted documents
*) rtfParser.java: Bugfix for UTF-8 support
*) tarParser.java: better handling of large documents
*) zipParser.java: better handling of large documents
*) plasmaCrawlEURL.java: new errorcode for encrypted documents
*) plasmaParserDocument.java: the extracted text can now be passed
to this object as byte array or temp file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2679 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
8b2ceddb91
*) Displaying servere and warning logging messages in different colors on ViewLog_p.html
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2678 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
low012
f8ac694e51
*) fixed a bug where searchword in snippets were not displayed bold in front of a punctuation mark (see http://www.yacy-forum.de/viewtopic.php?p=25998 )
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2677 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
df1629b05a
- code cleanup
...
- version 0.471
- moved surftipps to own web page
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2676 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
c665f6cddb
*) handling of quotes in charset string
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2674 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b73efd5565
*) missing changes needed because of last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2673 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
140ddba93f
*) adding soap functions to pause and resume the crawler
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2668 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
2463e5624a
'quick' release 0.47
...
- documentation update
- necessary bugfixes (missing css for new peers)
- reduced effect of search result redundancy filter
- removed some debug output, but not all
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2665 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
49fbb688df
*) SOAP: old urlInfo renamed to urlInfoByHash, new urlInfo Function added.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2662 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
8f143d516b
*) make snippet fetcher accessible via soap api
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2661 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
97615af406
*) Restructuring of YaCy SOAP services
...
- general functions moved to abstract service class
- service class splitted into SearchService, CrawlService, StatusService
*) Bugfix for SOAP search services
- Attention: some xml tages where renamed
See: http://www.yacy-forum.de/viewtopic.php?p=25877
*) New SOAP service function urlInfo to view the parsed content of an URL
See: http://www.yacy-forum.de/viewtopic.php?p=25869
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2660 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
241b881560
*) Redesign of YaCy SOAP handler
...
- should be more fail-safe now
- better handling of compressed request bodies
- better handling of persistent connections
- better handling of AxisFaults
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2659 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
009a33170b
*) Content-Location header added
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2658 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
1aa07a52cd
*) Bugfix for UnsupportedEncodingException if the media type contains multiple parameters
...
See: http://www.yacy-forum.de/viewtopic.php?p=25832#25826
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2654 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
625c2ce6b1
*) bugfix for snippet fetching problem if content but not http header is available in cache
...
See: http://www.yacy-forum.de/viewtopic.php?p=25748
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2651 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
813a8a8179
*) migration of mimeTypeParser to jmimemagic 0.1
...
- better mimetype detection for rss feeds
- better mimetype detection for odt documents (less memory consuming)
- two new detector classes implementing MagicDetector interface of jmimemagic
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2650 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
3f5a4153a0
Make Peers more receptible to transferred indexes
...
- Set MaxWordCount for dhtInCache to indexDistribution.dhtReceiptLimit
so that the inCache gets flushed when the limit is passed
- Modify flushCacheSome to flush enough words to get below MaxWordCount immediately
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2649 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
57415b6889
*) Bugfix for surftipps UTF-8 problem
...
See: http://www.yacy-forum.de/viewtopic.php?t=2864
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2647 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
b0a4fcce8c
fix from theli
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2642 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b6c7b91582
*) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
...
*) better logging of parser failures
*) simplified usage of plasmaparser through switchboard
*) restructuring of crawler
- crawler now returns an error message if it is used in sync mode (e.g. by snippet fetcher)
*) snippet-fetcher: more verbose error messages
*) serverByteBuffer.java: adding new function append(String,encoding)
*) serverFileUtils.java: adding functions to copy only a given number of bytes between streams
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2641 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
64b2ef5aae
*) Trying to bugfix shutdown problem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2639 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
e03427871e
enhanced surftipps:
...
- added switchh to show or hide surftipps
- more news contribute to surftipps
- added voting system for surftipps
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2638 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
1dc12d6659
*) Bugfix for shutdown problem caused by cacheScan thread
...
See: http://www.yacy-forum.de/viewtopic.php?p=25729
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2636 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
borg-0300
42173462f5
rename cutUrlText to shortenURLString;
...
other little things;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2635 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
borg-0300
af1d89e381
check url == null added;
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2634 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
cc667b0aa5
*) htmlFilterContentScraper.java: adding support for link tag
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2633 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
26dfbb7499
*) Bugfix for UTF-8: url names are now stored properly in stackcrawl, crawler, indexing queue and should be displayed correct on the gui
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2630 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
cf6acff2c2
*) Bugfix. htmlFilterInputStream document analysis did not work properly for documents smaller than the
...
default InputStream Buffer size.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2629 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
borg-0300
f18304ddd3
unused/not needed imports removes;
...
properties added;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2628 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
ec031eb993
first version of surftipps
...
see http://localhost:8080/index.html
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2627 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
borg-0300
b174fbd0ca
"import ...*" removed;
...
properties added;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2626 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
807756150e
patch for strange bug reported by email
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2625 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
5c6251bced
*) some improvements for extended html document charset support
...
- new class htmlFilterInputStream.java which allows to pre-analyze the html header to extract
the charset meta data. This is only enabled for the crawler at the moment. Integration into
proxy needs more testing.
- adding eventlisterner interfaces to the htmlscraper to allow other classes to get informed
about detected tags (used by the htmlFilterInputStream.java)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2624 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
33f0f703c0
*) reinserting type cast again
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2623 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
8c11a543dc
fixed line ending coding
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2622 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b690597275
*) adding casts to avoid compatibility problems between java 1.4 and java 1.5 writer class usage
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2621 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
5afb0cbce8
*) setting default charset (for unkown documents) to iso-8859-1
...
*)
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2620 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
f453c14b5d
removed unreacheable catch blocks and unused imports
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2619 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
ad7f600f25
*) Bugfix. re-enabling inheritance of serverCharBuffer from writer class
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2618 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
97d2a08ef1
*) restructuring needed to support parsing of documents using various charsets
...
- serverFileUtils.java:
-- adding methods to copy from stream to writer and readers to writers
-- moving httpc writeX methods into serverFileUtils class
- serverCharBuffer.java: removing inheritance from Writer class
- replacing htmlFilterOutputStream by htmlFilterWriter class which handles
content as char stream
- htmlFilterContentTransformer.java: deactivating getText mode
(still needs to be migrated to use char streams instead of byte streams)
- changes in several classes to use htmlFilterWriter instead of htmlFilterOutputStream
- changes in Scraper and Transformer classes to operate on chars instead of bytes
- httpdProxyHandler.java: bugfix. clientTimeout setting was missing in config file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2617 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
fc594e8eda
*) adding httpContentLengthInputStream.java class to allow reading of http response bodies
...
until EOF even if a persistent connection is used
*) httpdByteCountInputStream.java: adding skip method
*) httpHeader.java: adding getCharacterEncoding function
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2616 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
low012
cd636eb00e
*) Fix for the fix...
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2615 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
low012
f9a5b55a9e
*) Fixed bug described in http://www.yacy-forum.de/viewtopic.php?p=25448#25448
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2614 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
3aac5b26da
- added automatic tag generation when a web page from the search results is added
...
- added new image 'B' in front of search results for bookmark generation
- added news generation when a public bookmark is added
- the '+' in front of search results has new meaning: positive rating for that result
- added news generation when a '+' is hit
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2613 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
low012
8a30c5343d
*) Fixed bug where exclamation marks could get lost between [=...=] and <pre>...</pre>
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2612 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
low012
d8f4b17e31
*) Hopefully fixed bug described in http://www.yacy-forum.de/viewtopic.php?t=2825 .
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2611 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
0e84a969d6
*) Bugfix for serverCharBuffer read from file operation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2607 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
90ef19d778
*) first version of a serverCharBuffer
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2606 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
d374ef2bbe
bugfix for tryRemoveURLs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2605 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
f644a1c3a7
better evaluation of index abstracts
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2604 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
1b48473bc5
bugfix to utf8 recognition
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2603 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
90f7241b59
serverByteBuffer.trim() can now recognize utf-8 characters
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2602 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
2fd610b556
http://www.yacy-forum.de/viewtopic.php?p=25611#25611
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2601 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
e34d9b3fec
*) charset aware headlines (after the serverByteBuffer.trim problem is solved)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2599 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
8115ac47b5
*) charset aware metadata parsing
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2598 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
3ac30bdf22
*) some todo markers added for additional charset support
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2597 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
06fa891152
*) htmlFilterContentScraper.java: using proper charset for document title
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2595 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
74c3e7cf29
*) storing document charset into plasmaParserDocument object (is needed later by the condenser)
...
*) htmlFilterContentScraper.java: using proper charset for document title
*) serverByteBuffer.java: adding new toString which allows to specify the charset for byte encoding
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2593 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
c5d3020941
*) better errorhandling for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2592 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
d0a5a53789
*) changes needed for multi-language support
...
- parsers may need to know the charset of the byte stream
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2591 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
d82875c72b
removed removal of 'funny symbols' that may have caused utf-8 problems
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2589 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
26ab1fa885
fixed null pointer exception
...
See http://www.yacy-forum.de/viewtopic.php?p=25598#25598
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2588 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b0e8ff6eda
*) some TODO makers for UTF-8 problem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2586 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
41e27b85b7
fix for crawler condition
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2583 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
0ee7e45413
bugfix for merge method (caused by bad refactoring)
...
see http://www.yacy-forum.de/viewtopic.php?p=25529#25529
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2581 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
40965e183e
bugfix for minimizeurldb and urldbcleanup
...
see http://www.yacy-forum.de/viewtopic.php?p=25539#25539
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2580 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
5c2f30eaca
adjustments to dhtInCache write
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2579 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
9ecf7f0da2
*) some TODO makers for UTF-8 problem
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2578 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
e2f8339827
*) some bugfixes for UTF-8 related problems
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2577 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
c89d8142bb
replaced old 'kCache' by a full-controlled cache
...
there are now two full-controlled caches for incoming indexes:
- dhtIn
- dhtOut
during indexing, all indexes that shall not be transported to remote peers
because they belong to the own peer are stored to dhtIn. It is furthermore
ensured that received indexes are not again transmitted to other peers
directly. They may, however be transmitted later if the network grows.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2574 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
6e2907135a
bugfixes for remote search server part
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2573 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
cf9884e22b
first attempt to implement a secondary search
...
this is a set of search processes that shall enrich search results
with specialized requests to realize a combination of search results
from different peers.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2571 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
2a06ce5538
*) next bugfix for UTF-8
...
- Sending UFT-8 messages to other peers did not work
- httpd.java: minor corrections for UTF-8
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2570 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
bdc51591ae
*) UTF-8 Bug solved (hopefully)
...
See: http://www.yacy-forum.de/viewtopic.php?p=25522
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2569 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
ef751b9d33
*) removing all string operations from the template engine
...
- engine should fully operate on bytes now
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2567 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
7ef80c1026
more debugging
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2566 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b251076e64
avoid ConcurrentModificationException
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2563 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
75b198bc02
- updated references to indexContainer
...
- more bugfixes and debugging for indexAbstract processing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2555 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
0bed3b9ac3
removed superfluous interface
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2554 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
b7e7808ea6
wordmigration now works also for new index database
...
if the new database is switched on, no 'too big' messages appear,
all the WORDS files can be completely migrated
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2553 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a0ddf2ec11
*) AbstractCrawlWorker.java: delete already downloaded data on crawling error
...
*) plasmaSwitchboard.java: log unexpected errors while parsing/indexing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2552 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
4f9e42d5ed
more changes towards better join-search
...
- fixed problems with index-abstract generation
- added analysis output for index abstract receive
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2551 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
005400a137
*) reverted last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2546 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
a7281a9b4d
fix for last commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2545 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
82a6054275
- fixed bug with new indexAbstract generation
...
- added partly evaluation of indexAbstracts during remote searches
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2544 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
fded1f4a5d
*) better handling of maximum file size limit in crawler
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2543 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
416b4e5c6b
ups
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2542 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
309accb983
memory control for ymage generation:
...
the ymageMatrix initializer throws an RuntimeException if there is not
enough memory available to generate a new ymage of wanted size
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2541 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
74d1dea30b
changes towards better join-search
...
- added generation of a compressed index within remote peers during global search
- added selection of specific urls within remote peers during secondary global search
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2539 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
045ffebbd8
*) added debugline to versionstring-processing to find a possible bug in versiongeneration
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2537 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
ae4e8ce03e
- cut for 'probably last html-interface version': version number update
...
- small enhancement to ranking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2536 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
64bed59ee8
enhancements to ranking
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2535 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
63893003be
*) Adding settings page for the crawler which allows to specify a file size limit and the timeout to use.
...
*) adding first version of maximum filesize check for the crawler
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2534 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
06b1365066
*) fixed existing protection against divbyzero and removed the new one
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2530 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
94d7ced900
fix for last ranking commit
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2529 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
cc97a3e9c6
fixed possibly bug with indexOutOfBoundsException
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2528 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
03835c2ee8
enhanced search result computation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2527 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
809960ddc6
avoid division by zero
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2526 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
ac3419b65f
better debugging for indexOutOfBoundException bug
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2525 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
75b03a4580
fix for new ArrayIndexOutOfBoundException
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2524 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
a8bc768206
enhancements to ranking evaluation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2523 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
auron_x
a82e926c5d
*) fix for wrong totalPPM-calculation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2522 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
33898ae7e9
*) ResourceInfoFactory.java: Bugfix for classNotFoundException
...
See: http://www.yacy-forum.de/viewtopic.php?t=2797
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2521 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
406e170e25
*) more verbose error message
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2519 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
b298474e22
*) Bugfix needed because of changed plasmaCrawlLURL.load behavior
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2518 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
c2e6cc8c6b
small part of Bosts patch
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2517 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
96c6e4e322
- enhancements to detailed search page
...
- enhancements to search ranking computation process
- removed bugs in postranking
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2516 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
9340dbb501
fixed all possible problems with nullpointer exception for LURLs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2513 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a5ed86105b
*) bugfix for handling of ResourceInfo object in proxy
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2512 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
ff4362b02d
some more fixes for new plasmaCrawlLURL.load behavior
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2511 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
7aeadbe7cc
another NullPointerException in http.ResourceInfo
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2510 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago