orbiter
f25c0e98d1
- replaced String by StringBuffer in condenser
...
- added CamelCase parser in condenser
- added option to switch on or off indexing for proxy
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3292 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
3863f2dd24
- CacheAdmin: no exception if cached file was not available
...
- fix for http://www.yacy-forum.de/viewtopic.php?t=3412
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3232 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
allo
0c81bd39d4
XSS-safe put as default.
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3217 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
89a270757f
- 3 more templates XHTML valid (had to remove all disabled <input>s if no blacklist was selected)
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3135 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
karlchenofhell
41d7e86299
- removed HTML from CacheAdmin_p.java in favour of the corresponding .html file
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3098 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
937ccd4e76
fix for snippet-generation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3060 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
ad1e4aa88e
added selection of audio, video, image and application resources
...
to search procedure. This function can currently not used through the
search interface, but only through remote search.
added accumulation of search attributes to enable the audio, video,
image and application selection.
fixed a problem with external URL representation generation
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3036 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
ceb9e3aa17
- enhanced parser: collection of audio, video, image and application links
...
- enhanced condenser: better handling of utf-8 and pre-formatted texts
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@3017 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
0f10bdde22
more generic cache methods
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2721 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
hermens
440c6ee657
Implement alternative htcache layout
...
mostly according to: http://www.yacy-forum.de/viewtopic.php?p=26205#26205
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2718 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
1969522dc1
removed lowercase of snippets (and other things):
...
- added new sentence parser to condenser
- sentence parsing can now handle charsets
to do: charsets must be handed over to new sentence parser
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2712 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
a2e3095044
*) Bugfix. Add missing plasmaParserDocument.close() calls
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2680 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
f453c14b5d
removed unreacheable catch blocks and unused imports
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2619 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
97d2a08ef1
*) restructuring needed to support parsing of documents using various charsets
...
- serverFileUtils.java:
-- adding methods to copy from stream to writer and readers to writers
-- moving httpc writeX methods into serverFileUtils class
- serverCharBuffer.java: removing inheritance from Writer class
- replacing htmlFilterOutputStream by htmlFilterWriter class which handles
content as char stream
- htmlFilterContentTransformer.java: deactivating getText mode
(still needs to be migrated to use char streams instead of byte streams)
- changes in several classes to use htmlFilterWriter instead of htmlFilterOutputStream
- changes in Scraper and Transformer classes to operate on chars instead of bytes
- httpdProxyHandler.java: bugfix. clientTimeout setting was missing in config file
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2617 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
9340dbb501
fixed all possible problems with nullpointer exception for LURLs
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2513 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
theli
dae763d8e3
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2495 6c8d7289-2bf4-0310-a012-ef5d649a1542
18 years ago
orbiter
abf22f6e60
removed url normalform computation from htmlFilterContentScraper.
...
This method was implemented in de.anomic.net.URL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2377 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
3879a0ecd0
replaced java.net.URL usage by use of new class de.anomic.net.URL
...
This shall be seen as an experiment to exclude all cases where
there could be a DNS lookup during URL comparisment.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2290 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
90d569d70f
refactoring of index management:
...
url storage is part of index management; moved plasmaURL to indexURL
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2122 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
51999578bf
fixed bug created with last commit of borg
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2013 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
b21b9df2d0
added section headlines generation to html parser
...
can be viewed in cache control, but is not yet included to indexing
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1320 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
6de5c46702
new CacheAdmin (redesign);
...
bugfix: no more '//' in cachelink;
bugfix: don't left the htCachePath;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1318 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
4500506735
fixed some bugs concerning url entry retrieval and intexControl interface
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1212 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
79818a320f
introduced citation-rank transmission protocol and activate transport for anonymisation
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1055 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
9c4306e41e
fixed problem with htcache path
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@811 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
68d5ff2ef1
added stringbuffer in condenser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@782 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
5543ea08ad
sorted directory/file list;
...
dont list responseHeader.db;
StringBuffers, finals;
cleaned;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@739 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
d06aa558f5
back to 731
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@736 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
f50d45678e
better values for BIG directory's
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@734 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
650764ce62
sorted directory/file list;
...
dont list responseHeader.db;
StringBuffers, finals;
cleaned;
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@733 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
51aa6d0b33
fix for performance-problem on CacheAdmiin_p.html
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@731 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
borg-0300
bf14e6def5
*) proxyCache, proxyCacheSize can be changed under 'Proxy Indexing'
...
- path now are absolute
*) move path check from plasmaHTCache to plasmaSwitchboard
- only one path check when starting
*) small other
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@606 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
orbiter
2d8557cb10
minor changes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@487 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
orbiter
712fe9ef18
bugfixed utf-8 decoding and parser
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@346 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
orbiter
3addf58046
enhanced snippet-loading with threads
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@322 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
orbiter
56d28a16f0
bugfixes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@320 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
rramthun
4e63456dba
some corrections/enhancements to the webinterface
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@198 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
rramthun
2d751ba831
Fixed a spelling mistake
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@117 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
rramthun
85c2f3be8a
Fixed spelling mistakes
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@110 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
theli
e7f7aa0bb9
*) Import statements reorganized
...
Now it's easier to determine which class really uses which other class*) Reogranizing Import Statements
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@83 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
theli
58b1a0ba40
*) adding an new package for extra content parsers
...
*) adding content parser for
- pdf (using the pdf-box library)
- doc (using the textmining.org library)
*) adding a Interface for content parsers
*) adding a configuration file which can be used to configure which parser is used for which mimeType
*) Sempahore class was moved and renamed to serverSemaphore
*) Changing yacy shutdown behaviour
Buzy waiting loop for shutdown was removed and replaced with a blocking call (using the semaphore class mentioned above) to the new switchboard.waitForShutdown method.
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@46 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
orbiter
e7d055b98e
very experimental integration of the new generic parser and optional disabling of bluelist filtering in proxy. Does not yet work properly. To disable the disable-feature, the presence of a non-empty bluelist is necessary
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@17 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
orbiter
a87a17a3c8
prepared generic text parser environment
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@15 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago
orbiter
248077d3f0
initial load with yacy 0.36
...
git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@1 6c8d7289-2bf4-0310-a012-ef5d649a1542
20 years ago