Commit Graph

50 Commits (77a73c7475ed24edf2f3e7ff1993d550025fa075)

Author SHA1 Message Date
reger 71d2655c02 downgrade to Jetty 8 to assure support of JRE 1.6
11 years ago
reger fe87fb638a adjust test/ParserTest to dc_description data type
11 years ago
Roland Haeder 841a28ae76 Added 'final' for all exception blocks as this helps the Java compiler
11 years ago
reger 97ab5b90e8 - odt & ooxml (office document) parser correction to add content to fulltext index
12 years ago
reger 160ce568b3 move testing SolrServlet.main to test, making include of jetty*.jar in distribution and classpath obsolete
12 years ago
orbiter d2ea250d99 refactoring:
13 years ago
orbiter 49e5ca579f added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.
13 years ago
orbiter cb1f49d0f2 replaced all 'new String' with default encoding (missing) or UTF-8 encoding with a String generation method that uses a pre-defined Charset constant for UTF-8. This avoids a cache-lookup for the Charset object using String hashing of the String 'UTF-8'.
14 years ago
f1ori 01cb3bbaec * fix patchCharsetEncoding-test (patchCharsetEncoding now returns null on input null)
14 years ago
orbiter 3197ca42ed preparations to move the HTCache into cora:
14 years ago
orbiter 844f158686 - removed dependencies in header framework:
14 years ago
orbiter b6fb239e74 redesign of parser interface:
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter 3528b970d6 - refactoring
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago
f1ori 34c71b22e8 fix and enable parser unit tests (tested with eclipse)
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter bea3b99aff moved table and util classes
15 years ago
orbiter ce7924d712 better concurrency for rwi entry parsing during search processing
15 years ago
orbiter 72ac5bd80f refactoring of search process.
15 years ago
f1ori d515bc11e2 added ooxmlparser
15 years ago
f1ori 8c1b02af04 * fix warning in testcase
15 years ago
f1ori 67da20647f * add new odf parser based on sax-xml-parser
16 years ago
f1ori 06557485f5 * added parser unittest!
16 years ago
f1ori 69dfd03985 reactivate unittests
16 years ago
orbiter daf0f74361 joined anomic.net.URL, plasmaURL and url hash computation:
17 years ago
theli 2399ed817c *) robots.txt parser now extracts the sitemap-URL (will be used later)
18 years ago
theli 1b7fda12ee *) SOAP: separate function to get the active/passive/potential peer list
18 years ago
karlchenofhell a1d68fe092 - use .class rather than Class.forName for classes in class-path
18 years ago
orbiter d25caa07bf redesigned some parts of http authentication
18 years ago
theli eb20ec3837 *) soap-service: adding function to check if a specific url is blacklisted
18 years ago
theli 5c0669429e *) soap: adding function to query the peer list
18 years ago
theli 203f2bde9a *) adding function to query the pause/resume state of the crawling queues
18 years ago
theli 6d3a130878 *) bugfix needed because of db refactoring
18 years ago
theli 892b9f2fc4 *) additional soap function to query peer status
18 years ago
theli bd3710a974 *) new xml template to view peer profile as xml
18 years ago
theli d1afe1ce6b *) adding xml template to get the message list as xml
18 years ago
theli f37e2041e8 *) adding soap function to import yacy bookmarks from xml or html (transfered via soap attachments)
18 years ago
theli 4a3ec63e34 *) new soap service to manage yacy bookmarks
18 years ago
theli 5e57e0814d *) new soap function to display log
18 years ago
theli c7bea4addb *) soap api
18 years ago
theli 532c23b5c7 *) soap handler
18 years ago
theli 7299dc30e3 *) new soap service to manage the yacy file-share
18 years ago
theli 9e8942a064 *) adding method to implement blacklist from file
18 years ago
theli d38ef0493d *) be more tolerant against missing ports in url
18 years ago
theli cfe54fedc7 *) Bugfix for resolveBackpath problem with tailing /..
18 years ago
theli ac13fa763a *) bugfix for blacklist remove (blacklist was not informed about remove)
18 years ago
theli 3e0516446b *) new soap function to get the current queue status
18 years ago
theli 92f774edd1 *) Better charset encoding detection
18 years ago
theli eedb898c45 *) adding date parsing test routine to determine if we have a date-parsing bug
18 years ago