Commit Graph

24 Commits (d181b9e89b5250facdbf10cb77c2a2d7ebf37d58)

Author SHA1 Message Date
Michael Peter Christen 910a496c9f replaced http links with https
4 months ago
Michael Christen b2af745dd6
Merge pull request #404 from lnceballosz/master
4 years ago
sgaebel c69c462a15 replaces a expensive getLoadTimeURL() by exists()
4 years ago
Lina Ceballos a96752f5ab adding SPDX license and copyright headers
4 years ago
luccioman e45afedee4 Added support for enclosures (media links) to the RSS loader
7 years ago
luccioman aaefd5219c Reduce log verbosity of RSS loader on feed items with no link
7 years ago
luccioman f0639d810c Customized name for Threads still using the default "Thread-n" pattern.
8 years ago
orbiter 22ce4fb4dd better error handling for remote solr queries and exists-checks
10 years ago
Michael Peter Christen 69391e5d9e changed strategy to test existence of documents in Solr: using the
11 years ago
Michael Peter Christen 8b14e92ba4 added button in host browser to re-load 404/failed documents
11 years ago
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
11 years ago
Michael Peter Christen a88a62f7aa added a feature to set a collection for a crawl result based on a
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
Michael Peter Christen bcc623a843 refactoring of load_delay: this is a matter of client identification
12 years ago
Michael Peter Christen 5878c1d599 - refactoring of log to ConcurrentLog:
12 years ago
Michael Peter Christen 8f2d3ce2f9 reduced locking situation in crawler: shifted synchronized location and
12 years ago
Michael Peter Christen 06d3063dc9 - no downcase when using collection modifier
12 years ago
Michael Peter Christen 8dbc80da70 redesign of index.exist-test: this shall now not be done using a single
12 years ago
Michael Peter Christen c091000165 added collection attribute also to the rss feed reader
12 years ago
Michael Peter Christen 788288eb9e added the generation of 50 (!!) new solr field in the core 'webgraph'.
12 years ago
Michael Peter Christen 5f0ab25382 removed the option to prevent removal of & parts inside of the
12 years ago
Michael Peter Christen 1533bfd63b refactoring
12 years ago
Michael Peter Christen 8219a445f3 refactoring
12 years ago
Michael Peter Christen 00c1c777fa refactoring
12 years ago