Commit Graph

17 Commits (6db7f5525b153b0ceb9d5c39a38a16772bc60e5b)

Author SHA1 Message Date
luccioman a9cb083fa1 Improved consistency between loader openInputStream and load functions
8 years ago
luccioman f66438442e Extended Mediawiki dump import to remote URLs.
8 years ago
reger c50e23c495 reduce creation of empty legacy RequestHeader() in situation where null
8 years ago
reger 7ab41d4ff1 use directories original lastmodified date in file- & smbloader in response
8 years ago
luc 5bbb2e1730 Ensure resource is closed when reading a full file InputStream
9 years ago
reger 6932aa4d7a use configured admin-username for api calls
11 years ago
orbiter 3cb6c7861f fixed shutdown authenticaton problem
11 years ago
Michael Peter Christen 91a875dff5 self-healing of mistakenly deactivated crawl profiles. This fixes a bug
11 years ago
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
Roland Haeder 841a28ae76 Added 'final' for all exception blocks as this helps the Java compiler
11 years ago
Michael Peter Christen 5878c1d599 - refactoring of log to ConcurrentLog:
12 years ago
Michael Peter Christen 16d1d744fa added url_file_name_s in default collection schema for the file name
12 years ago
Michael Peter Christen d6b82840f8 added a feature to find similarities in documents.
12 years ago
Michael Peter Christen 5f0ab25382 removed the option to prevent removal of & parts inside of the
12 years ago
Michael Peter Christen a06930662c replaced some more .getBytes() with UTF8/ASCII.getBytes()
12 years ago
Michael Peter Christen 00c1c777fa refactoring
12 years ago