Commit Graph

104 Commits (81737dcb1810374d54cff7810b404b4009f0182b)

Author SHA1 Message Date
orbiter 65eaf30f77 redesign of crawl profiles data structure. target will be:
15 years ago
orbiter 3197ca42ed preparations to move the HTCache into cora:
15 years ago
orbiter 844f158686 - removed dependencies in header framework:
15 years ago
orbiter 5e7081cd19 refactoring towards a unified loading mechanism for MultiProtocolURIs
15 years ago
orbiter caece04f26 removed System.err and System.out usage from FTPClient; changed logging to log4j (preferred in yacy.cora)
15 years ago
orbiter 90531f78ff refactoring of the cora package to get subpackages for http and ftp (smb to come)
15 years ago
orbiter a82a93f2fc - better url double check in crawler
15 years ago
sixcooler a6ed6e8cb9 ... migrating to HttpComponents-Client-4.x ...
15 years ago
sixcooler 15e8c13526 ... migrating to HttpComponents-Client-4.x ...
15 years ago
orbiter b6fb239e74 redesign of parser interface:
15 years ago
orbiter 150cf42a1b migrated all my LGPL 3 -licensed files to the LGPL 2.1 because LGPL 3 is not compatible to the GPL 2
15 years ago
orbiter 777195e8d1 more abstraction for access of LoaderDispatcher and cache
15 years ago
orbiter 87087f12fe - scanned remote search process and enhanced some data structure and synchronizations here and there
15 years ago
orbiter 3f93a0cc8f redesign of remote proxy settings
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter 6950d8a33d fixes to SMB crawler
15 years ago
orbiter 2126c03a62 - removed download-limit that can be given for the crawler for non-crawler download tasks. This was necessary because the same procedure was used for other downloads like for the download of dictionary files where a limit is not useful. The limit still stays for the indexer
15 years ago
orbiter cf43bdc87e This is a large bugfix and enhancement commit to support a better location detection for data
15 years ago
orbiter c45117f81f fixed dates in metadata
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
orbiter 3300930fc5 - (almost) fixed FTP crawler
15 years ago
orbiter 9623d9e6d2 added a smb loader component for the YaCy crawler
15 years ago
orbiter 8ce936bcdd added an api recording function: it shall be possible to record
15 years ago
orbiter dff4f95c78 some patches to get the torrent parser working
15 years ago
orbiter 2d8f3ee301 some performance hacks
15 years ago
orbiter b0b7a4f9a5 - added function to OAI-PMH reader that can pull all records from a server using an evaluation of the resumption token to get URL to retrieve remaining records
15 years ago
orbiter a0e891c63d - some redesign in UI menu structure to make room for new 'Content Integration' main menu containing import servlets for Wikimedia Dumps, phpbb3 forum imports and OAI-PMH imports
15 years ago
orbiter 52470d0de4 - fix for xls parser
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
16 years ago
orbiter 26fafd85a5 - more refactoring
16 years ago
orbiter 3528b970d6 - refactoring
16 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
16 years ago
orbiter e7f18ba24b refactoring
16 years ago
orbiter ce8dc575ca refactoring
16 years ago
orbiter bea3b99aff moved table and util classes
16 years ago
orbiter 194da25a2f moved kelondro index
16 years ago
orbiter 4446acc8cd moved kelondro order
16 years ago
orbiter f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
16 years ago
orbiter 735e2737e3 * added index segments
16 years ago
orbiter 6aa474f529 - better logging for web cache access and fail reasons
16 years ago
orbiter 3671c37989 added experimental oai-pmh reader and integrated it with the existing dublin core parser
16 years ago
low012 f65bfaa9af *) Removed base tag from errror page. This has been added by myself a long time ago as a workaround for some weird behavior of my router, but as it turns out, it does more bad than good in general: If HTTPS is used for communication with YaCy, entering a wrong passwort led to an errror page with a form which would send username and password unencrypted with the user possibly being unaware of this.
16 years ago
orbiter 3e9dcfc204 fix for http://forum.yacy-websuche.de/viewtopic.php?p=17504#p17504
16 years ago
orbiter 44579fa06d - fixed a problem loading images through yacy's document loader,
16 years ago
orbiter 72e5407115 refactoring of snippet cache
16 years ago
orbiter 8e56c2ace6 fix for fixes from this afternoon
16 years ago
orbiter c0e17de2fb - fixes for some problems with the new crawling/caching strategies
16 years ago
orbiter 634a01a9a4 replaced wget-requests with caching requests
16 years ago
orbiter 161d2fd2ef redesign of access to the HTCache (now http.client.Cache):
16 years ago
orbiter 4da9042e8a code simplification
16 years ago
orbiter 1d8d51075c refactoring:
16 years ago
orbiter 5bb8074150 removed the indexing queue. This queue was superfluous since the introduction of the blocking queues last year, where documents are parsed, analysed and stored in the index with concurrency.
16 years ago
orbiter b332dfad67 - inserted request object into response object which carries this now instead generating new objects
16 years ago
orbiter ca72ed7526 -removed superfluous crawl cache
16 years ago