Commit Graph

451 Commits (c659310e89e8a4b4b2d1de0b13c67f604843ad1e)

Author SHA1 Message Date
orbiter e4d561971e added more score cluster options and made score cluster usage more transparent
14 years ago
orbiter e8f90201a5 fix for scheduling of rss feeds
14 years ago
orbiter 6a166c2040 patches for bad proxy behaviour
14 years ago
orbiter 45b1ab3d07 custom + generic skins:
14 years ago
orbiter 0d363a94d7 more performance hacks
14 years ago
orbiter 091dd3f6ec - enhanced intranet search speed
14 years ago
orbiter aacf572a26 - enhancements for search speed
14 years ago
orbiter 2c549ae341 fixed a number of small bugs:
14 years ago
orbiter f6eebb6f99 replaced auto-dom filter with easy-to-understand Site Link-List crawler option
14 years ago
orbiter d2fd93135c - moved yacybot user agent string definition to MultiProtocolURI since there are basic access mechanisms where the bot string is needed
14 years ago
orbiter 48c0d508ac fixes for crawling of smb links (file length not always available)
14 years ago
orbiter 461a2a6ec7 enhanced remote crawling:
14 years ago
orbiter 5870b13f3a - code cleanup / added debug line for further investigation in HTTPDemon.parseMultipart
14 years ago
sixcooler 17eebd4ef8 counting crawler traffic again:
14 years ago
orbiter 348dece62f redesign of the SortStack and SortStore classes:
14 years ago
orbiter 114bdd8ba7 fixed old sitemap importer which was not able to parse urls containing post elements
14 years ago
orbiter 5fe828fa06 - replaced pdfbox and fontbox version 1.1.0 with 1.2.1
14 years ago
orbiter 22047ffad5 enhanced computation speed of many replaceAll string operations
14 years ago
orbiter 9d080f387e change in handling of the all-visible home path for storage in YaCy:
14 years ago
orbiter 65eaf30f77 redesign of crawl profiles data structure. target will be:
14 years ago
orbiter 104318d58a - added nice colors to feed indexing state messages
14 years ago
orbiter 0f276dd63f - MapHeap now implements Map<byte[], Map<String, String>>
14 years ago
orbiter c60d0282fd more abstraction for tables stored in heaps:
14 years ago
orbiter 3197ca42ed preparations to move the HTCache into cora:
14 years ago
orbiter 844f158686 - removed dependencies in header framework:
14 years ago
orbiter 5e7081cd19 refactoring towards a unified loading mechanism for MultiProtocolURIs
14 years ago
orbiter caece04f26 removed System.err and System.out usage from FTPClient; changed logging to log4j (preferred in yacy.cora)
14 years ago
orbiter 90531f78ff refactoring of the cora package to get subpackages for http and ftp (smb to come)
14 years ago
sixcooler 661867923a ... migrating to HttpComponents-Client-4.x ...
14 years ago
orbiter 7aa860c505 - more logging
14 years ago
orbiter 70dd26ec95 added the new crawl scheduling function to the crawl start menu:
14 years ago
orbiter 7fdb17bb96 redirect uncaught exceptions to logging + small other changes
14 years ago
orbiter 87b1684211 additional double-check in balancer
14 years ago
orbiter 0d81731e88 fixed crawler bug caused by NPE in logging
14 years ago
orbiter a82a93f2fc - better url double check in crawler
14 years ago
sixcooler a6ed6e8cb9 ... migrating to HttpComponents-Client-4.x ...
14 years ago
orbiter 171f2bd84e - removed unused network oanet
14 years ago
sixcooler d88b9606d1 fix for http://forum.yacy-websuche.de/viewtopic.php?f=6&t=2923
14 years ago
orbiter 6388a58fc7 better memory management and slightly less (in total and temporary) RAM allocation:
14 years ago
orbiter 5924a0d851 - enhanced concurrency in database index access for multicore
14 years ago
sixcooler 15e8c13526 ... migrating to HttpComponents-Client-4.x ...
15 years ago
orbiter 22dbbcfa56 better (and corrected) recognition of intranet and internet-addresses. This corrects the isLocal property that is used by network definitions to restrict index ranges to local and global addresses. Address locations (intranet or internet) had been partly identified by the top level domain of the host address. Since intranet addresses can also be addressed using a host name that is in a country domain it is necessary to do a dns resolving for each check. The check is supported by a local dns cache so the intranet/internet check should not affect network traffic too much. To ensure that the cache works properly the cache class was upgraded to better concurrency data structures.
15 years ago
orbiter b6fb239e74 redesign of parser interface:
15 years ago
orbiter 150cf42a1b migrated all my LGPL 3 -licensed files to the LGPL 2.1 because LGPL 3 is not compatible to the GPL 2
15 years ago
orbiter 5d00888c95 - added animated visualization for DHT-in and DHT-out in network graphic
15 years ago
orbiter dcd01698b4 added a 'transition feature' that shall lower the barrier to move from g**gle to yacy (yes!):
15 years ago
orbiter 777195e8d1 more abstraction for access of LoaderDispatcher and cache
15 years ago
orbiter 7bcfa033c9 more abstraction of the htcache when using the LoaderDispatcher:
15 years ago
orbiter 73f03e05ee fixed a bug in snippet fetch strategy: cache only does not help if resource can only be found in web
15 years ago
orbiter 87087f12fe - scanned remote search process and enhanced some data structure and synchronizations here and there
15 years ago
orbiter b03caaa57a better handling of OOM situations
15 years ago
orbiter 60e71876ad - more abstraction (HashMap -> Map)
15 years ago
orbiter a83772c71b fixes and enhancements for balancer:
15 years ago
orbiter 9cde05418f fixed url crawl list display
15 years ago
orbiter 30b337fa9f fixes to balancer when crawling filesystem (problem was: host == null)
15 years ago
orbiter 844853243a fixed balancer time guessing
15 years ago
orbiter 3f93a0cc8f redesign of remote proxy settings
15 years ago
orbiter 11639aef35 - added new protocol loader for 'file'-type URLs
15 years ago
orbiter 6950d8a33d fixes to SMB crawler
15 years ago
orbiter 2a8f70f0ca - fix for caching of OSM tiles. if you want that this fix applies to your peer, please delete the crawl profiles
15 years ago
orbiter 2126c03a62 - removed download-limit that can be given for the crawler for non-crawler download tasks. This was necessary because the same procedure was used for other downloads like for the download of dictionary files where a limit is not useful. The limit still stays for the indexer
15 years ago
orbiter cf43bdc87e This is a large bugfix and enhancement commit to support a better location detection for data
15 years ago
orbiter c45117f81f fixed dates in metadata
15 years ago
orbiter 7ab207d93a better presentation of search result metadata and fixes to htcache loading
15 years ago
orbiter 40a8d132d9 tried to fix 100% CPU when calling Balancer.top()
15 years ago
orbiter 90c3e5d6f6 - cleanup, removed unused imports
15 years ago
orbiter 4cd5418963 removed finalize methods because of a hint in
15 years ago
orbiter bfa35d6d20 possible fix for ZURL.list counter
15 years ago
orbiter 8c40f1cb8e self-healing for broken table files (may cause other problems, but better than nothing)
15 years ago
orbiter 7b69d79727 enhanced remove() operation: in many cases it is not necessary to return the removed object to the called.
15 years ago
orbiter 64f29f990e a collection of performance hacks and code cleanup:
15 years ago
orbiter 8b8107b2a3 reduced IO-load and synchronization/blocking
15 years ago
orbiter 1a8a134e0c continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775 and continued in SVN 6790
15 years ago
orbiter 48b9371735 changed balancer re-load counter. causes less blocking here doing intranet indexing.
15 years ago
orbiter 55d8e686ea performance hacks
15 years ago
hermens 4ec0092677 more null == proxy fixes
15 years ago
hermens 2f90f0ad56 Remove asserts blocking proxy use cases
15 years ago
orbiter 25aef069a6 continuing String-hash - to - byte[]-hash redesign that was started in SVN 6775
15 years ago
low012 b97ad0f380 *) some minor changes for better code readability
15 years ago
orbiter ba51d140e1 added more info in assert in balancer
15 years ago
orbiter 1e8e79b9ef redesign of reference hash (URL-hash) parameter hand-over:
15 years ago
orbiter 90dd197ae7 - no latency for local crawls
15 years ago
orbiter c855fc48c6 only load robots.txt for http and http protocol
15 years ago
orbiter 3300930fc5 - (almost) fixed FTP crawler
15 years ago
orbiter 9623d9e6d2 added a smb loader component for the YaCy crawler
15 years ago
orbiter b88f5fbb4b slightly changed crawling policy
15 years ago
orbiter 7684a575c4 fix for deletion of error database each time when YaCy starts up
15 years ago
orbiter c09a995930 better logging of double occurrences of urls in the crawler
15 years ago
orbiter 727dd9b193 - fixed a bug in robots.txt parser
15 years ago
orbiter 46c4f8b68a better look-ahead into the crawl queue: show more on crawl monitor
15 years ago
orbiter 564927ce72 redesign of CrawlResult data structures because of OOM occurrences during URL deletion processes.
15 years ago
lotus 945e0ba5a5 allow global search if res. observer disabled index transmission
15 years ago
lotus 11188cd7eb resource observer now uses the Java 6 method to check for free space. thus, disk observing now needs Java 6 installed.
15 years ago
orbiter 8ce936bcdd added an api recording function: it shall be possible to record
15 years ago
orbiter e80e060ca6 - increased thread priority for server threads
15 years ago
orbiter 473b11033d fixed network switch process - crawling did not work after a switch before this fix
15 years ago
orbiter bd05e57d3b fix for http://forum.yacy-websuche.de/viewtopic.php?p=18563#p18563
15 years ago
orbiter 5df628a2a4 - added BEncoder class
15 years ago
orbiter 82f57f79e5 more PMD enhancements
15 years ago
orbiter a06f7ddb33 more PMD recommendations
15 years ago
orbiter 66c0a8e849 more PMD recommendations
15 years ago
orbiter bc96d74813 - clean-up of robots.txt parser
15 years ago
orbiter dd459281c8 applied code changes that are recommended by PMD
15 years ago
lotus eac2daf2e8 * reenable DHT if yet enough memory is available
15 years ago
orbiter d77a8f3b3e added some modifications recommended by PMD for better performance
15 years ago
orbiter d1973bae2a code cleanup: removed unused code and unused methods
15 years ago
orbiter a3b8b7b5c5 some redesign of the main menu structure:
15 years ago
orbiter eeca2ded92 fix for http://forum.yacy-websuche.de/viewtopic.php?p=18500#p18500
15 years ago
orbiter dff4f95c78 some patches to get the torrent parser working
15 years ago
orbiter 362b7a929b added extensive memory protection logic to avoid out of memory errors that may be caused by the RowCollection memory allocation function
15 years ago
orbiter e34e63a039 preset of proper HashMap dimensions: should prevent re-hashing and increase performance
15 years ago
orbiter 4a5100789f replaced _all_ size() == 0 with isEmpty() and all size() > 0 with !isEmpty(). The isEmpty() method is much faster in some cases, especially when used to access badly balanced hashtables where an size() operation becomes a large iteration.
15 years ago
orbiter 2d8f3ee301 some performance hacks
15 years ago
orbiter 4c99d4683d possible fix for lost crawl profile handles: clean-up job did wrong measurement to see if crawl is still running.
15 years ago
lotus 6edc168cfe option to disable dht by memory limit:
15 years ago
orbiter 4431b9767e added about 450 replacements for printStackTrace() methods to pipe such traces into the log at DATA/LOG/
15 years ago
orbiter e3025ee691 - new icon for OAI-PMH loading action
15 years ago
orbiter b0b7a4f9a5 - added function to OAI-PMH reader that can pull all records from a server using an evaluation of the resumption token to get URL to retrieve remaining records
15 years ago
lotus 79251e6f60 configurable disk space hardlimit for dht
15 years ago
orbiter a0e891c63d - some redesign in UI menu structure to make room for new 'Content Integration' main menu containing import servlets for Wikimedia Dumps, phpbb3 forum imports and OAI-PMH imports
15 years ago
orbiter 30f108f97d added stub of oai-pmh importer (not working yet)
15 years ago
orbiter 52470d0de4 - fix for xls parser
15 years ago
orbiter 5e8038ac4d - refactoring of blacklists
15 years ago
orbiter 26fafd85a5 - more refactoring
15 years ago
orbiter 3528b970d6 - refactoring
15 years ago
orbiter a8ce192f63 - shifted main classes to new package net.yacy
15 years ago
orbiter b79f4f062f refactoring of yacy documents and parsers: they depend now only on the kelondro classes
15 years ago
orbiter e7f18ba24b refactoring
15 years ago
orbiter ce8dc575ca refactoring
15 years ago
orbiter bea3b99aff moved table and util classes
15 years ago
orbiter c0e0e1f422 moved blob classes
15 years ago
orbiter 1e4f8b56ed accumulated classes from different packages into the new rwi package
15 years ago
orbiter 194da25a2f moved kelondro index
15 years ago
orbiter 4446acc8cd moved kelondro order
15 years ago
orbiter f677d534b1 start of a really extensive refactoring which will produce a hierarchical package structure with the domain yacy.net as package root
15 years ago
orbiter 735e2737e3 * added index segments
15 years ago
orbiter 6e0dc39a7d - some fixes to prevent blocking situations
15 years ago
orbiter 04a548a1e3 - temporary integrated the transferURL servlet as static class instead as a class that is called using reflection to investigate the OOM problems in that class
15 years ago
orbiter 6aa474f529 - better logging for web cache access and fail reasons
15 years ago
orbiter 3671c37989 added experimental oai-pmh reader and integrated it with the existing dublin core parser
15 years ago
orbiter 2e6bdce086 - added more logging to balancer
15 years ago
hermens 62a7341c4d Fix for http://forum.yacy-websuche.de/viewtopic.php?f=5&t=2204
15 years ago
low012 f65bfaa9af *) Removed base tag from errror page. This has been added by myself a long time ago as a workaround for some weird behavior of my router, but as it turns out, it does more bad than good in general: If HTTPS is used for communication with YaCy, entering a wrong passwort led to an errror page with a form which would send username and password unencrypted with the user possibly being unaware of this.
15 years ago
orbiter 3e9dcfc204 fix for http://forum.yacy-websuche.de/viewtopic.php?p=17504#p17504
15 years ago
orbiter 573d03c7d7 added configuration to enable ram table copy
15 years ago
orbiter ce972ff4ef update to default ranking profile which has now some settings to deny some phpbb3 pages which are redundant in the index when crawling phpbb3.
15 years ago
orbiter 44579fa06d - fixed a problem loading images through yacy's document loader,
15 years ago
orbiter 72e5407115 refactoring of snippet cache
15 years ago
orbiter 8e56c2ace6 fix for fixes from this afternoon
15 years ago
orbiter cf739edc2e fix for possible deadlock, see
15 years ago