Commit Graph

3188 Commits (c3aadcf8999e8c5792160364dc6b69c8ca14119b)

Author SHA1 Message Date
Michael Peter Christen 92007e5d2d more enhancements to posprocessing speed
10 years ago
Michael Peter Christen 9a7fe9e0d1 fix for bad timing computation in postprocessing
10 years ago
Michael Peter Christen bd16119a00 another fix for postprocessing (the query for "" on numeric field did
10 years ago
Michael Peter Christen 327e83bfe7 more fixes in postprocessing: partitioning of the complete queue to
10 years ago
orbiter 2bc6199408 more concurrency for postprocessing
10 years ago
orbiter a83cf26c38 more fixes and enhancements to postprocessing
10 years ago
orbiter 71758f0d62 enhanced postprocessing by usage of a field-list generation to prevent
10 years ago
orbiter 7856fbdbe8 fix for npe (in rare cases)
10 years ago
orbiter 8a2b569d7c fix for literal computation
10 years ago
orbiter 856da2712b Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago
orbiter ca9cd7b58a more IPv6 fixes
10 years ago
Michael Peter Christen b4585e9546 added new index size history image in /Status.html page
10 years ago
Michael Peter Christen 167c5a51f0 IPv6 fix
10 years ago
Michael Peter Christen fe537679de fix for exact_signature_unique_b, exact_signature_copycount_i,
10 years ago
sixcooler eb9d2705d2 fix for ConnectionInfo.cleanup of server-connections
10 years ago
Michael Peter Christen 2e5214eb21 added field postprocessing.partialUpdate to settings which can be used
10 years ago
Michael Peter Christen 11074d8d24 fix for a ssl bug that appear only in java 7.
10 years ago
Michael Peter Christen e96490e3a1 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen 77662e08e1 concurrently initialize the error cache; extended also the cache by
10 years ago
sixcooler d8fcc4a2f5 added a timeout on Jetty connectors
10 years ago
Michael Peter Christen 0f0b60404b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
sixcooler 72561926aa do not overwrite yacy.conf in case of an exception
10 years ago
Michael Peter Christen 07c5b57953 removed warnings
10 years ago
orbiter fa2ad101ec enhanced graphics computation (avoiding long string parsing for colours)
10 years ago
orbiter ef813cec91 added proper copyright notice to OSM tiles presented at the search
10 years ago
Michael Peter Christen fca11701f0 better profiling of solr queries
10 years ago
Michael Peter Christen 2e09da9832 npe fix
10 years ago
Michael Peter Christen d80418f1b1 added partial updates to solr during postprocessing: during
10 years ago
Michael Peter Christen b1cfbc4a04 added new solr field url_paths_count_i which can be used to enhance the
10 years ago
Michael Peter Christen e69883d5ab fix-fix for
10 years ago
Michael Peter Christen 30d4402cd1 fixed location search
10 years ago
Michael Peter Christen 6983dff334 explain crawl denial when not switched to intranet mode
10 years ago
Michael Peter Christen f818f84adb more ipv6 fixes
10 years ago
Michael Peter Christen afd5bd5f5f slightly enhanced Network table computation by using a lazy initialized
10 years ago
Michael Peter Christen 2c2b50e65d refactoring (class name should start with uppercase letter)
10 years ago
Michael Peter Christen bc275dca07 added network history graph image /NetworkHistory.png which can show
10 years ago
Marc Nause ce9368246b Merge branch 'master' of gitorious.org:yacy/rc1
10 years ago
Marc Nause 5603809deb Minor changes:
10 years ago
Michael Peter Christen d8beafba3a fix for values in CrawlProfileEditor table and xml; now the full profile
10 years ago
Michael Peter Christen ec95dfa2e6 fixed crawl profile xml result which did not show the correct crawl
10 years ago
Michael Peter Christen 8c1a89cb34 added another decoration flag to switch off network graphics in crawler
10 years ago
Michael Peter Christen ee27be3399 misc bugfixes (concurrency, memory protection)
10 years ago
Michael Peter Christen 9b1958e8ca more ipv6 bugfixes
10 years ago
Michael Peter Christen 7817fc50c9 added a high cpu cycle monitor to PerformanceQueues
10 years ago
Michael Peter Christen 5082feb103 less volume for effect sounds
10 years ago
Michael Peter Christen e8392e2ff2 fix for local search
10 years ago
Michael Peter Christen 0bfc69b29b more ipv6 bugfixes
10 years ago
Michael Peter Christen a27563e5c3 removed the atmo sound clips because they had been too large
10 years ago
Michael Peter Christen 883622306e Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen 97995a1dd9 fix for remote search process
10 years ago
Michael Peter Christen 0843b12ef3 ipv6 fix: avoid that shrinked own ip set is overwritten with (non-valid)
10 years ago
Michael Peter Christen 92c5d97486 fix for bad node flag setting with IPv6
10 years ago
orbiter c27bad9326 more ipv6 fixes
10 years ago
orbiter cddf884bc4 Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago
Michael Peter Christen 460858fb22 more ipv6 fixes
10 years ago
Michael Peter Christen 5cef88a315 argh.. adding missing java class for latest audio feature
10 years ago
Michael Peter Christen 74957f3760 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen 2a052f446a Added an experimental audio feedback system.
10 years ago
Marc Nause 1e6e69bc40 Finished implementation of UPNP:
10 years ago
Michael Peter Christen d0358e568b Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen e1bc768f9d more IPv6 bugfixes
10 years ago
reger 59c6532a65 add link extraction to pdfParser
10 years ago
reger aa2e15d846 allow url parameter in worktable apicall
10 years ago
orbiter f3a12801f0 Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago
orbiter d93325a578 lazy handling of process_sxt field (part of postprocessing)
10 years ago
Michael Peter Christen b31db00010 toString fixes
10 years ago
Michael Peter Christen 961f06c0b6 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
reger 209e0f2fe8 allow url parameter in worktable apicall
10 years ago
reger b5ca20de15 preserve content_type (mime) if supplied in preference of construct in from file type.
10 years ago
reger fe9f1c594e fix char encoding parameter in UrlProxy
10 years ago
reger b0c87d8240 fix image search expand box, cut-off of 2nd capture line height
10 years ago
Michael Peter Christen 2c2ed8bf4e typo in javadoc
10 years ago
Michael Peter Christen 528f583d72 ipv6 fixes
10 years ago
Michael Peter Christen 6ee5b4352d Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen 247e626083 IPv6 host parsing bugfixes
10 years ago
reger fb1fcc2b03 handle noarchive tag, skip writing page to cache
10 years ago
Michael Peter Christen fe917deb2d when pinging other peers, be able to select the right IP option
10 years ago
Michael Peter Christen 65e6ae52fb IPv6-enhanced Network monitoring page
10 years ago
Michael Peter Christen 3073c69aee Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen 6491270b3a large IPv6 redesign of peer ping methods!
10 years ago
reger eaccce3467 added metadataImageParser for tif and psd (Photoshop) images.
10 years ago
reger a69f5358ff use javax ImageIO getReader to add supported image extension/mime
10 years ago
reger 8b1ce49ee6 remove unused variable timeout
10 years ago
reger 48aed15c48 skip loader wait cycle on concurrent access in nocache configuration.
10 years ago
Michael Peter Christen 67cd4c37bd activated the new apk parser which was already ready but not included in
10 years ago
orbiter a922b122a3 added a hack to forward solr search results from an external attached
10 years ago
Michael Peter Christen 025516f682 fix for crawl limit for number of pages fail
10 years ago
Michael Peter Christen 2645dc816a added warning for not well-formed postprocessing queries
10 years ago
Michael Peter Christen 437ce3b8a0 added internal api for partial updates to Solr
10 years ago
orbiter 3ac31614a3 added option to reverse-sort YaCy tables (internal API change only)
10 years ago
Michael Peter Christen 6d3d4c4ea6 changed the concurrent enumeration of query results in such a way that
10 years ago
Michael Peter Christen ad35d9294f added a 'stats' table which records some peer statistics twice every
10 years ago
reger 8284ea751a catch TimeoutException during ping and do not delete yacy.conf during prereadconfigfile
10 years ago
reger ffa7c7116f better fix for NPE in image search
10 years ago
Michael Peter Christen 759e7d9538 fix for http://forum.yacy-websuche.de/viewtopic.php?p=30720#p30720
10 years ago
Michael Peter Christen bf18a39d0e replaced warning with info
10 years ago
Michael Peter Christen f1032fb8fe more enhancements to image search in case that a restriction to a single
10 years ago
Michael Peter Christen 475125f9d7 hack to get more results when doing a remote site search
10 years ago
Michael Peter Christen 81f9b34da7 increaesed ability ot search for all images on a single server within
10 years ago
Michael Peter Christen 2c26013c50 better contentdom abstraction
10 years ago
Michael Peter Christen 6a8fb8190b changed default value for maximum number of connections to 50
10 years ago
Michael Peter Christen ca8b2bf099 removed www and welcome servlet, these had been demo servlets and are
10 years ago
reger 03a7a29db3 limit OAI import urn resolver try for Deutsche National Library
10 years ago
Michael Peter Christen 0838326a76 changed error message, see http://mantis.tokeek.de/view.php?id=439
10 years ago
reger b5e0f70197 - remove repositoryPath post from ConfigBasic (obsolete)
10 years ago
reger 8931e14514 fix NPE in image search
10 years ago
Michael Peter Christen 1735dbc9d9 enhanced image search: bugfixes and performance enhancements
10 years ago
Michael Peter Christen ebd0be2cea fixes and speed updates for search process
10 years ago
Michael Peter Christen 7611bf79bd Merge branch 'master' of gitorious.org:yacy/icewindxs-rc1
10 years ago
Michael Peter Christen 524bedc00a fixed text in startup tray icon and added shutdown icon during shutdown
10 years ago
Michael Peter Christen 4709d8417c npe fix for non-tray users
10 years ago
orbiter 5b5635e187 replaced font for boot tray icon with image and added some more images
10 years ago
orbiter aa6cdc4ab5 speed-up of start process if remote DNS waits for timeout
10 years ago
orbiter 40b3977c21 added an animation of the tray icon during the boot phase of YaCy.
10 years ago
Michael Peter Christen ec6082c872 very bad language detection hack fix hack
10 years ago
Michael Peter Christen 39615de3f9 adding the buffer size is not wrong but may cause confusing information
10 years ago
Michael Peter Christen 395edec6f1 changed strategy to count the number of documents: get the max of
10 years ago
Michael Peter Christen e87dc08c0d set the correct fail time in error docs
10 years ago
Michael Peter Christen cfb20bc0ce removing the [] for ipv6 addresses may be a bad idea..
10 years ago
orbiter b6d57f06eb enhanced the apk parser (up to beeing production-ready).
10 years ago
Michael Peter Christen a7dd89c4de changed method to write the citation index: do not catch up references
10 years ago
Michael Peter Christen 57ce7eeff3 fixed localhost authorization and replaced the adminRealm with an info
10 years ago
orbiter f318d7c285 enhanced date-ordered ranking
10 years ago
reger a6891ff7f8 fix Querygoal.parse exception on +/-null-term
10 years ago
reger c7335318eb remove unused legacy procedure from httpserver
10 years ago
Michael Peter Christen eab0d3e1a9 bugfix for wrong lock display, see
10 years ago
orbiter 49d4f95faf bugfix to latest commit
10 years ago
orbiter 68211f8244 enable Crawler_p servlet if a rss feed or a wiki dump import was
10 years ago
orbiter a65df4ce7e do not push noindex errors into log if in intranet mode. noindex
10 years ago
orbiter 688c6d8954 Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago
orbiter 4ae7aead28 addon to latest fix
10 years ago
Marc Nause 2af56fa37d Improved UPnP. (still not perfect)
10 years ago
orbiter b3ebd38079 removed the HTDOCS repository concept because the concept to host files
10 years ago
reger 1fdcc2d67b change seedfile upload ip check to allow intranet ip in intranet mode
10 years ago
reger e31b0e6d67 - update javadoc Seed.getIP
10 years ago
reger 350c6b8250 in IntranetMode allow intranet hosted seedlist with Network_Domain "any"
10 years ago
orbiter d68438c3d9 make sure that the postprocessing background thread never dies by any
10 years ago
orbiter b4f2a1db6e added a unlock icon for all protected pages that are unlocked because
10 years ago
reger ea6c9e9b07 reduce mem buffer overhead for gap files during r/w
10 years ago
reger e88537522d allow single quote " ' " in query
10 years ago
orbiter 487021fb0a snippet computation update
10 years ago
orbiter 1c2f1f233a Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago
reger 5a4995ded3 fill solr rss writer dc:subject tag with keyword content
10 years ago
orbiter 927aaa95a6 concurrency bugfix
10 years ago
orbiter c9e593cf78 removed warnings
10 years ago
reger 7584352e7b use more predefined Solr query parameter constants
10 years ago
reger f9db5dd6c5 reduce doublecontent check document (prevent out of memory)
10 years ago
reger e9eae45b55 simplify rssreader and improve atom feed link extraction
10 years ago
reger a8508417d1 catch NPE during crawl (OAI import)
10 years ago
reger 3dde94422f center searchevent lines on network graph
10 years ago
Michael Peter Christen 3860711aef fix for possible interruption of concurrent queries
10 years ago
Michael Peter Christen 6344718f8b reducing the concurrent query stack size and reduced concurrency of
10 years ago
Michael Peter Christen eca9380e3d bugfix for crawler double-check: if an url is redirected, the
10 years ago
Michael Peter Christen 9ac0c93f17 fix for subpath crawl filter
10 years ago
Michael Peter Christen 66106bdaf0 fix for crawler attribute maxdompages
10 years ago
Michael Peter Christen 49d91b94c3 npe fix in crawler
10 years ago
Michael Peter Christen b7183a7321 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
reger ea2e627662 fix ConfigAccounts del user with uppercase letter in name
10 years ago
Michael Peter Christen c465b791af typo
10 years ago
Michael Peter Christen 191ec8c82a added concurrency to postprocess rewrite process
10 years ago
Michael Peter Christen a1e8bdd5e9 log ppm instead of docs/second
10 years ago
Michael Peter Christen cc0ded7abd set process type of web graph according to fields as defined in the
10 years ago
Michael Peter Christen 12fb9d7cd1 log postprocessing constraints in case that postprocessing is not
10 years ago
Michael Peter Christen 3c23b89823 less logging
10 years ago
Michael Peter Christen a0c53174c5 better solr query logging to detect unnecessary sort requests for more
10 years ago
Michael Peter Christen 338f574bdc no sorting if http/www unique fields are not demanded (makes query
10 years ago
Michael Peter Christen 1609763be5 toString fix
10 years ago
Michael Peter Christen b983e68254 more retries, less sleep
10 years ago
Michael Peter Christen 1503ba7794 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
reger 8f77719091 fix "Ljava.lang.String" in crawl queue anchor name
10 years ago
Michael Peter Christen 0ceeceb35e more logic on Solr queries; usage of the query terms in posprocessing,
10 years ago
orbiter 38864ae004 Merge branch 'master' of git@gitorious.org:yacy/rc1.git
10 years ago
orbiter 4099296b45 added new classes which shall reduce call overhead to Solr (stub)
10 years ago
reger d0c02e1de7 adjust rss lat/lon to double
10 years ago
orbiter 3491ab4c38 removed unused images from webgraph edge computation
10 years ago
orbiter 2371d6b8db target linktexts must be string to enable search facets on these fields
10 years ago
Michael Peter Christen 001e05bb80 do not store failure of loading of robots.txt into the index as a fail
10 years ago
Michael Peter Christen 05d58e4df0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen 98f45c9032 fix for image alt attachment to AnchorURLs in html parser.
10 years ago
orbiter 22ce4fb4dd better error handling for remote solr queries and exists-checks
10 years ago
Marc Nause 9df14fc126 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Marc Nause 477be17c51 Replaced old UPNP library with Weupnp. UPNP should
10 years ago
orbiter 738989aab7 reverted commit f94c91315b because the
10 years ago
orbiter e9163e7e10 fix for malformed hostpath names in crawl balancer
10 years ago
Michael Peter Christen c115f3869c enhanced snippet computation and test method in ViewFile
10 years ago
reger 6c10b59f3e move bootstrap peers test systems to its test class
10 years ago
orbiter 1027f3d04a fix for the usage of ready-prepared solr queries, some queries are
10 years ago
Michael Peter Christen f94c91315b if the webgraph is used, then use it also for reference computation to
10 years ago
Michael Peter Christen 6e1dc444c3 added a snippet test function in ViewFile: you can now search for a
10 years ago
orbiter 4b06adb751 fix for file urls
10 years ago
orbiter 08409ec680 no idea why the words max was an ordered one. This change increaes speed
10 years ago
reger e5854a5cdb fix localhost link to opensearchdescription.xml
10 years ago
Michael Peter Christen b44626e55b fixed target_alt_t in webgraph
10 years ago
Michael Peter Christen 504327b15c fix for condition for writing the webgraph
10 years ago
Michael Peter Christen 542c20a597 changed handling of crawl profile field crawlingIfOlder: this should be
10 years ago
Michael Peter Christen 4eec1a7452 refactoring (change Metadata name of load time data structure to avoid
10 years ago
reger c95ba52cf0 improve logexception info
10 years ago
orbiter e441831a24 reverted toString() change in AnchorURL to prevent mistakenly used
10 years ago
reger 47f201a6b8 Add Solr default query fields (&qf) to select servlet
10 years ago
reger f96cfdc84d prevent array out of bound exception on getRankingProfile(x)
10 years ago
reger 5f5fb4ecdc remove unused static (RSS)search from protocol
10 years ago
reger 7c1706d83a use CRLF in generated bat command scripts for windows
10 years ago
reger a2cb366b25 Combine /heuristic search modifier with opensearch configured targets
10 years ago
Michael Peter Christen 2de159719b added an option to set 'obey nofollow' for links with rel="nofollow"
10 years ago
Michael Peter Christen bf1b6b93e7 do not write CR values to webgraph if no CR values are computed
10 years ago
Michael Peter Christen e039e78210 small bugfixes
10 years ago
Michael Peter Christen 32a2ff925c Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
10 years ago
Michael Peter Christen d07cdd8c3b added SolrCloud access mode and configuration
10 years ago
Michael Peter Christen 8514bffc22 enhanced postprocessing status report
10 years ago
reger b24572f304 fix GSA filter query assignment
10 years ago
Michael Peter Christen b5fc2b63ea removed exist() retrieval functions from error cache and replaced it
11 years ago
Michael Peter Christen 62c72360ee cleanup of checkAcceptanceInitially in CrawlStacker, should avoid
11 years ago
Michael Peter Christen dd5cdfe212 reverted filter query hack, it did not work
11 years ago
Michael Peter Christen b5d78ba156 reduced number of solr queries during crawling
11 years ago
Michael Peter Christen 5326970d6c enhanced solr queries for single document extraction
11 years ago
Michael Peter Christen 525575bd97 added debugging of filter queries in thread dump thread names
11 years ago
Michael Peter Christen f319ef268f testing filter queries instead of queries to retrieve documents by id
11 years ago
Michael Peter Christen fd87fa1613 removed more unnecessary exist-checks in ErrorCache
11 years ago
Michael Peter Christen f2b476e08b don't do a double check to solr for failed documents if they are not
11 years ago
Michael Peter Christen 06ab72d1af enhanced crawler host round-robin strategy
11 years ago
orbiter dab9a0786a Merge branch 'master' of git@gitorious.org:yacy/rc1.git
11 years ago
orbiter 51bf5c85b0 Renamed the transmission cloud to buffer in dispatcher since the name
11 years ago
Michael Peter Christen a694b6a8fc another fix for unique field computation
11 years ago
Michael Peter Christen fb3dd56b02 fix for processing of noindex flag in http header
11 years ago
Michael Peter Christen b0d941626f fixed bugs in canonical, robots and title/description unique calculation
11 years ago
reger d9472d043a cleanup older unused classes
11 years ago
reger 665e12f88e move startup time from old serverCore to switchboard (most used here)
11 years ago
reger 336425912a remove unused localSearchThread from SearchEvent
11 years ago
reger 32bd2a61c1 add local ip to AbstractRemoteHandler local hostname cache
11 years ago
Michael Peter Christen f3a6b6e21e fix for bad URL decoding
11 years ago
Michael Peter Christen 1092e798a5 fixed double content postprocessing
11 years ago
Michael Peter Christen aee5b108e5 added linkScraperParser, a parser which ignores the text like the
11 years ago
reger 2b8cc5832c fix seek error for 0 file size records file
11 years ago
reger 2ba394333f fix Crawler HostQueue release of stackfile
11 years ago
reger 40133ba2d0 fix NPE in Condenser,
11 years ago
orbiter 59160984cc timeline performance update
11 years ago
orbiter 54bea96e67 Merge branch 'master' of git@gitorious.org:yacy/rc1.git
11 years ago
Michael Peter Christen 841cc77391 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen e09218129c remove check for local solr. This check was made during a time when Solr
11 years ago
orbiter 2073e69034 fix for long periods in timeline
11 years ago
reger 1f94df29e7 fix NPE in solr rss where snippet contains only the title text
11 years ago
Michael Peter Christen 09dcdb9b19 update to solr 4.9.0
11 years ago
Michael Peter Christen 1cd4b2e8be Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen 8c52f0651b refactoring of AccessTracker events & timeline fix
11 years ago
reger 431a5f9c4e added test case for TextSnippet,
11 years ago
Michael Peter Christen 5b94a257ce no timeout for large reference collections
11 years ago
Michael Peter Christen f5b817bac4 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
reger cb2c17d236 extract author and keywords in .doc and .ppt parser
11 years ago
reger a5707cd2eb enable proper Author navigator
11 years ago
Michael Peter Christen 74206a10c7 refactoring
11 years ago