Commit Graph

563 Commits (474e29ce4a831550d06a4fd3d6354fd41e32f121)

Author SHA1 Message Date
orbiter 657585fe0d network functions for robinson peers: server-side protection
18 years ago
orbiter 89c1511738 - added new Network Configuration menu, can be found in basic settings
18 years ago
orbiter ca79362b9d disabling auto-setting of remote crawl performance
18 years ago
michitux 4990909178 Some bugfixes, new layout/style for image search results:
18 years ago
orbiter 78d04bcbcf fixed bug in search statistics
18 years ago
orbiter b79b4082e2 completed search exclusion:
18 years ago
orbiter 06a7978730 moved url pattern matching for search to better place
18 years ago
orbiter 159bd0cab5 diverses; b.o. fix for http://www.yacy-forum.de/viewtopic.php?p=33914#33914
18 years ago
orbiter 40c14a4f0e - better implementation of search query properties
18 years ago
orbiter 5c3afb3202 added option to configure a path to a secondary index location.
18 years ago
orbiter 2cb16824e3 removed support for old database structures.
18 years ago
orbiter 3688ec33e5 release 0.51
18 years ago
orbiter 6b9eea3932 - removed differentiation between longTitle and shortTitle; this cannot be used for search results,
18 years ago
orbiter a738b57b31 added author tag to indexing content
18 years ago
orbiter 861f41e67e redesigned NURL-handling:
18 years ago
orbiter 9f929b5438 better snippet handling in case of snippet load fail
18 years ago
orbiter a5d668c0c6 added speed-buttons for easy performance setting
18 years ago
orbiter 5b0a84ce09 fix for synchronization deadlock with flushMissNameCache.
18 years ago
orbiter 6ad39bae1e fixed shutdown problem
18 years ago
orbiter d755a8026d - better OOM protection
18 years ago
orbiter 33f97cff7a changed startup initialization sequence slightly
18 years ago
karlchenofhell 03c5906ae7 - minor bugfixes for url-fetcher & http://www.yacy-forum.de/viewtopic.php?t=3646
18 years ago
orbiter 1cba31de43 redesigned ram organization for database caches
18 years ago
karlchenofhell 88245e44d8 - improved version of robots.txt (delete your old htroot/robots.txt before updating):
18 years ago
orbiter 10a3c20b8d some more enhancements to R/W Head path optimization
18 years ago
orbiter f4cfd19835 second Generation of collection R/W head path optimization:
18 years ago
hydrox cb89c74d52 *) added blog-comments
18 years ago
orbiter f7803a6ce4 enhanced crawl balancer
18 years ago
orbiter c3e8c23f5d fix for 'CANNOT FETCH ENTRY: hash is null' bug
18 years ago
orbiter 30d79d69a6 fix for wrong display of search statistics
18 years ago
orbiter d25caa07bf redesigned some parts of http authentication
18 years ago
orbiter 819ff21c92 fixed QPM output
18 years ago
auron_x 89e7af037a *) used more switchboard-vars instead of config-vars
18 years ago
orbiter 306c50ac40 QPM (queries per minute) statistic stub
18 years ago
karlchenofhell 9f74b128dd - added many more commented constants (please use constants rather than i.e. config-setting strings directly)
18 years ago
orbiter f25c0e98d1 - replaced String by StringBuffer in condenser
18 years ago
karlchenofhell d311e258f8 - adjusted LogStatistics to nano-seconds
18 years ago
orbiter f3f99b19c6 extended search statistics
18 years ago
orbiter c0851ee943 refactoring: moved and renamed de.anomic.data.searchResults to plasma package
18 years ago
allo c39dda2374 finished refactoring of searchtemplates.
18 years ago
allo 35039982da refactoring of search process: store results in a searchResults structure. At the moment, its just stored in it, and read from it again.
18 years ago
allo 29aa7031d3 workaround for the snippets
18 years ago
(no author) fe72b772cf added a monitor page for search requests
18 years ago
karlchenofhell b873ad51ab - fix for http://www.yacy-forum.de/viewtopic.php?t=3369
18 years ago
borg-0300 1aa74bbd2b update for last commit
18 years ago
karlchenofhell 35fb671721 - updated DetailedSearch and ViewFile
18 years ago
borg-0300 d2be3c674d wrong cache values fixed
18 years ago
orbiter 9b726ac366 release 0.50
18 years ago
orbiter 036a0c828e fix for auto-configuration of crawler thread memory
18 years ago
orbiter c48374d14a new memory limit computation for indexing queue
18 years ago
orbiter 0a050bc043 enhanced ranking
18 years ago
orbiter 61798f0ae6 added option to distinguish between text crawl and media crawl
18 years ago
orbiter febe6b114a design update of crawler monitor
18 years ago
orbiter 7ff86d6ba6 - image search now shows thumbnails (in bad order, but it works)
18 years ago
orbiter ee3d91cb6b print-out of links that result from contraint-filtering
18 years ago
orbiter 1377c53aa3 extraction of media links from search results
18 years ago
orbiter bf0d820659 - added correct flagging of word properties
18 years ago
orbiter a603c4d5e8 more code simplifications
18 years ago
orbiter 9a85f5abc3 cleanup
18 years ago
orbiter 109ed0a0bb - cleaned up code; removed methods to write the old data structures
18 years ago
orbiter 052f28312a removed assortments from indexing data structures
18 years ago
orbiter 2372b4fe0c release 0.49
18 years ago
orbiter ad1e4aa88e added selection of audio, video, image and application resources
18 years ago
orbiter 7cc4cec9c9 bugfix for assertion bugs documented in
18 years ago
orbiter ceb9e3aa17 - enhanced parser: collection of audio, video, image and application links
18 years ago
orbiter 30888e7a2f implementation of search constraints
18 years ago
orbiter f4b547dc13 limited index transfer to peer with version 0.486
18 years ago
orbiter e3d75f42bd final version of collection entry type definition
18 years ago
orbiter c9364246cc introduced new RWI-Object.
18 years ago
orbiter 497428c8ec refactoring
18 years ago
orbiter 76fceb9997 refactoring
18 years ago
orbiter bb7d4b5d5e refactoring to prepare new RWI entry object
18 years ago
orbiter ba967c4875 - bugfixes and debug code
19 years ago
orbiter ee4715a21c - more asserts
19 years ago
orbiter 114a76a86e - added flag to urlhash that shows that domain is a local domain
19 years ago
orbiter 8fdefd5c68 generalization of payload definition of index storage
19 years ago
hydrox 7e8669b15c *) added possibility to "recycle" a DHTChunk that failed to transfer.
19 years ago
auron_x 194d42b6a7 *) changed PPM-calculation to be more accurate
19 years ago
orbiter 2a9d868f6d - removed object cache from kelondroTree
19 years ago
orbiter 06854988da - full integration of new LURL database in INDEX
19 years ago
orbiter b79e06615d - added new LURL.Entry class for next database migration
19 years ago
theli 3d152bfe43 *) Logging message added
19 years ago
orbiter 77a59a115d refactoring of indexing methods
19 years ago
orbiter a5dd0d41af - refactoring of plasmaCrawlLURL.Entry to prepare new Entry format
19 years ago
orbiter 6396f5971e bugfixes and migration attempt toward new kelondroFlex db
19 years ago
orbiter c8f3a7d363 added snippet-url re-indexing
19 years ago
orbiter 0f10bdde22 more generic cache methods
19 years ago
hermens 440c6ee657 Implement alternative htcache layout
19 years ago
orbiter 43614f1b36 bugfix in collection index. the index for collections was not created correctly
19 years ago
theli a9a0f51303 *) suppressing InterruptedException errormessage
19 years ago
theli f17ce28b6d *) plasmaHTCache:
19 years ago
orbiter dbc2e039bb added time-out option parameter to call hierarchy
19 years ago
orbiter 00746ca232 identified and fixed search performance problem caused by
19 years ago
orbiter 310f1c41cd added option to see ranking scores in surftipps
19 years ago
theli a2e3095044 *) Bugfix. Add missing plasmaParserDocument.close() calls
19 years ago
theli cd5f349666 *) Better handling of large files during parsing
19 years ago
orbiter df1629b05a - code cleanup
19 years ago
hermens 3f5a4153a0 Make Peers more receptible to transferred indexes
19 years ago
theli b6c7b91582 *) Parser now throws an ParserException instead of returning null on parsing errors (e.g. needed by snippet fetcher)
19 years ago
borg-0300 42173462f5 rename cutUrlText to shortenURLString;
19 years ago
theli cf6acff2c2 *) Bugfix. htmlFilterInputStream document analysis did not work properly for documents smaller than the
19 years ago
theli 97d2a08ef1 *) restructuring needed to support parsing of documents using various charsets
19 years ago
orbiter 3aac5b26da - added automatic tag generation when a web page from the search results is added
19 years ago
theli d0a5a53789 *) changes needed for multi-language support
19 years ago
theli b0e8ff6eda *) some TODO makers for UTF-8 problem
19 years ago
orbiter c89d8142bb replaced old 'kCache' by a full-controlled cache
19 years ago
orbiter 75b198bc02 - updated references to indexContainer
19 years ago
theli a0ddf2ec11 *) AbstractCrawlWorker.java: delete already downloaded data on crawling error
19 years ago
orbiter 64bed59ee8 enhancements to ranking
19 years ago
orbiter a8bc768206 enhancements to ranking evaluation
19 years ago
orbiter 96c6e4e322 - enhancements to detailed search page
19 years ago
orbiter 9340dbb501 fixed all possible problems with nullpointer exception for LURLs
19 years ago
hermens ff4362b02d some more fixes for new plasmaCrawlLURL.load behavior
19 years ago
orbiter 4866868c0e added write cache for LURLs
19 years ago
theli dae763d8e3 git-svn-id: https://svn.berlios.de/svnroot/repos/yacy/trunk@2495 6c8d7289-2bf4-0310-a012-ef5d649a1542
19 years ago
theli 7a35b8e237 *) direct access to responseheaders of sbQueue.Entry removed to make it more http independent
19 years ago
theli ffbf416e76 *) direct access to requestheader of htCache.Entry removed to make it more http independent
19 years ago
theli 3870d615e3 *) setting htCache.Entry fields to private
19 years ago
theli 393a7d10be *) setting htCache.Entry fields to private
19 years ago
theli ab5a9bee66 *) adding some copyright headers
19 years ago
theli 9ded4e8d5a *) Bugfix for name resolution in proxy mode
19 years ago
theli 09b106eb04 *) next step of restructuring for new crawlers
19 years ago
theli b4acbdaa97 *) better handling of server shutdown
19 years ago
theli f3ac4dbbb9 *) better handling of server shutdown
19 years ago
orbiter 18b6876860 new cache flush configuration settings
19 years ago
orbiter 985dcbde7f changed some parameters that may cause better memory usage and more indexing speed
19 years ago
orbiter b7f4a1521b added options to switch on or off the kelondroFlexTable for NURL, EURL and PreNURL
19 years ago
orbiter c26da4893b turned back NURL usage of kelondroTree, kelondroFlexTable has still problems with deleted entries
19 years ago
theli f80f776b89 *) Trying to solve NullpointerException problem in function addURLtoErrorDB
19 years ago
orbiter 1ce3c22761 better memory control:
19 years ago
orbiter 39b4c26bdc more memory control:
19 years ago
orbiter eb633c0a4f server threads must now supply a method that can be called in case
19 years ago
orbiter 8418af141a added several consistency checks and small changes
19 years ago
theli eee44be602 *) adding an interface for customized blacklist classes
19 years ago
theli d2e8e76218 *) now it's possible to configure the yacy blacklist separately for dht, search, proxy, crawler
19 years ago
orbiter abf22f6e60 removed url normalform computation from htmlFilterContentScraper.
19 years ago
orbiter 314021453f * more logging
19 years ago
orbiter 80b6c90d54 enhancements to prevent blocking during dht transfer receive
19 years ago
theli 9f298083cd *) adding more urls to the error url
19 years ago
orbiter 279b1d969d Integrated new indexing data structure 'collections' into the main class
19 years ago
orbiter ebc2233092 * implemented (finished) class indexRowSetContainer
19 years ago
orbiter 9183d21f25 renamed new index class to old name
19 years ago
orbiter c4e922885a replaced indexURLEntry by new class that uses a kelondroRow.Entry object
19 years ago
orbiter e357599f92 * fixed problem with indexContainer iteration from RAM:
19 years ago
orbiter 5f72be2a95 some redesign of EURL storage
19 years ago
orbiter e4f1820b58 protection against too long authentication strings in switchboard
19 years ago
orbiter 3879a0ecd0 replaced java.net.URL usage by use of new class de.anomic.net.URL
19 years ago
orbiter 671fd9a5c9 work towards new indexing database structure
19 years ago
orbiter 92f4cb4d73 added option to configure the start-up delay time for kelondro database files.
19 years ago
orbiter 66964dc015 removed high/med/low from kelondroRecords cache control.
19 years ago
allo 67a8c74be3 Fix for dynamic login with static password.
19 years ago
allo ef9eb50c3c fix for adminlogin
19 years ago
allo 6fe2fed87e cookieauth works with static Admin.
19 years ago
theli 4ca0857c0c *) Index transfer now considers the pause time send by busy peers during
19 years ago
orbiter c75cacda95 added a flex-width-array: this is a table where it is
19 years ago
orbiter 5041d330ce refactoring
19 years ago
orbiter bd057b44dd - automatic setting of peer-does-not-accept-remote-crawl
19 years ago
orbiter cda087f43b - integrated cache miss storage into object cache
19 years ago
theli 61078b3885 *) adding support for delayed shutdown
19 years ago
orbiter 90d569d70f refactoring of index management:
19 years ago
orbiter a930be4ba3 refactoring of index management:
19 years ago
hermens df7e1d9df3 Changes to plasmaURL and subclasses:
19 years ago
orbiter a474669338 start with refactoring of index management
19 years ago
theli f331def5d8 *) Bugfix for distribution. Incorrect behavior if peerCount == selectedCount
19 years ago
theli bcc950c533 *) Bugfix for Index Transfer
19 years ago
orbiter 461548698c configuration of index transfer chunk size
19 years ago
hermens 51e3bb576f Don't increase dhtTransferIndexCount when the last transferred index was smaller
19 years ago
hermens a0ca4c5fb8 Remove a possible race condition between DHT transfer and deQueue
19 years ago
orbiter 60e5aff9fc some enhancements to the remote crawl trigger
19 years ago
orbiter 14d6e476c9 tried to solve some problems with new picture viewer
19 years ago
orbiter f0833b0328 introduced simple search interface
19 years ago
orbiter 83e0e765ec redesigned some parts of the html scanner & parser
19 years ago
orbiter e2e8d0c188 some kind of refactoring of yacysearch:
19 years ago
rramthun 250864406f ...
19 years ago
orbiter 63f39ac7b5 added 3 new crawling steering options:
19 years ago
orbiter 1fc3b34be6 some pre-work (without function yet) to implement:
19 years ago
theli c9e6b5e391 *) check size of indexing-queue and crawler pool before processing remote triggered crawl jobs
19 years ago
orbiter 1f4412a146 adopted isListed to discussed new behavior as discussed (url, getFile)
19 years ago
orbiter 063ef4660a bug?
19 years ago
orbiter 3286b1f498 re-organisation of lurl-creation and -stacking
19 years ago
hydrox 8da13088e9 *)removed multiple DHT_Distribution_Threads
19 years ago
orbiter bcd99fe83e introduced a second RAM cache for DHT transfer
19 years ago
orbiter bae3783d38 added a snippet marking
19 years ago
orbiter f0a38873eb * added yacysearch page with better view on search results
19 years ago
theli 759800f543 *) Bugfix for storeHTCache problem
19 years ago
orbiter 1b9b8922d9 * fixed problems with new basic 1-2-3 configuration (now authentication required)
19 years ago
auron_x 8c6f38fe70 *) added Blog to YaCy (atm not reachable through interface) -> Blog.html
19 years ago
orbiter eaffcfefe2 * added more ranking attributes (without function; this will be added later)
19 years ago
orbiter 3703f76866 - fixed re-search bug: after a search with several words, a second search could not
19 years ago
theli fbbbf5f411 *) remote trigger for proxy-crawl
19 years ago
orbiter 1d8ca6e082 serialized dhtChunk deletion with indexing
19 years ago
theli 2336f0f013 *) allow pausing/resuming of crawlJob Threads separately
19 years ago
orbiter 60dac4325e serialized indexing with dht selection
19 years ago
orbiter a840755964 moved parts of index transfer logic back to switchboard
19 years ago
borg-0300 64441b1f78 ADDED: yacy.badwords list to filter the topwords
19 years ago
orbiter 2c4e4ae6a2 further refactoring of dht selection, transfer and flushing
19 years ago
orbiter 73dad68cf1 outsourced thelis DHT flush class into own file
19 years ago
theli 42a5f56723 *) Bugfix for broken dht thread configuration
19 years ago
hydrox e2af2a3f45 *) it's now possible to run more then one indexDistribution-Thread
19 years ago
theli 980e986b64 *) Re enabling short cycle for already removed nurl entries
19 years ago