Commit Graph

384 Commits (508050f79cb2fac6b14017f9dfc3535faaa71dcd)

Author SHA1 Message Date
Michael Peter Christen ca8b100f96 run the cleanup process even when load is high, do postprocessing even
11 years ago
Michael Peter Christen 6e59ca4ebf removed jena library and all code that depended on jena. When jena was
11 years ago
Michael Peter Christen 931541d198 re-inserted default value re-set button to performance queues and
11 years ago
reger a71718a459 add config value for ssl/https port (default=8443)
11 years ago
Michael Peter Christen be5e808236 - removed hardcoded load-test which is now handled in BusyQueues
11 years ago
sixcooler 40a4030b55 configurable max-load values for YaCy-Threads:
11 years ago
Michael Peter Christen 77531850b5 reverted crawling strategy from latest commit.
11 years ago
reger 97e84439fb adjusted ConfigHeuristic and changed QueryGoal.getOriginalQueryString to .getQueryString
11 years ago
reger 0c754dd794 implemented DIGEST authentication, which is for remote login more secure
11 years ago
orbiter 2ead4e44d9 introduced a new storage path ARCHIVE inside of DATA which will be used
11 years ago
reger fbdd89e198 Merge origin/master
11 years ago
reger 65a2f3d5e7 tweak Jetty credentials to work with YaCy UserDB
11 years ago
Michael Peter Christen ee17bd0b69 added option to attach remote solr servers in read-only mode
11 years ago
Michael Peter Christen 84167adb49 removed unused anomichttpd code after migration to jetty
11 years ago
reger effea4bca0 Merge origin/master into jetty
11 years ago
Michael Peter Christen a16534cb0a tried to fix timeout and connection-lost problems when using an outside
11 years ago
reger f111f30ace Merge origin/master into jetty
11 years ago
Michael Peter Christen 24a052ecb9 removed debug code for existsByIds
11 years ago
Michael Peter Christen 087df05e24 added option to Config_Network_p.html to enable remote search while
11 years ago
Michael Peter Christen 899e7e92b0 added debug code
11 years ago
reger 1437c45383 merge rc1/master
11 years ago
Michael Peter Christen 7f768b42d3 we do not need the load-image flag any more since this is now controlled
11 years ago
reger f017066197 Merge origin/master into jetty
11 years ago
Michael Peter Christen f1bfe64361 integrated startpage to compare_yacy
11 years ago
Michael Peter Christen 9bb7eab389 hacks to prevent storage of data longer than necessary during search and
11 years ago
Michael Peter Christen 1b4fa2947d - fixed a problem which ocurred when a document was not recognized with
11 years ago
reger f46c723398 allow to choose used http server, YaCy-Anomic or Jetty
11 years ago
Michael Peter Christen 820b896146 Replaced the inframe loading from yacy.net for donations with the
11 years ago
Michael Peter Christen 90c8577840 enhanced ranking; patches to replace old ranking
11 years ago
orbiter 8ac2e8c8c9 added location navigator which causes that the image to the map search
11 years ago
Michael Peter Christen 69f85265e1 added an option to put image links to the crawl queue and handle these
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
orbiter 944ae5686c added donation plea to the about box as default (you can replace this in
11 years ago
orbiter bf0ad04e1b apply load limitation also to dht-in
11 years ago
orbiter f50b596e0b do not run dht ditribution if system load is over 2.5
11 years ago
orbiter e24016e30a added the property federated.service.solr.indexing.timeout to yacy.init
11 years ago
Michael Peter Christen 2716dfc46c increase crawler speed by reduction if the busysleep time
12 years ago
Michael Peter Christen 57ffdfad4c added a crawl option to obey html-meta-robots-noindex. This is on by
12 years ago
orbiter 7c6ccc426c set crawlingQ to true by default because most webpages are dynamic and
12 years ago
Michael Peter Christen fd1776a3b0 added a new 'Citations' function: each search result item can now be
12 years ago
Michael Peter Christen 1762911f57 added synchronizations and timeouts in solr api; missing
12 years ago
Michael Peter Christen 6115bef335 added a 'greedy learning' mechanismn which will cause that a 'fresh'
12 years ago
Michael Peter Christen 856e5c42ae the line "Web Search by the People, for the People" is more generic for
12 years ago
Michael Peter Christen f7a4377812 usage of the new normalized link polularity CRn as default ranking
12 years ago
Michael Peter Christen eb9d0ba5b1 ranking and boost function update, small bugfixes, better default search
12 years ago
Michael Peter Christen a8dc4346e8 default configuration of MMapDirectoryFactory for solr, increased lock
12 years ago
orbiter 4baa0d4a97 Added a default keystore for ssl encryption of the YaCy web interface.
12 years ago
Michael Peter Christen cc90f82dbb increased default proxy client timeout to one minute
12 years ago
Michael Peter Christen d05dc07cff setting of new default values for ranking
12 years ago
Michael Peter Christen 97775fbebc fixed ranking for add-function queries: this did not work. The option
12 years ago
Michael Peter Christen 27d6222880 added new field host_extent_i which, after a crawl and postprocessing,
12 years ago
Michael Peter Christen 2d36a7eaf5 - do not create a new query for all remote peers
12 years ago
Michael Peter Christen 4af0839be2 use appropriate ranking for each search situation:
12 years ago
Michael Peter Christen addba047e2 changes in ranking computation
12 years ago
Michael Peter Christen 25300913fa fixes to search debugging after testing with the different search
12 years ago
orbiter b1140e3d82 added debug switches for detailed search testing
12 years ago
Michael Peter Christen 0d7b4bc891 better protection against OOM during search flush and fixed missing
12 years ago
Michael Peter Christen 3b1d9dc884 made index storage from DHT search result concurrently. This prevents
12 years ago
Michael Peter Christen 56d5946a59 - added flags in IndexFederated_p.html to switch on or off the webgraph
12 years ago
Michael Peter Christen 91a0401d59 introduced a second core named 'webgraph'. This core will hold the link
12 years ago
Michael Peter Christen 4111606654 removed the commitWithin attribute because that is not the way how the
12 years ago
Michael Peter Christen 4735bd47f4 - changed solr commit call and added an optimize option. Since Solr
12 years ago
reger 168b1d130d Adding heuristic to get search results from configured systems which support opensearch specification
12 years ago
reger e9e0d63897 Add config option to show HostBrowser link in search result
12 years ago
Michael Peter Christen 98819ec3d9 use solr boost configuration to select search fields. At this time it is
12 years ago
Michael Peter Christen 72f165d58b added a Boost class which stores solr query boost values. The class can
12 years ago
sixcooler 2d972f289a rise commitWithinMs to default-value from SwitchBoard
12 years ago
Michael Peter Christen 42e525ca9a enhanced the host browser
12 years ago
sof 5cb244b79b Merge remote branch 'origin/master'
12 years ago
apfelmaennchen 88b062210c Added a parser for audio file tags (e.g. ID3 tags for MP3 files) based
12 years ago
Michael Peter Christen 3d33a5bdf6 turned the synonyms_t Text field into a multi-valued String field
12 years ago
orbiter a55e77a115 added twitter search heuristic
12 years ago
Michael Peter Christen b2b516cc3e added a collection attribute to crawls and searches:
12 years ago
cominch dc468dad01 add content control features for custom filter lists
12 years ago
Michael Peter Christen af764c106c re-activated audio and video search because they obviously work (!)
12 years ago
Michael Peter Christen 23226676c6 FOR THE BRAVE.. this is a forced migration to solr which is now ready
12 years ago
cominch e2119f4e76 augmented browsing: replace htmlparser by jsoup, which is more stable
12 years ago
Michael Peter Christen 826967513b changed options in IndexFederated_p to switch on/off parts of the index
12 years ago
Michael Peter Christen 0301aba1e9 removed unused method parameters
13 years ago
reger 067728bccc add search result heuristic. adding a crawl job with depth-1 for every displayed search result (crawling every external linked page of displayed search result pages)
13 years ago
Michael Peter Christen 9116013c64 - allow lazy initialization of solr value (if using 'lazy', then no
13 years ago
Michael Peter Christen c03d306afa shorter autocommit time (now: 1 second) to prevent that user cannot see
13 years ago
Michael Peter Christen 3fd4a01286 added option to record urls that are forwarded to the solr index
13 years ago
Michael Peter Christen 8dd469b9dd added option to configure the autocommit delay time of solr on-the-fly
13 years ago
Michael Peter Christen b9dfca4b0a - fixed IndexFederated Servlet / a embedded Solr can now be selected
13 years ago
Michael Peter Christen 8738336408 set Xms lower than Xmx
13 years ago
Michael Peter Christen 96f6a5869f more robust OAI-PMH client (large time-out, three re-tries). OAI-PMH
13 years ago
Michael Peter Christen 6d17686258 made triplestore persistent by default
13 years ago
cominch 3c255c025b Show tags in search results (if activated in ConfigPortal_p.html)
13 years ago
Michael Peter Christen a5cdfb91de - fixed Cache link (below snippet)
13 years ago
Roland 'Quix0r' Haeder af5a597e47 Scroogle is not comming back, remove dead code
13 years ago
cominch 90512640bf Added config switches for custom parser
13 years ago
cominch 5d20cd324a Add Triplestore and RDF query interface
13 years ago
Michael Peter Christen 41c02cb10e - less restrictions for usage of Table RAM copy
13 years ago
Michael Peter Christen 8002fd2578 use less cache space since a large cache would cause more memory usage
13 years ago
Michael Peter Christen 5aee19daa4 added show from cache in search results (not yet finished)
13 years ago
Michael Peter Christen 0d32a766ed relax verify attribute for search widget to make it faster:
13 years ago
Michael Peter Christen db9d81cb7a ups
13 years ago
Michael Peter Christen e7e381d110 added configuration to switch off redirection following in crawler
13 years ago
Michael Peter Christen 99c74699de removed scroogle (scroogle is dead)
13 years ago
Michael Peter Christen 4c5edab1ec added option to have exception search result windows
13 years ago
Michael Peter Christen 696ee5fc16 removed pdf from default parser deny list
13 years ago
Lotus c73af39e54 refactoring of tray icon class,
13 years ago
Michael Peter Christen 0bcef2d156 added feature as requested in
13 years ago
Michael Christen 17f962fceb translator updates:
13 years ago
Michael Christen c715d19c09 fixes for dependency on svn
13 years ago
Michael Christen f62e6fb438 less frequent DHT distribution to reduce the load a bit on every peer
13 years ago
Michael Christen 9dbc93613e now that the whole world knows that we actually do p2p and not
13 years ago
orbiter f9216e388c - faster ping to clean up old peers faster
13 years ago
orbiter ac5bda205f - removed lower page navigation (it never looks nice)
13 years ago
orbiter c659310e89 - removed option to search for audio, video and applications. These things are still experimental and should not be shown to new users since this would cause them to argue that YaCy does not work. The functions are stil available, because:
13 years ago
orbiter 6cd27473f5 - better default values for caching and cache usage
13 years ago
orbiter 5866c73a09 fix for compare search: use scroogle instead of bing and get a default search if configured search engine is not available
13 years ago
orbiter e4a82ddd8b produce a bookmark entry from every crawl start. these bookmarks are always private.
13 years ago
orbiter f183d3822c added a default accept header in http requests since some http fraud detection functions check that this header field exist
13 years ago
orbiter 78ce3b13be typo
13 years ago
orbiter cf4fd525ee added directDocByURL attribute in crawl profile
13 years ago
orbiter 5ad7f9612b added crawl settings for three new filters for each crawl:
13 years ago
orbiter e48ce5d80e - style change for search box: larger font, selected by default
13 years ago
sixcooler ecb4986b38 refactored stuff from last commit to ReferenceContainer
13 years ago
orbiter 49e5ca579f added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.
13 years ago
orbiter 9a8937f8b6 be more liberal when evaluating search results. This may cause that it is possible to fraud content on fresh peers, but that is better than looong waiting times for the evaluation of every link which causes that everybody rejects YaCy as 'too slow'. But this is only because of the high standards that YaCy sets to itself. If we are able to gain more users by lowering the standard, then that is useful. The option to set that flag to verify each link is still there.
13 years ago
sixcooler 4fec99115b Implementation of strategies for controlling memory resources.
13 years ago
orbiter 77a9af99f1 same values for Xmx and Xms: memory extension may be difficult if the OS has not the remaining memory available and may kill the jvm. If the memory is reserved at the start but never used the OS may handle that as well and leave non-used space in swap area (and never swap)
13 years ago
orbiter 768c59740c - replaced solrj 3.1 with solrj 3.3
14 years ago
lotus fa6f2c2b44 use proxy accounts by default for more security
14 years ago
orbiter b6f09a475d - added an index profile editor in the /indexFederated_p.html servlet for solr indexes
14 years ago
f1ori fdc84d8319 small pi link on index page to administration pages
14 years ago
orbiter 84c9658644 added a file type navigator
14 years ago
f1ori 900dacbf97 * improve link rewriting in proxy-url
14 years ago
orbiter cc239b18cd fix for IPv6 localhost proxy client
14 years ago
orbiter 10e2f588f8 - enhanced ybr ranking computation
14 years ago
orbiter 3ed4a09368 small features, some bug fixes and performance hacks
14 years ago
orbiter d8e934c085 better abstraction of http client identification
14 years ago
orbiter b77b8cac0c - enhanced html parser: recognized much more details in the content
14 years ago
orbiter 19fd13d3bc Added federated index storage to solr.
14 years ago
orbiter b1a8d0c020 enhancements to web cache and less strict caching rules
14 years ago
orbiter ba03ca8620 added more configuration options for search:
14 years ago
orbiter bed79402be introduction of a new remote search load control: the remote search has taken 10 results per peer with a time-out of 3 seconds so far. The attributes of number of results per peer and time-out time can now be configured.
14 years ago
f1ori 59dea3a284 * implement url proxy, a proxy via the url http://peer:port/proxy.html?url=http://domain.tld/path
14 years ago
orbiter e3ef4e3021 - increased default peer ping time from 2 minutes to 1 minute
14 years ago
orbiter d28f8040e0 removed unnecessary recording function that caused also a performance problem after serving too much files
14 years ago
orbiter 6c52e31993 new methods to open a browser
14 years ago
orbiter 4588b5a291 - fixed document number limitation for crawls that restrict the number of documents per domain
14 years ago
low012 64f32e8f00 *) replaced all IPs in IP filters for proxy with the proper regular expression
14 years ago
orbiter fe93caac5a added flags and administration options to show advanced search and to show search result attributes (for each search result)
14 years ago
orbiter 88773e4daa changed the default port from 8080 to 8090
14 years ago
orbiter 6c35b68f17 - removed 'peerName' property from the yacy settings file because this information is stored in the yacy seed file
14 years ago
orbiter 786166041a - added recording of all accessed and submitted servlets
14 years ago
orbiter 3fe03f153d - search page becomes default start page (new users are not forced to do configuration since this is not necessary)
14 years ago