Commit Graph

4724 Commits (d1091e79f83591502fdc08444aca84b733300a71)

Author SHA1 Message Date
reger b38de92a16 Merge origin/master into jetty
11 years ago
Michael Peter Christen 434e13b46d in host browser also show the properties of failed documents including
11 years ago
orbiter 1ac504ae51 use html encoding for urls in metadata
11 years ago
reger f017066197 Merge origin/master into jetty
11 years ago
Michael Peter Christen 25951cee14 - fixed opensearchdescription, this delivered an url with missing
11 years ago
Michael Peter Christen f1bfe64361 integrated startpage to compare_yacy
11 years ago
Michael Peter Christen 2f57327f20 added boolean load property to CacheResource_p servlet which causes that
11 years ago
Michael Peter Christen 9bb7eab389 hacks to prevent storage of data longer than necessary during search and
11 years ago
Michael Peter Christen 5afa6e3aee Automatically flush the log cache if a short memory status is reached.
11 years ago
Michael Peter Christen 030d0776ff Enhanced crawl start for very, very large crawl lists (i.e. > 5000)
11 years ago
Michael Peter Christen 4948c39e48 added concurrency for mass crawl check
11 years ago
Michael Peter Christen 1b4fa2947d - fixed a problem which ocurred when a document was not recognized with
11 years ago
Michael Peter Christen 16e3b357b3 replaced old tag cloud and adopted design a bit
11 years ago
Michael Peter Christen dc38d35986 added matching in url field in Table_API_p search
11 years ago
Michael Peter Christen 691d7e70fa added hint to development/commit rss feed
11 years ago
Michael Peter Christen b81859c751 Show a RSS icon in the right top corner of search results. This replaces
11 years ago
Michael Peter Christen 1a09771be8 fixed sitemap crawl start
11 years ago
orbiter b743e6d79f - prevent that crawl filter have empty (never-match) content
11 years ago
orbiter f597fdb602 make it easier to filter properties (case insensitive)
11 years ago
reger f46c723398 allow to choose used http server, YaCy-Anomic or Jetty
11 years ago
reger 1adb4b8741 merge rc1/master
11 years ago
reger 37d24f3318 make use of declared static string ACTION_LOCATION
11 years ago
reger eea504c117 update Info.plist
11 years ago
reger a44eede8b8 merge rc1/master
11 years ago
reger 54a0272338 searchpage javascript (latestinfo) causes reset of search statistic after moving to next page
11 years ago
Michael Peter Christen 91fa99e9bb added new icon/image for latest commit
11 years ago
Michael Peter Christen 9fac9249bc - replaced 'edit' link with a clone symbol in Table_API_p since that is
11 years ago
Michael Peter Christen 0f6db6ad5b Merge remote-tracking branch 'jensbees/crawlexpert-post'
11 years ago
Jens Bertram 3252c1ec39 Merge upstream/master into crawlexpert-post
11 years ago
Michael Peter Christen 90c8577840 enhanced ranking; patches to replace old ranking
11 years ago
bhoerdzn a3824dfbaa check URL on inital load, if set
11 years ago
bhoerdzn 52f49d475b add a hidden field for "crawlingstart" since jQuery omits the submit button value
11 years ago
bhoerdzn b0c0ec2dec link recorded crawl starts back to "CrawlStartExpert_p" in "Process Scheduler"
11 years ago
bhoerdzn d64d45361c use integer types for boolean values
11 years ago
bhoerdzn eda123d6fd remove debugging code intercepting post requests
11 years ago
bhoerdzn 5057f27bbd fix typo in parsing "cachePolicy" parameter
11 years ago
bhoerdzn 98f5c9018d Fixed template vars for "deleteold". Fixed parsing "deleteold" parameter. Stop "setState" overwriting "deletold" state on load.
11 years ago
bhoerdzn a6a62986d4 correct state handling for country code restriction
11 years ago
bhoerdzn 4066b85155 correctly set initial state for load filters
11 years ago
bhoerdzn 8c91c3e7cd set form boolean values to 0 & 1 instead of false & true
11 years ago
bhoerdzn c27fabc88e fixed wrong parameter check
11 years ago
bhoerdzn 2214bf5396 Remove some post parameters, if they are set to default values, as their values are already set by YaCy. Added some documentation.
11 years ago
reger 71d2655c02 downgrade to Jetty 8 to assure support of JRE 1.6
11 years ago
orbiter 705b3338ee list more fields available for search and for ranking boosts
11 years ago
bhoerdzn 405878182f Use list template for all other option lists. Fixed some template expressions.
11 years ago
bhoerdzn 8e74098cd4 Use list template for "reloadIfOlderNumber".
11 years ago
bhoerdzn 52bad7b908 Dynamic toggling of form fields, based on passed in and selected values. This will also cut down the post string by disabling not needed fields.
11 years ago
Michael Peter Christen e56aa4fe93 fixed search navigation
11 years ago
Michael Peter Christen 4fbc4740df removed warnings
11 years ago
bhoerdzn 45cf553bc3 try to guess default crawling mode, if none set
11 years ago
bhoerdzn b4f0c822f2 assign strings before checking contents
11 years ago
bhoerdzn 499abe8f91 set default values for string parameters
11 years ago
bhoerdzn 42ea56eaad made crawStartExpert_p aware of post variables; extended template where needed
11 years ago
reger c7c706fd9f merge with rc1/master
11 years ago
Michael Peter Christen 82bfd9e00a - crawl profiles shall be deleted from active and passive stacks if they
11 years ago
orbiter 8ac2e8c8c9 added location navigator which causes that the image to the map search
11 years ago
orbiter d86d2be5c3 automatically removed Places autotagging if no location library is
11 years ago
reger 5c4ba9b5db merge rc1 master
11 years ago
reger 70c51775ae Merge remote-tracking branch 'origin/master' into jetty
11 years ago
orbiter d2effd21db fix for npe during location search
11 years ago
Michael Peter Christen e40671ddb7 better and consistent deletions for error urls
11 years ago
Michael Peter Christen 2602be8d1e - removed ZURL data structure; removed also the ZURL data file
11 years ago
Michael Peter Christen 61c5e40687 - replaced the properties object in AnchorURL with distinct variables
11 years ago
Michael Peter Christen 5e31bad711 - the webgraph shall store all links which appear on a web page and not
11 years ago
reger 13fc86c960 Merge remote-tracking branch 'origin/master' into jetty
11 years ago
reger 127adbf5cf remove references to 10_http thread (legacy http server)
11 years ago
Michael Peter Christen 3e22d05290 added option for daterange properties in GSA interface to use an left-
11 years ago
reger 36b7159282 - remove double initialization of jetty
11 years ago
reger 63ed04260a Merge remote-tracking branch 'origin/master' into jetty
11 years ago
Michael Peter Christen 35ab2cef7b added parsing of 'date', 'dc:date', 'dc.date' and 'last-modified' in
11 years ago
reger aafef72a8a merged current rc1/master into jetty branch to allow further development with latest version
11 years ago
Michael Peter Christen dbef8ccfcb forced deletion of ZURL entries for a specific host for each host that
11 years ago
Michael Peter Christen e137ff4171 refactoring (im preparation for new removeHost method)
11 years ago
Michael Peter Christen 9e12fdff23 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen 049c3b3f2e added an option to exclude image search results from text search. This
11 years ago
Michael Peter Christen 5d71a4c8bc fix for dc:description field
11 years ago
reger 392174de8c remove all_words, all_strings lists from QueryGoal
11 years ago
Michael Peter Christen cb85b22725 redesign of the image search process (with much better results,
11 years ago
Michael Peter Christen 6184fd9d9a fix for solr/gsa result logging
11 years ago
reger 29967102a2 optimized QueryGoal (reducing mem and computation by removing all_hashes)
11 years ago
orbiter f106345eef link strings should not be tokenized
11 years ago
orbiter 5b14bdfffd npe fix
11 years ago
orbiter 1ca4b9612c added special handling of the BinaryResponseWriter in the solr interface
11 years ago
Michael Peter Christen a88a62f7aa added a feature to set a collection for a crawl result based on a
11 years ago
Michael Peter Christen 765943a4b7 Redesign of crawler identification and robots steering. A non-p2p user
11 years ago
Michael Peter Christen 47b1c81d08 - refactoring
11 years ago
Michael Peter Christen e6b423c4d9 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
reger 94bec24d14 add back menu to Surftips page (currently no menu is displayed)
11 years ago
Michael Peter Christen 1f299b0d42 removed link.gif as link button because this image is now shown
11 years ago
Michael Peter Christen 48ddd50a6c html fix
11 years ago
reger 96ae332427 revert del _blank (last commit) in template
11 years ago
reger 43348a98a9 add some href target=_blank to ext. links with external icon
11 years ago
reger 82d81a57bd info msg if no embedded Solr http://bugs.yacy.net/view.php?id=279
11 years ago
reger 02fe8b43ba Field Re-Indexing: display list of fields in reindex queue
11 years ago
sixcooler 7f501b7c38 clear some caches before reporting low Memory
11 years ago
reger 070bf85b33 css fix for IE10 showing border on all img within <a /> tag since introduction of external link icon (commit 112836dcc9)
11 years ago
sixcooler 8a96140f92 fix / workaround for
11 years ago
Michael Peter Christen 2674d28ef4 protection against self-ping (may be cause by fraud attempts)
11 years ago
orbiter f3d001c7ab more space in the about section
11 years ago
Michael Peter Christen e879b97b0a added line to enhance debugging
11 years ago
Michael Peter Christen 76afcccaaf fix for default boolean post values: the default value MUST NOT be TRUE,
11 years ago
orbiter 252c525709 fixed feed api servlet and and enhanced RSSReader class
11 years ago
Marc Nause 112836dcc9 Improved external links.
11 years ago
Marc Nause d64a094f0e External links in HTML interface are marked as external with small icon.
11 years ago
Michael Peter Christen 58fe986cca Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen cf12835f20 replaced the single-text description solr field with a multi-value
11 years ago
sixcooler 7d53ac86a3 fix for Blacklist (-Administration)
11 years ago
orbiter f425b2c61c re-try to fetch url after a soft commit
11 years ago
orbiter bf0ad04e1b apply load limitation also to dht-in
11 years ago
Roland Haeder b58ca8622d Some cleanups:
11 years ago
Roland Haeder e2ee412160 Use SwitchboardConstants.LISTS_PATH_DEFAULT instead of 'DATA/LISTS'
11 years ago
Roland Haeder ae19401af0 Removed another duplicate occurance of Blacklist.BLACKLIST_FILENAME_FILTER
11 years ago
Roland Haeder 59225487ea Fix for blacklist export, also applied the filename filter here
11 years ago
Roland Haeder 952fc0e7bd Removed superfluous check for files ending '.black' as the previous commit already excluded all other files (e.g. .ser dumps), added logging in catch-all block
11 years ago
Roland Haeder 060fec1577 Reuse Blacklist.BLACKLIST_FILENAME_FILTER
11 years ago
Roland Haeder 29049c71f5 Possible fix for ticket http://bugs.yacy.net/view.php?id=270, the filter for only including *.black must be applied
11 years ago
Michael Peter Christen 4c242f9af9 always use a default value for boolean options to have transparency for
11 years ago
orbiter 9c681cc00d added segment sizes, postprocessing status and cpu load to crawler
11 years ago
orbiter 86b514cf46 added load info to status_p.xml
11 years ago
orbiter 056b42f5aa - added information about segment count to status_p.xml
11 years ago
orbiter 6fb2811e68 fixes for problems with remote solr and non-activated webgraph index
11 years ago
orbiter e24016e30a added the property federated.service.solr.indexing.timeout to yacy.init
11 years ago
orbiter 232100301c removed double-ocurring value assignments
11 years ago
Roland Haeder aaedc0405d Fixes and avoid of catching bad exceptions (some):
11 years ago
Roland Haeder 841a28ae76 Added 'final' for all exception blocks as this helps the Java compiler
11 years ago
Felix Ableitner 376f9cd9d0 Merge branch 'master' of git://gitorious.org/yacy/rc1 into blacklist_structure
11 years ago
Michael Peter Christen 89c0aa0e74 added collection_sxt to error documents
11 years ago
Michael Peter Christen 0df5195cb0 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
11 years ago
Michael Peter Christen 1fd006cc56 fixes using the embedded connector
11 years ago
orbiter aba7cc5de7 added cpu load information to status page
11 years ago
Roland Haeder 59b4fdd5ad Merge remote-tracking branch 'upstream/master'
12 years ago
orbiter 5493389576 stealth mode shall only be available for authorized users, because
12 years ago
Roland Haeder ebbb3bc5c1 Fixed CHMOD on many files + added missing loggers (e.g. jena) and made some noisy loggers quiet
12 years ago
Michael Peter Christen bcc623a843 refactoring of load_delay: this is a matter of client identification
12 years ago
orbiter 2be456e7fb added a postprocessing field into api/status_p.xml to show if the
12 years ago
orbiter 575f913154 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter c4efb612e2 added list of crawls to status_p.xml
12 years ago
Lotus bb6caa346c Do not allow automatic update in case YaCy is installed to the Program
12 years ago
orbiter dac88561ae minimum access time has a tight connection to ClientIdentification,
12 years ago
Felix Ableitner a020697d64 Fixed problems with blacklist entry insertion.
12 years ago
sixcooler bff8c753c6 re-insert this file - was deleted by mistake
12 years ago
Michael Peter Christen 5878c1d599 - refactoring of log to ConcurrentLog:
12 years ago
orbiter c79f687110 enhanced the network scanner: find more hosts automatically by removal
12 years ago
orbiter b4677d1cad fix for bug #252
12 years ago
Michael Peter Christen 07261fe274 Merge remote-tracking branch 'nutomics/blacklist_structure'
12 years ago
Michael Peter Christen dea71851d2 - better concurrency for network scanner
12 years ago
orbiter 9f0cc9b401 enhanced network scanner
12 years ago
orbiter f8c28efd66 fix for rssTerminal coloring
12 years ago
Felix Ableitner 44f8fcf62e Changed class structure of Blacklist.
12 years ago
Michael Peter Christen 3054a6d4b9 added a patch from Sebastian M.B., submitted by email for coloring of
12 years ago
Michael Peter Christen 78af998f8f Merge commit 'fd90fcc4e08f80acbfd1c9a7ec62ce04cd309594'
12 years ago
Michael Peter Christen 57ffdfad4c added a crawl option to obey html-meta-robots-noindex. This is on by
12 years ago
Felix Ableitner fd90fcc4e0 Fixes #196.
12 years ago
Michael Peter Christen f1c5338210 prepartion for greedy crawl profiles and refactoring
12 years ago
Michael Peter Christen e6f361f474 adding the canonical tag to crawl queues
12 years ago
Michael Peter Christen 203921006a redesign of citation index storage
12 years ago
Michael Peter Christen e92b9275ce Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen 56cdcfa2fa fixed greedy learning mode - global is not a search attribute in
12 years ago
Michael Peter Christen 32aa1d4569 removed unused option for queries
12 years ago
Michael Peter Christen 0c5bed7e2c added configuration option for greedy learning function to ConfigPortal
12 years ago
sixcooler 5d1f619f07 possible helpful closing of solr-requests
12 years ago
Michael Peter Christen 9d291764d1 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
sixcooler e5abccdfe4 added optimize-option
12 years ago
Michael Peter Christen 8ea6ddf636 removed attributes from ConfigPortal.html which are redundant to
12 years ago
Michael Peter Christen 64140f35cd fix for solr requests if no query part is given (prevent npe)
12 years ago
Michael Peter Christen 23fb458963 - fix to gsa searchresult answer in case that no query part is given
12 years ago
Michael Peter Christen 660a196989 refactoring
12 years ago
Michael Peter Christen 54024958ac added url_file_name_s in qeury for live-search of urls
12 years ago
Michael Peter Christen 16d1d744fa added url_file_name_s in default collection schema for the file name
12 years ago
Michael Peter Christen f542cf7d9c fix for daterange: the to-date is inclusive
12 years ago
Michael Peter Christen c36720d45f added daterange option to gsa api
12 years ago
Michael Peter Christen 4e3007f4a0 typo
12 years ago
Michael Peter Christen 2cb6b6bc21 added target="_blank" to shutdown links
12 years ago
orbiter c8e94ad7c7 fix for citation search in case that the citation is very fresh
12 years ago
orbiter 57dcf68665 added a feed-back message inside the shutdown page
12 years ago
Michael Peter Christen 0600d510e1 show the citation report also in ViewFile
12 years ago
Michael Peter Christen 1a92b61d69 fixed usage of ViewFile which needs a commit before showing latest crawl
12 years ago
Michael Peter Christen 570511f3c8 removed fields references_internal_id_sxt and
12 years ago
Michael Peter Christen fd1776a3b0 added a new 'Citations' function: each search result item can now be
12 years ago
Michael Peter Christen 1762911f57 added synchronizations and timeouts in solr api; missing
12 years ago
Michael Peter Christen 2fd7bbb450 reduced load on solr; no seed update in Status and no exists-check in
12 years ago
Michael Peter Christen 7ee71c2354 changed administration page headline to 'admnistration'
12 years ago
Michael Peter Christen efd973d29d changed p2p/stealth mode text and links a bit
12 years ago
Michael Peter Christen 6115bef335 added a 'greedy learning' mechanismn which will cause that a 'fresh'
12 years ago
Michael Peter Christen a5e328d7c5 new icons
12 years ago
Michael Peter Christen b85db72a73 added another response writer which can present search result with
12 years ago
Michael Peter Christen 5132bf719c added new buttons to search result page in p2p mode which show the
12 years ago
orbiter 2b320313d9 replaced yacydoc servlet usage by a solr result output using an html
12 years ago
orbiter 200769d0c6 show the cache link in search results only if there is actually a cache
12 years ago
Michael Peter Christen f7e77a21bf Added a citation reference computation for intra-domain link structures.
12 years ago
Michael Peter Christen fdcd4e6a6f fixes to index deletion: quoting of host name (a '-' may be part of the
12 years ago
reger 7480e87386 - fix stopword handling for RWI see example http://bugs.yacy.net/view.php?id=247
12 years ago
orbiter 5c7ddc67fe in GSA api enable usage of solr fq-attribute together with GSA
12 years ago
Michael Peter Christen eb9d0ba5b1 ranking and boost function update, small bugfixes, better default search
12 years ago
Michael Peter Christen 5f92c68f1f removed block rank ranking and all YBR files in /ranking
12 years ago
Michael Peter Christen 164603b946 cleanup
12 years ago
Michael Peter Christen 0c1a018bbd removed 'later' tactic because it used too much RAM, reduced number of
12 years ago
Michael Peter Christen 709e9b8ce7 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen 9e07447d47 added new link for SMW
12 years ago
Michael Peter Christen 3c04dd11de removed dead link
12 years ago
Michael Peter Christen 281959a2d7 added option to re-boot the embedded solr during run-time. Added also
12 years ago
Michael Peter Christen 80a7989e8c fixed ClassCastException: [Ljava.lang.Object; cannot be cast to
12 years ago
orbiter da621e827e prevent NPE in case RWI is disabled
12 years ago
Michael Peter Christen 7300d81f40 include API Table deletion requests to the API recorder
12 years ago
Michael Peter Christen d2ade87b49 fixed missing thisaddress in yacysearch.html which caused that the
12 years ago
Michael Peter Christen 179d032181 added a (badly formatted) delete button for process scheduler entries
12 years ago
reger c03f75ebc3 fix DHT url receive see http://bugs.yacy.net/view.php?id=242
12 years ago
Marc Nause 8fb1b1e290 *) simplified banner creation code
12 years ago
Marc Nause cd0b5f31b4 *) updated links to description of regex
12 years ago
Michael Peter Christen 8f2d3ce2f9 reduced locking situation in crawler: shifted synchronized location and
12 years ago
Michael Peter Christen f93501e6e0 nice crawl name if crawl is started with file:// (was: null)
12 years ago
Michael Peter Christen b4f0cac102 added the reindexing job servlet to the submenu structure
12 years ago
Michael Peter Christen 8dbc80da70 redesign of index.exist-test: this shall now not be done using a single
12 years ago
Michael Peter Christen c91c67c3cd reject bad solr requests
12 years ago
Michael Peter Christen 44e363f37f refactoring of WorkflowProcessor, added process counter, update of
12 years ago
reger 79401cb938 added reindex option for documents with disabled or obsolete fields to Solr Schema Editor page (IndexSchema_p.html)
12 years ago
Michael Peter Christen b24d1d18e4 removed synchronization and concurrency in Fulltext class, concurrent
12 years ago
Michael Peter Christen f965d04496 added new peer icons for Mentor peers and Mentee peers (not used yet)
12 years ago
Michael Peter Christen b9b446bca6 - added ssl configuration sign (a lock) to network statistic/table
12 years ago
Michael Peter Christen 7095446ad3 added checkbox (near port) to switch on ssl support (https access) to
12 years ago
Michael Peter Christen e6c8b545c2 Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter 4baa0d4a97 Added a default keystore for ssl encryption of the YaCy web interface.
12 years ago
Michael Peter Christen 038f956821 fix for sitemap detection: the sitemap url was not visible if it
12 years ago
Michael Peter Christen e26bdd4a52 fixes to deletion methods (removed unnecessary concurrency and added
12 years ago
Michael Peter Christen f7f3e28c5e prevent that the size of the index is computed too many times.
12 years ago
Michael Peter Christen cca19d94d4 re-declared some fields to be of type string rather than text which
12 years ago
Michael Peter Christen ed1d5bace6 draw the names of other peers which receive/send dht into the network
12 years ago
Michael Peter Christen b528448332 enlarge network graph circle according to image height and reduce the
12 years ago
Michael Peter Christen f1bb54943e typo
12 years ago
Michael Peter Christen d7fd346917 - added regular-expression based deletions
12 years ago
Michael Peter Christen 3841854c97 abstraction of catchall term
12 years ago
sixcooler e145afb8d6 fix for PerformanceMemory showing UNRESOLVED_PATTERN by removing
12 years ago
Michael Peter Christen 1b102d98d8 - added index deletion to index administration submenu
12 years ago
Michael Peter Christen 0e2ee00fea added an index deletion servlet and some style changes for the
12 years ago
Michael Peter Christen e4f7e5bcfe fixed bad css change
12 years ago
Michael Peter Christen 3502b4c697 refactoring (renaming) of yacy-solr api
12 years ago
Michael Peter Christen 3a0fcfbeda Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
Michael Peter Christen 25499eead5 - added a new field for the regular expression in crawl start
12 years ago
reger 0a9b0992f3 RinkingSolr_p: include warning if boost field not in local index
12 years ago
orbiter e1bfe9d07a - reduction of the concurrently running processes to make YaCy more
12 years ago
Michael Peter Christen c091000165 added collection attribute also to the rss feed reader
12 years ago
orbiter f7571386a3 added a 'collection' property attribute in yacysearch.html which can be
12 years ago
orbiter 3e79bd4b1f Merge branch 'master' of ssh://git@gitorious.org/yacy/rc1.git
12 years ago
orbiter d571e739b6 increased row limitation for authorized users from 10000 to 100000000 in
12 years ago
Michael Peter Christen a1fffe8e86 fixed default ranking values
12 years ago
Michael Peter Christen 1d30082446 added hindi translation configuration
12 years ago
Michael Peter Christen 97775fbebc fixed ranking for add-function queries: this did not work. The option
12 years ago
Michael Peter Christen 298bf2deb5 fix to ranking configuration servlet
12 years ago
Michael Peter Christen 2db058b551 added in RankingSolr_p.html a select box to switch between different
12 years ago
Michael Peter Christen 6fbca35215 fixed api table navigation
12 years ago