Commit Graph

1398 Commits (3b89c232dbe3861244bba42edd4b43900a449ce5)

Author SHA1 Message Date
luccioman ac766327d3 Switched a few more Solr fields from strictly mandatory to optional
8 years ago
Burkhard 4fdc11cae8 Update SearchEvent.java
8 years ago
luccioman cdc7f3e431 Switched some Solr fields from mandatory to optional
8 years ago
luccioman 3475d8c1a9 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
8 years ago
luccioman c68a8be2d9 Refactored and enforced Solr mandatory fields for proper operation
8 years ago
reger 334c70c37a correct fromDate init value on missing param in api/timeline_p servlet
8 years ago
reger cc770512d5 add hint of query syntax in AccessTracker log (qs=normal querystring,
8 years ago
luccioman e5858bc8c8 Fixed a NullPointerException case possible on Index Export
8 years ago
reger 5e8879beb7 Reduce self generated content for text_t (visible text index field)
8 years ago
luccioman 1857651988 Added a new Debug/Analysis advanced settings subsection.
8 years ago
luccioman 526f2d6a8b Fixed NPE case occurring when local solr index is disabled in search.
8 years ago
luccioman 08de58b6d3 Named a Thread without name for easier monitoring
8 years ago
reger 1f497ccad5 Add consistency check for related index fields upon load and save of
8 years ago
luccioman 68afe900d0 Added user-friendly controls over disk usage configuration settings.
8 years ago
reger 95d2a28599 adjust the Field-Reindex Thread to verify and update the document id
8 years ago
luccioman fc01b69eca Fixed local image search pagination regression.
8 years ago
reger 581b00cc20 remove obsolete lastmodified calculation in WebgraphConfig
8 years ago
luccioman 0da1e6ba16 Factored code re-implementing DigestURL.hosthash() method.
8 years ago
luccioman 6a4d51d8f9 Cleaned up some Javadoc warnings.
8 years ago
luccioman 86dc198698 Fixed some JavaDocs broken links.
8 years ago
reger 4c9be29a55 fix concurrency issue with htmlParser using not current scraper data
8 years ago
reger 68d4dc5cc5 Complete harmonization RequestHeader getCookie with std ServletRequest
8 years ago
reger a1e5f7dbca fix of fulltext.remove() by id of webgraph document
8 years ago
luccioman 1df558a6c6 Fixed YaCy proper shutdown triggered by SIGTERM signal.
8 years ago
reger b522d540b9 Include itemprop latitude/longitude (see schema.org) in attribute
8 years ago
luccioman 3ca695390c FTP crawl start URLs : applied crawl profile depth control
8 years ago
reger 8eb6fba59c activate filetype navigator plugin and restrict config (append) of navs
8 years ago
luccioman c25e48e969 Enabled displaying results after 14th page for local search queries.
8 years ago
reger bab4804d11 add FileTypeNavigator plugin
8 years ago
luccioman 467650c042 Hardened system update checks.
8 years ago
luccioman b5711b8fe1 Added some Javadocs.
8 years ago
reger 0758c868c9 add HostNavigator plugin
8 years ago
reger 60160877f5 bundle initialization of search navigation plugins in separate handler
8 years ago
luccioman d27adc2b92 Fixed language detector initialization and NullPointerException cases.
8 years ago
reger f7e9f9be5f move Digest auth checks from DefaultServlet to adminAuthenticated,
8 years ago
reger 44a6a4e795 fix authentication by hit in userdb (wrong parameter)
8 years ago
luccioman aa9ddf3c23 Added control over Robots.txt active threads maximum number.
8 years ago
reger 59130777a6 add high scored items first to YearNavigator (to make sure to be included
8 years ago
reger 08a0acc35d make a YearNavigator availabel, useable as SearchEvent.naviator plugin.
8 years ago
reger 7742579ca4 make a LanguageNavigator availabel, useable for the SearchEvent.naviator
8 years ago
reger bad8f87998 remove old/obsolete clear text "adminAccount" credential entry from init
8 years ago
reger 811cf637f8 fix Jetty9YaCySecurityHandler, length check of Basic credential,
8 years ago
reger 59448461d3 make use of userInRole for quick login verification
8 years ago
reger 2a4d826d9e adjust servlet RequestHeader.getLocale
8 years ago
reger 9db68acb4f remove obsolete X_YACY... header declarations
8 years ago
luccioman 84b81c1af0 Switched more URLs to relative ones when possible.
8 years ago
reger 8fe28a83f2 harmonize used lastmodified date for rwi and fulltext in storeDocument
8 years ago
reger 3d1d297308 refactor namespace navigator as part of navigatorplugin map, this allows
8 years ago
reger 67f660523b Make navigators underlaying indexfield name accessible in interface
8 years ago
reger 5eb3ee4e20 Add search navigator interface to allow for additional navigators (plugins)
8 years ago
reger fd3f58fcaa improve query modifier parsing of "collection:" and possible collision
8 years ago
reger af39a76bf6 Reduce number of default max. search navigator lines (from 10000)
8 years ago
reger 3c7220bc7b Refacture rwi reference word position and word distance calculation
8 years ago
luccioman f0639d810c Customized name for Threads still using the default "Thread-n" pattern.
8 years ago
luccioman 7263d17436 Removed mentions of deprecated LURL-db.
8 years ago
reger 31d2a5645e remove obsolete query variable
8 years ago
luccioman 6e1959f469 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
8 years ago
reger 685d8e86bf Avoid frequent data type casting (float/long) for rwi score
8 years ago
reger e68b00678e prevent negative score on URIMetadataNode - in the special case were no
8 years ago
luccioman 8d57b5b970 Added some javadocs.
8 years ago
luccioman 60df09fff9 Fixed some HTML validation errors : Illegal character in query
8 years ago
luccioman b3b75b0498 Accessibility : add a customizable alternative text to YaCy log
8 years ago
luccioman 3ee4f56c39 Improved ErrorCache behavior when switching networks
8 years ago
luccioman 7d5ba2afa4 Added some JavaDoc and moved crawlStacker close at the right place.
8 years ago
luccioman 8edbcd8ad4 Log eventual Solr instances close errors.
8 years ago
reger 330768c8a2 fix for solr write.lock after mode change http://mantis.tokeek.de/view.php?id=686
8 years ago
Michael Peter Christen df51e4ef07 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
9 years ago
Michael Peter Christen e063aaf97f enable fuzzy search, solr style (append a ~ to get a fuzzyness on the
9 years ago
reger 7f63fc50f3 prepare a IndexSegment test case for RWI index testing
9 years ago
luccioman 06d4f93d03 Merged master into postprocessing branch
9 years ago
reger e310ec5f70 fix posInText ranking calculation to score 0 on no position info
9 years ago
reger 51c077f493 adjust the getTopics() and getTopicNavigator() to current useage
9 years ago
reger cc2d9dd3f1 reactivate the use of included-in-topwords boost in postRanking
9 years ago
reger 6801673a07 apply postranking media search boost only on media queries
9 years ago
luccioman 8c49a755da Postprocessing refactoring
9 years ago
luccioman 42f45760ed Refactored postprocessing
9 years ago
Michael Peter Christen 079112358c Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
9 years ago
Michael Peter Christen efeb592661 don't do solr optimization, this create high IO load. We should leave
9 years ago
reger 4c7a77662a eleminate dependency on file-extension in storeDocument but use supported mime-type
9 years ago
reger 2910fe35c1 add missing scheduler calc of next exec_date (call of calculateAPIScheduler)
9 years ago
reger 70d47ae38a keep scheduler selection by repeat entry from 07311020d4
9 years ago
reger 7c3f932e5d revert due to conflict with double count recording by schedulter / servlet by the commit under normal operation (no shutdown)
9 years ago
reger 07311020d4 postpone apicall exec date init until actual call
9 years ago
reger fcad2d0744 add uses of config constant INDEX_RECEIVE_ALLOW
9 years ago
reger 35a7d57260 update lucenematchversion to current (5.2.0 -> 5.5.0)
9 years ago
luccioman 893a40995a Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
9 years ago
Michael Peter Christen 7466d390b2 small refactoring + do not accept too old peers during bootstrap
9 years ago
luccioman 6e96c7341a Merge remote-tracking branch 'origin/master'
9 years ago
reger 8d58a48029 remove wrong log line in CrawlSwitchboard
9 years ago
reger b119ff65be clean out not used Switchboard variables
9 years ago
reger bd8f7c11f5 Use transparent addToCrawler in AutoSearch instead of addToIndex
9 years ago
JeremyRand 433217b33e Properly support multiple Boost Queries. (Previous code was broken because it concatenated multiple Boost Queries together rather than passing Solr an array.)
9 years ago
reger d0a571bed2 del cytag trail for own index.html (save resource not used by default)
9 years ago
reger 7097dcbdbd cleanup hack for partial Solr update on multivalued datefields
9 years ago
reger f10ea3c155 clean-out unused SwitchboardConstants
9 years ago
reger ef24593347 delete obsolete SEARCHRESULT busythread constants
9 years ago
reger 6ecc180299 fix rwi doubledom return best (highest) ranking
9 years ago
reger d9adc2c255 load handler for Transparent Proxy on startup only if feature is activated
9 years ago
Michael Peter Christen b89465d952 0N - basic dump upload servlet infrastructure, to share index dumps
9 years ago
Michael Peter Christen 849ab671a9 0n: modified the p2p bootstraping process - rules had been too tight and
9 years ago
Michael Peter Christen a6bf0b1649 0N - added option to generate index export files for a specific number
9 years ago
reger 06d0e2aeb9 result heuristic (also used in greedy learning mode) to use outbound links if result is full index doc. Otherwise use default loader methode.
9 years ago
reger caf9e98f09 put metadata dc_publisher in corresponding schema field
9 years ago
luc 3f338777f7 Also check and index eventual icon url information from metadata.
9 years ago
reger 6f0b073bf3 override detected language (statistic langdetect) only with TLD determided
9 years ago
luc 07222b3e1a Added favicon url transmission in RWI chunks.
9 years ago
luc 480772c070 Fixed json search results from commit "Improved URLLicence reliability"
9 years ago
reger 535d4bf75f respect hidden attribute for file and smb directory listing
9 years ago
luc 3cc5619d93 Improved HTML icons indexing and rendering in search results.
9 years ago
reger a6617ad887 expand initRemoteCrawler() to terminate worker threads if called to deactivate
9 years ago
reger ed3e16e092 apply remote result count config value to Bookmark Autosearch
9 years ago
Ryszard Goń a98c395023 Add the Autocrawl thread
9 years ago
Ryszard Goń 1728cd30c6 Create autocrawl profiles
9 years ago
luc 571bc55937 Refactoring : use StandardCharsets constants instead of hard-coded
9 years ago
reger 1af0e9ef74 remove workaround for Solr bug regarding multivalued date fields
9 years ago
reger a58d34a4e8 check error URL cache before adding errorDoc to index
9 years ago
reger cd26717ba2 fix low memory status hint (dht-in disabled)
9 years ago
sixcooler dce1cb65c4 Merge remote-tracking branch 'choose_remote_name/master'
9 years ago
reger 6d54eb3d36 skip loading document on crawl start for YMark bookmarks
9 years ago
reger 45b9bd8403 adjust MultiProtocolURL.protocol detection to handle mailto with "://" in parameters,
9 years ago
reger dec3e6ad96 fix: adjust urlstub for mailto links
9 years ago
luc 8c4ab9c76b Added an option to eventually limit size of remote solr documents put to
9 years ago
reger 28b8bc290a fix use of NETWORK_SEARCHVERIFY for rwi verification
9 years ago
reger 020630efd8 remove unused network scanner parameter from queryparameter
9 years ago
luc ad5586f8f6 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
luc 8ebefa4233 Fixed MediaWiki import : DCEntry conversion to SolrInputDocument was
9 years ago
reger cdb8f3b10d make current ranking score value avail. to search interface / api
9 years ago
Michael Peter Christen ef8cd80593 fix for npe
9 years ago
reger 0347bfa71f Apply collection query constraint/modifiert to rwi result stack.
9 years ago
reger ca3d26a401 harmonize wordsintitle & CollectionSchema.title_words_val calculation,
9 years ago
reger 52a9040ae6 Sort out double keywords (dc_subject) early in parsed documents
9 years ago
sixcooler 646afe9183 do not store subfield *_coordinate + make all num-fields being docvalues
9 years ago
sixcooler 194df613de not using 'location' as defaultfacetfield - since we removed it being
9 years ago
sixcooler 4a905ec134 fix to not let the AccessTracker-Log grow to much, but have enough data
9 years ago
reger a60b1fb6c2 differentiate api call getLocalPort() from getConfigInt()
9 years ago
reger 11f3666660 increase use of pre.defined CATCHALL_QUERY string
9 years ago
reger a58ee49307 Optimize internal imagequery focus on using content_type to select images
9 years ago
Michael Peter Christen 151ccd50a9 fix for image size field values (must be multi-valued)
9 years ago
reger 43c27aa550 upd to solr/lucene 5.3.1
9 years ago
Michael Peter Christen 3d7dd9d3aa follow-up to latest commit: also flush the search cache if all crawls
9 years ago
Michael Peter Christen c737ff235d in case that the include_string contains several entries including
9 years ago
reger 7889fc2389 Hack to prevent Solr issue on partial update on a document containing multivalued date field
10 years ago
reger 3428b6f13b improve filtering by filetype navigator.
10 years ago
reger e37a4f0b3d prevent metadata records in index w/o valid url
10 years ago
reger 802ccaead6 fix init of error cache, use latest faildates => load_date_dt
10 years ago
reger dba7f15073 apply same size constrain on result image from doc
10 years ago
sixcooler 87e4abe393 fight the fieldcache by usind DocValues: in Solr-5.x the fieldcache has
10 years ago
reger eaf0e8ff2c start recording/indexing pixel size for image document
10 years ago
reger c33229fc0c check mime prior to ext for metadata modification for images
10 years ago
reger 19f1308bf0 enforce th result images limit to > 16x16px
10 years ago