Commit Graph

3391 Commits (5445f38070af55ed56a5c826e20838925cfc2519)

Author SHA1 Message Date
reger c6495a5b62 add a log entry on parsing ajax crawling scheme snapshot
9 years ago
reger 9252e36aeb implement ajax crawling scheme for ajax sites which adhere to the proposed use of hash-bangs to provide html content
9 years ago
Michael Peter Christen d1ae999ef9 replaced HashMap with LinkedHashMap to preserve the object order
9 years ago
Michael Peter Christen 7d075a1d76 added log lines
9 years ago
Michael Peter Christen 092dac086e Merge branch 'master' of https://github.com/luccioman/yacy_search_server
9 years ago
reger 7a64bebb86 init Recrawl job chunk size to max crawl loader during job start, to use some system preferences
9 years ago
luc d6522fa4a2 Integrated haraldk/TwelveMonkeys library to first add TIF image format
9 years ago
Michael Peter Christen 9244694e64 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
9 years ago
Michael Peter Christen 151ccd50a9 fix for image size field values (must be multi-valued)
9 years ago
reger c9937973e3 unescape MultiProtocolURL getAttributes() return values.
9 years ago
reger 78e8c6f3e5 refactor special handling (static override) of SUPPORTED_EXTENSIONS/MIME_TYPES
9 years ago
reger d54c5d310a add links with image extension not automatically to image links.
9 years ago
reger 851e8f6c8a check jpeg file signature in genericImageParser
9 years ago
reger fb75fea446 use recrawljob w/o sort results by date
9 years ago
reger 43c27aa550 upd to solr/lucene 5.3.1
9 years ago
reger 688f7b2a5c allow/display svg images in image results previews
9 years ago
reger d5330391de remove some unused var allocation in parser
9 years ago
Michael Peter Christen 3d7dd9d3aa follow-up to latest commit: also flush the search cache if all crawls
9 years ago
Michael Peter Christen c737ff235d in case that the include_string contains several entries including
9 years ago
Michael Peter Christen 8e555d79a3 add also 1-character tokens to the token list because that could be also
9 years ago
reger 7c82cd4415 add a end condition to svgParser for wrong content
9 years ago
reger 356d4d1301 remove rdfParser from init (current function identical with genericParser)
9 years ago
reger c647d899e3 add svgParser to parse metadate from svg images
9 years ago
reger bad34804fe optimize parseInt for <img> tag attribute parsing
9 years ago
Michael Peter Christen 6ebc2451a9 Merge pull request #14 from luccioman/master
9 years ago
reger 2f51baff4f check for loading error (includs unsupported formats)
9 years ago
luc 5578886f6f Merge branch 'master' of https://github.com/luccioman/yacy_search_server.git
9 years ago
luc c38d6c1f37 Correction for mantis 535: inurl: parameter doesn't work on URLs with
9 years ago
reger 52e3eb4ce8 harmonize/correct assignment to Ymarkmeta.mime
9 years ago
Michael Peter Christen 87f358058e Fix for index entries which have id's not computed as hash from the url.
9 years ago
reger 3f2b8ab5e5 optionally include mime in p2p url exchange string
9 years ago
reger a3195d78ae add Portuguese month names to date recognition
9 years ago
reger d2cc11ea8f fix html parser taking <style> content as text.
9 years ago
Michael Peter Christen 5f706797cb patch for a bug inside of solr since solr 5.0 when using a boost
9 years ago
reger 7889fc2389 Hack to prevent Solr issue on partial update on a document containing multivalued date field
9 years ago
reger b4cbdea1e7 adapt SolrServerConnector.add to handle error on partial update input document.
9 years ago
reger 98ab655917 on reindex delete index document with invalid url
9 years ago
reger 1e8369e18b use a parsed date in Document.toString
9 years ago
luccioman 199b2ce52d Translator refactoring : to simplify locale files writing, process keys
9 years ago
luccioman 4dd9c0d5d9 Merge from main repository
9 years ago
reger 3428b6f13b improve filtering by filetype navigator.
9 years ago
reger e37a4f0b3d prevent metadata records in index w/o valid url
9 years ago
reger 41c4eade51 extract modification date from vCard (vcfParser)
9 years ago
reger 8768896975 extract lastmodified from openoffice doc
9 years ago
Michael Peter Christen c40c302748 when many crawl queues are generated, this NPE can occur; probably
9 years ago
reger 367fe388b9 fix exception throw after sendError in DefaultServlet
9 years ago
luccioman 9752bd5f88 Added utils to help translation without launching full YaCy application
9 years ago
luccioman 2f0f0180e2 Added a function to list files recursively.
9 years ago
luccioman 7e4c1d2282 Translator refactoring :
9 years ago
reger 802ccaead6 fix init of error cache, use latest faildates => load_date_dt
9 years ago