Commit Graph

7904 Commits (befb2415f8ae6ce8cfb061de047c5a5ef9ed5d41)

Author SHA1 Message Date
reger 112ae013f4 update bzip and bzip parser process,
9 years ago
reger e76a90837b update zip and tar parser process,
9 years ago
luc 4e673ffc9a Ensure closing of InputStream even when an exception occurs.
9 years ago
luc 10696b53f7 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 8532565c7d optimize order of parsers to try
9 years ago
reger 681889ae64 use current tar library for untar files
9 years ago
reger 5d71fc70e3 fix tarParser early exit on looping content
9 years ago
luc bcc2e7cb5b Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 2fcf6f104c fix bzipParser recognition
9 years ago
luc 745e97a575 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger a60b1fb6c2 differentiate api call getLocalPort() from getConfigInt()
9 years ago
reger 11f3666660 increase use of pre.defined CATCHALL_QUERY string
9 years ago
reger a58ee49307 Optimize internal imagequery focus on using content_type to select images
9 years ago
luc fc3294382e Updated javadocs for warning on target encoding format potential errors.
9 years ago
luc aa70ff4ff6 Corrected images alpha channel rendering
9 years ago
reger d223cf0ae4 adjust MediaWiki importer geo coordinate calculation
9 years ago
reger 2b775d5be6 fix typo in WikiCode coordinate calculation
10 years ago
reger bbe9df2bb3 fix MediawikiImporter for bz2 dump
10 years ago
reger c6687dd560 fix a system.out to log.fine
10 years ago
reger e53c6bbd51 fix init of peer flags
10 years ago
Michael Peter Christen ac034db8bc Merge branch 'master' of https://github.com/luccioman/yacy_search_server
10 years ago
reger 826f14f37f fix unnececary set null of peer flags, causing reread
10 years ago
luc 5902ce032e Corrected NullPointerException case when ImageIO reader is not found for
10 years ago
reger c6495a5b62 add a log entry on parsing ajax crawling scheme snapshot
10 years ago
reger 9252e36aeb implement ajax crawling scheme for ajax sites which adhere to the proposed use of hash-bangs to provide html content
10 years ago
Michael Peter Christen d1ae999ef9 replaced HashMap with LinkedHashMap to preserve the object order
10 years ago
Michael Peter Christen 7d075a1d76 added log lines
10 years ago
Michael Peter Christen 092dac086e Merge branch 'master' of https://github.com/luccioman/yacy_search_server
10 years ago
reger 7a64bebb86 init Recrawl job chunk size to max crawl loader during job start, to use some system preferences
10 years ago
luc d6522fa4a2 Integrated haraldk/TwelveMonkeys library to first add TIF image format
10 years ago
Michael Peter Christen 9244694e64 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen 151ccd50a9 fix for image size field values (must be multi-valued)
10 years ago
reger c9937973e3 unescape MultiProtocolURL getAttributes() return values.
10 years ago
reger 78e8c6f3e5 refactor special handling (static override) of SUPPORTED_EXTENSIONS/MIME_TYPES
10 years ago
reger d54c5d310a add links with image extension not automatically to image links.
10 years ago
reger 851e8f6c8a check jpeg file signature in genericImageParser
10 years ago
reger fb75fea446 use recrawljob w/o sort results by date
10 years ago
reger 43c27aa550 upd to solr/lucene 5.3.1
10 years ago
reger 688f7b2a5c allow/display svg images in image results previews
10 years ago
reger d5330391de remove some unused var allocation in parser
10 years ago
Michael Peter Christen 3d7dd9d3aa follow-up to latest commit: also flush the search cache if all crawls
10 years ago
Michael Peter Christen c737ff235d in case that the include_string contains several entries including
10 years ago
Michael Peter Christen 8e555d79a3 add also 1-character tokens to the token list because that could be also
10 years ago
reger 7c82cd4415 add a end condition to svgParser for wrong content
10 years ago
reger 356d4d1301 remove rdfParser from init (current function identical with genericParser)
10 years ago
reger c647d899e3 add svgParser to parse metadate from svg images
10 years ago
reger bad34804fe optimize parseInt for <img> tag attribute parsing
10 years ago
Michael Peter Christen 6ebc2451a9 Merge pull request #14 from luccioman/master
10 years ago
reger 2f51baff4f check for loading error (includs unsupported formats)
10 years ago
luc 5578886f6f Merge branch 'master' of https://github.com/luccioman/yacy_search_server.git
10 years ago
luc c38d6c1f37 Correction for mantis 535: inurl: parameter doesn't work on URLs with
10 years ago
reger 52e3eb4ce8 harmonize/correct assignment to Ymarkmeta.mime
10 years ago
Michael Peter Christen 87f358058e Fix for index entries which have id's not computed as hash from the url.
10 years ago
reger 3f2b8ab5e5 optionally include mime in p2p url exchange string
10 years ago
reger a3195d78ae add Portuguese month names to date recognition
10 years ago
reger d2cc11ea8f fix html parser taking <style> content as text.
10 years ago
Michael Peter Christen 5f706797cb patch for a bug inside of solr since solr 5.0 when using a boost
10 years ago
reger 7889fc2389 Hack to prevent Solr issue on partial update on a document containing multivalued date field
10 years ago
reger b4cbdea1e7 adapt SolrServerConnector.add to handle error on partial update input document.
10 years ago
reger 98ab655917 on reindex delete index document with invalid url
10 years ago
reger 1e8369e18b use a parsed date in Document.toString
10 years ago
luccioman 199b2ce52d Translator refactoring : to simplify locale files writing, process keys
10 years ago
luccioman 4dd9c0d5d9 Merge from main repository
10 years ago
reger 3428b6f13b improve filtering by filetype navigator.
10 years ago
reger e37a4f0b3d prevent metadata records in index w/o valid url
10 years ago
reger 41c4eade51 extract modification date from vCard (vcfParser)
10 years ago
reger 8768896975 extract lastmodified from openoffice doc
10 years ago
Michael Peter Christen c40c302748 when many crawl queues are generated, this NPE can occur; probably
10 years ago
reger 367fe388b9 fix exception throw after sendError in DefaultServlet
10 years ago
luccioman 9752bd5f88 Added utils to help translation without launching full YaCy application
10 years ago
luccioman 2f0f0180e2 Added a function to list files recursively.
10 years ago
luccioman 7e4c1d2282 Translator refactoring :
10 years ago
reger 802ccaead6 fix init of error cache, use latest faildates => load_date_dt
10 years ago
reger dba7f15073 apply same size constrain on result image from doc
10 years ago
reger 4cf875336c complete TODO: getFileExtension handle dot in query part
10 years ago
sixcooler 87e4abe393 fight the fieldcache by usind DocValues: in Solr-5.x the fieldcache has
10 years ago
reger eaf0e8ff2c start recording/indexing pixel size for image document
10 years ago
reger c33229fc0c check mime prior to ext for metadata modification for images
10 years ago
reger 19f1308bf0 enforce th result images limit to > 16x16px
10 years ago
reger 0e4ba0360b fix NPE on .yacyh result url of disconnected peer
10 years ago
reger 7ed812a2bf log missing seed.port
10 years ago
reger 206883f80d fix: Preserve protocol in url proxy
10 years ago
reger f7b0b3b7b3 avoid runtime exception by earlier testing for seed.ip=null
10 years ago
Michael Peter Christen 906b5fd742 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen 8f90767889 fix for filesystem crawl
10 years ago
sixcooler a3dd4be749 added / corrected charste to be 1.7 compatible.
10 years ago
Michael Peter Christen 8028410ab7 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen df3314ac1a added a new facet type based on a probabilistic classifier using
10 years ago
reger 1409cabe8b exclude more default search fields from text copy to text_t
10 years ago
reger e2e73258ca remove obsolete interface SearchAccumulator
10 years ago
Michael Peter Christen dbbad23e12 removed warnings
10 years ago
Michael Peter Christen 500cfa9457 enhanced logging
10 years ago
Michael Peter Christen c14bc8d9b7 revert of fq transformation (recent fix)
10 years ago
Michael Peter Christen 203df5a750 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
reger fa08ca207e ! finish running crawls before applying !
10 years ago
reger ee77f24e52 use some more declared HeaderFramework constants
10 years ago
Michael Peter Christen 11a848da5a Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen b94bd7f20a a collection of search query enhancements:
10 years ago
reger dbe2594c38 replace deprecated myPublicLocalIP() in AbstractRemoteHandler
10 years ago
reger 6d3534e725 remove unused Transmission hit counter
10 years ago