Commit Graph

3411 Commits (d3b9349b6f2c1ddc0d70fb63debc658257c5bcb5)

Author SHA1 Message Date
sixcooler d3b9349b6f simplification / speedup of GenerationMemoryStrategy
9 years ago
sixcooler 4a905ec134 fix to not let the AccessTracker-Log grow to much, but have enough data
9 years ago
reger 20e18d79f8 harmonize document title for archive parsers
9 years ago
reger 112ae013f4 update bzip and bzip parser process,
9 years ago
reger e76a90837b update zip and tar parser process,
9 years ago
reger 8532565c7d optimize order of parsers to try
9 years ago
reger 681889ae64 use current tar library for untar files
9 years ago
reger 5d71fc70e3 fix tarParser early exit on looping content
9 years ago
reger 2fcf6f104c fix bzipParser recognition
9 years ago
reger a60b1fb6c2 differentiate api call getLocalPort() from getConfigInt()
9 years ago
reger 11f3666660 increase use of pre.defined CATCHALL_QUERY string
9 years ago
reger a58ee49307 Optimize internal imagequery focus on using content_type to select images
9 years ago
reger d223cf0ae4 adjust MediaWiki importer geo coordinate calculation
9 years ago
reger 2b775d5be6 fix typo in WikiCode coordinate calculation
9 years ago
reger bbe9df2bb3 fix MediawikiImporter for bz2 dump
10 years ago
reger c6687dd560 fix a system.out to log.fine
10 years ago
reger e53c6bbd51 fix init of peer flags
10 years ago
Michael Peter Christen ac034db8bc Merge branch 'master' of https://github.com/luccioman/yacy_search_server
10 years ago
reger 826f14f37f fix unnececary set null of peer flags, causing reread
10 years ago
luc 5902ce032e Corrected NullPointerException case when ImageIO reader is not found for
10 years ago
reger c6495a5b62 add a log entry on parsing ajax crawling scheme snapshot
10 years ago
reger 9252e36aeb implement ajax crawling scheme for ajax sites which adhere to the proposed use of hash-bangs to provide html content
10 years ago
Michael Peter Christen d1ae999ef9 replaced HashMap with LinkedHashMap to preserve the object order
10 years ago
Michael Peter Christen 7d075a1d76 added log lines
10 years ago
Michael Peter Christen 092dac086e Merge branch 'master' of https://github.com/luccioman/yacy_search_server
10 years ago
reger 7a64bebb86 init Recrawl job chunk size to max crawl loader during job start, to use some system preferences
10 years ago
luc d6522fa4a2 Integrated haraldk/TwelveMonkeys library to first add TIF image format
10 years ago
Michael Peter Christen 9244694e64 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
10 years ago
Michael Peter Christen 151ccd50a9 fix for image size field values (must be multi-valued)
10 years ago
reger c9937973e3 unescape MultiProtocolURL getAttributes() return values.
10 years ago
reger 78e8c6f3e5 refactor special handling (static override) of SUPPORTED_EXTENSIONS/MIME_TYPES
10 years ago
reger d54c5d310a add links with image extension not automatically to image links.
10 years ago
reger 851e8f6c8a check jpeg file signature in genericImageParser
10 years ago
reger fb75fea446 use recrawljob w/o sort results by date
10 years ago
reger 43c27aa550 upd to solr/lucene 5.3.1
10 years ago
reger 688f7b2a5c allow/display svg images in image results previews
10 years ago
reger d5330391de remove some unused var allocation in parser
10 years ago
Michael Peter Christen 3d7dd9d3aa follow-up to latest commit: also flush the search cache if all crawls
10 years ago
Michael Peter Christen c737ff235d in case that the include_string contains several entries including
10 years ago
Michael Peter Christen 8e555d79a3 add also 1-character tokens to the token list because that could be also
10 years ago
reger 7c82cd4415 add a end condition to svgParser for wrong content
10 years ago
reger 356d4d1301 remove rdfParser from init (current function identical with genericParser)
10 years ago
reger c647d899e3 add svgParser to parse metadate from svg images
10 years ago
reger bad34804fe optimize parseInt for <img> tag attribute parsing
10 years ago
Michael Peter Christen 6ebc2451a9 Merge pull request #14 from luccioman/master
10 years ago
reger 2f51baff4f check for loading error (includs unsupported formats)
10 years ago
luc 5578886f6f Merge branch 'master' of https://github.com/luccioman/yacy_search_server.git
10 years ago
luc c38d6c1f37 Correction for mantis 535: inurl: parameter doesn't work on URLs with
10 years ago
reger 52e3eb4ce8 harmonize/correct assignment to Ymarkmeta.mime
10 years ago
Michael Peter Christen 87f358058e Fix for index entries which have id's not computed as hash from the url.
10 years ago