Commit Graph

3538 Commits (4eddabee4221f0d8deec55a157b520e1e1c140ce)

Author SHA1 Message Date
reger 9da1712a31 increase http header EXPIRES for css and images in DefaultServlet
9 years ago
reger 6d54eb3d36 skip loading document on crawl start for YMark bookmarks
9 years ago
reger 80e2c82249 fix NPE on empty blog importfile parameter
9 years ago
reger e84d94f8ca fix mime table for ms office / open office documents
9 years ago
reger 45b9bd8403 adjust MultiProtocolURL.protocol detection to handle mailto with "://" in parameters,
9 years ago
reger d5fd031449 fix reading of ippattern config array in URLProxy
9 years ago
reger b7e8358645 make use of header.getContentType where possible (mime is normalized afterwards)
9 years ago
reger 7a8c077838 fix HeaderFramework.mime() to strip charset parameter.
9 years ago
reger b4b6910d60 fix (todo): correct doc.id of remote search result if no match with newly
9 years ago
reger dec3e6ad96 fix: adjust urlstub for mailto links
9 years ago
reger cb83e65f89 drop returning document language "en" if unknown (fix todo)
9 years ago
reger 0c5548a7ff fix (todo) remove redundant holding of email link nameproperty in parser document
9 years ago
reger 71c416f383 show mailto links in ViewFile.html linklist
9 years ago
reger 6b7c10cef8 fix dc:date in mediawikiimporter/document.writexml to use lastmodified
9 years ago
reger 14803d58cd let html scraper accept html5 <link rel="icon"> for favicon links
9 years ago
luc b4cdacee76 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
luc ba0a293f5c Corrected another case of
9 years ago
reger 4d2b934487 prevent mailto links getting into parser result document's in/outbound link collection
9 years ago
luc 8c4ab9c76b Added an option to eventually limit size of remote solr documents put to
9 years ago
luc a2c08402af Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
luc 70595d05d0 Modified MemoryControl.main() test to properly end for better results
9 years ago
sixcooler 1be67d9ab6 CachedSolrConnector was replaced by ConcurrentUpdateSolrConnector years
9 years ago
reger 28b8bc290a fix use of NETWORK_SEARCHVERIFY for rwi verification
9 years ago
reger 020630efd8 remove unused network scanner parameter from queryparameter
9 years ago
luc ad5586f8f6 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
luc 8ebefa4233 Fixed MediaWiki import : DCEntry conversion to SolrInputDocument was
9 years ago
luc 7736ee5a42 Updated MediaWimporter main() : display usage in console and stop
9 years ago
reger cdb8f3b10d make current ranking score value avail. to search interface / api
9 years ago
luc 27d11f8671 Fixed isSolrDump function : PushBackInputStream was not unread when
9 years ago
Michael Peter Christen 135a123a77 less logging in new language detection
9 years ago
Michael Peter Christen ef8cd80593 fix for npe
9 years ago
reger 0347bfa71f Apply collection query constraint/modifiert to rwi result stack.
9 years ago
luc 2a67d2ba6f Corrected error management for unsupported image formats, parsing
9 years ago
Michael Peter Christen d6e9834040 Merge branch 'master' of
9 years ago
Michael Peter Christen d82d311995 Merge branch 'master' of https://github.com/luccioman/yacy_search_server
9 years ago
reger b5371ea8c1 read/init crawl queue in a thread
9 years ago
reger 1160b13172 remove unused md5 from ViewFile servlet params
9 years ago
reger e163ea88f6 fix vsdParser (Visio) parser return statement
9 years ago
reger b2c8bc0ae6 remove md5_s from default index fields
9 years ago
luc e40ae0943b - No max dimensions specified : render raw image data when source and
9 years ago
reger 90686a75a2 fix flux factor (additional crawl delay by access count) calculation
9 years ago
luc 4af27289e5 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 297fdb60d3 throw exception if crawler hostqueue can't create hostpath directory.
9 years ago
luc 755efac17d Use same max file size when loading all resource bytes or opening stream
9 years ago
luc bc6c79fc12 Corrected scaling function for non RGB images.
9 years ago
luc 1565559df8 Refactoring : extracted write InputStream method.
9 years ago
luc f0478bb14d BMP and ICO image formats support : integrated /haraldk/TwelveMonkeys
9 years ago
luc 07437986e7 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 97cc03ef6a start using a template for urlproxy header
9 years ago
luc f01d49c37a Process large or local file images dealing directly with content
9 years ago
luc 3c4c77099d If available, check content length before downloading. Check also
9 years ago
luc 5bbb2e1730 Ensure resource is closed when reading a full file InputStream
9 years ago
luc 6291a57300 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 0d3c5b223e have psParser cleanup temp file
9 years ago
reger 7d0d19cb8e avoid File.deleteOnExit() on temp files
9 years ago
luc bfe51001e3 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 02e4489a23 set tmpfile.deleteOnExit by default,
9 years ago
reger 2985baaa01 Exclude repetitive protocol part in tokenized url
9 years ago
reger ca3d26a401 harmonize wordsintitle & CollectionSchema.title_words_val calculation,
9 years ago
reger 52a9040ae6 Sort out double keywords (dc_subject) early in parsed documents
9 years ago
luc 49331dc523 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 47d70732f6 improve locale translator
9 years ago
sixcooler 646afe9183 do not store subfield *_coordinate + make all num-fields being docvalues
9 years ago
sixcooler 194df613de not using 'location' as defaultfacetfield - since we removed it being
9 years ago
sixcooler d3b9349b6f simplification / speedup of GenerationMemoryStrategy
9 years ago
sixcooler 4a905ec134 fix to not let the AccessTracker-Log grow to much, but have enough data
9 years ago
reger 20e18d79f8 harmonize document title for archive parsers
9 years ago
luc f11b5e8309 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 112ae013f4 update bzip and bzip parser process,
9 years ago
reger e76a90837b update zip and tar parser process,
9 years ago
luc 4e673ffc9a Ensure closing of InputStream even when an exception occurs.
9 years ago
luc 10696b53f7 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 8532565c7d optimize order of parsers to try
9 years ago
reger 681889ae64 use current tar library for untar files
9 years ago
reger 5d71fc70e3 fix tarParser early exit on looping content
9 years ago
luc bcc2e7cb5b Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger 2fcf6f104c fix bzipParser recognition
9 years ago
luc 745e97a575 Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger a60b1fb6c2 differentiate api call getLocalPort() from getConfigInt()
9 years ago
reger 11f3666660 increase use of pre.defined CATCHALL_QUERY string
9 years ago
reger a58ee49307 Optimize internal imagequery focus on using content_type to select images
9 years ago
luc fc3294382e Updated javadocs for warning on target encoding format potential errors.
9 years ago
luc aa70ff4ff6 Corrected images alpha channel rendering
9 years ago
reger d223cf0ae4 adjust MediaWiki importer geo coordinate calculation
9 years ago
reger 2b775d5be6 fix typo in WikiCode coordinate calculation
9 years ago
reger bbe9df2bb3 fix MediawikiImporter for bz2 dump
9 years ago
reger c6687dd560 fix a system.out to log.fine
9 years ago
reger e53c6bbd51 fix init of peer flags
9 years ago
Michael Peter Christen ac034db8bc Merge branch 'master' of https://github.com/luccioman/yacy_search_server
9 years ago
reger 826f14f37f fix unnececary set null of peer flags, causing reread
9 years ago
luc 5902ce032e Corrected NullPointerException case when ImageIO reader is not found for
9 years ago
reger c6495a5b62 add a log entry on parsing ajax crawling scheme snapshot
9 years ago
reger 9252e36aeb implement ajax crawling scheme for ajax sites which adhere to the proposed use of hash-bangs to provide html content
9 years ago
Michael Peter Christen d1ae999ef9 replaced HashMap with LinkedHashMap to preserve the object order
9 years ago
Michael Peter Christen 7d075a1d76 added log lines
9 years ago
Michael Peter Christen 092dac086e Merge branch 'master' of https://github.com/luccioman/yacy_search_server
9 years ago
reger 7a64bebb86 init Recrawl job chunk size to max crawl loader during job start, to use some system preferences
9 years ago
luc d6522fa4a2 Integrated haraldk/TwelveMonkeys library to first add TIF image format
9 years ago
Michael Peter Christen 9244694e64 Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
9 years ago
Michael Peter Christen 151ccd50a9 fix for image size field values (must be multi-valued)
9 years ago