Commit Graph

1035 Commits (6d78a6d06e7573913328423850903b70e8b23791)

Author SHA1 Message Date
Michael Peter Christen b060ba900d added parsing of contentprop attribute in html tags for
10 years ago
Michael Peter Christen ae02c92fd0 logging fix
10 years ago
Michael Peter Christen 5651713134 better debugging of fq
10 years ago
reger b1ec0644e5 fix NPE in location search on missing/empty PubDate in underlaying rss data
10 years ago
reger 839b962c20 correct percent encoding for '%' char
10 years ago
reger 2ef8ffdb60 apply UTF-8 encoding
10 years ago
reger 7120ea42f1 fix for path with char code > 255
10 years ago
reger 1d81bd0687 fix url encoding for path see http://mantis.tokeek.de/view.php?id=559
10 years ago
reger 62087fb8b2 fix MultiProtocolURL mailto protocol detection
10 years ago
reger f94e34058c fix url (path) %-decoding http://mantis.tokeek.de/view.php?id=519
10 years ago
Michael Peter Christen 710a0efa1b generalized time period computations
10 years ago
Michael Peter Christen 535f1ebe3b added a new way of content browsing in search results:
10 years ago
reger 9b0de2de64 introduce getQueryFields to return default query fields (queryparamter QF)
10 years ago
reger 8ec1db76ee url unescape add check for inconsistent utf8 multibyte parsing
10 years ago
reger f0a5188e11 replace depreciated HTTPClient setStaleConnectionCheckEnabled with setValidateAfterInactivity()
10 years ago
reger 7b569d2dbe replace depriciated HTTPClient ALLOW_ALL_HOSTNAME_VERIFIER with NoopHostnameVerifier()
10 years ago
reger eda0aeaf26 allow/recognize host in file: protocol crawl target
10 years ago
Michael Peter Christen 8ff76f8682 the cleanup process experienced a 100% CPU load situation and the loop
10 years ago
Michael Peter Christen 6578ff3ddb enhanced suggest function
10 years ago
reger fe6f5a395d fix Umlaut handling in blekko heuristic search term
10 years ago
reger c454ef69c6 add shortMemory check to heuristic search
10 years ago
reger 9e1ec5fec4 refactor: just some more useages of constant for term ":[* TO *]"
10 years ago
Michael Peter Christen b5ac29c9a5 added a html field scraper which reads text from html entities of a
10 years ago
Michael Peter Christen 1cb290170e refactoring of autotagging code (combined same code pieces)
10 years ago
Michael Peter Christen c3b55455fc enhanced initialization speed of vocabularies by using better
10 years ago
Michael Peter Christen de3e373913 using precompiled CommonPattern.TAB for split
10 years ago
Michael Peter Christen a8a2b7a803 persistency for vocabulary facet switch
10 years ago
Michael Peter Christen 69eacdf4eb applying precompiled CommonPattern.COMMA.split to all places where
10 years ago
Michael Peter Christen ac19690d30 refactoring with CommonPattern.COMMA
10 years ago
Michael Peter Christen b5a55c8b3d fix for wkhtmltopdf (custom header does not work)
10 years ago
Michael Peter Christen bee5ee7cce removed some warnings
10 years ago
Michael Peter Christen 783cf6fbc7 the LinkedBlockingQueue is much faster than the ArrayBlockingQueue
10 years ago
Michael Peter Christen 6390454652 fix for vocabulary on/off setting
10 years ago
Michael Peter Christen dc5700148f update to latest code changes from json.org
10 years ago
Michael Peter Christen 7db2888336 fixed font size and print page generation in pdf snapshots
10 years ago
reger 24f68a4eb7 refactor opensearch heuristic
10 years ago
Michael Peter Christen b07afbc115 a test with http://validator.w3.org/feed/#validate_by_input shows that
10 years ago
reger c156548efe add info text to metadata page (htmlresponsewriter) on no documents found
10 years ago
reger 51ec9c1f44 fix "null" title in response writer for documents with multivalued title
10 years ago
Michael Peter Christen cc090bcb01 enhanced initialization of autotagging
10 years ago
Michael Peter Christen a0576ec737 fix for pdf sub-page result preparation
10 years ago
Michael Peter Christen 407cfff010 fix to wkhtmltopdf usage
10 years ago
Michael Peter Christen 5d321d3dc5 fixes to wkhtmltopdf call
10 years ago
Michael Peter Christen d14114697c the miss cache does not seem to work, it sometimes contains urlhashes
10 years ago
Michael Peter Christen 1cfddea578 added (very experimental) Solr response writer for snapshot image
10 years ago
Michael Peter Christen 7287dd764e added url, date, time and page number on pdf snapshot footer
10 years ago
Michael Peter Christen 66b5a56976 Added and integrated new date detection class which can identify date
10 years ago
Michael Peter Christen c3c2b6999b fixes on wkhtmltopdf
10 years ago
Michael Peter Christen 114f0afc1e enable sku as anchor in html response writer
10 years ago
Michael Peter Christen aa80cb1159 enhanced tagging preparation speed which reduces initialization time for
10 years ago