Commit Graph

8971 Commits (687820788d7b8996d1ddaea4b8da966690b6eb30)

Author SHA1 Message Date
luccioman a5771b1f14 Made SNI extension user configurable without the need for server restart
6 years ago
luccioman e90405b6f0 Support parsing audio URLs without file extension
6 years ago
luccioman a8316c79da Allow JS resorting of search results by unauthenticated users
6 years ago
luccioman 0ab2b49c31 Made /yacysearch access rate limitations user configurable
6 years ago
luccioman 5b7e41202a Added Solr GSA writer support for responses from remote instances
6 years ago
luccioman 4d8a948455 Properly close PDF snapshots loaded with pdfbox library
6 years ago
luccioman 74e6d6e984 Added Solr GrepHTML writer support for responses from remote instances
6 years ago
luccioman 5e6501974d Added Solr snapshots writer support for responses from remote instances
6 years ago
luccioman 384c37102c Improve accuracy of total results count on latest pages in Stealth mode
6 years ago
luccioman 5e9a08355a Improved logging for federated search
6 years ago
luccioman 9782a98a9c Added the possibility to customize facets sort type and direction
6 years ago
sgaebel c2398fd890 remove warnings: 'Statement unnecessarily nested within else clause'
6 years ago
sgaebel 811d40a6c4 taking care of closing inputstreams, HTTPClient
6 years ago
sgaebel 8d2e7262d9 Recrawl:
6 years ago
sgaebel 8f58c1dcfa extend the SolrServlet to be usable as remote solr (incl. update)
6 years ago
luccioman 7223a2fdb1 Removed usage of now deprecated Jetty function
6 years ago
luccioman 440d9f2fa0 Exclude peers with empty or disabled RWI from remote RWI search
6 years ago
luccioman 08ea0b0397 Added a configurable timeout to wkhtmltopdf calls for pdf snapshots
6 years ago
luccioman 3fb449b3b6 Properly resolve relative URLs against document URL in html base tags
6 years ago
luccioman 73a6e45524 Extended detection of external tools used for Snapshots generation
6 years ago
luccioman 7dc1f60619 Fixed detection of absolute data folder path on MS Windows
6 years ago
luccioman 595e144797 Trace a message on incomplete proper server finish when killing process
6 years ago
luccioman 9daeea823b Fixed concurrency issue on cache used for circles rendering
6 years ago
Michael Peter Christen c347e7d3f8 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
6 years ago
Michael Peter Christen 848e9304d9 evil bots may crawl harder
6 years ago
luccioman a997133260 Fixed gzip decompression regression on index transfer APIs
6 years ago
luccioman e85f231bdf Fixed termination of Host browser and link structure Solr query threads
6 years ago
luccioman fcf6b16db4 Added new crawler attribute for finer control over Media Type detection
6 years ago
luccioman a83a56473e Added suport for PDF snapshots generation when running on MS Windows
6 years ago
luccioman 8852c97cee Added basic styling for cleaner rendering of missing image snapshots
6 years ago
luccioman 746e0e788d Render a relevant HTTP status code on snapshot image rendering error
6 years ago
luccioman 50b6edfcf5 Updated Solr snapshots writer for a cleaner html head
6 years ago
luccioman f366f43d6b Made snapshots size customizable in Solr snapshots response writer
6 years ago
luccioman 7a62fc0e66 Fixed concurrency issue in custom classloader used for template classes
6 years ago
luccioman 61c337f29a Decode blacklist entries for easier edition of non ascii chars
6 years ago
luccioman ed93221fa1 Improved normalization of blacklist path patterns having non ascii chars
6 years ago
luccioman 2a73b63d9e Use a constant default target file name for seed SCP upload method
6 years ago
luccioman b5eabb626f Removed some dead code
6 years ago
luccioman db7ad76366 Improved support for Java logs file pattern options
6 years ago
luccioman 7adbd1f87d Fixed raw IPV6 addresses snapshots read/write on FAT32 and NTFS fs
6 years ago
luccioman 9b1c87033b Fixed logs folder checking and creation
6 years ago
luccioman c29588dd6a Made possible to provide an absolute data root path for start script
6 years ago
luccioman d03c098b54 Removed deprecated warning comments about imports and Debian installer
6 years ago
luccioman 5b60b4225f Fixed encoding of '+' character on search pages links
6 years ago
luccioman 54fbe166ba Updated pdf cache clear steps consistently with current pdfbox version
6 years ago
luccioman 685122363d Added a parser for XZ compressed archives.
6 years ago
luccioman 4ee14ff3c5 Fixed NullPointerException case on malformed crawl queue folder name
6 years ago
luccioman 21ad9435ec Fixed crawl queue folder naming for IPv6 hosts on MS Windows filesystems
6 years ago
luccioman 8a29551c54 Upgraded the OpenGeoDB dump URL
6 years ago
luccioman 373edf9eac Adjusted yjson Solr writer to support responses from an external Solr
6 years ago
luccioman 87bd17b1cf Simplified a little bit the RSS OpenSearch Solr writer
6 years ago
luccioman dc49ca9c27 Fixed a NPE case on the Solr OpenSearch response writer
6 years ago
luccioman f4267ed247 Made Solr OpenSearch RSS writer compatible with external Solr index
6 years ago
luccioman b1410f593a Fixed stylesheet relative URLs rendering in Solr html writer
6 years ago
luccioman 89c59814da Improved rendering of the Solr api relative url in the html writer
6 years ago
luccioman bf4f320b16 Optionally render the response header when using the Solr html writer
6 years ago
luccioman 313204ae2c Override qf and df Solr params with defaults only when they are not set
6 years ago
luccioman bdafb14336 Removed redundant synchronization lock on network switch function
6 years ago
luccioman d5f44ea216 Removed unnecessary synchronization lock from serverSwitch constructor
6 years ago
luccioman dcad393fe5 Fixed exceeding max size of failreason_s Solr field on large link list
6 years ago
luccioman f467601561 Properly lock solrInstances for reboot and restoration of embedded Solr
6 years ago
luccioman 9630f81306 Fixed small unnecessary lines of code
6 years ago
luccioman 876bcd2f54 Fixed useless comparison between int parameter and Long.MAX_VALUE
6 years ago
luccioman c726154a59 Fixed removal of URLs from the delegatedURL remote crawl stack
6 years ago
luccioman 2bdd71de60 Added server side columns sorting on the Process Scheduler table
6 years ago
luccioman bb51555830 Removed remaining unsafe accesses to SimpleDateFormat instances.
6 years ago
luccioman f895745e1c Removed more unsafe concurrent accesses to SimpleDateFormat instances.
6 years ago
luccioman e97580dfc7 Fixed unsafe conccurent access to generic SimpleDateFormat instances
6 years ago
luccioman 8811700e2e Upgraded Jetty dependency from 9.4.9 to 9.4.11
6 years ago
luccioman d53c33e4ef Fixed potential infinite loop case (does not occur in current code base)
7 years ago
luccioman a15ac8e0ca Made CrawlProfile loading tolerant to malformed json string attribute
7 years ago
luccioman a715bb7876 Fixed rendering of solr mustNoMatch value on CrawlProfileEditor_p.xml
7 years ago
luccioman 0b302c5004 Do not block whole server startup on persisted crawl profile load error
7 years ago
luccioman 4d9aa4ed1e Fixed default crawl profile solr mustnotmatch query from previous commit
7 years ago
luccioman cced94298a Added a new crawler document filter type using Solr syntax
7 years ago
Michael Christen e0dc632020 removed transformer
7 years ago
luccioman 9bc7b6c39d Allow edtion of scheduled next execution dates for finer control
7 years ago
luccioman 40e8c7b89b Use the heavy ConcurrentUpdateSolrClient only when necessary
7 years ago
luccioman bd4cfeda3f Add a max acceptable limit to the size of Solr responses on p2p search
7 years ago
luccioman de4ea95687 Consistently allow gzip compression of remote Solr responses
7 years ago
luccioman cea8187161 Reuse expired connections evictors threads provided by apache and solr
7 years ago
luccioman b5dc1f376f Made outgoing pools max total connections user configurable
7 years ago
luccioman 387d646c0e Added gzip compression of responses returned to user-agents accepting it
7 years ago
luccioman a7a4ba3287 Apply remote solr configured timeout on getting connection from pool
7 years ago
luccioman ee6670fb8f Use a common pooled http connection manager for remote solr instances
7 years ago
luccioman d28f9ba0f6 Removed use of deprecated ConcurrentUpdateSolrClient constructor
7 years ago
luccioman 8a749aa5ad Trace level log message for monitoring remote solr response times
7 years ago
luccioman 35826a3091 Added a search page customization setting to display or not favicons
7 years ago
luccioman 0082b5ab2a Added missing default Solr http client connection timeout initialization
7 years ago
luccioman fa4399d5d2 Small perf improvement : initialize threads names early when possible
7 years ago
luccioman 84d82bfdd7 Adjusted suggestions timeout management
7 years ago
luccioman 65854bcb22 Fixed NullPointerException when omitHeader=true on external Solr server
7 years ago
luccioman c4d984cec8 Fixed Solr response header duplication when requesting external Solr
7 years ago
luccioman 124cc24aa3 Properly handle embedded Solr partial results
7 years ago
luccioman 3ce44cf250 Fixed largest snippet get : don't reject ones starting with a space char
7 years ago
luccioman f511e16d50 Prevent duplication of Solr query highlight fields parameters
7 years ago
luccioman e357ade47d Reduced memory footprint of text snippet extraction
7 years ago
luccioman e115e57cc7 Reduced text snippet extraction processing time.
7 years ago
sgaebel 4b79851e12 corrected icons_sizes_sxt to SolrType.string
7 years ago
luccioman 3b89c232db Easier tracking of longest text snippets initializations
7 years ago
luccioman 3c4344cb12 Fixed text snippet max init time statistic rendering
7 years ago
reger a8234b7ea7 Make sure for image resource url enabled index image pixel size fields are filled
7 years ago
luccioman e67df103b5 Removed more remaining uses of deprecated Seed.getIP() function.
7 years ago
luccioman addd18c993 Removed some remaining uses of deprecated Seed.getIP()
7 years ago
luccioman c35d0568b6 Support for preferred https in peers communication on more operations
7 years ago
luccioman e914d17aca Updated call to function deprecated since commons-codec version 1.11
7 years ago
luccioman a3ec7a7a5f Added analysis optional setting to compute statistics on text snippets
7 years ago
luccioman 1889d484de Added Solr HTML writer support for responses from remote instances
7 years ago
luccioman 2af3bf79c7 Improve rendering of remote Solr admin URLs
7 years ago
luccioman bb74de7d59 Removed unnecessary "/admin" suffix from remote Solr instance admin URL
7 years ago
luccioman 0d34034f17 Ensure an embedded Solr is available for Solr dump/restore operations
7 years ago
luccioman d92b191942 Ensure no remote Solr is attached before "Shut Down and Re-Start Solr"
7 years ago
luccioman 26d8ad591c Adjusted Solr select servlet output when using an external Solr only
7 years ago
luccioman 69690c13a0 Optionally allow external Solr server with self-signed certificate
7 years ago
luccioman b882f85900 Fixed NPE case in Solr select servlet on external Solr only setup
7 years ago
luccioman 2fd4d05e2f Added a shared Java constant for setting key server.servlets.called
7 years ago
luccioman ba9cd14516 Removed hard-coded patch for Solr 5.0 on ranking boost function
7 years ago
luccioman fb3032c530 Added a crawl filtering possibility on documents Media Type (MIME)
7 years ago
luccioman e45afedee4 Added support for enclosures (media links) to the RSS loader
7 years ago
luccioman aaefd5219c Reduce log verbosity of RSS loader on feed items with no link
7 years ago
luccioman cf62b571bd Added RSS reader support for `enclosure` feed item sub element.
7 years ago
luccioman e5f5de0fc7 Added some JavaDoc to the RSSMessage class.
7 years ago
luccioman 0d7625ecfb Handle Solr fields restrict and alias in YaCy html and exml writers
7 years ago
luccioman 3da2739bbd Parse and index more common audio metadata text tag fields.
7 years ago
luccioman 846aba00fa Added parsing of URLs eventually present in audio metadata tags
7 years ago
Michael Peter Christen 187075b878 added nav filter
7 years ago
luccioman bcbd0ae1a4 Enabled partial parsing of audio resources.
7 years ago
luccioman fda0189613 Updated audio file extensions with ones recently added to audioTagParser
7 years ago
luccioman 978e2be95b Let a chance for other parsers on audioTagParser error
7 years ago
luccioman 9e5846a26e Small fix on svg parser error message
7 years ago
luccioman 11611dbdcf Reuse existing File copy function to handle audio parser tmp files
7 years ago
luccioman f77f8f40f9 Factored audio parser tag processing
7 years ago
luccioman 9a7a353d0e Removed some unnecessary intermediate list creation on array copy.
7 years ago
luccioman fb6457f5bc Fixed NPE case when on audio resource parsed with null tag
7 years ago
luccioman c3ff50c17a Updated the list of audio file formats supported by the audioTagParser
7 years ago
luccioman 1b90479a76 Added missing vocabulary navigator increment on results from RWI
7 years ago
luccioman 46c9da6428 Allow creation of vocabularies from remote CSV file URLs.
7 years ago
luccioman 17c7a85f18 Make StreamResponse usable in Java try-with-resources statements
7 years ago
luccioman b67742336e Provide user interface messages on vocabulary creation read/write errors
7 years ago
luccioman 3e8dd90211 Use https rather than http in links and queries to openstreetmap.org
7 years ago
luccioman 3a973dbb23 Removed unused import
7 years ago
luccioman e9527cd0e5 Reuse the same Pattern instance when matching multiple key/values
7 years ago
luccioman dbf4c1cd76 Improved blacklist entries editing operations :
7 years ago
reger 87077b8fb6 Adjust and move Language Navigator to be member of the navigatior plugin
7 years ago
luccioman eb20589e29 Fixed issue #158 : completed div CSS class ignore in crawl
7 years ago
luccioman 0cdee4e26a Fixed loss of "meanCount" search param when using facets or page buttons
7 years ago
luccioman 117a859879 Do not clear all search modifiers when unselecting one modifier.
7 years ago
luccioman 33593c22e9 Fixed loss of other modifiers on keywords/tags search navigation links
7 years ago
luccioman a9dc0874c0 Remove old query terms from search results suggestions links.
7 years ago
luccioman 9412881230 Added basic support for autotagging microdata annotated item types.
7 years ago
luccioman 5a14d34a7d Refactoring : documented and extracted autotagging processing functions.
7 years ago
luccioman 58b9834729 Added HTML microdata typed items parsing capability.
7 years ago
luccioman 80fb1026d0 Create recrawl requests with the relevant crawl profile.
7 years ago
luccioman 539925a275 Added an utility to generate/update XLIFF master file from lng files.
7 years ago
luccioman fa6d030b0b Moved dbtest to the test source folder.
7 years ago
luccioman 6cd3847d0a Fixed NullPointerException case on Table init with relative file path.
7 years ago
luccioman 28883d8a71 Shutdown daemon threads at the end of dbtest
7 years ago
luccioman 929e0d6eae Replaced improper ByteBuffer.equals() implementation by Arrays.equals()
7 years ago
luccioman 46b5249c20 Removed time condition on HostBalancer initialization in JUnit test.
7 years ago
luccioman 8b572b7337 Commit Solr index before simulating or starting recrawl job.
7 years ago
luccioman 733cacdbb8 Revised the RDFaParser main launcher for minimal proper operation.
7 years ago
luccioman 7baa99f26f Fixed stored URL in web cache when redirection(s) occurs.
7 years ago
luccioman 9ddf92d143 Removed unncessary reflection usage for workflow tasks.
7 years ago
luccioman 897d3d30cc Added new recrawl job profile to the list of default crawl profiles
7 years ago
luccioman 9624516bf8 Refresh recrawl job profile threshold date like other default profiles
7 years ago
luccioman b712a0671e Added a specific default crawl profile for the recrawl job.
7 years ago
luccioman adf3fa493d Added comments about crawl profiles recrawl cycles
7 years ago
luccioman 3638e16c2e More comprehensive log on rejected recrawls caused by date constraint
7 years ago
luccioman d47afe6fab Use a constant for crawler reject reason prefix with specific processing
7 years ago
luccioman 4e03335625 Added more details to the recrawl job report
7 years ago
luccioman 6425963cee Fixed internal tables exact value match iterator
7 years ago
luccioman 0c9e0b3566 Record recrawl calls to make them schedulable
7 years ago
luccioman 433e241e4f Added a report info box about eventual last terminated recrawl job
7 years ago
luccioman b2af25b14f Added a stop condition to the Recrawl busy thread
7 years ago
luccioman 421728d25a Made possible to customize selection query before launching a recrawl
7 years ago
luccioman 36e9b1c5b3 Fixed SegmentTest test case time dependant occasional failures
7 years ago
luccioman 8a4ea1c11e Added UI switch to control content domain constraint per search request
7 years ago
reger f8071ac8ae Make TokenizedStringNavigator (used for keyword search facet) active
7 years ago
luccioman e6907fdab3 Added optional search parameter/setting to control content domain filter
7 years ago
luccioman f52217c939 Enable full size images preview for users with extended search rights
7 years ago
luccioman 09c4ee56a7 Added optional https support for remote crawl and profile operations
7 years ago
luccioman 5db1c9155a Do locale independant case conversion on hosts, schemes, and file exts.
7 years ago
luccioman 1c4803e40a Enable optional https support for /yacy/transferURL API calls.
7 years ago
luccioman c6e1befbca Restored peer URL host name stripping removed from previous commit.
7 years ago
luccioman 17e004599d Started implementing optional https preference for protocol operations
7 years ago
Michael Peter Christen b907819cb4 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
7 years ago
Michael Peter Christen 25573bd5ab added a crawl filter based on <div> tag class names
7 years ago
luccioman d95b288f19 Removed use of deprecated Jetty IPAccessHandler for client filtering.
7 years ago
reger cc7a93e6b6 remove deprecated jetty continuation class from urlproxyservlet
7 years ago
Michael Peter Christen 607b39b427 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
7 years ago
Michael Peter Christen 4355de0f3c (more!) evaluation of XRealIP from nginx reverse proxy
7 years ago
luccioman a4494d6e01 Improved support for internationalized domain names on "site:" modifier
7 years ago
luccioman d07006bac4 Do locale independant case conversion on "filetype:" query modifier.
7 years ago
luccioman 8fbf25d1ed Made "site:" query modifier case insensitive.
7 years ago
luccioman 867388e05b Refactored 'site:' query modifier parsing into a dedicated function.
7 years ago
luccioman c9d80b5b77 Prefer fine URL match over approximate URL mask regex on final filtering
7 years ago
luccioman 0a120787e3 Improved accuracy of URLs search filters : protocol, tld, host, file ext
7 years ago
luccioman d1c7dfd852 Fixed URL parsing with fragment and empty path
7 years ago
luccioman e07ef1b610 Apply tld query modifier on Solr host_s mandatory field.
7 years ago
luccioman 478e92deff Fixed url mask filter generated when protocol modifier is not null
7 years ago
luccioman 29de4a65d7 Refactored url mask filter build from query modifiers
7 years ago
reger d5a75537e4 remove redundant setting of timeout for remoteinstance
7 years ago
luccioman f01aac31fd Made possible to use https for remote search on peers with SSL enabled.
7 years ago
luccioman e2f6427a63 Added a basic JUnit test for the Visio parser (vsdParser)
7 years ago
luccioman 1e9cdaabd4 Do locale neutral case conversion of HTML charset name.
7 years ago
luccioman 7206f1ed71 Do locale neutral case conversions on domain names.
7 years ago
luccioman 398c66f06c Do locale neutral case conversions in MultiProtocolURL
7 years ago
luccioman 9531b83598 Do locale neutral case conversions in Classification
7 years ago
luccioman d22fc0d0a2 Updated lists of known sponsored and country-code TLDs.
7 years ago
luccioman ac209cac2e Updated the generic top-level known domains list.
7 years ago
luccioman 938d8a9731 Added some JavaDoc
7 years ago
luccioman e0eda84c24 Remove old hard-coded holiday dates from DateDection class.
7 years ago
luccioman cb10daba92 Renamed Chinese & Greek lng files using ISO639-1 codes.
7 years ago
luccioman 46f37e38dc Customized Threads with generic name for easier monitoring.
7 years ago
luccioman 046be566e1 Updated a license header typo.
7 years ago
Apply55gx 3c905a2a5c fix typo
7 years ago
luccioman 8e732d437c Enable HTTP Digest authentication for non admin users.
7 years ago
luccioman d8eaf621cc Fixed blacklist returned location URL on empty parameters
7 years ago
luccioman af198b990b Added an optional login link/status to the search public top nav bar.
7 years ago
luccioman 1de86cf1bf Fixed JPEG snapshot resizing when running on OpenJDK.
7 years ago
luccioman a17a418e78 Fixed NullPointerException cases on snapshot images parsing.
7 years ago
luccioman 285f0d6a39 Consistently encode snapshot image with format requested on the API.
7 years ago
luccioman 34ca73d61b Fixed a NullPointerException case on images encoding errors.
7 years ago
luccioman 7c319c841e Fixed pdf2image conversion with imagemagick on PDFs having transparency
7 years ago
luccioman 6e497241f7 Properly close resources (even on error) on OS and ThreadDump classes.
7 years ago
luccioman fe75f326d8 Fixed ProfilingGraph calculation integer overflows and added test class.
7 years ago
luccioman 5d1ef8fdfc Merge branch 'master' of https://github.com/otteresk/yacy_search_server
7 years ago
luccioman 8303e15419 Reduced number of search navigators refresh requests in JS resort mode
7 years ago
luccioman dbff7b14fc Add a configurable limit to tags initially displayed in search results
7 years ago
Andreas 0c4db9eef0 Merge pull request #3 from yacy/master
7 years ago
reger c31d94664a Update deprecated SolrInputDocument.addField() with boost value
7 years ago
luccioman 7e271f9cf5 Updated travis config : install ghostscript, required for Html2Image
7 years ago
luccioman 32c9dfa768 Added partial bzip2 stream parsing support and bzipParser Junit test
7 years ago
luccioman dd9cb06d25 Fixed RWI distance calculation on multi words search queries.
7 years ago
luccioman 6b11bf3a12 Fixed NullPointerException case on 'Browser' lang selection
7 years ago
reger ae1c675c85 fix array out of bounds in YJsonResponseWriter and OpensearchResponsWriter
7 years ago
otter 73d1d577fd prevent integer overflow in chartDot for nodes with a big index
7 years ago
otter 4e2ccdfcac prevent integer overflow in chartLine
7 years ago
luccioman 27ab733685 Ensure private search features are not lost on Digest auth timeout
7 years ago
reger ba60f65040 Adjust filetype: query modifier parameter to lower case
7 years ago
luccioman 57a33aefb0 Removed unnecessary max counts init on empty search navigators.
7 years ago
luccioman ef8aea7f8d Made the dates navigator max elements number user configurable.
7 years ago
luccioman 9e86d183b8 Disable manual search results resorting when resorting is done with JS
7 years ago
luccioman 66cb9c4ff9 Added Solr filter queries for audio, video and application domains
7 years ago
luccioman 5d3ceb31b7 Improved search navigators counters accuracy and consistency.
7 years ago
luccioman 8e4f31bdc7 Updated internal ISO 639-1 language codes with latest standards.
7 years ago
luccioman a28428047a Fixed count of filtered results from local solr.
7 years ago
Michael Peter Christen 2f71005a93 Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
7 years ago
Michael Peter Christen 2314f8e358 try to fix problem
7 years ago
luccioman 3c9df6e0ce Use local solr filtered results in total search results count.
7 years ago
luccioman a1a0515312 Added a button to manually refresh sorting of p2p search results.
7 years ago
luccioman 4eba88f2ff Removed some unnecessary uses of java.lang.reflect api.
7 years ago
luccioman da3dbf9ea1 Use Javadoc style comments on SearchEvent properties.
7 years ago
luccioman c6ae87168a Added unit tests on the gzip parser.
7 years ago
luccioman 169ffdd1c7 Finer control on max links to parse in the html parser.
7 years ago
luccioman e41d046a9d Improved parsing support for OOXML spreadsheets (.xlsx)
7 years ago
reger 51a4e03c93 Allow to stop currently running warc import (stop button)
7 years ago
luccioman 6cec2cdcb5 Use unredirected robots.txt URL when adding an entry to the table.
7 years ago
luccioman 3f0446f14b Ensure proper synchronous robots entry retrieval on first check.
7 years ago
luccioman b23a563065 Prevent search result failure on incomplete images information.
7 years ago
Michael Peter Christen 30d71c6359 added usage of X-Real-IP http header
7 years ago
Michael Peter Christen f45378c11c Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
7 years ago
Michael Peter Christen 7f395ef937 added image link in search results
7 years ago
luccioman 780173008e Implemented partial stream parsing of tar archives.
7 years ago
luccioman acab6a6def Also handle text content when parsing XML within limits.
7 years ago
reger 2a07799ad1 Correction of d03e2c98ea
7 years ago
reger d03e2c98ea Fix Conjunction.addOperator to do nothing if term is empty
7 years ago
reger b6a41df4f7 Remove deprecated YaCyProxyServlet
7 years ago
luccioman 8a94fef9e0 Prevent unwanted cached bytes duplication on stream parsing.
7 years ago
reger 4979439e87 Skip public post of jre version.
7 years ago
reger e918ec199e Replace deprecated ConcurrentHashSet with recommended Java8
7 years ago
reger fb71994342 Harmonizing use of xml reader / sax parser in XMLBlacklistImporter
7 years ago
reger 275d65fffe Patch last_modified date with internal FirstSeenTime() if no date provided
7 years ago
reger d1b23afed6 Remove obsolete Protocol parameter ttl (time to live)
7 years ago
reger 15d78b1064 Replace deprecated getIP with getIPs in Protocol transferURL() and
7 years ago
reger ed36b47bec Replace one more deprecated peerDeparture in Protocol.transferIndex()
7 years ago
luccioman 0ee8c030c4 Log an error when Solr folder migration fails for some reason.
7 years ago
luccioman 5a646540cc Support parsing gzip files from servers with redundant headers.
7 years ago
luccioman 11a7f923d4 Distinguish response parsing failures from unexpected exceptions.
7 years ago
luccioman eda7b0aeb6 Merge branch 'master' of https://github.com/yacy/yacy_search_server
7 years ago
reger 3005be7349 Clean up unmaintained and unused AugmentParser trail.
7 years ago
luccioman cb4f1358e1 Added gzip parser support for max content bytes limit
7 years ago
luccioman 5216c681a9 Added HTML parser support for maximum content bytes parsing limit
7 years ago
luccioman 4aafebc014 Merge pull request #122 from Scarfmonster/patch-1
7 years ago
luccioman 651fad6da5 Added RSS parser support for maximum content bytes parsing limit
7 years ago
luccioman 452a17a8d5 Finer control on bounded input streams with custom stream implementation
7 years ago
luccioman f8f1959ebb Added parsing within bounds implementation to the generic parser.
7 years ago
luccioman e0f400a0bd Support trying multiple parsers even when streaming on large resources.
7 years ago
luccioman 1e84956721 Support loading local files with a per request specified maximum size.
7 years ago
luccioman f369679d1c Fixed read/copy on input streams reading sometimes less than expected.
7 years ago
luccioman bf55f1d6e5 Started support of partial parsing on large streamed resources.
7 years ago
luccioman 90a7c1affa HTML parser : removed unnecessary remaining recursive processing
7 years ago
reger e6e20dab52 upd to Jetty 9.4.6.v20170531
7 years ago
luccioman dcc56318bb Made remote search max system load limits configurable from UI.
7 years ago
reger ddd13b776d Add keyword constraint to rwi query result filter
7 years ago
luccioman e82eaee4b6 Apply consistent behavior on HTTP resource size exceeding limit.
7 years ago
luccioman 0b75e92ac2 Do not wrap unnecessarily loader IOExceptions in IOExceptions
7 years ago
luccioman 433bdb7c0d Respect maxFileSize limit also when streaming HTTP and when relevant.
7 years ago
luccioman 9b1bb2545e Refactored plain-text URLs detection implementation.
7 years ago
luccioman 8da3174867 Ensure lower case conversion consistency with any default locale.
7 years ago