luccioman
fcf6b16db4
Added new crawler attribute for finer control over Media Type detection
...
New "Media Type detection" section in the advanced crawl start page
allow to choose between :
- not loading URLs with unknown or unsupported file extension without
checking the actual Media Type (relying Content-Type header for now).
This was the old default behavior, faster, but not really accurate.
- always cross check URL file extension against the actual Media Type.
This lets properly parse URLs ending with an apparently odd file
extension, but which have actually a supported Media Type such as
text/html.
Sample URLs with misleading file extensions added as documentation in
the crawl start page.
fixes issue #244
6 years ago
luccioman
88d0ed676c
Render http status instead of null responses on snapshot api errors
6 years ago
luccioman
a83a56473e
Added suport for PDF snapshots generation when running on MS Windows
6 years ago
luccioman
18d07538ad
Upgraded Apache Ant from 1.10.1 to 1.10.5 in Docker alpine image flavor
6 years ago
luccioman
053df1f312
Added support for snapshots generation to Docker images
6 years ago
luccioman
92e10d7d1c
Added a crawl start hint message on availability or not of wkhtmltopdf
...
As this tool is required to produce pdf snapshots
6 years ago
luccioman
8852c97cee
Added basic styling for cleaner rendering of missing image snapshots
...
For the output of the Solr snapshots writer
6 years ago
luccioman
746e0e788d
Render a relevant HTTP status code on snapshot image rendering error
...
Instead of a null response body which is not very helpful.
6 years ago
luccioman
50b6edfcf5
Updated Solr snapshots writer for a cleaner html head
6 years ago
luccioman
f366f43d6b
Made snapshots size customizable in Solr snapshots response writer
6 years ago
luccioman
7a62fc0e66
Fixed concurrency issue in custom classloader used for template classes
...
As reported in issue #241 , the problem is only critical (random but
complete crash of the JVM) when upgrading to JDK11.
6 years ago
luccioman
0eb52f8c72
Added documentation hint about JVM option useful to debug JVM crashes
6 years ago
luccioman
753bda1409
Fixed remaining blacklist entries improper decoding of '+' character
...
In the blacklist cleaner and import/export administration pages.
6 years ago
luccioman
61c337f29a
Decode blacklist entries for easier edition of non ascii chars
...
Not using the JDK URLDecoder.decode() function, as it strips '+'
characters when they occur after '?' (both characters having regular
expression semantics when used in blacklist path patterns)
6 years ago
luccioman
ed93221fa1
Improved normalization of blacklist path patterns having non ascii chars
...
Normalize blacklist path patterns using percent-encoding, at pattern
edition in web interface and at loading from configuration files.
Fixes issue #237
6 years ago
luccioman
d42f079c2d
Additional modifications for typo fix in Bookmarks.html from PR #240
6 years ago
luccioman
d23578efc3
Merge pull request #240 from ivanhercaz/fixEnglishBookmarksPage
...
Fix English Bookmarks.html
6 years ago
ivanhercaz
8a8208c7e2
typo fix
6 years ago
ivanhercaz
a651358cce
cleaning the file of entries in German already translated to Spanish
6 years ago
ivanhercaz
dc09f240e7
changin all «» to "" to avoid confusions
6 years ago
ivanhercaz
07dae68ab0
ConfigHeuristics_p.html translated
6 years ago
ivanhercaz
102c1cc4a9
ConfigHTCache_p.html translated
6 years ago
ivanhercaz
41684ba559
adding Spanish to the interface language list
6 years ago
ivanhercaz
1714805092
ConfigAccounts_p.html translated
6 years ago
ivanhercaz
91ac9c652a
Collage.html translated
6 years ago
ivanhercaz
2d393e8f07
Bookmarks.html translated
6 years ago
ivanhercaz
1dafc85d33
typo fix in Bookmarks.html
6 years ago
ivanhercaz
275cff0cb7
removing duplicated entry (the one in German) for Translator_p.html
6 years ago
ivanhercaz
39fb80e84a
BlogComments.html translated
6 years ago
ivanhercaz
7f5121a0ec
Translator_p.html translated
6 years ago
ivanhercaz
59ea245e8b
Blog.html translated
6 years ago
ivanhercaz
d221ddcbc8
Blacklist_p.html translated
6 years ago
ivanhercaz
843f0bb48f
BlacklistTest_p.html translated and forgotten string in BlacklistImpExp_p.html
6 years ago
ivanhercaz
729a09d45d
BlacklistImpExp_p.html translated
6 years ago
ivanhercaz
c45324f086
BlacklistCleaner_p.html translated
6 years ago
ivanhercaz
1be4c84ed7
Autocrawl_p.html translated
6 years ago
ivanhercaz
c0f7aa92e4
AccessTracker_p.html translated
6 years ago
ivanhercaz
7aa7ba689c
AccessGrid_p.html translated
6 years ago
luccioman
3d14fb51c5
Removed now unused Java import in addition to modification from PR #239
6 years ago
luccioman
d5ec706604
Merge pull request #239 from otteresk/master
...
Display correct time in Rejected URLs overview
6 years ago
otter
8820d8d7c7
replace current date by FailDate
6 years ago
Andreas
3c65a158e1
Merge pull request #4 from yacy/master
...
Fork update #4
6 years ago
ivanhercaz
4f37c9f0ba
starting the Spanish translation
6 years ago
luccioman
6c3e140083
Upgraded Solr and Lucene dependencies from 6.6.3 to 6.6.5
6 years ago
luccioman
982179a7eb
Upgraded BouncyCastle dependencies from jdk15:1.46 to jdk15on:1.60
6 years ago
luccioman
c409ec089c
Hide password values from visible HTML in the Advanced Config page
...
Fixes issue #228
6 years ago
luccioman
75b9cd53cc
Use accessible labels in the Server Access Settings page
6 years ago
luccioman
4ed055bcdf
Enforced access controls to System settings pages
6 years ago
luccioman
de6820d257
Updated html input field type for seed upload with file method
...
- To meet current browsers security rules, which prevent selecting a
full file path with an html input field of type 'file'
- As it does not make sense to select a local file path when a the
administered YaCy server is remote (not on the same computer as the
browser)
6 years ago
luccioman
2a73b63d9e
Use a constant default target file name for seed SCP upload method
...
To make seed upload (in /Settings_p.html?page=seed page) with SCP easier
when the user specify a remote target directory path.
See report by @vikulin in issue #227
6 years ago