Michael Peter Christen
ff8fe7b6a4
fix for ',' or '.' appearing within a word or number. This will not
...
tokenize the query into parts around that character to make it possible
to search for numbers or version numbers.
1 year ago
Michael Peter Christen
0689f4f0ae
Check if the character is a minus sign and is followed by a letter or a
...
digit. Treat it as part of the word/number.
1 year ago
Michael Peter Christen
5db97a8928
parser can now separate numbers from words also when they are not
...
separated by space, i.e. 4.7Ohm
1 year ago
Michael Peter Christen
e3797de7de
enhanced the word tokenizer to recognize numbers in a proper way
1 year ago
Michael Peter Christen
88cd17ea57
migrated solr from 8.9.0 to 8.11.2; activated also migration script. A YaCy index with solr 8.9.0 will automatically be migrated to 8.11.2. This is a preparation step to migrate to 9.0.0 soon.
1 year ago
Michael Peter Christen
0089f234f4
added npe protection
1 year ago
Michael Peter Christen
8285fe715a
tab to spaces for classes supporting the condenser.
...
This is a preparation step to make changes in condenser and parser more
visible; no functional changes so far.
1 year ago
Michael Peter Christen
195bd2e444
extended the maximum header size to 16k to prevent http error 431
1 year ago
Michael Peter Christen
92dad3ed49
removed 7Zip parser because the old library could not be replaced by a maven repository
1 year ago
Michael Peter Christen
5afcba162b
updated libraries
1 year ago
Michael Christen
a348146d8f
setting connect host to 0.0.0.0
1 year ago
Michael Peter Christen
1c0f50985c
fixed documentation and some details of handling of keywords
2 years ago
Michael Christen
3472bcb4d3
patched a 'java.lang.NoSuchMethodError: com.twelvemonkeys.imageio.util.IIOUtil.lookupProviderByName' problem which occurred only on ARM
2 years ago
Michael Christen
f7b6e98ed7
Merge pull request #562 from thkoch2001/fix-warnings
...
Fix warnings
2 years ago
Michael Peter Christen
a157d01bb5
increased network image size limit for linuxtage poster
2 years ago
Thomas Koch
6bca836f49
fix 3 javac warnings: redundant cast
...
see GitHub issue #561 for context
[javac] /home/thk/git/yacy_search_server/source/net/yacy/htroot/ConfigAccounts_p.java:85: warning: [cast] redundant cast to YaCyHttpServer
[javac] final YaCyHttpServer jhttpserver = (YaCyHttpServer)sb.getHttpServer();
[javac] ^
[javac] /home/thk/git/yacy_search_server/source/net/yacy/htroot/ConfigUser_p.java:156: warning: [cast] redundant cast to YaCyHttpServer
[javac] final YaCyHttpServer jhttpserver = (YaCyHttpServer) sb.getHttpServer();
[javac] ^
[javac] /home/thk/git/yacy_search_server/source/net/yacy/htroot/ConfigUser_p.java:167: warning: [cast] redundant cast to YaCyHttpServer
[javac] final YaCyHttpServer jhttpserver = (YaCyHttpServer) sb.getHttpServer();
2 years ago
Michael Christen
9012fe4519
extended error message
2 years ago
Michael Christen
74104ff2d3
fix to timeout
2 years ago
Michael Peter Christen
9fcd8f1bda
added canonical filter
...
attention: this is on by default!
(it should do the right thing)
2 years ago
Michael Peter Christen
5a52b01c09
front-end integration of tag valency
2 years ago
Michael Peter Christen
7f728bb4b4
crawl profile storage extension for tag valency
2 years ago
Michael Christen
4304e07e6f
crawl profile adoption to new tag valency attribute
2 years ago
Michael Peter Christen
5acd98f4da
introduction of tag-to-indexing relation TagValency
2 years ago
Michael Peter Christen
ab3ef87abf
fixed exec start command where a path contains spaces
2 years ago
Michael Peter Christen
17eec667fb
better release number representation
2 years ago
Michael Peter Christen
b1199e97f8
enabling new update location release.yacy.net
...
with new version numbers
2 years ago
Michael Peter Christen
66169d1aad
default build properties to remove barrier developing in IDE
...
environments
2 years ago
Michael Peter Christen
309adb814e
fixed import of jsonlist imort from searchlab.eu using a direct URL
2 years ago
Michael Peter Christen
5ddc794bb9
code cleanup in http clieant
2 years ago
Michael Peter Christen
62d177bf59
stub for jsonlist index importer web page
2 years ago
Michael Peter Christen
efa0425f00
refactoring: moved jsonlist importer to importer class
2 years ago
Michael Peter Christen
49daa32a88
yacy can now read searchlab export dump files
...
using the surrogate input process:
- copy the searchlab export file to DATA/SURROGATE/in
- the file is processed automatically and then moved to
DATA/SURROGATE/OUT
2 years ago
Michael Peter Christen
6042dd99c6
reduced danger that Tray does not initialize
2 years ago
Michael Christen
61b27217b9
throttle number of DNS requests:
...
as soon as the number of requests is > 50, there is a forced delay
of (10 * (requests - 50)) milliseconds. That means that once the number
of DNS requests reach 150, there is a one second delay to each request.
This shall prevent that a remote DNS is flooded with request and
possibly gets damaged.
This is also a fix/enhancement for
https://github.com/yacy/yacy_search_server/issues/513
2 years ago
Michael Christen
99174282d8
try to shut down in a bit more ordered way
...
inspired by https://github.com/yacy/yacy_search_server/issues/518
2 years ago
Michael Peter Christen
482f507e65
upgraded solr from 8.8.1 to 8.9.0
...
should hopefully fix
https://github.com/yacy/yacy_search_server/issues/496
because it includes https://issues.apache.org/jira/browse/SOLR-13034
2 years ago
Michael Peter Christen
d49f937b98
added iso,apk,dmg to extension-deny list
...
see also https://github.com/yacy/yacy_search_server/issues/510
zip is not on the list because it can be parsed
2 years ago
Michael Peter Christen
761dbdf06d
increases log history length to 10000
...
implements https://github.com/yacy/yacy_search_server/issues/512
2 years ago
Michael Peter Christen
0970a79bbf
attempt to fix https://github.com/yacy/yacy_search_server/issues/517
2 years ago
Michael Peter Christen
1893661ee4
removed/suppressed more warnings
2 years ago
Michael Christen
51cf17d252
removed warnings
2 years ago
Michael Christen
867f96a32b
removed warnings
2 years ago
Michael Christen
8a06beaf24
removed finalize() methods, deprecated
2 years ago
Michael Peter Christen
60c9986a0e
new release file names with date and git hash
...
...without reference to 9000ish SVN
2 years ago
Michael Christen
8b37a5dc6f
removed log4j properties because we don't have a log4j any more
2 years ago
Michael Christen
347b676b76
changed system to load build properties
2 years ago
Michael Christen
c36bdbf78d
refactoring
2 years ago
Michael Peter Christen
1e1107c97c
clean-up and new servlet method caching
2 years ago
Michael Peter Christen
adbda4c71b
moved all remaining servlet classes to new location
2 years ago
Michael Peter Christen
33889b4501
moved more servlets to new location
2 years ago
Michael Peter Christen
6d388bb7bf
refactoring - moved htroot/yacy classes
2 years ago
Michael Peter Christen
48fcf3b3b5
alternative servlet method, tested with wiki
...
may become the future method to store servlets
2 years ago
Michael Peter Christen
d23dea2642
refactoring
2 years ago
Michael Peter Christen
23f1dc3741
addressing/fixing some concurrency issues from
...
https://github.com/yacy/yacy_search_server/issues/505
2 years ago
Michael Peter Christen
9c1bc533fa
removed hazelcast because it is phoning home, see also:
...
https://github.com/yacy/yacy_search_server/issues/504
2 years ago
Michael Peter Christen
fc98ca7a9c
removed ContentControl servlet and functinality
...
This was not used at all (as I know) and was blocking a smooth
integration of ivy in the context of an existing JSON parser.
2 years ago
Thomas Koch
3116713672
rm buildDate from build.xml and its usages
...
The https://reproducible-builds.org project invests a lot of work
to make builds reproducible. This is a security property. It allows
to compare the build of binaries from different builder machines.
If they are identical, it means that either the builds have not
been manipulated or an attacker managed to attack all builder
machines in exactly the same way.
One problem that the reproducible-builds project often sees is
that projects include the build time in their binaries. This
makes builds unreproducible for apparently no reason. The build
date should not be of interest since binaries built on different
dates but from the same source code should not be different.
Thus I decided to remove the build date instead of re-implementing
the functionality without the GitRev task. Anyways the reported
date was not the build date but the date of the last git commit
which is even less informative. The git commit ID would have
information value but should only be relevant for "nightly builds".
2 years ago
Thomas Koch
572558244a
rm unused build properties PKGMANAGER, RESTARTCMD, DESTDIR
...
PKGMANAGER is always false, thus the java code wrapped in
if statements for this property is dead code and can also
be removed.
The Debian packaging removed in c4659f0fb0
did set the PKGMANAGER property to true. When we do distro
packages again, we can revisit this commit and redo it with
property files instead.
RESTARTCMD is only used inside those dead code.
DESTDIR is never used even in the build.xml
2 years ago
Michael Peter Christen
3d138d3fdd
catch error when initializing hazelcast
...
should fix https://github.com/yacy/yacy_search_server/issues/468
2 years ago
Burkhard
a6a9828181
Merge pull request #440 from lfuelling/master
...
Add setting for public facing port
3 years ago
reger24
141e86964e
Fix compile deprecation warning
...
warning: [removal] AccessControlException in java.security has been deprecated and marked for removal
3 years ago
reger24
a7e93d9328
Add option to add host to default blacklist from search result
...
- added authorized ikon/button to blacklist a host
- host is added to default blacklist
- insired by https://github.com/yacy/yacy_search_server/issues/213#issuecomment-412485190
3 years ago
reger24
027e284ef9
Enhance notability of current blacklist by diff color in header
...
in servlet Blacklist_p.html
bugfix for 18dddb74c9
3 years ago
reger24
18dddb74c9
Harmonize loading/reading blacklist
...
between init and servlet to use the same procedures
-added BlacklistHelper.blacklistToSortedArray to simplify use in servlet
3 years ago
reger24
f28d705cd0
update IndexBroser_p add to blacklist button
...
add feedback to user on success
3 years ago
Michael Peter Christen
52fe2ed8ba
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
3 years ago
Michael Peter Christen
39e7bbac13
removed deprecation warning for new Double()
3 years ago
reger24
6a5f0b3684
Servlet IndexBroser_p add button "Add to blacklist"
...
allows to add the displayed host to add to the default blacklist
3 years ago
Lukas Fülling
111cf48642
add missing prop
3 years ago
reger24
f33e0ed7fd
revert commit 17fd1a4616
...
wrong file selected
3 years ago
unknown
17fd1a4616
delete .idea not needed in distribution
...
.idea is created locally by IntelliJ IDEA upon import as gradle project to store IDEA specific settings.
No need to include in distribution
3 years ago
Daleth Darko
3ced06c731
Various javadoc fixes
3 years ago
reger24
6a1e259fd0
Fix NPE in Switchboard . getURL https://github.com/yacy/yacy_search_server/issues/441
3 years ago
reger24
eae16287e9
Added epub (ebook) format to existing zipParser
...
*.epub files are zip files containing xhtml files with content and other artifact files,
which the zipParser can already feed to index
- extension "epub"
- mime "epub+zip"
3 years ago
reger24
3e34f7c596
Import Ant build.xml into Gradle and use old compile of servlets in Gradle
...
to be able to use/reuse Ant targets where task has not been implemented in Gradle build.
- use the import to include the compile of htroot as first important task
! it is possible that first build fails an compile of GitRevTask.jar !
! solution/workaround -> use "ant all" once to compile GitRevTask.jar !
- adjusted build.xml a little
- split compile-core into compile-core and compile-htroot to have a target for htroot comp. only
- set build-path to reuse Gradles build directory
- (fix javadoc failure)
- changed the filtered-copy of yacyBuildProperties.java to ! the build path :-(
as current (copy,delete,exclude) is complicated and not migration worthy,
used simple/straigt forward approach (using a yacyBuildProperties.java.template file as copy source)
3 years ago
reger24
398b105781
Prevent that YaCy always starts with a exception message on none Apple systems
...
Perform try to access com.apple.eio.FileManager only on none Win systems
3 years ago
Lukas Fülling
e8a00007f6
add setting for public facing port
3 years ago
Michael Peter Christen
d7b17d8935
fixed missing thread name revert after balancer waiting
3 years ago
Michael Peter Christen
bd3f2483a1
replaced url and date retrieval by only url retrieval
...
This should prevent that the search index is used for freshnes of the
index entry.
3 years ago
Michael Peter Christen
163ba26d90
replaced check for load time method
...
instead of loading the solr document, an index only for the last loading
time was created. This prevents that solr has to fetch from its index
while the index is created. Excessive re-loading of documents while
indexing has shown to produce deadlocks, so this should now be
prevented.
3 years ago
Michael Peter Christen
1ead7b85b5
remove compiler warning
...
"warning: [try] explicit call to close() on an auto-closeable resource"
3 years ago
Michael Peter Christen
59777010dc
Merge branch 'master' of git@github.com:yacy/yacy_search_server.git
3 years ago
Michael Peter Christen
7898815c41
disabling concurrent logging
...
(maybe temporary)
3 years ago
sgaebel
4bf6954474
uses clientBuilder not HttpClients.custom() to have these inside the
...
Pool too
3 years ago
sgaebel
cdf901270c
always use HTTPClient by 'try with resources' pattern to free up
...
resources
3 years ago
sgaebel
69adaa9f55
makes our HTTPClient closable
3 years ago
sgaebel
fc4275f901
handle all references for client, response, request to be able to close
...
them
3 years ago
sgaebel
e7d3a363f2
refactor to use finish()
3 years ago
sgaebel
4fc876f4a3
revert back to use EntityUtils.consumeQuietly - as it simply closes the
...
underlying stream
3 years ago
sgaebel
4f0392e93e
refactor use of AuthSchemeProvider
3 years ago
sgaebel
b74f337859
removes double setting of UserAgent
3 years ago
sgaebel
965748fefb
some refactoring using try with resources
3 years ago
Michael Peter Christen
552ab7051b
fix for warc importer
3 years ago
Michael Peter Christen
3c86b7b780
attempt to make a Mac Release using gradle
...
This is almost working with many workarounds:
- run rm lib/yacycore.jar
- run ./gradlew clean build bundleNative
- run ant clean all
- run again rm lib/yacycore.jar
- run ./fixMacBuild.sh
The build is then inside build/mac/YaCy.app
Right now this works so far but it does not have the correct release
number inside.
Target is to make this working for Windows releases and to embedd jre
entirely.
3 years ago
Michael Peter Christen
999c819e3e
Merge branch 'master' of https://github.com/yacy/yacy_search_server.git
3 years ago
Michael Peter Christen
fd770e90e2
spike to identify paths for YaCy within mac application bundles
3 years ago
Michael Peter Christen
d19872fd26
making sure that crawl queues are closed correctly to prevent data loss
3 years ago
sgaebel
90507c0fdc
comments out printing query params to std.out
3 years ago
Michael Peter Christen
be0aebad84
fixes https://github.com/yacy/yacy_search_server/issues/424
3 years ago
Michael Peter Christen
63ad8ce6b2
removed ymarks
...
had not been used since a long time
3 years ago