Michael Peter Christen
910a496c9f
replaced http links with https
4 months ago
luccioman
93ea366778
Updated license header file name
8 years ago
luccioman
7717a3d43d
Fixed license headers on files created to improve favicon management.
8 years ago
luc
ef83e34b8a
Merge branch 'master' of https://github.com/yacy/yacy_search_server
9 years ago
reger
84c970eaec
move test classes to test/java (subdirectory as in Maven standard subdir layout)
...
because ViewImage*Test.java breaks test run
9 years ago
luc
cfdbc2b487
Improved URLLicence reliability for use by conccurrent non authaurized
...
users.
Removed URLLicence generation when unnecessary (authorized users)
9 years ago
luc
571bc55937
Refactoring : use StandardCharsets constants instead of hard-coded
...
charset names.
9 years ago
reger
1af0e9ef74
remove workaround for Solr bug regarding multivalued date fields
...
fixed in 5.4.0
http://issues.apache.org/jira/browse/SOLR-8050
9 years ago
reger
4d2b934487
prevent mailto links getting into parser result document's in/outbound link collection
...
by checking mailto scheme early.
- fix upper case mailto protocol assignment
- add test case for getProtocol
9 years ago
reger
288acceac3
fix test htmlParserTest, charset parameter
...
+ upd maven templating-plugin version
9 years ago
reger
d223cf0ae4
adjust MediaWiki importer geo coordinate calculation
...
- allow lat/long 0.xxx
- south / west assignment
include test class
9 years ago
reger
bad34804fe
optimize parseInt for <img> tag attribute parsing
...
Performance better as using Numberformat.parse or parseInt(substring())
9 years ago
reger
d2cc11ea8f
fix html parser taking <style> content as text.
...
Noticed some result description contain css content from style tag.
Added <style> to tag list to scrape it's content not as text
+ test case included
9 years ago
reger
e594130aec
add test case for partial update - to discover effect on YaCy for update of documents with multivalued date fields (like dates_in_content_dts)
...
current result: loss of fields/information in index document, see EmbeddedSolrConnectorTest.testUdate_withMultivaluedDateField()
9 years ago
reger
d5da9e5a38
fix test methode (add throw for URIMetadataNode)
9 years ago
reger
4cf875336c
complete TODO: getFileExtension handle dot in query part
...
+ testcase
9 years ago
reger
c37dda8849
fix NPE on MultiProtocolURL on url with parameter value and '='
...
in getAttribute
- added test case for it
10 years ago
reger
71bf95af8a
upd parser calls in test cases
10 years ago
reger
f63fff9008
fix snippet containig number with comma as desmo point http://mantis.tokeek.de/view.php?id=344
...
to keep it as one word (by altering the split regex)
- added sniipet test case with number
- regex for word split to match multiple splitcars
10 years ago
reger
2ef8ffdb60
apply UTF-8 encoding
...
copied from escape()
10 years ago
reger
7120ea42f1
fix for path with char code > 255
...
(causing index out of bound exception)
+ test cas for it
10 years ago
reger
1d81bd0687
fix url encoding for path see http://mantis.tokeek.de/view.php?id=559
...
So far we used same escape procedure for all parts of the url (which includes x-www-form-urlencoded for all url components)
Added capability to use different encoding rules for the different url components (through specific bitset for each component).
(this is inspired by org.apache.http.client and java.net.uri implementation).
- Added test case for http://mantis.tokeek.de/view.php?id=559
10 years ago
reger
f94e34058c
fix url (path) %-decoding http://mantis.tokeek.de/view.php?id=519
...
- add test case for this
10 years ago
reger
16bc267a32
add test case for snippet html encoding check
10 years ago
reger
77851fa53c
fix parser test cases
...
(Vocabulary paramete)
10 years ago
reger
df83fcc4fc
disable optimistic GC assumption in StandardMemoryStrategy
...
After several tests found that eom is not prevented. Major reason in testing was assumption future GC will free avg of last 5 GC.
Disabeling this check improved eom exceptions.
Added simplest testcase used for verification
10 years ago
Michael Peter Christen
68c605d637
replace with CommonPattern.SPACE for split
10 years ago
reger
9edc7308aa
update to metadata-extractor-2.7.0.jar
...
add 2 simple JUnit test cases for jpeg and tif parsing
10 years ago
reger
5d67e165d9
remove redundant null check in ResponseHeader.lastModified
...
added a JUnit testcase for ResponseHeader dates (using age()),
adjusted age() to pass all tests
10 years ago
reger
ea633a794c
including small junit test case for WordTokenizer
10 years ago
reger
aa2e15d846
allow url parameter in worktable apicall
...
allow url=wwwl?param=a¶m=b (with ?, & encoded)
fix: http://mantis.tokeek.de/view.php?id=100
fix double adding of '&' in MultiProtocolURL.escape()
10 years ago
reger
e88537522d
allow single quote " ' " in query
...
see http://mantis.tokeek.de/view.php?id=379
-add QueryGoal test case for this
10 years ago
reger
e50b2b4d04
fix test case MultiProtocolURL.toString()
...
(only allowed on AnchorURL)
10 years ago
reger
b510b182d8
- update Maven pom
...
- add ppt parser test case
10 years ago
Michael Peter Christen
2de159719b
added an option to set 'obey nofollow' for links with rel="nofollow"
...
attribute in the <a> tag for each crawl. This introduces a lot of
changes because it extends the usage of the AnchorURL Object type which
now also has a different toString method that the underlying
DigestURL.toString. It is therefore not advised to use .toString at all
for urls, just just toNormalform(false) instead.
10 years ago
reger
1f2eba977d
add test case for Records (used in HostBalancer)
...
- simulating seek error (http://mantis.tokeek.de/view.php?id=411 )
11 years ago
reger
e94efd4d7c
update to JUnit 4.11
...
- fix build.xml -> parserTest error on Windows due to javac encoding
11 years ago
reger
3b77e41f1a
adding test for HostQueue crawl stack
...
- simulating problem with zero length stack file (but not fixing it)
- adding test data clean to maven pom
11 years ago
reger
431a5f9c4e
added test case for TextSnippet,
...
removed obsolete/unused parameter and reference to MediaSnippet
11 years ago
reger
7847a93558
fix AbstractParser.singleList not adding null strings
...
- prevents null titles in oo... parser (as detected by ParserTest)
- correct ParserTest dc_description check (dc_description allowed to return 0 length array)
11 years ago
reger
0b6db04e40
fix contentscraper img height/width parsing
...
prevent numberformat exception on common "100px" property
- include in test case
11 years ago
reger
bb8181b2be
fix: resolve url without path but searchpart
...
e.g. http://yacy.net?q=test was resolved as host "yacy.net?q=test" now host="yacy.net" path="/"
fixes http://mantis.tokeek.de/view.php?id=47
added test case for getHost
11 years ago
reger
86f6975edc
exclude html tags in in/outboundlinks_anchortext_txt parsed text
...
- some outboundlinks_anchortext_txt in index contain e.g. <span>text</span> or more tags,
remove all tags for text property (inline img tags are still parsed)
- added test case for above (to htmlParserTest)
- fix solr test case
11 years ago
reger
71649bf22d
add test case htmlParser.parse - getCharset
...
(which fails)
11 years ago
reger
6878c90f99
fix: IPv6 INTRANET_PATTERNS for local ip (see http://bugs.yacy.net/view.php?id=378 )
...
requiring following ":" for fc and fd prefix and made pattern match case insesitive
- add some more ipv6 test cases to MultiProtocolURLTest.java
11 years ago
reger
c8d437b69a
clean up test sources
...
rename to current package names and move to default location
11 years ago
reger
18a56446ce
reorg URL test classes add isLocal test with some IPv6 examples
...
- putting in default location and clean old package names
- add some valid RFC IPv6 sample urls (which don't pass the isLocal test)
11 years ago
reger
10a6346056
clean-up test cases
...
to work with current source
11 years ago
reger
b4fdb8c887
cleanup test directory from Jetty 9 implementation samples
...
- current Jetty implementation advances so that it seems not beneficial to keep the code
as it makes the test unuseable and use of Jetty 9 is due to Java 1.7 dependency not in sight.
11 years ago
reger
71d2655c02
downgrade to Jetty 8 to assure support of JRE 1.6
...
- introduce a YaCyHttp interface to modulize/separate http server
- adjust the Jetty version specific implementation part (in package net.yacy.http)
- putting the version specific code in classes starting with Jetty8xxxx
- moved existing Jetty9xxx implementation into a test class (to keep the code)
- adjust build to the changed jars
- make use of the introduced YaCyHttpServer interface in related htroot servlets
- adjust other test cases/classes
11 years ago