reger
71bf95af8a
upd parser calls in test cases
10 years ago
reger
77851fa53c
fix parser test cases
...
(Vocabulary paramete)
10 years ago
reger
9edc7308aa
update to metadata-extractor-2.7.0.jar
...
add 2 simple JUnit test cases for jpeg and tif parsing
10 years ago
reger
ea633a794c
including small junit test case for WordTokenizer
10 years ago
reger
aa2e15d846
allow url parameter in worktable apicall
...
allow url=wwwl?param=a¶m=b (with ?, & encoded)
fix: http://mantis.tokeek.de/view.php?id=100
fix double adding of '&' in MultiProtocolURL.escape()
10 years ago
reger
b510b182d8
- update Maven pom
...
- add ppt parser test case
10 years ago
reger
e94efd4d7c
update to JUnit 4.11
...
- fix build.xml -> parserTest error on Windows due to javac encoding
11 years ago
reger
7847a93558
fix AbstractParser.singleList not adding null strings
...
- prevents null titles in oo... parser (as detected by ParserTest)
- correct ParserTest dc_description check (dc_description allowed to return 0 length array)
11 years ago
reger
0b6db04e40
fix contentscraper img height/width parsing
...
prevent numberformat exception on common "100px" property
- include in test case
11 years ago
reger
86f6975edc
exclude html tags in in/outboundlinks_anchortext_txt parsed text
...
- some outboundlinks_anchortext_txt in index contain e.g. <span>text</span> or more tags,
remove all tags for text property (inline img tags are still parsed)
- added test case for above (to htmlParserTest)
- fix solr test case
11 years ago
reger
71649bf22d
add test case htmlParser.parse - getCharset
...
(which fails)
11 years ago
reger
c8d437b69a
clean up test sources
...
rename to current package names and move to default location
11 years ago