This library was not prepared for large data because it was missing long
data types for pointers. I had to modify the code-base in a fundamental
way:
- Proof-Reading,
- unclustering,
- refactoring,
- naming adoption to https://wiki.openzim.org/wiki/ZIM_file_format,
- change of Exception handling,
- extension to more attributes as defined in spec (bugfix for mime type
loading)
- bugfix to long parsing (prevented reading of large files)
The code is furthermore very inefficient and requires more attention.
However the format is very useful for YaCy as there are numerous data
sources for ZIM-Files.
the raw format reader files with no integration in YaCy yet, which will
maybe follow as a next step. The zim file format is documented in
https://openzim.org and the reader code was taken from the archived,
non-maintained repository at https://github.com/openzim/zimreader-java