PyLucene 10.0.0
I'm trying to store a long text by compressing it first using zlib
*doc.add(StoredField("contents", zlib.compress(ftext.encode('utf-8'))))*
The resulting index size is *~83 MB*. When reading it's value back using
*c = doc.getBinaryValue("contents")*
It's returning 'NoneType' and when using
*c = doc.get("contents")*
It's returning a string which cannot be decompressed.
When using
*doc.add(StoredField("contents",
JArray('byte')(zlib.compress(ftext.encode('utf-8')))))*
The resulting index size is ~*160 MB. *There is no problem in getting it's
value using
*c = doc.getBinaryValue("contents")cc =
zlib.decompress(c.bytes.bytes_).decode('utf-8') *
*Question 1 : *Why does the index size almost double when using JArray?
*Question 2: *How do you correctly create and store compressed binary data
in StoredField ?
I am using PyLucene in my current project. Please advise me if I should
post my questions on the java-user list instead of here.
Prashant