: This setting can only affect the size of the fdt (and fdx) files. I suspect : you saw differences in the size of other files because it caused Lucene to : run different merges (because segments had different sizes), and the : compression that we use for postings/terms worked better, but it could have : been the other way as well.
You can check the number of documents in each segment to verify Adrien's comments. If you want to do a true "apples to apples" comparison on just the impacts of stored field compression, choose something like the NoMergePolicy or LogDocMergePolicy for your test to ensure that the number of documents per segment are not impacted by the size (in bytes) of any of the files in those segments. : > Hello, : > : > I'm experimenting with Lucene 5.2.1 and I see something I cannot find an : > easy explanation for in the api docs. : > Depending on whether I pick BEST_COMPRESSION or BEST_SPEED mode for : > StoredFieldsFormat almost all files become smaller for BEST_COMPRESSION : > mode. I expected only .fdt files to be smaller but for some reason the : > following file types also shrink very significantly: : > .fdx, .doc, .pos. Term dictionary (.tim) also gets smaller though not as : > significantly. Weirdly enough .tip becomes a little bigger for the best : > compressions setting. : > Index contained about 10M small (~300 bytes each) text docs. : > : > I guess I could go through the code myself to understand this but may be : > someone can shed some light on this. : > : > Thanks! : > : > Anton : > : -Hoss http://www.lucidworks.com/ --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org