What version of Solr? In Solr 8.2 there will be a tool to facilitate this kind
of analysis - see SOLR-13512. In the meantime, if you’re on Solr 8.x you should
be able to easily back port this change to your version (7x should be possible
too, but with more changes).
> On 1 Jul 2019, at 11:23, R
Whoa.
First, it should be pretty easy to figure out what fields are large, just look
at your input documents. The fdt files are really simple, they’re just the
compressed raw data. Numeric fields, for instance, are just character data in
the fdt files. We usually see about a 2:1 ratio. There’s
Hi Rob,
The codec records per docid how many bytes each document consumes -- maybe
instrument the codec's sources locally, then open your index and have it
visit stored fields for every doc in the index and gather stats?
Or, to avoid touching Lucene level code, you could make a small tool that
lo
Hello,
We are currently trying to investigate an issue where in the index-size is
disproportionally large for the number of documents. We see that the .fdt
file is more than 10 times the regular size.
Reading the docs, I found that this file contains the fielddata.
I would like to find the docum