I'm already too late to the party but +1 on Mike McCandleless comment on
RAM. There are a number of efforts in the past to move to off heap. FWIW
you can use Lucene FST (a trie-based datastructure used in many different
places including term index, synonym dictionary, etc) to build a large
(>>32GB)
Hi Vincent,
My 2 cents:
We had a production environment with ~250g and ~1M docs with static + dynamic
fields in Solr (afair lucene 7) with a machine having 4GB for the jvm and
(afair) a little bit more maybe 6GB OS ‚cache‘.
In peak times (re-index) we had 10-15k updates / minute and (partiall
Hi Vincent,
Lucene has a hard limit of ~2.1 B documents in a single index; hopefully
you hit the ~50 - 100 GB limit well before that.
Otherwise it's very application dependent: how much latency can you
tolerate during searching, how fast are the underlying IO devices at random
and large sequentia