It would also depend on the query.

For example collapse keeps a Map of groups heads gathered during the query.
A large result set and a high cardinality group field would result in more
memory usage.


Joel Bernstein
http://joelsolr.blogspot.com/


On Wed, May 3, 2023 at 3:11 PM Kevin Risden <kris...@apache.org> wrote:

> Here is an example calculation of bytes -> number of entries held from the
> bitset.
>
> (2864256-12-12)/24 = 119343 long objects = 22913856 entries
>
> The above is from a cluster where each query is generating a bitset of size
> 2864256 bytes - ~2.8 MB on heap. This is for 22 million results in the
> resultset. There is some algorithmic stuff to say whether this is a spare
> bitset or a fixed bitset - over a certain size result this is always a
> fixed bitset [1]. It grows based on number of documents in the resultset
> for the shard.
>
> This is easily viewable with a profiler like async-profiler where bitsets
> are created for each query. I recently looked at this in
> https://issues.apache.org/jira/browse/SOLR-16555 where filtercache bitsets
> were being recreated over and over if there were multiple fq clauses.
> SOLR-16555 drastically reduced heap usage on the cluster I was working on
> (you can see some of the metrics on the PR from before/after)
>
> If you have a shard with 200M documents - I think that bitset could be
> ~20MB per bitset per query.
>
> [1]
>
> https://github.com/apache/solr/blame/main/solr/core/src/java/org/apache/solr/search/DocSetUtil.java#L46
>
> PS - for G1 GC almost all of these big bitsets are humongous allocations
> (due to G1 region size) which idk is a problem or not. Its something I'd
> like to look at further, but haven't had time to benchmark or look at other
> approaches.
>
> Kevin Risden
>
>
> On Wed, May 3, 2023 at 1:14 PM Vincenzo D'Amore <v.dam...@gmail.com>
> wrote:
>
> > Hi Markus,
> >
> > thanks for your explanation.
> > What if I submit a query q=*:*&rows=0 and there are 200M of documents in
> > the solr core? Will I allocate an array of ScoreDoc objects so big?
> >
> >
> >
> > On Wed, May 3, 2023 at 5:32 PM Markus Jelsma <markus.jel...@openindex.io
> >
> > wrote:
> >
> > > Hello Vincenzo,
> > >
> > > Yes. Last time i checked, an array of ScoreDoc objects is created for
> > each
> > > query with the size of the numFound for the local core/replica. This
> > should
> > > clearly visible in VisualVM. This happens in SolrIndexSearcher.
> > >
> > > Regards,
> > > Markus
> > >
> > > Op wo 3 mei 2023 om 17:20 schreef Vincenzo D'Amore <v.dam...@gmail.com
> >:
> > >
> > > > Hi all,
> > > >
> > > > Just asking if there could be some correlation from the amount of
> > memory
> > > > allocated by a Solr query and the number of *hits* selected in solr
> > logs.
> > > > I haven't found anything in the Solr documentation.
> > > >
> > > > Do you know if there is some advice for the hits value?
> > > >
> > > > Thanks,
> > > > Vincenzo
> > > >
> > > > --
> > > > Vincenzo D'Amore
> > > >
> > >
> >
> >
> > --
> > Vincenzo D'Amore
> >
>

Reply via email to