Thank you for looking into this.

To provide more context, my schema contains around 800 fields, with
docValues enabled on 94 of them (44 predefined fields and 50 dynamic
fields) where useDocValuesAsStored=true is set. The results are retrieved
using a glob pattern (fl=id, field_prefix_*), where the field_prefix
matches most of the fields.

Each query retrieves the first 50 documents, with about 5 to 7 fields
populated in each. Since all of these fields are needed by the application,
excluding them from the Solr response may not be an option. I haven't yet
tried setting stored=true for all fields to see if that helps reduce memory
usage.

The increased memory usage persists after all queries are processed,
although the overall footprint does reduce by around 1.5 GB once the tests
are complete. I suspect this may be due to the max heap size being set to
16 GB; I haven't re-run the tests with a lower size like 8 or 12 GB
to check if it has similar behaviour.
Below are snapshots of memory usage during the test run (left) vs 30
minutes after the tests were finished -
[image: image.png]

Thank you,
Soham


On Wed, Oct 2, 2024 at 5:24 PM Michael Gibney <mich...@michaelgibney.net>
wrote:

> Thanks for raising this topic; I suspect it does warrant a Jira issue to
> address this, but I'll ask couple of questions first to make sure I'm not
> missing something:
>
> Do you have a very large number of dynamic fields configured as
> `useDocValuesAsStored=true`, and are you retrieving field values by
> globbing (i.e., `fl=*` or `fl=field_prefix_*`)?
>
> How many docs (`rows=N`) are your test queries returning?
>
> I would suspect that the memory usage persists only over the lifetime of
> the response(s) -- i.e. there should _not_ be a longer-term memory leak (to
> be clear, even this request-scoped memory usage increase is still a
> problem). Are you able to confirm/contradict that this is the case in your
> test?
>
> At a minimum it seems it would be good to provide a way to disable this
> behavior (which there currently is not). In the meantime, there's a
> workaround that should help and may/may not be viable for various use
> cases: if it's possible/appropriate to constrain the number of
> `useDocValuesAsStored` fields requested via globbing (explicitly specifying
> fields, or narrower glob patterns), that should help.
>
> There are inefficiencies inherent in the way the code currently handles
> `useDocValuesAsStored` for large numbers of distinct fields (as can often
> result from liberal use of dynamic fields), in ways that extend beyond the
> DocValuesIteratorCache. I'm thinking further about possible solutions, but
> will be interested to hear any further information you can provide.
>
> Michael
>
> On Wed, Oct 2, 2024 at 3:33 PM soham gandhi <sohamgandh...@gmail.com>
> wrote:
>
> > Hi everyone,
> >
> > I recently upgraded from Solr 8.11.1 to Solr 9.6.0 and noticed an
> increase
> > in memory usage in my test environment.
> >
> > Test environment -
> >
> >    - A simple setup with 2 million documents in the index (size ~3 GB)
> >    - A test run that executes 100 unique queries every other minute for
> >    an hour
> >    - Metrics captured: query response times, CPU, and memory usage,
> >    benchmarked against a baseline
> >
> > After upgrading to Solr 9.6.0 (with no changes to schema or solrconfig
> > compared to Solr 8.11.1), I observed an increase in memory usage from 4
> GB
> > to 6 GB.
> >
> > Memory usage snapshots for Solr 8.11.1 (left) vs Solr 9.6.0
> > [image: image.png]
> > Snapshots from a memory profiler for Solr 8.11.1 (left) vs Solr 9.6.0
> > [image: image.png]
> >
> >    - The increase appears to be linked to the DocValuesIteratorCache
> introduced
> >    in Solr 9.4.0 as part of SOLR-16989
> >    <https://issues.apache.org/jira/browse/SOLR-16989> (Optimize and
> >    consolidate reuse of DocValues iterators for value retrieval).
> >    - Since my queries are already fast (~100ms), I haven't seen a
> >    considerable benefit from this enhancement.
> >    - I noticed some improvement in query response time by increasing the
> >    page size by a factor of 10, but since users are only interested in
> the top
> >    50 results, this isn't a valid use-case.
> >    - If I retrieve only the id field, the memory usage is comparable to
> >    what I observed with Solr 8.11.1.
> >
> > Has anyone else experienced similar behavior after upgrading? Is there
> any
> > way to disable the DocValuesIteratorCache introduced in Solr 9.4.0? Any
> > suggestions or insights would be appreciated!
> >
> > Thank you
> > Soham
> >
> >
> >
> >
>

Reply via email to