It would be nice to have more clarity regarding the problem you're trying
to solve. Few questions:

1. Why do you need to return so many search results at the same time? If
it's a typical search usecase, could you not work with some manageable list
of documents, say 50/100? But I'm guessing this is not a typical search
that you're planning to support. In that case, you have to be careful with
how this functionality is being exposed. Like Vincenzo said, you will run
into problems if the endpoint is being hit frequently and you really won't
have a good handle on how much memory to allocate to your JVM.

2. Is Solr your primary source of data? If not, could you retrieve just the
identifiers from Solr and then go to your primary data source for
additional data? We support an 'export' usecase where users can export
1000s of documents at the same time. And our strategy is to get "only the
ids" of the matching docs from Solr and then use an async operation to go
back to our DB and download/export all the DB objects using the same set of
identifiers. On the Solr end, we don't have a problem because we limit our
payload (fields being returned) to only document identifiers (and even in
this case we have a limit - 20k).



On Thu, Apr 28, 2022 at 2:44 AM Vincenzo D'Amore <v.dam...@gmail.com> wrote:

> Ok, but the OP has to know that doing this often can be a serious issue.
> For example if you are implementing an endpoint that can be called 10/100
> times per hour, each call will result in a few humongous objects allocated
> in the JVM.
>

Reply via email to