Cursors require keeping session state outside of Solr. With a million queries 
per hour and the middle tier spread across lots of containers, that isn’t 
practical. Stateless searches are the default in Solr for a good reason.

Using start and rows works great. The only issue is that Solr is defenseless 
against deep paging.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jun 25, 2021, at 8:09 PM, Dwane Hall <dwaneh...@hotmail.com> wrote:
> 
> Ok we lock down the rows and start params and then use cursors (which you 
> don't want to use) for paging in increments of the page size.  It works 
> nicely for us but it sounds like it's not workable solution for you.
> 
> Thanks,
> 
> Dwane
> From: Walter Underwood <wun...@wunderwood.org <mailto:wun...@wunderwood.org>>
> Sent: Saturday, 26 June 2021 12:53 PM
> To: users@solr.apache.org <mailto:users@solr.apache.org> 
> <users@solr.apache.org <mailto:users@solr.apache.org>>
> Subject: Re: Defense against deep paging?
>  
> The start parameter needs to be read from the request. That is how the client 
> gets to the second page of results, by setting start=10 or start=20. The 
> problem is when a bot sneaks through the checks and Solr gets start=3990000. 
> A few of those will use all of heap and take down the server process.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org <mailto:wun...@wunderwood.org>
> http://observer.wunderwood.org/ <http://observer.wunderwood.org/>  (my blog)
> 
> > On Jun 25, 2021, at 6:40 PM, Dwane Hall <dwaneh...@hotmail.com 
> > <mailto:dwaneh...@hotmail.com>> wrote:
> > 
> > Hey Walter,
> > 
> > Can you set the value for start (0) and rows (your default sensible 
> > response row size) as an invariant in the request handler you're using so 
> > it can't be overridden from a client request? That's how I've defended 
> > against it from Solr's perspective in the past. This can be hard coded in 
> > your request handler in the XML of your solr-config or using the parameters 
> > API. I've found it simple but effective approach and there's an example 
> > here from the docs 
> > (https://solr.apache.org/guide/8_8/requesthandlers-and-searchcomponents-in-solrconfig.html#request-handlers
> >  
> > <https://solr.apache.org/guide/8_8/requesthandlers-and-searchcomponents-in-solrconfig.html#request-handlers>).
> > 
> > Thanks,
> > 
> > Dwane
> > From: Walter Underwood <wun...@wunderwood.org 
> > <mailto:wun...@wunderwood.org>>
> > Sent: Saturday, 26 June 2021 6:39 AM
> > To: users@solr.apache.org <mailto:users@solr.apache.org> 
> > <users@solr.apache.org <mailto:users@solr.apache.org>>
> > Subject: Re: Defense against deep paging?
> >  
> > Thanks, that is exactly the info I wanted! I’ve commented there, even 
> > though it is closed as Won’t Do.
> > 
> > wunder
> > Walter Underwood
> > wun...@wunderwood.org <mailto:wun...@wunderwood.org>
> > http://observer.wunderwood.org/ <http://observer.wunderwood.org/> 
> > <http://observer.wunderwood.org/ <http://observer.wunderwood.org/>>  (my 
> > blog)
> > 
> > > On Jun 25, 2021, at 12:46 PM, Mike Drob <md...@mdrob.com 
> > > <mailto:md...@mdrob.com>> wrote:
> > > 
> > > This was discussed somewhat in
> > > https://issues.apache.org/jira/browse/SOLR-15252 
> > > <https://issues.apache.org/jira/browse/SOLR-15252><https://issues.apache.org/jira/browse/SOLR-15252
> > >  <https://issues.apache.org/jira/browse/SOLR-15252>> with no
> > > implementation provided.
> > > 
> > > On Fri, Jun 25, 2021 at 11:52 AM Walter Underwood <wun...@wunderwood.org 
> > > <mailto:wun...@wunderwood.org>> wrote:
> > >> 
> > >> I already said that we have a limit in the client code. I’m asking about 
> > >> a limit in Solr.
> > >> 
> > >> wunder
> > >> Walter Underwood
> > >> wun...@wunderwood.org <mailto:wun...@wunderwood.org>
> > >> http://observer.wunderwood.org/ <http://observer.wunderwood.org/> 
> > >> <http://observer.wunderwood.org/ <http://observer.wunderwood.org/>>  (my 
> > >> blog)
> > >> 
> > >>> On Jun 25, 2021, at 11:50 AM, Håvard Wahl Kongsgård 
> > >>> <haavard.kongsga...@gmail.com <mailto:haavard.kongsga...@gmail.com>> 
> > >>> wrote:
> > >>> 
> > >>> Just create a proxy client between the user and solr. Set if page >= 
> > >>> 500 ….
> > >>> else
> > >>> 
> > >>> Simple stuff
> > >>> 
> > >>> fre. 25. jun. 2021 kl. 19:20 skrev Walter Underwood 
> > >>> <wun...@wunderwood.org <mailto:wun...@wunderwood.org>>:
> > >>> 
> > >>>> Has anyone implemented protection against deep paging inside Solr? I’m
> > >>>> thinking about something like a max_rows parameter, where if 
> > >>>> start+rows was
> > >>>> greater than that, it would limit the max result to that number. Or 
> > >>>> maybe
> > >>>> just return a 400, that would be OK too.
> > >>>> 
> > >>>> I’ve had three or four outages caused by deep paging over the past 
> > >>>> dozen
> > >>>> years with Solr. We implement a limit in the client code, then someone
> > >>>> forgets to add it to the redesigned client code. A limit in the request
> > >>>> handler would be so much easier.
> > >>>> 
> > >>>> And yes, I know about cursor marks. We don’t want to enable deep 
> > >>>> paging,
> > >>>> we want to stop it.
> > >>>> 
> > >>>> wunder
> > >>>> Walter Underwood
> > >>>> wun...@wunderwood.org <mailto:wun...@wunderwood.org>
> > >>>> http://observer.wunderwood.org/ <http://observer.wunderwood.org/> 
> > >>>> <http://observer.wunderwood.org/ <http://observer.wunderwood.org/>>  
> > >>>> (my blog)
> > >>>> 
> > >>>> --
> > >>> Håvard Wahl Kongsgård
> > >>> Data Scientist
> > >> 
> > 

Reply via email to