The start parameter needs to be read from the request. That is how the client gets to the second page of results, by setting start=10 or start=20. The problem is when a bot sneaks through the checks and Solr gets start=3990000. A few of those will use all of heap and take down the server process.
wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Jun 25, 2021, at 6:40 PM, Dwane Hall <dwaneh...@hotmail.com> wrote: > > Hey Walter, > > Can you set the value for start (0) and rows (your default sensible response > row size) as an invariant in the request handler you're using so it can't be > overridden from a client request? That's how I've defended against it from > Solr's perspective in the past. This can be hard coded in your request > handler in the XML of your solr-config or using the parameters API. I've > found it simple but effective approach and there's an example here from the > docs > (https://solr.apache.org/guide/8_8/requesthandlers-and-searchcomponents-in-solrconfig.html#request-handlers). > > Thanks, > > Dwane > From: Walter Underwood <wun...@wunderwood.org> > Sent: Saturday, 26 June 2021 6:39 AM > To: users@solr.apache.org <users@solr.apache.org> > Subject: Re: Defense against deep paging? > > Thanks, that is exactly the info I wanted! I’ve commented there, even though > it is closed as Won’t Do. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ <http://observer.wunderwood.org/> (my blog) > > > On Jun 25, 2021, at 12:46 PM, Mike Drob <md...@mdrob.com> wrote: > > > > This was discussed somewhat in > > https://issues.apache.org/jira/browse/SOLR-15252 > > <https://issues.apache.org/jira/browse/SOLR-15252> with no > > implementation provided. > > > > On Fri, Jun 25, 2021 at 11:52 AM Walter Underwood <wun...@wunderwood.org> > > wrote: > >> > >> I already said that we have a limit in the client code. I’m asking about a > >> limit in Solr. > >> > >> wunder > >> Walter Underwood > >> wun...@wunderwood.org > >> http://observer.wunderwood.org/ <http://observer.wunderwood.org/> (my > >> blog) > >> > >>> On Jun 25, 2021, at 11:50 AM, Håvard Wahl Kongsgård > >>> <haavard.kongsga...@gmail.com> wrote: > >>> > >>> Just create a proxy client between the user and solr. Set if page >= 500 > >>> …. > >>> else > >>> > >>> Simple stuff > >>> > >>> fre. 25. jun. 2021 kl. 19:20 skrev Walter Underwood > >>> <wun...@wunderwood.org>: > >>> > >>>> Has anyone implemented protection against deep paging inside Solr? I’m > >>>> thinking about something like a max_rows parameter, where if start+rows > >>>> was > >>>> greater than that, it would limit the max result to that number. Or maybe > >>>> just return a 400, that would be OK too. > >>>> > >>>> I’ve had three or four outages caused by deep paging over the past dozen > >>>> years with Solr. We implement a limit in the client code, then someone > >>>> forgets to add it to the redesigned client code. A limit in the request > >>>> handler would be so much easier. > >>>> > >>>> And yes, I know about cursor marks. We don’t want to enable deep paging, > >>>> we want to stop it. > >>>> > >>>> wunder > >>>> Walter Underwood > >>>> wun...@wunderwood.org > >>>> http://observer.wunderwood.org/ <http://observer.wunderwood.org/> (my > >>>> blog) > >>>> > >>>> -- > >>> Håvard Wahl Kongsgård > >>> Data Scientist > >> >