[ 
https://issues.apache.org/jira/browse/LUCENE-6156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14263465#comment-14263465
 ] 

Michael McCandless commented on LUCENE-6156:
--------------------------------------------

Hmm, that stack trace is from Lucene 4.9.x not 4.10.x.  Maybe you saw this with 
Elasticsearch 1.3.x?

Lucene's regexp parsing/building is recursive, meaning it consumes one java 
stack frame per regexp operation (|, &, etc.).  This hasn't changed recently, 
e.g. it was the same way before LUCENE-5752, so I'm not sure why you saw it 
working with previous Lucene version (btw, 10.6 is not a valid Lucene 
version... can you re-check which Elasticsearch/Lucene version you saw this 
working on?).

Making this code non-recursive is likely not a great option ... it makes the 
code more complex.  For example, isFinite can also hit StackOverflowError, but 
we abandoned making it non-recursive in LUCENE-5659 because it increased code 
complexity.

Maybe we should add a "max regexp parts" limit so you get a more sane exception 
for too-large regexps?  Similar to what we did for determinize in LUCENE-6046 
...

> StackOverFlow error during parsing of a request
> -----------------------------------------------
>
>                 Key: LUCENE-6156
>                 URL: https://issues.apache.org/jira/browse/LUCENE-6156
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: core/queryparser
>    Affects Versions: 4.10.2
>         Environment: windows 2008, osx yosemite with java 1.7.0_60
>            Reporter: Aurelien PISU
>            Priority: Critical
>
> during parsing of a query send to lucene via elasticSearch 1.4.2, i encounter 
> that stackOverFlow exception on RegExp.
> here the stack trace
> Caused by: java.lang.StackOverflowError
> at java.util.BitSet.(BitSet.java:154)
> at org.apache.lucene.util.automaton.Automaton.(Automaton.java:75)
> at org.apache.lucene.util.automaton.Automata.makeString(Automata.java:273)
> at org.apache.lucene.util.automaton.RegExp.toAutomaton(RegExp.java:518)
> at org.apache.lucene.util.automaton.RegExp.findLeaves(RegExp.java:553)
> at org.apache.lucene.util.automaton.RegExp.findLeaves(RegExp.java:550)
> at org.apache.lucene.util.automaton.RegExp.findLeaves(RegExp.java:551)
> at org.apache.lucene.util.automaton.RegExp.findLeaves(RegExp.java:551)
> Note : the regular expression is quite large and contains only ascii 
> character and '|' character. let me know,  If you need the value of the 
> regular expression. This issue was firstly raise to elastic search component 
> on github that use the 4.10.2 version of lucene and identify as a lucene 
> issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to