[ https://issues.apache.org/jira/browse/HIVE-13196?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15175218#comment-15175218 ]
Gopal V commented on HIVE-13196: -------------------------------- Wrote a JMH bench, which explains this change - https://github.com/t3rmin4t0r/regexbench {code} # Run complete. Total time: 00:00:41 Benchmark Mode Cnt Score Error Units RegexBench.testGreedyRegexHit avgt 5 340.991 ± 7.929 ns/op RegexBench.testGreedyRegexMiss avgt 5 466.184 ± 21.349 ns/op RegexBench.testLazyRegexHit avgt 5 72.456 ± 16.156 ns/op RegexBench.testLazyRegexMiss avgt 5 366.955 ± 49.159 ns/op {code} > UDFLike: reduce Regex NFA sizes > ------------------------------- > > Key: HIVE-13196 > URL: https://issues.apache.org/jira/browse/HIVE-13196 > Project: Hive > Issue Type: Improvement > Components: UDF > Affects Versions: 1.3.0, 1.2.1, 2.0.0, 2.1.0 > Reporter: Gopal V > Assignee: Gopal V > Priority: Minor > Attachments: HIVE-13196.1.patch > > > The NFAs built from complex regexes in UDFLike are extremely complex and > spend a lot of time doing simple expression matching with no backtracking. > Prevent NFA -> DFA explosion by using reluctant regex matches instead of > greedy matches. -- This message was sent by Atlassian JIRA (v6.3.4#6332)