[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Teddy Choi updated HIVE-4642: ----------------------------- Attachment: HIVE-4642-1.patch I wrote draft code. It needs more comments, tests, and refactoring. I agree that FA generation will be a heavy job, so I didn't implemented it. Common phone number patterns are covered with a simple fixed automaton. I will add more simple automata. There are already hard coded decisions, and more will come. So I introduced an interface that generalizes decisions. It may reduce performance little bit. ---- Class hierarchy: AbstractFilterStringColLikeStringScalar + FilterStringColLikeStringScalar + FilterStringColRegExpStringScalar AbstractFilter...#Checker + AbstractFilter...#BeginChecker + AbstractFilter...#EndChecker + AbstractFilter...#MiddleChecker + AbstractFilter...#NoneChecker + AbstractFilter...#AnyCharChecker + AbstractFilter...#ComplexChecker + FilterStringColRegExpStringScalar#PhoneNumberChecker AbstractFilter...#CheckerFactory + Filter...Like...#LikeBeginCheckerFactory + Filter...Like...#LikeEndCheckerFactory + Filter...Like...#LikeMiddleCheckerFactory + Filter...Like...#LikeNoneCheckerFactory + Filter...Like...#LikeAnyCharCheckerFactory + Filter...Like...#LikeComplexCheckerFactory + Filter...RegExp...#RegExpBeginCheckerFactory + Filter...RegExp...#RegExpEndCheckerFactory + Filter...RegExp...#RegExpMiddleCheckerFactory + Filter...RegExp...#RegExpNoneCheckerFactory + Filter...RegExp...#RegExpAnyCharCheckerFactory + Filter...RegExp...#RegExpComplexCheckerFactory + Filter...RegExp...#RegExpPhoneNumberCheckerFactory > Implement vectorized RLIKE and REGEXP filter expressions > -------------------------------------------------------- > > Key: HIVE-4642 > URL: https://issues.apache.org/jira/browse/HIVE-4642 > Project: Hive > Issue Type: Sub-task > Reporter: Eric Hanson > Assignee: Teddy Choi > Attachments: HIVE-4642-1.patch > > > See title. I will add more details next week. The goal is (a) make this work > correctly and (b) optimize it as well as possible, at least for the common > cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira