[ https://issues.apache.org/jira/browse/HIVE-6617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pengcheng Xiong updated HIVE-6617: ---------------------------------- Attachment: parser.png Performance comparison between the old parser and the proposed new parser (in this jira) I directly measured the parsing time of an Hive query, which is defined as the time that ANTLR takes from accepting the string to outputting the AST. I parsed 19025 queries in 1888 files in client positive q tests of CliDriver with my laptop. (Darwin Kernel Version 13.2.0, Intel(R) Core(TM) i7-4850HQ CPU @ 2.30GHz, Primary memory available: 16.00 gigabytes) I compared 3 cases, default (Hive default parser), TRUE (Proposed new parser with a configuration to support SQL reserved keywords), FALSE (Proposed new parser with a configuration not to support SQL reserved keywords for backward compatibility). The CDF is shown as attached. Please note that Y-axis is measured in nano seconds (10−9). It seems that (1) There is no significant difference among the 3, which proves that there is no significant performance penalty for the new parser. (2) It only takes 0.4ms to finish parsing for half of the queries. And the longest one, less than 2ms for all of them. > Reduce ambiguity in grammar > --------------------------- > > Key: HIVE-6617 > URL: https://issues.apache.org/jira/browse/HIVE-6617 > Project: Hive > Issue Type: Task > Reporter: Ashutosh Chauhan > Assignee: Pengcheng Xiong > Attachments: HIVE-6617.01.patch, HIVE-6617.02.patch, > HIVE-6617.03.patch, HIVE-6617.04.patch, HIVE-6617.05.patch, > HIVE-6617.06.patch, HIVE-6617.07.patch, HIVE-6617.08.patch, > HIVE-6617.09.patch, HIVE-6617.10.patch, HIVE-6617.11.patch, > HIVE-6617.12.patch, HIVE-6617.13.patch, HIVE-6617.14.patch, > HIVE-6617.15.patch, HIVE-6617.16.patch, HIVE-6617.17.patch, > HIVE-6617.18.patch, HIVE-6617.19.patch, HIVE-6617.20.patch, > HIVE-6617.21.patch, parser.png > > > CLEAR LIBRARY CACHE > As of today, antlr reports 214 warnings. Need to bring down this number, > ideally to 0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)