[ https://issues.apache.org/jira/browse/HIVE-16763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025790#comment-16025790 ]
Carter Shanklin commented on HIVE-16763: ---------------------------------------- If we're considering adding double-quoted identifiers I have some additional unsolicited opinions here. Hive has a number of non-standard restrictions on what characters are allowed, for example Hive won't allow $, :, /, #, or |, even in a delimited (or quoted) identifier. This causes problems for many migration scenarios. Many users hope to see better Unicode support. Aligning with the SQL standard allows us to solve both of these problems. Per the SQL:2011 spec: A regular identifier starts with an <identifier start> and is optionally followed by a sequence of <identifer start> or <identifier extend>. These are comprised of allowed types of Unicode characters. Additional details: 1) An <identifer start> is any character in the Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, or “Nl”. NOTE 94 — The Unicode General Category classes “Lu”, “Ll”, “Lt”, “Lm”, “Lo”, and “Nl” are assigned to Unicode characters that are, respectively, upper-case letters, lower-case letters, title-case letters, modi er letters, other letters, and letter numbers. 2) An <identifer extend> is U+00B7, “Middle Dot”, or any character in the Unicode General Category classes “Mn”, “Mc”, “Nd”, “Pc”, or “Cf”. NOTE 95 — The Unicode General Category classes “Mn”, “Mc”, “Nd”, “Pc”, and “Cf” are assigned to Unicode characters that are, respectively, nonspacing marks, spacing combining marks, decimal numbers, connector punctuations, and formatting codes. Based on these definitions, the following are valid regular identifiers and should not require special quoting. aďƪȸβҵᴟệⰼꜷꮾ𝐩𝖌𝛑 ミムㄪㅉ了泥 C̲̅r̲̅a̲̅y̲̅o̲̅l̲̲a̲̅ If you're skeptical, try any of these in Postgres and you will see that they work without quotes. Delimited identifiers, per standard, start with a double quote (") and end with a double quote ("). Anything may be placed within the quotes, including whitespace. It seems to be fairly common among the SQL-on-Hadoop space (Presto, SparkSQL, maybe others) to allow both ` and " for quoting. Punctuation is generally not allowed in regular identifiers and must be quoted. > Support space in quoted column alias > ------------------------------------ > > Key: HIVE-16763 > URL: https://issues.apache.org/jira/browse/HIVE-16763 > Project: Hive > Issue Type: Sub-task > Reporter: Pengcheng Xiong > Assignee: Pengcheng Xiong > > {code} > select key as 'k y' from src; > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)