[ https://issues.apache.org/jira/browse/HIVE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16766635#comment-16766635 ]
Gopal V commented on HIVE-21242: -------------------------------- Java used to use UCS-2, it switched to UTF-16 by default to support supplemental characters https://docs.oracle.com/javase/8/docs/technotes/guides/intl/overview.html#textrep {code} The primitive data type char in the Java programming language is an unsigned 16-bit integer that can represent a Unicode code point in the range U+0000 to U+FFFF, or the code units of UTF-16. {code} > Calcite Planner Logging Indicates UTF-16 Encoding > ------------------------------------------------- > > Key: HIVE-21242 > URL: https://issues.apache.org/jira/browse/HIVE-21242 > Project: Hive > Issue Type: Improvement > Components: CBO > Affects Versions: 4.0.0, 3.2.0 > Reporter: BELUGA BEHR > Priority: Major > > I noticed some debug logging from calcite and it is using UTF-16. I would > expect UTF-8. > {code} > 2019-02-10T19:08:06,393 DEBUG [7db4d3c5-0f88-49db-88fa-ad6428c23784 main] > parse.CalcitePlanner: Plan after decorrelation: > HiveSortLimit(offset=[0], fetch=[2]) > HiveProject(_o__c0=[array(3, 2, 1)], _o__c1=[map(1, 2001-01-01, 2, null)], > _o__c2=[named_struct(_UTF-16LE'c1', 123456, _UTF-16LE'c2', _UTF-16LE'hello', > _UTF-16LE'c3', array(_UTF-16LE'aa', _UTF-16LE'bb', _UTF-16LE'cc'), > _UTF-16LE'c4', map(_UTF-16LE'abc', 123, _UTF-16LE'xyz', 456), _UTF-16LE'c5', > named_struct(_UTF-16LE'c5_1', _UTF-16LE'bye', _UTF-16LE'c5_2', 88))]) > HiveTableScan(table=[[default, src]], table:alias=[src]) > {code} > I'm not sure if this is a calcite internal thing which can be configured or > if this only an artifact of the way the logging works. -- This message was sent by Atlassian JIRA (v7.6.3#76005)