[ https://issues.apache.org/jira/browse/HIVE-19580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16772256#comment-16772256 ]
Darrell Ross edited comment on HIVE-19580 at 2/19/19 7:22 PM: -------------------------------------------------------------- Amazon appears to just require lower case. https://issues.apache.org/jira/browse/HIVE-19580 was (Author: eukota): Amazon appears to just require lower case. https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-emrfs-iam-roles.html > Hive 2.3.2 with ORC files stored on S3 are case sensitive > --------------------------------------------------------- > > Key: HIVE-19580 > URL: https://issues.apache.org/jira/browse/HIVE-19580 > Project: Hive > Issue Type: Bug > Affects Versions: 2.3.2 > Environment: EMR s3:// connector > Spark 2.3 but also true for lower versions > Hive 2.3.2 > Reporter: Arthur Baudry > Priority: Major > Fix For: 2.3.2 > > > Original file is csv: > COL1,COL2 > 1,2 > ORC file are created with Spark 2.3: > scala> val df = spark.read.option("header","true").csv("/user/hadoop/file") > scala> df.printSchema > root > |– COL1: string (nullable = true)| > |– COL2: string (nullable = true)| > scala> df.write.orc("s3://bucket/prefix") > In Hive: > hive> CREATE EXTERNAL TABLE test_orc(COL1 STRING, COL2 STRING) STORED AS ORC > LOCATION ("s3://bucket/prefix"); > hive> SELECT * FROM test_orc; > OK > NULL NULL > *Everyfield is null. However if fields are generated using lower case in > Spark schemas then everything works.* > The reason why I'm raising this bug is that we have customers using Hive > 2.3.2 to read files we generate through Spark and all our code base is > addressing fields using upper case while this is incompatible with their Hive > instance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)