[ https://issues.apache.org/jira/browse/HIVE-16667?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16015947#comment-16015947 ]
Naveen Gangam commented on HIVE-16667: -------------------------------------- I am able to reproduce the issue now. Nothing has changed since last night except I rebuilt the product from scratch this morning. So the HiveConf changes for it to use DirectSQL might not have picked up last night. {code} hive> CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT 'default') PARTITIONED BY (ds STRING, hr STRING) STORED AS TEXTFILE; OK Time taken: 11.637 seconds hive> LOAD DATA LOCAL INPATH "/Users/ngangam/THIRD/apache/hive/data/files/kv1.txt" OVERWRITE INTO TABLE srcpart PARTITION (ds="2008-04-09",hr="11"); Loading data to table default.srcpart partition (ds=2008-04-09, hr=11) OK Time taken: 132.753 seconds hive> select * from srcpart; OK Failed with exception java.io.IOException:java.lang.IllegalArgumentException: Error: type expected at the position 0 of '25200:25201' but '25200' is found. Time taken: 24.4 seconds hive> {code} Let me debug thru this and see if I can come up with a fix. So from ur experience, how do we determine if text is being stored in-line or out-of-line by the database? > PostgreSQL metastore handling of CLOB types for COLUMNS_V2.TYPE_NAME and > other field is incorrect > ------------------------------------------------------------------------------------------------- > > Key: HIVE-16667 > URL: https://issues.apache.org/jira/browse/HIVE-16667 > Project: Hive > Issue Type: Bug > Reporter: Remus Rusanu > Assignee: Naveen Gangam > Attachments: HiveCLIOutput.txt, PostgresDBOutput.txt > > > The CLOB JDO type introduced with HIVE-12274 does not work correctly with > PostgreSQL. The value is written out-of-band and the LOB handle is written,as > an INT, into the table. SELECTs return the INT value, which should had been > read via the {{lo_get}} PG built-in, and then cast into string. > Furthermore, the behavior is different between fields upgraded from earlier > metastore versions (they retain their string storage) vs. values inserted > after the upgrade (inserted as LOB roots). > Teh code in > {{MetasoreDirectSql.getPartitionsFromPartitionIds/extractSqlClob}} expects > the underlying JDO/Datanucleus to map the column to a {{Clob}} but that does > not happen, the value is a Java String containing the int which is the LOB > root saved by PG. > This manifests at runtime with errors like: > {code} > hive> select * from srcpart; > Failed with exception java.io.IOException:java.lang.IllegalArgumentException: > Error: type expected at the position 0 of '24030:24031' but '24030' is found. > {code} > the 24030:24031 should be 'string:string'. > repro: > {code} > CREATE TABLE srcpart (key STRING COMMENT 'default', value STRING COMMENT > 'default') PARTITIONED BY (ds STRING, hr STRING) STORED AS TEXTFILE; > LOAD DATA LOCAL INPATH "${hiveconf:test.data.dir}/kv1.txt" OVERWRITE INTO > TABLE srcpart PARTITION (ds="2008-04-09", hr="11"); > select * from srcpart; > {code} > I did not see the issue being hit by non-partitioned/textfile tables, but > that is just the luck of the path taken by the code. Inspection of my PG > metastore shows all the CLOB fields suffering from this issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)