[ https://issues.apache.org/jira/browse/HIVE-9481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Eugene Koifman updated HIVE-9481: --------------------------------- Status: Patch Available (was: Open) > allow column list specification in INSERT statement > --------------------------------------------------- > > Key: HIVE-9481 > URL: https://issues.apache.org/jira/browse/HIVE-9481 > Project: Hive > Issue Type: Bug > Components: Parser, Query Processor, SQL > Affects Versions: 0.14.0 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Attachments: HIVE-9481.2.patch, HIVE-9481.4.patch, HIVE-9481.patch > > > Given a table FOO(a int, b int, c int), ANSI SQL supports insert into > FOO(c,b) select x,y from T. The expectation is that 'x' is written to column > 'c' and 'y' is written column 'b' and 'a' is set to NULL, assuming column 'a' > is NULLABLE. > Hive does not support this. In Hive one has to ensure that the data > producing statement has a schema that matches target table schema. > Since Hive doesn't support DEFAULT value for columns in CREATE TABLE, when > target schema is explicitly provided, missing columns will be set to NULL if > they are NULLABLE, otherwise an error will be raised. > If/when DEFAULT clause is supported, this can be enhanced to set default > value rather than NULL. > Thus, given {noformat} > create table source (a int, b int); > create table target (x int, y int, z int); > create table target2 (x int, y int, z int); > {noformat} > {noformat}insert into target(y,z) select * from source;{noformat} > will mean > {noformat}insert into target select null as x, a, b from source;{noformat} > and > {noformat}insert into target(z,y) select * from source;{noformat} > will meant > {noformat}insert into target select null as x, b, a from source;{noformat} > Also, > {noformat} > from source > insert into target(y,z) select null as x, * > insert into target2(y,z) select null as x, source.*; > {noformat} > and for partitioned tables, given > {noformat} > Given: > CREATE TABLE pageviews (userid VARCHAR(64), link STRING, "from" STRING) > PARTITIONED BY (datestamp STRING) CLUSTERED BY (userid) INTO 256 BUCKETS > STORED AS ORC; > INSERT INTO TABLE pageviews PARTITION (datestamp = '2014-09-23')(userid,link) > > VALUES ('jsmith', 'mail.com'); > {noformat} > And dynamic partitioning > {noformat} > INSERT INTO TABLE pageviews PARTITION (datestamp)(userid,datestamp,link) > VALUES ('jsmith', '2014-09-23', 'mail.com'); > {noformat} > In all cases, the schema specification contains columns of the target table > which are matched by position to the values produced by VALUES clause/SELECT > statement. If the producer side provides values for a dynamic partition > column, the column should be in the specified schema. Static partition > values are part of the partition spec and thus are not produced by the > producer and should not be part of the schema specification. -- This message was sent by Atlassian JIRA (v6.3.4#6332)