[ 
https://issues.apache.org/jira/browse/HIVE-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14358084#comment-14358084
 ] 

Fanhong Li commented on HIVE-5317:
----------------------------------

insert into table values() when UTF-8 character is not correct



insert into table test_acid partition(pt='pt_2')
 values( 2, '中文_2' , 'city_2' )
 ;

hive> select *
 > from test_acid 
 > ;
 OK
 2 -�_2 city_2 pt_2
 Time taken: 0.237 seconds, Fetched: 1 row(s)
 hive> 

CREATE TABLE test_acid(id INT, 
 name STRING, 
 city STRING) 
 PARTITIONED BY (pt STRING)
 clustered by (id) into 1 buckets
 stored as ORCFILE
 TBLPROPERTIES('transactional'='true')
 ;


> Implement insert, update, and delete in Hive with full ACID support
> -------------------------------------------------------------------
>
>                 Key: HIVE-5317
>                 URL: https://issues.apache.org/jira/browse/HIVE-5317
>             Project: Hive
>          Issue Type: New Feature
>            Reporter: Owen O'Malley
>            Assignee: Owen O'Malley
>             Fix For: 0.14.0
>
>         Attachments: InsertUpdatesinHive.pdf
>
>
> Many customers want to be able to insert, update and delete rows from Hive 
> tables with full ACID support. The use cases are varied, but the form of the 
> queries that should be supported are:
> * INSERT INTO tbl SELECT …
> * INSERT INTO tbl VALUES ...
> * UPDATE tbl SET … WHERE …
> * DELETE FROM tbl WHERE …
> * MERGE INTO tbl USING src ON … WHEN MATCHED THEN ... WHEN NOT MATCHED THEN 
> ...
> * SET TRANSACTION LEVEL …
> * BEGIN/END TRANSACTION
> Use Cases
> * Once an hour, a set of inserts and updates (up to 500k rows) for various 
> dimension tables (eg. customer, inventory, stores) needs to be processed. The 
> dimension tables have primary keys and are typically bucketed and sorted on 
> those keys.
> * Once a day a small set (up to 100k rows) of records need to be deleted for 
> regulatory compliance.
> * Once an hour a log of transactions is exported from a RDBS and the fact 
> tables need to be updated (up to 1m rows)  to reflect the new data. The 
> transactions are a combination of inserts, updates, and deletes. The table is 
> partitioned and bucketed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to