[ 
https://issues.apache.org/jira/browse/HIVE-6329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14111827#comment-14111827
 ] 

Larry McCay commented on HIVE-6329:
-----------------------------------

I see, [~navis]. So, the intent of this patch is to provide a hook within the 
SerDe mechanism with enough fidelity to do encryption but the initial 
implementation just provides an encoding to Base64 implementation. That helps 
me understand the patch more and I think you have accomplished this.

I would be a bit leery of calling the hook and Base64 implementation that we 
are providing in this patch "column level encryption/decryption" - even though 
you are enabling someone to use it for that. This happens to be a patch that 
introduces column/value encoding/decoding. This is easily reversible and 
joinable across tables allowing correlations to be made.

Are we able to frame the usecase that is actually represented by this patch as 
a problem that needs solving or do we need to make this implementation more 
robust in terms of encryption/decryption and all then key management 
requirements required to do that properly?

I am just concerned about introducing new interfaces and hooks that need to be 
supported if they are not what we would consider strategic implementation 
choices for a given feature like encryption. Does the SerDe mechansim provide 
everything that we need? It seems like this approach provides little in terms 
of key management and metadata which are requisite for encryption mechanisms. 
Though, I may still be missing the forest for the trees.

What I would like to do is ensure that our customers have a path forward with 
their needs met while not moving this forward in apache until we have an actual 
encryption mechanism available.

Does that make sense?

What do you think that will require?

> Support column level encryption/decryption
> ------------------------------------------
>
>                 Key: HIVE-6329
>                 URL: https://issues.apache.org/jira/browse/HIVE-6329
>             Project: Hive
>          Issue Type: New Feature
>          Components: Security, Serializers/Deserializers
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-6329.1.patch.txt, HIVE-6329.10.patch.txt, 
> HIVE-6329.11.patch.txt, HIVE-6329.2.patch.txt, HIVE-6329.3.patch.txt, 
> HIVE-6329.4.patch.txt, HIVE-6329.5.patch.txt, HIVE-6329.6.patch.txt, 
> HIVE-6329.7.patch.txt, HIVE-6329.8.patch.txt, HIVE-6329.9.patch.txt
>
>
> Receiving some requirements on encryption recently but hive is not supporting 
> it. Before the full implementation via HIVE-5207, this might be useful for 
> some cases.
> {noformat}
> hive> create table encode_test(id int, name STRING, phone STRING, address 
> STRING) 
>     > ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe' 
>     > WITH SERDEPROPERTIES ('column.encode.columns'='phone,address', 
> 'column.encode.classname'='org.apache.hadoop.hive.serde2.Base64WriteOnly') 
> STORED AS TEXTFILE;
> OK
> Time taken: 0.584 seconds
> hive> insert into table encode_test select 
> 100,'navis','010-0000-0000','Seoul, Seocho' from src tablesample (1 rows);
> ......
> OK
> Time taken: 5.121 seconds
> hive> select * from encode_test;
> OK
> 100   navis     MDEwLTAwMDAtMDAwMA==  U2VvdWwsIFNlb2Nobw==
> Time taken: 0.078 seconds, Fetched: 1 row(s)
> hive> 
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to