Right mechanism is implement Kerberos authentication. Think like UNIX how
you protect data. Same like that users and groups can be created and file
permissions can be given. If you don't protect that way user can read any
data any data including ORC (by ORC command line/tool).

On Thursday, March 17, 2016, Mich Talebzadeh <mich.talebza...@gmail.com>
wrote:

> Hi,
>
> What are the best mechanisms of hiding data destined for Hive tables.
>
> Let us assume that we are loading tons of CSV files into Hive.
>
> The way I do it is:
>
> --1 Move .CSV data into HDFS staging area
> --2 Create an external table.
> --3 Create the ORC table if needed
> --4 Insert or append the data from the external table to the Hive ORC
> table
> --5 Remove CSV files from staging area
>
> Within process 1 to 5 (that may take a good while), sensitive data
> residing on HDFS can be exposed. I would be interested to know possible
> solutions to this potential security breach.
>
> Thanks,
>
> Dr Mich Talebzadeh
>
>
>
> LinkedIn * 
> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>
>
>
> http://talebzadehmich.wordpress.com
>
>
>

Reply via email to