Thank you Alan.

-----Original Message-----
From: Alan Gates [mailto:ga...@hortonworks.com] 
Sent: Thursday, July 18, 2013 5:45 PM
To: user@hive.apache.org
Subject: Re: Hive Architecture - Execution on nodes


On Jul 18, 2013, at 1:40 PM, Tzur Turkenitz wrote:

> Hello,
> Just finished reading the Hive-Architecture pdf, and failed to find the
answers I was hoping for. So here I am, hoping this community will shed some
light.
> I think I know what the answers will be, I need that bolted down and
secured.
>  
> We are concerned on how data is transferred between data-nodes and hive,
especially when it comes to clusters were there's no SSL between nodes.
>  
> And this is the user-case:
> 1.       Table employee is a Hive table, with SerDe
> 2.       MapReduce job accesses the table Employees which holds Encrypted
data
> 3.       SerDe decrypts the data
> 4.       Post-SerDe output is returned to the MapReduce job and saved to a
new Hive table using a new Encryption implementation
>  
> The flow, as I think it currently is should be:
> MapReduce Job -- > Read table metadata -- > SerDe creates map-reduce job
-- > distributes across nodes
>  
> Which means that data is decrypted on the local nodes and then sent in
clear-text back to the original map-reduce job to be saved in a new table.
> Is that correct? L

No.  Data deserialization (which is what a serde does, not decryption) is
done as part of reading in the map reduce job.  Mainly only query parsing,
validation, and planning is done on the client node.

Alan.
>  


Reply via email to