Hive tables can sit on top of S3 storage so you dont really need a separate 
export process

thanks,
Shrikanth
On May 15, 2012, at 11:35 AM, Jon Palmer wrote:

> That seems like a very reasonable approach. However, if we use a technology 
> like Amazon Elastic Map Reduce my Hive cluster is (potentially) going to be 
> destroyed and recreated. As a result I'd really need to export the update 
> history Hive table to some other store (like S3) so that it can be 
> re-imported on the next spin up of the Hive cluster. Do I have that right?
> 
> Jon
> 
> -----Original Message-----
> From: shrikanth shankar [mailto:sshan...@qubole.com] 
> Sent: Tuesday, May 15, 2012 1:14 PM
> To: user@hive.apache.org
> Subject: Re: What's the right data storage/representation?
> 
> I would agree on keeping track of the history of updates in a separate table 
> in Hive (you may not need to maintain it in the application tier). This 
> pattern seems to be the "Slowly Changing Dimension" pattern used in other 
> (more traditional) Data Warehouses...  I suspect the challenge here would be 
> writing a ETL process to maintain the Hive table based on the current status 
> of the application db table ..
> 
> Shrikanth
> On May 15, 2012, at 9:41 AM, Owen O'Malley wrote:
> 
>> On Tue, May 15, 2012 at 5:11 AM, Jon Palmer <jpal...@care.com> wrote:
>>> I can see a few potential solutions:
>>> 
>>> 1.       Don't solve it. Accept that you have some artifacts in your
>>> reporting data that cannot be recovered from the source data.
>>> 
>>> 2.       Create status and location history tables in the application db and
>>> use that during the analytics process.
>>> 
>>> 3.       Log the status and location change 'events' to some other log file
>>> and use those logs in the Hive analysis.
>> 
>> I would probably create a Hive table that includes the status and 
>> location updates. One of the advantages of Hive & Hadoop is that it is 
>> easy to store the raw information in bulk and continue to process it.
>> Once you have the information, you will likely find new uses for it.
>> 
>> -- Owen
> 
> 
> 
> This email is intended for the person(s) to whom it is addressed and may 
> contain information that is PRIVILEGED or CONFIDENTIAL. Any unauthorized use, 
> distribution, copying, or disclosure by any person other than the 
> addressee(s) is strictly prohibited. If you have received this email in 
> error, please notify the sender immediately by return email and delete the 
> message and any attachments from your system.

Reply via email to