RE: using the key from a SequenceFile

Ruben de Vries Thu, 19 Apr 2012 08:49:41 -0700

You're a lifesaver!

From: Dilip Joseph [mailto:dilip.antony.jos...@gmail.com]
Sent: Thursday, April 19, 2012 5:47 PM
To: user@hive.apache.org
Subject: Re: using the key from a SequenceFile

An example input format for using SequenceFile keys in hive is at 
https://gist.github.com/2421795 .  The code just reverses how the key and value 
are accessed in the standard SequenceFileRecordRecorder and 
SequenceFileInputFormat that comes with hadoop.

You can use this custom input format by specifying the following when you 
create the table:

STORED AS
    INPUTFORMAT 'com.mycompany.SequenceFileKeyInputFormat'

Dilip

On Thu, Apr 19, 2012 at 6:09 AM, Owen O'Malley 
<omal...@apache.org<mailto:omal...@apache.org>> wrote:
On Thu, Apr 19, 2012 at 3:07 AM, Ruben de Vries 
<ruben.devr...@hyves.nl<mailto:ruben.devr...@hyves.nl>> wrote:
> I'm trying to migrate a part of our current hadoop jobs from normal
> mapreduce jobs to hive,
>
> Previously the data was stored in sequencefiles with the keys containing
> valueable data!
I think you'll want to define your table using a custom InputFormat
that creates a virtual row based on both the key and value and then
use the 'STORED AS INPUTFORMAT ...'

-- Owen

--
_________________________________________
Dilip Antony Joseph
http://csgrad.blogspot.com
http://www.marydilip.info

RE: using the key from a SequenceFile

Reply via email to