Re: using the key from a SequenceFile

2012-04-19 Thread David Kulp
ust like a text file, only it omits the key >> completely and only uses the value part of it, other than that you won’t >> notice the difference between sequence or plain text file >> >> From: David Kulp [mailto:dk...@fiksu.com] >> Sent: Thursday, April 19, 2012 2:13

RE: using the key from a SequenceFile

2012-04-19 Thread Ruben de Vries
You're a lifesaver! From: Dilip Joseph [mailto:dilip.antony.jos...@gmail.com] Sent: Thursday, April 19, 2012 5:47 PM To: user@hive.apache.org Subject: Re: using the key from a SequenceFile An example input format for using SequenceFile keys in hive is at https://gist.github.com/2421795 .

Re: using the key from a SequenceFile

2012-04-19 Thread Dilip Joseph
An example input format for using SequenceFile keys in hive is at https://gist.github.com/2421795 . The code just reverses how the key and value are accessed in the standard SequenceFileRecordRecorder and SequenceFileInputFormat that comes with hadoop. You can use this custom input format by spec

Re: using the key from a SequenceFile

2012-04-19 Thread Owen O'Malley
On Thu, Apr 19, 2012 at 3:07 AM, Ruben de Vries wrote: > I’m trying to migrate a part of our current hadoop jobs from normal > mapreduce jobs to hive, > > Previously the data was stored in sequencefiles with the keys containing > valueable data! I think you'll want to define your table using a cu

Re: using the key from a SequenceFile

2012-04-19 Thread David Kulp
ses the value part of it, other than that you won’t > notice the difference between sequence or plain text file > > From: David Kulp [mailto:dk...@fiksu.com] > Sent: Thursday, April 19, 2012 2:13 PM > To: user@hive.apache.org > Subject: Re: using the key from a SequenceFile >

RE: using the key from a SequenceFile

2012-04-19 Thread Ruben de Vries
PM To: user@hive.apache.org Subject: Re: using the key from a SequenceFile I'm trying to achieve something very similar. I want to write an MR program that writes results in a record-based sequencefile that would be directly readable from hive as though it were created using "STORED

Re: using the key from a SequenceFile

2012-04-19 Thread David Kulp
I'm trying to achieve something very similar. I want to write an MR program that writes results in a record-based sequencefile that would be directly readable from hive as though it were created using "STORED AS SEQUENCEFILE" with, say, BinarySortableSerDe. From this discussion it seems that H

Re: using the key from a SequenceFile

2012-04-19 Thread madhu phatak
zes / deserializes the value part of the > sequencefile L ? > > ** ** > > *From:* madhu phatak [mailto:phatak@gmail.com] > *Sent:* Thursday, April 19, 2012 12:16 PM > *To:* user@hive.apache.org > *Subject:* Re: using the key from a SequenceFile > > ** ** > > Ser

RE: using the key from a SequenceFile

2012-04-19 Thread Ruben de Vries
Afaik SerDe only serialzes / deserializes the value part of the sequencefile :( ? From: madhu phatak [mailto:phatak@gmail.com] Sent: Thursday, April 19, 2012 12:16 PM To: user@hive.apache.org Subject: Re: using the key from a SequenceFile Serde will allow you to create custom data from your

Re: using the key from a SequenceFile

2012-04-19 Thread madhu phatak
Serde will allow you to create custom data from your sequence File https://cwiki.apache.org/confluence/display/Hive/SerDe On Thu, Apr 19, 2012 at 3:37 PM, Ruben de Vries wrote: > I’m trying to migrate a part of our current hadoop jobs from normal > mapreduce jobs to hive, > > Previously the d

using the key from a SequenceFile

2012-04-19 Thread Ruben de Vries
I'm trying to migrate a part of our current hadoop jobs from normal mapreduce jobs to hive, Previously the data was stored in sequencefiles with the keys containing valueable data! However if I load the data into a table I loose that key data (or at least I can't access it with hive), I want to