Hey again,
I thought it will be easy to combine the key and the value, however I ran into
difficulties, I wonder if someone has make a generic FileInputFormat that
prepend the key to the value?
Anyhow here is the code I am trying to write:
I have a class that extends the SequenceFileInputFormat
public class CombinedSequenceFileInputFormat<K extends Writable,V extends
Writable > extends SequenceFileInputFormat<K, V> {
@Override
public org.apache.hadoop.mapred.RecordReader<K, V> getRecordReader(
org.apache.hadoop.mapred.InputSplit split, JobConf job,
Reporter reporter) throws IOException {
// TODO Auto-generated method stub
CombinedSequenceRecordReader<K, V> wrap = new
CombinedSequenceRecordReader<K, V>(super.getRecordReader(split, job, reporter));
return wrap;
}
}
And then I return the wrapped recrodReader and the code of that wrapper is:
public class CombinedSequenceRecordReader<K extends Writable,V > implements
RecordReader<K, V> {
private RecordReader<K, V> proxy;
private K currentKey;
public CombinedSequenceRecordReader(RecordReader<K, V> proxy){
this.proxy = proxy;
}
public void setProxy(RecordReader<K, V> proxy) {
this.proxy = proxy;
}
public RecordReader<K, V> getProxy() {
return proxy;
}
@Override
public boolean next(K key, V value) throws IOException {
return proxy.next(key, value);
}
@Override
public K createKey() {
currentKey = proxy.createKey() ;
return currentKey;
}
@Override
public V createValue() {
V val = proxy.createValue();
return val;
}
@Override
public long getPos() throws IOException {
// TODO Auto-generated method stub
return proxy.getPos();
}
@Override
public void close() throws IOException {
proxy.close();
}
@Override
public float getProgress() throws IOException {
// TODO Auto-generated method stub
return proxy.getProgress();
}
}
Now I am trying to extend the createValue in such a way that I will have also
the key, any suggestions?
-----Original Message-----
From: Edward Capriolo [mailto:[email protected]]
Sent: Sunday, January 16, 2011 10:33 PM
To: [email protected]
Subject: Re: Sequence file- custom serdes - question
2011/1/16 Guy Doulberg <[email protected]>:
> Hey all,
>
> I am new to this hive thing, but I have a very complex task to perform, I am
> a little stuck. I hope someone here can help.
>
> My team has been storing data to a custom sequence file that has a custom key
> and a custom value. We want to expose a hive interface to query this data.
> I have been trying to write a custom SerDe that deserialize the sequence
> file to the a hive table.
>
> As long as I needed values from the value part of the object everything was
> all-right, but when I needed to extract a value from the key-part, I got
> stuck, suddenly I realized that in the method of the deserialize(Writeable
> o), o is instance of the value class, and I don't know how I can access the
> key object.
>
> It could be I am missing something in the configuration in the java code or
> declaration in the HIVE.
>
>
>
> Thanks,
> Guy
>
>
>
>
>
Hive ignores then Key! (I know how crazy right) What I have done is
used my InputFormat to combine the key and the value and make the
combined field the value.