Steven, Thanks for your reply! I have written it the way you mentioned, based on an earlier post in this mailing list. I'm concerned about having to encode/decode the string in base64, I'm wondering how much this will impact my job run time.
I have also written a UDF that emits a byte array, stored in a field of type array<tinyint>. When reading this field, the ObjectInspector is a ListObjectInspector with primitiveJavaByte for the list elements. Reading this field in the UDF seems clunky because I have to iterate over the list, reading each byte into a byte array, before I can use it. Given both approaches, which one do you think has the least performance overhead? Thanks, Luke On 5/23/11 6:59 PM, "Steven Wong" <sw...@netflix.com> wrote: >Hive does not support the blob data type. An option is to store your >binary data encoded as string (such as using base64) and define them in >Hive as string. > > >-----Original Message----- >From: Luke Forehand [mailto:luke.foreh...@networkedinsights.com] >Sent: Monday, May 23, 2011 1:21 PM >To: user@hive.apache.org >Subject: hive storing a byte array > >Hello, > >Can someone please provide an example in Hive, how I can store a >serialized object in a field? A field type of byte array or binary or >blob is really what I was looking for, but if something slightly less >trivial is involved some instruction would be much appreciated. This >object is used in a custom UDF later on in the processing pipeline. > >-Luke > >