Hi,

I'm capturing data of the form A (1:n) B, which is a fairly standard 
item-subitem pattern.  In a standard DB, I'd have A and B tables with a foreign 
key from B to A.

But since Hive is different -- there's no natural primary key in my data and 
joins seem much more expensive -- I'm considering using an Array of Structs.

So -- some questions:  
        Does this make sense?  How's performance?  Say B has an attribute 
'num', and I want to find the average of nums or something [which a B table 
would lend itself to]

        Is there an example of how to format the files?

Thanks,

Ranjan

Reply via email to