Hi all,
I have few questions with regards to nested columns in Hive.
> How does ORC internally stores the complex types such as a struct? Are
the nested fields stored as separate columns or is the whole struct is
serialized as one column?
> Is predicate pushdown supported for queries which access nested columns?
In general, is there a significant performance difference in following
schemas with regards to query execution and storage?
Schema1:
{
string a;
struct b {
string b1;
string b2;
}
}
Schema 2:
{
string a;
string b.b1;
string b.b2;
}
--
Regards,
Abhishek Agarwal