Hi Alan, Response is inline, below:
On Tue, Feb 18, 2014 at 11:49 AM, Alan Gates <ga...@hortonworks.com> wrote: > Gunther, is it the case that there is anything extra that needs to be done to > ship Parquet code with Hive right now? If I read the patch correctly the > Parquet jars were added to the pom and thus will be shipped as part of Hive. > As long as it works out of the box when a user says "create table ... stored > as parquet" why do we care whether the parquet jar is owned by Hive or > another project? > > The concern about feature mismatch in Parquet versus Hive is valid, but I'm > not sure what to do about it other than assure that there are good error > messages. Users will often want to use non-Hive based storage formats > (Parquet, Avro, etc.). This means we need a good way to detect at SQL > compile time that the underlying storage doesn't support the indicated data > type and throw a good error. Agreed, the error messages should absolutely be good. I will ensure this is the case via https://issues.apache.org/jira/browse/HIVE-6457 > > Also, it's important to be clear going forward about what Hive as a project > is signing up for. If tomorrow someone decides to add a new datatype or > feature we need to be clear that we expect the contributor to make this work > for Hive owned formats (text, RC, sequence, ORC) but not necessarily for > external formats This makes sense to me. I'd just like to add that I have a patch available to improve the hive-exec uber jar and general query speed: https://issues.apache.org/jira/browse/HIVE-860. Additionally I have a patch available to finish the generic STORED AS functionality: https://issues.apache.org/jira/browse/HIVE-5976 Brock