[ https://issues.apache.org/jira/browse/HIVE-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jakob Homan updated HIVE-2171: ------------------------------ Attachment: HIVE-2171.patch Patch: * Adds comment field to StructField interface and implements reasonable versions to each of its implementations. * Adds overloaded versions of each of the struct-based ObjectInspector factories to allow the comments to be set. * Adjusts MetastoreUtils to check if the comment of the field is null, if so, maintains previous behavior, else uses the comment. * Adds new unit test for MetastoreUtils. For this, mockito was added as a dependency. Right now it looks like Hive's Ivy conf isn't set up to only include some jars in the package. If this patch goes in, I'll open another jira to make sure the mockito and other test-related jars aren't included in jars they don't need to be. * Refactors the TestStandardObjectInspectors test to test both with and without comments. After this patch, a serde that wants to specify comments can and have them show up in the table description. For example, with a table kst created by an implementation of SerDe, that has an example for each type (the comments are all separate, they're all just boring: this is field BLAH) can now set the field comments: {noformat}hive> describe kst; OK string1 string this field is string1 string2 string this field is string2 int1 int this field is int1 boolean1 boolean this field is boolean1 long1 bigint this field is long1 float1 float this field is float1 double1 double this field is double1 inner_record1 struct<int_in_inner_record1:int,string_in_inner_record1:string> this field is inner_record1 enum1 string this field is enum1 array1 array<string> this field is array1 map1 map<string,string> this field is map1 union1 uniontype<float,boolean,string> this field is union1 fixed1 array<tinyint> this field is fixed1 null1 void this field is null1 unionnullint int this field is UnionNullInt bytes1 array<tinyint> this field is bytes1 ds string Time taken: 0.286 seconds{noformat} One thing I noticed is that these field comments on structs should extended to substructures, and does with this new patch for custom serdes: {noformat}hive> describe kst.inner_record1; OK int_in_inner_record1 int this field is int_in_inner_record1 string_in_inner_record1 string this field is string_in_inner_record1 Time taken: 0.113 seconds{noformat} However, this doesn't work correctly with built-in serdes: {noformat}hive> create table test_table(a STRUCT<z:string COMMENT 'comment for z',x:int> COMMENT 'comment for a'); OK Time taken: 2.565 seconds hive> describe test_table; OK a struct<z:string,x:int> comment for a Time taken: 0.139 seconds hive> describe test_table.a; OK z string from deserializer x int from deserializer Time taken: 0.096 seconds hive> describe test_table.a.z; OK z string from deserializer Time taken: 0.089 seconds hive>{noformat} The comment for field z is lost, replaced by the boilerplate text "from deserializer" and can't be retrieved from the CLI. I'll open a JIRA for this. This is my first Hive patch, so please check to see if I missed anything. > Allow custom serdes to set field comments > ----------------------------------------- > > Key: HIVE-2171 > URL: https://issues.apache.org/jira/browse/HIVE-2171 > Project: Hive > Issue Type: Improvement > Affects Versions: 0.7.0 > Reporter: Jakob Homan > Assignee: Jakob Homan > Fix For: 0.7.1 > > Attachments: HIVE-2171.patch > > > Currently, while serde implementations can set a field's name, they can't set > its comment. These are set in the metastore utils to {{(from > deserializer)}}. For those serdes that can provide meaningful comments for a > field, they should be propagated to the table description. These > serde-provided comments could be prepended to "(from deserializer)" if others > feel that's a meaningful distinction. This change involves updating > {{StructField}} to support a (possibly null) comment field and then > propagating this change out to the myriad places {{StructField}} is thrown > around. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira