[ https://issues.apache.org/jira/browse/HIVE-19256?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16446064#comment-16446064 ]
Ratandeep Ratti commented on HIVE-19256: ---------------------------------------- https://reviews.apache.org/r/66732 > UDF which shapes the input data according to the specified schema > ----------------------------------------------------------------- > > Key: HIVE-19256 > URL: https://issues.apache.org/jira/browse/HIVE-19256 > Project: Hive > Issue Type: New Feature > Reporter: Ratandeep Ratti > Assignee: Ratandeep Ratti > Priority: Major > Attachments: HIVE-19256.patch, HIVE-19256_1.patch > > > We use this UDF a lot in our org. This UDF takes an object and a Hive schema > and make sure the output object matches the schema completely. In some > respects it is similar to {{named > _struct}} UDF which can be used to select columns from a struct, but it is > more general since it can work not only on structs, but all Hive data types > (expect union). Also the schema can provide certain valid type conversions > (int -> double etc) > One scenario where this is quite useful is making sure that the Hive view > created with a specific schema will have columns which will always match that > schema. In Hive today when a view is created, new nested columns from the > underlying table can leak out from the view, even though the user never > wanted this behavior. Note that this leaking of columns is only for nested > columns and not for top level columns, so in that regard this behavior of > Hive is inconsistent. > Sample usage of the UDF > {code} > generic_project(col, "struct<a:array<struct<c:int,d:string>>>") // Returning > data which matches the input schema. Here extra columns which are not part of > the input will be removed > generic_project(col, "struct<a:double>") // If the input column had a struct > with col a as int . It would type cast 'a' to double. > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)