[ 
https://issues.apache.org/jira/browse/HIVE-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13559982#comment-13559982
 ] 

Ashutosh Chauhan commented on HIVE-3833:
----------------------------------------

bq. For the above, it is fairly difficult to address. In a follow-up, I can add 
a serde level property, which indicates that the serde can handle different 
datatypes (for eg. lazySimpleSerde) - if all the partitions of the table have 
serde's with this property, then we can use identityConverter. This is kind of 
hacky, and am not sure if it is useful, since it should not be a common case. 
Usually, the partition schema should match the table schema.

I think this really is a common case. Folks usually change the serde of an 
existing table usually when they find a better FileFormat or sometime when 
there is a better serde, both of which is a rare occurrence. So, I think we 
need to think about optimizing this case. Though I agree approach you suggested 
is hacky. We need to think of a better approach, probably in a follow-up jira.

Also thanks for updating the patch.  Some more comments on latest patch are on 
phabricator. Also are we going to loose any lazy aspects of deserialization 
here? I guess not, because we are just wiring up OIs. But, want to make sure. 
Can you verify?

                
> object inspectors should be initialized based on partition metadata
> -------------------------------------------------------------------
>
>                 Key: HIVE-3833
>                 URL: https://issues.apache.org/jira/browse/HIVE-3833
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.3833.10.patch, hive.3833.11.patch, 
> hive.3833.12.patch, hive.3833.13.patch, hive.3833.14.patch, 
> hive.3833.16.path, hive.3833.17.patch, hive.3833.18.patch, 
> hive.3833.19.patch, hive.3833.1.patch, hive.3833.20.patch, hive.3833.2.patch, 
> hive.3833.3.patch, hive.3833.4.patch, hive.3833.5.patch, hive.3833.6.patch, 
> hive.3833.7.patch, hive.3833.8.patch, hive.3833.9.patch
>
>
> Currently, different partitions can be picked up for the same input split 
> based on the
> serdes' etc. And, we dont allow to change the schema for 
> LazyColumnarBinarySerDe.
> Instead of that, different partitions should be part of the same split, only 
> if the
> partition schemas exactly match. The operator tree object inspectors should 
> be based
> on the partition schema. That would give greater flexibility and also help 
> using binary serde with rcfile

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to