[ https://issues.apache.org/jira/browse/HIVE-19040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16416600#comment-16416600 ]
Vihang Karajgaonkar commented on HIVE-19040: -------------------------------------------- Based on my previous discussions with [~alangates] I think the assumption with this particular API was that only Hive uses currently and hence it is assumed that the right hive jars are in the classpath of HMS. This API is very flaky though from the point of view of standalone-metastore. Technically, HMS APIs are backwards compatible which means older clients should be able to talk to newer HMS. This assumption is broken is in this case. In my humble opinion sending an object as bytearray over wire and deserializing it on the server side is reinventing what thrift does for us. I am not even sure how Hive UDFs are used to create filter strings on MySQL (I will have to look more to understand how the expressionTree is getting generated). What happens if client sends a UDF which is specific to Hive? Also, is it always true that HS2 and HMS will always be at the same version? The second meta-point to think about is when the standalone-metastore is deployed separately, should it have hive-exec jars in its classpath? I think what we are indirectly saying now is if Hive is one of the users of this standalone HMS (which it most certainly will always be) then we should add the "right" version of hive-exec jars in the metastore's classpath. How does that make metastore standalone? Aren't we back to square one? I think the right way ahead may be is to deprecate this API and reimplement it without depending on hive-exec. The {{Filter.g}} is already part of HMS and we should try to build/enhance this to provide the expressions which can be used to filter out partitions. > get_partitions_by_expr() implementation in HiveMetaStore causes backward > incompatibility easily > ------------------------------------------------------------------------------------------------ > > Key: HIVE-19040 > URL: https://issues.apache.org/jira/browse/HIVE-19040 > Project: Hive > Issue Type: Improvement > Components: Standalone Metastore > Affects Versions: 2.0.0 > Reporter: Aihua Xu > Priority: Major > > In the HiveMetaStore implementation of {{public PartitionsByExprResult > get_partitions_by_expr(PartitionsByExprRequest req) throws TException}} , an > expression is serialized into byte array from the client side and passed > through PartitionsByExprRequest. Then HMS will deserialize back into the > expression and filter the partitions by it. > Such partition filtering expression can contain various UDFs. If there are > some changes to one of the UDFs between different Hive versions, HS2 on the > older version will serialize the expression in old format which won't be able > to be deserialized by HMS on the newer version. One example of that is, > GenericUDFIn class adds {{transient}} to the field constantInSet which will > cause such incompatibility. > One approach I'm thinking of is, instead of converting the expression object > to byte array, we can pass the expression string directly. > > > > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)