[ https://issues.apache.org/jira/browse/HIVE-5105?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13742894#comment-13742894 ]
Hudson commented on HIVE-5105: ------------------------------ FAILURE: Integrated in Hive-trunk-h0.21 #2274 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2274/]) HIVE-5105 HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up fieldPositionMap (Eugene Koifman via Sushanth Sowmyan) (khorgath: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1514929) * /hive/trunk/hcatalog/core/src/main/java/org/apache/hcatalog/data/schema/HCatSchema.java * /hive/trunk/hcatalog/core/src/test/java/org/apache/hcatalog/data/schema/TestHCatSchema.java > HCatSchema.remove(HCatFieldSchema hcatFieldSchema) does not clean up > fieldPositionMap > ------------------------------------------------------------------------------------- > > Key: HIVE-5105 > URL: https://issues.apache.org/jira/browse/HIVE-5105 > Project: Hive > Issue Type: Bug > Components: HCatalog > Affects Versions: 0.12.0 > Reporter: Eugene Koifman > Assignee: Eugene Koifman > Fix For: 0.12.0 > > Attachments: HIVE-5105.patch > > > org.apache.hcatalog.data.schema.HCatSchema.remove(HCatFieldSchema > hcatFieldSchema) makes the following call: > fieldPositionMap.remove(hcatFieldSchema); > but fieldPositionMap is of type Map<String, Integer> so the element is not > getting removed > Here's a detailed comment from [~sushanth] > The result is that that the name will not be removed from fieldPositionMap. > This results in 2 things: > a) If anyone tries to append a field to a hcatschema after having removed > that field, it shouldn't fail, but it will. > b) If anyone asks for the position of the removed field by name, it will > still give the position. > Now, there is only one place in hcat code where we remove a field, and that > is called from HCatOutputFormat.setSchema, where we try to detect if the user > specified partition column names in the schema when they shouldn't have, and > if they did, we remove it. Normally, people do not specify this, and this > check tends to be superfluous. > Once we do this, we wind up serializing that new object (after performing > some validations), and this does appear to stay through the serialization > (and eventual deserialization) which is very worrying. > However, we are luckily saved by the fact that we do not append that field to > it at any time(all appends in hcat code are done on newly initialized > HCatSchema objects which have had no removes done on them), and we don't ask > for the position of something we do not expect to be there(harder to verify > for certain, but seems to be the case on inspection). > The main part that gives me worry is that HCatSchema is part of our public > interface for HCat, in that M/R programs that use HCat can use it, and thus, > they might have more interesting usage patterns that are hitting this bug. > I can't think of any currently open bugs that is caused by this because of > the rarity of the situation, but nevertheless, something we should fix > immediately. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira