[ https://issues.apache.org/jira/browse/HIVE-6707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prasanth J updated HIVE-6707: ----------------------------- Attachment: HIVE-6707.1.patch > LazyMap and LazyBinaryMap is broken > ----------------------------------- > > Key: HIVE-6707 > URL: https://issues.apache.org/jira/browse/HIVE-6707 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Affects Versions: 0.5.0, 0.6.0, 0.7.0, 0.8.0, 0.9.0, 0.10.0, 0.11.0, > 0.12.0, 0.13.0 > Reporter: Prasanth J > Assignee: Prasanth J > Priority: Critical > Labels: serde > Fix For: 0.13.0, 0.14.0 > > Attachments: HIVE-6707.1.patch > > > LazyPrimitive and LazyBinaryPrimitive overrides hashcode method in HIVE-949. > But it failed to override equals() method. As a result, LazyMap and > LazyBinaryMap will end up having multiple values for the same key. Both > LazyMap and LazyBinaryMap uses LinkedHashMap, so the expected behaviour is to > have a single value per unique key. > In the following code from LazyMap (LazyBinaryMap also has same code segment) > {code} > LazyPrimitive<?, ?> lazyKey = uncheckedGetKey(i); > if (lazyKey == null) { > continue; > } > Object key = lazyKey.getObject(); > if (key != null && !cachedMap.containsKey(key)) { > {code} > lazyKey.hashcode() returns the writable object's hashcode. The containsKeys() > method of hash map > (http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/HashMap.java#366) > checks if the hashcode are same, if so then it uses equals() method to > verify if the key already exists. Since LazyPrimitive does not override > equals() method it falls back to use Object equals(). Object equals() will > return true only if both object are exactly the same (this == obj). > So in the above code segment, even if the key already exists, the new value > will be inserted with hash collision resulting in more number of map entries. -- This message was sent by Atlassian JIRA (v6.2#6252)