[ https://issues.apache.org/jira/browse/HIVE-24297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17221435#comment-17221435 ]
Ádám Szita commented on HIVE-24297: ----------------------------------- Looking into the issue in more detail with [~asinkovits] we found that the root cause is actually the setting of declaredCacheLength on the new buffer. This causes the new buffer to be inserted into the cache policy too - this should not be happening. If we have buffer with the same content already in cache the new one should be discarded. BTW this can also cause CacheContentsTracker to report negative buffer and byte counts, because when 2 buffers collide, they only report one cache event, but both will be evicted in the future, resulting in negative balance..(this happens without HIVE-23741 too, it just doesn't throw an error). If we unset declaredCacheLength on the new buffer then unlockBuffer method (called later after the put, in processCollisions) will immediately deallocate the new, unused buffer, rather than inserting it to cache policy with top priority, and be evicted much much later. > LLAP buffer collision causes NPE > -------------------------------- > > Key: HIVE-24297 > URL: https://issues.apache.org/jira/browse/HIVE-24297 > Project: Hive > Issue Type: Bug > Reporter: Ádám Szita > Assignee: Ádám Szita > Priority: Major > > HIVE-23741 introduced an optimization so that CacheTags are not stored on > buffer level, but rather on file level, as one cache tag can only relate to > one file. With this change a buffer->filecache reference was introduced so > that the buffer's tag can be calculated with an extra indirection i.e. > buffer.filecache.tag. > However during buffer collision in putFileData method, we don't set the > filecache reference of the collided (new) buffer: > [https://github.com/apache/hive/commit/2e18a7408a8dd49beecad8d66bfe054b7dc474da#diff-d2ccd7cf3042845a0812a5e118f82db49253d82fc86449ffa408903bf434fb6dR309-R311] > Later this cases NPE when the new (instantly decRef'ed) buffer is evicted: > {code:java} > Caused by: java.lang.NullPointerException > at > java.util.concurrent.ConcurrentSkipListMap.doGet(ConcurrentSkipListMap.java:778) > at > java.util.concurrent.ConcurrentSkipListMap.get(ConcurrentSkipListMap.java:1546) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.getTagState(CacheContentsTracker.java:129) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.getTagState(CacheContentsTracker.java:125) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.reportRemoved(CacheContentsTracker.java:109) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.notifyEvicted(CacheContentsTracker.java:238) > at > org.apache.hadoop.hive.llap.cache.LowLevelLrfuCachePolicy.evictSomeBlocks(LowLevelLrfuCachePolicy.java:276) > at > org.apache.hadoop.hive.llap.cache.CacheContentsTracker.evictSomeBlocks(CacheContentsTracker.java:177) > at > org.apache.hadoop.hive.llap.cache.LowLevelCacheMemoryManager.reserveMemory(LowLevelCacheMemoryManager.java:98) > at > org.apache.hadoop.hive.llap.cache.LowLevelCacheMemoryManager.reserveMemory(LowLevelCacheMemoryManager.java:65) > at > org.apache.hadoop.hive.llap.cache.BuddyAllocator.allocateMultiple(BuddyAllocator.java:323) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.allocateMultiple(EncodedReaderImpl.java:1302) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedStream(EncodedReaderImpl.java:930) > at > org.apache.hadoop.hive.ql.io.orc.encoded.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:506) > ... 16 more {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)