[jira] [Comment Edited] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

Ashutosh Chauhan (JIRA) Sat, 21 Jul 2012 11:53:37 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-3098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13419908#comment-13419908
 ]


Ashutosh Chauhan edited comment on HIVE-3098 at 7/21/12 6:52 PM:
-----------------------------------------------------------------

[~mithun] Few comments on current patch.
* As Rohini suggested its a good idea to extract UGI in its own class, since 
HadoopThrift20SAuthBridge is already getting too big.
* +1 on using Guava for this. {{CacheBuilder}} class provides appropriate 
functionality for it.
* I think configuring the size of cache with CacheBuilder::maximumSize(long 
size) might be better idea then time-based expiration. This way we can limit 
the cache size by memory rather time-based. What do you think?
* Also, update HadoopShimsSecure::createRemoteUser() to do 
UGICache.getRemoteUser() from current UGI.createRemoteUser() so that it also 
benefits from ugi-cache. 
                
      was (Author: ashutoshc):
    [~mithun] Few comments on current patch.
* As Rohini suggested its a good idea to extract UGI in its own class, since 
HadoopThrift20SAuthBridge is already getting too big.
* +1 on using Guava for this. {{CacheBuilder}} class provides appropriate 
functionality for it.
* I think configuring the size of cache with maximumSize(long size) might be 
better idea then time-based expiration. This way we can limit the cache size by 
memory rather time-based. What do you think?
* Also, update HadoopShimsSecure::createRemoteUser() to do 
UGICache.getRemoteUser() from current UGI.createRemoteUser() so that it also 
benefits from ugi-cache. 
                  
> Memory leak from large number of FileSystem instances in FileSystem.CACHE. 
> (Must cache UGIs.)
> ---------------------------------------------------------------------------------------------
>
>                 Key: HIVE-3098
>                 URL: https://issues.apache.org/jira/browse/HIVE-3098
>             Project: Hive
>          Issue Type: Bug
>          Components: Shims
>    Affects Versions: 0.9.0
>         Environment: Running with Hadoop 20.205.0.3+ / 1.0.x with security 
> turned on.
>            Reporter: Mithun Radhakrishnan
>            Assignee: Mithun Radhakrishnan
>         Attachments: Hive_3098.patch
>
>
> The problem manifested from stress-testing HCatalog 0.4.1 (as part of testing 
> the Oracle backend).
> The HCatalog server ran out of memory (-Xmx2048m) when pounded by 60-threads, 
> in under 24 hours. The heap-dump indicates that hadoop::FileSystem.CACHE had 
> 1000000 instances of FileSystem, whose combined retained-mem consumed the 
> entire heap.
> It boiled down to hadoop::UserGroupInformation::equals() being implemented 
> such that the "Subject" member is compared for equality ("=="), and not 
> equivalence (".equals()"). This causes equivalent UGI instances to compare as 
> unequal, and causes a new FileSystem instance to be created and cached.
> The UGI.equals() is so implemented, incidentally, as a fix for yet another 
> problem (HADOOP-6670); so it is unlikely that that implementation can be 
> modified.
> The solution for this is to check for UGI equivalence in HCatalog (i.e. in 
> the Hive metastore), using an cache for UGI instances in the shims.
> I have a patch to fix this. I'll upload it shortly. I just ran an overnight 
> test to confirm that the memory-leak has been arrested.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Comment Edited] (HIVE-3098) Memory leak from large number of FileSystem instances in FileSystem.CACHE. (Must cache UGIs.)

Reply via email to