[ 
https://issues.apache.org/jira/browse/KAFKA-2795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14999221#comment-14999221
 ] 

Guozhang Wang commented on KAFKA-2795:
--------------------------------------

+1.

Originally it was written in the above case but I changed it for the internal 
usage of addGroup (which is wrong btw). Since now we are only calling addGroup 
directly from GroupCoordinator we can probably remove the private addGroup and 
move the logic back to the public addGroup with the above logic.

> potential NPE in GroupMetadataManager
> -------------------------------------
>
>                 Key: KAFKA-2795
>                 URL: https://issues.apache.org/jira/browse/KAFKA-2795
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Onur Karaman
>            Assignee: Jason Gustafson
>
> I didn't run the code, but I took a look at GroupMetadataManager.addGroup and 
> it looks like we can get a NullPointerException when a group is somehow 
> removed between the groupsCache.putIfNotExists and groupsCache.get lines and 
> someone tries to use the result of the addGroup. One way this can happen is 
> by interleaving GroupMetadataManager.addGroup and 
> GroupMetadataManager.removeGroupsForPartition.
> Here's the scenario:
> # thread-1 is in the middle of adding a group g which is in the offset topic 
> partition p. thread-1 already hit the groupsCache.putIfNotExists line in 
> GroupMetadataManager.addGroup
> # thread-2 is in the middle of migrating all groups for partition p. thread-2 
> is in GroupMetadataManager.removeGroupsForPartition and called 
> groupsCache.remove("g").
> # thread-1 now executes groupsCache.get("g"), which returns null since it's 
> now gone.
> # thread-1 now goes back to the GroupCoordinator doJoinGroup with a null 
> GroupMetadata and then tries to do a group synchronized {...}, resulting in 
> an NPE.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to