Pavel, I don't have the logs for the client node. It happened 2 times in our cluster till now in 45 days. Difficult to reproduce. But the logs show a null point exception on server nodes... 1st one server node (192.168.1.6) went down and then the other.
In 12255, it is noted that an assertion could be seen on the coordinator, but this is a null pointer exception. Agree, the race condition, described in 12255 seems similar to the logs i attached. But just does not explain the null pointer exception. The race is the following: Client node (with some configured caches) joins to a cluster sending SingleMessage to coordinator during client PME. This SingleMessage contains affinity fetch requests for all cluster caches. When SingleMessage is in-flight server nodes finish client PME and also process and finish cache destroy PME. When a cache is destroyed affinity for that cache is cleared. When SingleMessage delivered to coordinator it doesn’t have affinity for a requested cache because the cache is already destroyed. It leads to assertion error on the coordinator and unpredictable behavior on the client node.