Igniters,

My team was faced with node failure [1] because of non-threadsafe
collections usage.

IgniteTxStateImpl's fields
- activeCacheIds
- txMap
are not thread safe, but are widely used from different threads without the
proper sync.

The main question is ... why?

According to the research, we have no guarantee that tx will be processed
at the single thread.
It may be processed at the several! threads at the striped pool and at the
tx recovery thread as well.

Thread at the striped pool will be selected by the message's partition()
method, which can be calculated like this:
- return keys != null && !keys.isEmpty() ? keys.get(0).partition() : -1;
- return U.safeAbs(version().hashCode());
- ...,
so, no guarantee it is processed at the same thread (proven by tests).

Seems, we MAY lose the data.
For example, ignoring some or all keys from txMap at commit.

If anyone knows why this is not a problem (I mean sync lack, not data loss)
or how to fix this properly, please give me a hint, or correct my
conclusions if necessary.

[1] https://issues.apache.org/jira/browse/IGNITE-19445

Reply via email to