Tx processing is supposed to be thread bound by hashing the version to a partition, see methods like [1] If for some cases this invariant is broken, this should be fixed.
[1] org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareRequest#partition пт, 19 мая 2023 г. в 15:57, Anton Vinogradov <a...@apache.org>: > Igniters, > > My team was faced with node failure [1] because of non-threadsafe > collections usage. > > IgniteTxStateImpl's fields > - activeCacheIds > - txMap > are not thread safe, but are widely used from different threads without the > proper sync. > > The main question is ... why? > > According to the research, we have no guarantee that tx will be processed > at the single thread. > It may be processed at the several! threads at the striped pool and at the > tx recovery thread as well. > > Thread at the striped pool will be selected by the message's partition() > method, which can be calculated like this: > - return keys != null && !keys.isEmpty() ? keys.get(0).partition() : -1; > - return U.safeAbs(version().hashCode()); > - ..., > so, no guarantee it is processed at the same thread (proven by tests). > > Seems, we MAY lose the data. > For example, ignoring some or all keys from txMap at commit. > > If anyone knows why this is not a problem (I mean sync lack, not data loss) > or how to fix this properly, please give me a hint, or correct my > conclusions if necessary. > > [1] https://issues.apache.org/jira/browse/IGNITE-19445 > -- Best regards, Alexei Scherbakov