Tx processing is supposed to be thread bound by hashing the version to a
partition, see methods like [1]
If for some cases this invariant is broken, this should be fixed.

[1] 
org.apache.ignite.internal.processors.cache.distributed.dht.GridDhtTxPrepareRequest#partition

пт, 19 мая 2023 г. в 15:57, Anton Vinogradov <a...@apache.org>:

> Igniters,
>
> My team was faced with node failure [1] because of non-threadsafe
> collections usage.
>
> IgniteTxStateImpl's fields
> - activeCacheIds
> - txMap
> are not thread safe, but are widely used from different threads without the
> proper sync.
>
> The main question is ... why?
>
> According to the research, we have no guarantee that tx will be processed
> at the single thread.
> It may be processed at the several! threads at the striped pool and at the
> tx recovery thread as well.
>
> Thread at the striped pool will be selected by the message's partition()
> method, which can be calculated like this:
> - return keys != null && !keys.isEmpty() ? keys.get(0).partition() : -1;
> - return U.safeAbs(version().hashCode());
> - ...,
> so, no guarantee it is processed at the same thread (proven by tests).
>
> Seems, we MAY lose the data.
> For example, ignoring some or all keys from txMap at commit.
>
> If anyone knows why this is not a problem (I mean sync lack, not data loss)
> or how to fix this properly, please give me a hint, or correct my
> conclusions if necessary.
>
> [1] https://issues.apache.org/jira/browse/IGNITE-19445
>


-- 

Best regards,
Alexei Scherbakov

Reply via email to