[ https://issues.apache.org/jira/browse/IGNITE-24995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrey Mashenkov updated IGNITE-24995: -------------------------------------- Description: *Motivation.* As for now, a SharedState class for storing correlates in execution context and is used by CorrelatedNestedLoopJoinNode (CNLJN) execution node. Seems, CorrelatedNestedLoopJoinNode was designed to use batching for correlates variables, to transfer many rows at a time, but implemented in wrong way, and this just don't work. There are few related issues 1. The class implements Serializable interface and can be transferred to another node. This causes using DefaultUserObjectMarshaller for class serialization in messaging system. Despite the SharedState class contains BinaryTuple objects, they are not converted to byte[] during serialization, which is ineffective. Maybe making it Externalizable could mitigate the issue. 2. We don't need to put a whole sql row to a correlate variable, but only required row columns(projection) to reduce network pressure. It is important that all the nodes creates the same projection for the same correlate. 3. We should fix the SharedState class to make batching possible, by allowing set multiple rows for the same correlate id. Most likely, we must keep correlates hierarchy order to preserve CNLJN collation. Correlate id number doesn't have this guarantee) in case of more than one correlate. It may turn out that passing batches for parent correlates is useless, because we can spool only child batch at a time to preserve the collation. Thus, SharedState maybe split or changed it's structure, to separate correlates, which where received from parent fragment, and current correlates to be passed to child fragment. *Suggestion* Let's improve SharedState class structure to support batching, by allowing multiple rows for same correlate and resolve ordering issue (if it exists). Let's resolve serialization issue by adding message class for this (or use externalizable at least). Let's avoid transferring whole rows. was: As for now, a SharedState class for storing correlates in execution context and is used by CorrelatedNestedLoopJoinNode (CNLJN) execution node. Seems, CorrelatedNestedLoopJoinNode was designed to use batching for correlates variables, to transfer many rows at a time, but implemented in wrong way, and this just don't work. There are few related issues 1. The class implements Serializable interface and can be transferred to another node. This causes using DefaultUserObjectMarshaller for class serialization in messaging system. Despite the SharedState class contains BinaryTuple objects, they are not converted to byte[] during serialization, which is ineffective. Maybe making it Externalizable could mitigate the issue. 2. We don't need to put a whole sql row to a correlate variable, but only required row columns(projection) to reduce network pressure. It is important that all the nodes creates the same projection for the same correlate. 3. We should fix the SharedState class to make batching possible, by allowing set multiple rows for the same correlate id. Most likely, we must keep correlates hierarchy order to preserve CNLJN collation. Correlate id number doesn't have this guarantee) in case of more than one correlate. Let's improve SharedState class structure, and fix/drop broken batching, fix messaging serialization issue. > Sql. Rework correlates serialization and propagation to another node. > --------------------------------------------------------------------- > > Key: IGNITE-24995 > URL: https://issues.apache.org/jira/browse/IGNITE-24995 > Project: Ignite > Issue Type: Improvement > Components: sql > Affects Versions: 3.0 > Reporter: Andrey Mashenkov > Priority: Major > Labels: ignite-3, performance, tech-debt > > *Motivation.* > As for now, a SharedState class for storing correlates in execution context > and is used by CorrelatedNestedLoopJoinNode (CNLJN) execution node. > Seems, CorrelatedNestedLoopJoinNode was designed to use batching for > correlates variables, to transfer many rows at a time, but implemented in > wrong way, and this just don't work. > There are few related issues > 1. The class implements Serializable interface and can be transferred to > another node. > This causes using DefaultUserObjectMarshaller for class serialization in > messaging system. Despite the SharedState class contains BinaryTuple objects, > they are not converted to byte[] during serialization, which is ineffective. > Maybe making it Externalizable could mitigate the issue. > 2. We don't need to put a whole sql row to a correlate variable, but only > required row columns(projection) to reduce network pressure. > It is important that all the nodes creates the same projection for the same > correlate. > 3. We should fix the SharedState class to make batching possible, by allowing > set multiple rows for the same correlate id. > Most likely, we must keep correlates hierarchy order to preserve CNLJN > collation. Correlate id number doesn't have this guarantee) in case of more > than one correlate. > It may turn out that passing batches for parent correlates is useless, > because we can spool only child batch at a time to preserve the collation. > Thus, SharedState maybe split or changed it's structure, to separate > correlates, which where received from parent fragment, and current correlates > to be passed to child fragment. > *Suggestion* > Let's improve SharedState class structure to support batching, by allowing > multiple rows for same correlate and resolve ordering issue (if it exists). > Let's resolve serialization issue by adding message class for this (or use > externalizable at least). > Let's avoid transferring whole rows. -- This message was sent by Atlassian Jira (v8.20.10#820010)