[ https://issues.apache.org/jira/browse/IGNITE-22501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Iurii Gerzhedovich updated IGNITE-22501: ---------------------------------------- Description: >From an architectural perspective, AI3 has many waits for any DDL statements >due to need to have strong guarantees. It leads to a significant time of >execution initialization script of a database containing tens or tons of DDL ( >CREATE TABLE, CREATE INDEX ...) statements. To improve the performance of such scripts, you can form batches of continuous sequences of DDL operations and execute them as a single catalog command. In case of any error during apply such batches it should be fallbacked to statement by statement execution to provide The only requirement is that batching must be transparent to the user, no changes in observable behavior must be made. We need to understand not all operations can be reordered. For example CREATE INDEX can't be executed earlier than CREATING TABLE, DROP TABLE can't be reordered with any other operation on the same table. As the first step, we can go with very simple optimization, but it should give as good boost for most cases. Let's batch all CREATE TABLE 's from a script while we meets only CREATE TABLE and CREATE INDEX statements and execute as first batch. The second batch will be CREATE INDEX statements which we meet during collecting the first batch. We can continuously repeat the operation as we start meet CREATE INDEX and CREATE TABLE statements. The proposed solution has one important consequences, which need to be reflected in the documentation. Operations can be reordered and any separated operations can be finished with an error, so after the user receives an error they can't be sure which part of the script was applied. was: >From an architectural perspective, AI3 has many waits for any DDL statements >due to need to have strong guarantees. It leads to a significant time of >execution initialization script of a database containing tens or tons of DDL ( >CREATE TABLE, CREATE INDEX ...) statements. To improve the performance of such scripts, you can form batches of continuous sequences of DDL operations and execute them as a single catalog command. Reuirements just one - We need to understand not all operations can be reordered. For example CREATE INDEX can't be executed earlier than CREATING TABLE, DROP TABLE can't be reordered with any other operation on the same table. As the first step, we can go with very simple optimization, but it should give as good boost for most cases. Let's batch all CREATE TABLE 's from a script while we meets only CREATE TABLE and CREATE INDEX statements and execute as first batch. The second batch will be CREATE INDEX statements which we meet during collecting the first batch. We can continuously repeat the operation as we start meet CREATE INDEX and CREATE TABLE statements. The proposed solution has one important consequences, which need to be reflected in the documentation. Operations can be reordered and any separated operations can be finished with an error, so after the user receives an error they can't be sure which part of the script was applied. > Sql. Batching DDL statement for scripts > --------------------------------------- > > Key: IGNITE-22501 > URL: https://issues.apache.org/jira/browse/IGNITE-22501 > Project: Ignite > Issue Type: Improvement > Components: sql > Reporter: Iurii Gerzhedovich > Priority: Major > Labels: ignite-3 > > From an architectural perspective, AI3 has many waits for any DDL statements > due to need to have strong guarantees. It leads to a significant time of > execution initialization script of a database containing tens or tons of DDL > ( CREATE TABLE, CREATE INDEX ...) statements. > To improve the performance of such scripts, you can form batches of > continuous sequences of DDL operations and execute them as a single catalog > command. > In case of any error during apply such batches it should be fallbacked to > statement by statement execution to provide > The only requirement is that batching must be transparent to the user, no > changes in observable behavior must be made. > We need to understand not all operations can be reordered. For example > CREATE INDEX can't be executed earlier than CREATING TABLE, DROP TABLE can't > be reordered with any other operation on the same table. > As the first step, we can go with very simple optimization, but it should > give as good boost for most cases. Let's batch all CREATE TABLE 's from a > script while we meets only CREATE TABLE and CREATE INDEX statements and > execute as first batch. The second batch will be CREATE INDEX statements > which we meet during collecting the first batch. We can continuously repeat > the operation as we start meet CREATE INDEX and CREATE TABLE statements. > The proposed solution has one important consequences, which need to be > reflected in the documentation. Operations can be reordered and any separated > operations can be finished with an error, so after the user receives an error > they can't be sure which part of the script was applied. > -- This message was sent by Atlassian Jira (v8.20.10#820010)