[ https://issues.apache.org/jira/browse/SPARK-50883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cuong Nguyen updated SPARK-50883: --------------------------------- Description: Current ALTER TABLE ... ALTER COLUMN syntax only allows altering one column at a time. For a large table with many columns, we need to run a command for every column, which can be slow since we need to incur the preprocessing and IO cost repeatedly. A new syntax that allows specifying multiple columns can open door for sharing such cost across multiple column changes. We propose this new syntax {code:java} ALTER TABLE table_name ALTER COLUMN { { column_identifier | field_name } { COMMENT comment | { FIRST | AFTER identifier } | { SET | DROP } NOT NULL | TYPE data_type | SET DEFAULT clause | DROP DEFAULT } } [, ...] {code} For example: {code:java} ALTER TABLE test_table ALTER COLUMN a COMMENT "new comment", b TYPE BIGINT, x.y.z FIRST{code} This new syntax is backward compatible with the current syntax. To bound the complexity of the initial support of this syntax we place the following restrictions: * Altering the same column multiple times is not allowed * Altering a parent and a child column (for nested data type) is not allowed. was: Current ALTER TABLE ... ALTER COLUMN syntax only allows altering one column at a time. For a large table with many columns, we need to run a command for every column, which can be slow since we need to incur the preprocessing and IO cost repeatedly. A new syntax that allows specifying multiple columns can open door for sharing such cost across multiple column changes. We propose this new syntax {code:java} ALTER TABLE table_name ALTER COLUMN { { column_identifier | field_name } { COMMENT comment | { FIRST | AFTER identifier } | { SET | DROP } NOT NULL | TYPE data_type | SET DEFAULT clause | DROP DEFAULT } }{code} For example: {code:java} ALTER TABLE test_table ALTER COLUMN a COMMENT "new comment", b TYPE BIGINT, x.y.z FIRST{code} This new syntax is backward compatible with the current syntax. To bound the complexity of the initial support of this syntax we place the following restrictions: * Altering the same column multiple times is not allowed * Altering a parent and a child column (for nested data type) is not allowed. > Support altering multiple columns in the same ALTER TABLE command > ----------------------------------------------------------------- > > Key: SPARK-50883 > URL: https://issues.apache.org/jira/browse/SPARK-50883 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 3.5.4 > Reporter: Cuong Nguyen > Priority: Major > Labels: pull-request-available > > Current ALTER TABLE ... ALTER COLUMN syntax only allows altering one column > at a time. For a large table with many columns, we need to run a command for > every column, which can be slow since we need to incur the preprocessing and > IO cost repeatedly. > A new syntax that allows specifying multiple columns can open door for > sharing such cost across multiple column changes. We propose this new syntax > {code:java} > ALTER TABLE table_name ALTER COLUMN { > { column_identifier | field_name } > { COMMENT comment | > { FIRST | AFTER identifier } | > { SET | DROP } NOT NULL | > TYPE data_type | > SET DEFAULT clause | > DROP DEFAULT } > } [, ...] {code} > For example: > {code:java} > ALTER TABLE test_table ALTER COLUMN > a COMMENT "new comment", > b TYPE BIGINT, > x.y.z FIRST{code} > This new syntax is backward compatible with the current syntax. To bound the > complexity of the initial support of this syntax we place the following > restrictions: > * Altering the same column multiple times is not allowed > * Altering a parent and a child column (for nested data type) is not allowed. > -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org