[jira] [Updated] (SPARK-50883) Support altering multiple columns in the same ALTER TABLE command

Cuong Nguyen (Jira) Fri, 17 Jan 2025 12:40:24 -0800


     [ 
https://issues.apache.org/jira/browse/SPARK-50883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Cuong Nguyen updated SPARK-50883:
---------------------------------
    Description: 
Current ALTER TABLE ... ALTER COLUMN syntax only allows altering one column at 
a time. For a large table with many columns, we need to run a command for every 
column, which can be slow since we need to incur the preprocessing and IO cost 
repeatedly. 

A new syntax that allows specifying multiple columns can open door for sharing 
such cost across multiple column changes. We propose this new syntax
{code:java}
ALTER TABLE table_name ALTER COLUMN { 
  { column_identifier | field_name }
  { COMMENT comment |
  { FIRST | AFTER identifier } |
  { SET | DROP } NOT NULL |
  TYPE data_type |
  SET DEFAULT clause |
  DROP DEFAULT }
} [, ...]  {code}
For example:
{code:java}
ALTER TABLE test_table ALTER COLUMN
  a COMMENT "new comment",
  b TYPE BIGINT,
  x.y.z FIRST{code}
This new syntax is backward compatible with the current syntax. To bound the 
complexity of the initial support of this syntax we place the following 
restrictions:
 * Altering the same column multiple times is not allowed
 * Altering a parent and a child column (for nested data type) is not allowed.

 

  was:
Current ALTER TABLE ... ALTER COLUMN syntax only allows altering one column at 
a time. For a large table with many columns, we need to run a command for every 
column, which can be slow since we need to incur the preprocessing and IO cost 
repeatedly. 

A new syntax that allows specifying multiple columns can open door for sharing 
such cost across multiple column changes. We propose this new syntax
{code:java}
ALTER TABLE table_name ALTER COLUMN { 
  { column_identifier | field_name }
  { COMMENT comment |
    { FIRST | AFTER identifier } |
    { SET | DROP } NOT NULL |
    TYPE data_type |
    SET DEFAULT clause |
    DROP DEFAULT }
}{code}
For example:
{code:java}
ALTER TABLE test_table ALTER COLUMN
  a COMMENT "new comment",
  b TYPE BIGINT,
  x.y.z FIRST{code}
This new syntax is backward compatible with the current syntax. To bound the 
complexity of the initial support of this syntax we place the following 
restrictions:
 * Altering the same column multiple times is not allowed
 * Altering a parent and a child column (for nested data type) is not allowed.

 


> Support altering multiple columns in the same ALTER TABLE command
> -----------------------------------------------------------------
>
>                 Key: SPARK-50883
>                 URL: https://issues.apache.org/jira/browse/SPARK-50883
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 3.5.4
>            Reporter: Cuong Nguyen
>            Priority: Major
>              Labels: pull-request-available
>
> Current ALTER TABLE ... ALTER COLUMN syntax only allows altering one column 
> at a time. For a large table with many columns, we need to run a command for 
> every column, which can be slow since we need to incur the preprocessing and 
> IO cost repeatedly. 
> A new syntax that allows specifying multiple columns can open door for 
> sharing such cost across multiple column changes. We propose this new syntax
> {code:java}
> ALTER TABLE table_name ALTER COLUMN { 
>   { column_identifier | field_name }
>   { COMMENT comment |
>   { FIRST | AFTER identifier } |
>   { SET | DROP } NOT NULL |
>   TYPE data_type |
>   SET DEFAULT clause |
>   DROP DEFAULT }
> } [, ...]  {code}
> For example:
> {code:java}
> ALTER TABLE test_table ALTER COLUMN
>   a COMMENT "new comment",
>   b TYPE BIGINT,
>   x.y.z FIRST{code}
> This new syntax is backward compatible with the current syntax. To bound the 
> complexity of the initial support of this syntax we place the following 
> restrictions:
>  * Altering the same column multiple times is not allowed
>  * Altering a parent and a child column (for nested data type) is not allowed.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-50883) Support altering multiple columns in the same ALTER TABLE command

Reply via email to