[jira] [Updated] (KUDU-3577) Altering a table with per-range hash partitions might make the table unusable

Alexey Serbin (Jira) Thu, 06 Jun 2024 21:03:20 -0700


     [ 
https://issues.apache.org/jira/browse/KUDU-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Alexey Serbin updated KUDU-3577:
--------------------------------
    Code Review: https://gerrit.cloudera.org/#/c/21486/

> Altering a table with per-range hash partitions might make the table unusable
> -----------------------------------------------------------------------------
>
>                 Key: KUDU-3577
>                 URL: https://issues.apache.org/jira/browse/KUDU-3577
>             Project: Kudu
>          Issue Type: Bug
>          Components: client, master, tserver
>    Affects Versions: 1.17.0
>            Reporter: Alexey Serbin
>            Priority: Major
>
> For particular table schemas with per-range hash schemas, dropping a nullable 
> column from might make the table unusable.  A workaround exists: just add the 
> dropped column back using the {{kudu table add_column}} CLI tool.  For 
> example, for the reproduction scenario below, use the following command to 
> restore the access to the table's data:
> {noformat}
> $ kudu table add_column $M test city string
> {noformat}
> As for the reproduction scenario, see below for the sequence of {{kudu}} CLI 
> commands.
> Set environment variable for the Kudu cluster's RPC endpoint:
> {noformat}
> $ export M=<master_RPC_address(es)>
> {noformat}
> Create a table with two range partitions.  It's crucial that the {{city}} 
> column is nullable.
> {noformat}
> $ kudu table create $M '{ "table_name": "test", "schema": { "columns": [ { 
> "column_name": "id", "column_type": "INT64" }, { "column_name": "name", 
> "column_type": "STRING" }, { "column_name": "age", "column_type": "INT32" }, 
> { "column_name": "city", "column_type": "STRING", "is_nullable": true } ], 
> "key_column_names": ["id", "name", "age"] }, "partition": { 
> "hash_partitions": [ {"columns": ["id"], "num_buckets": 4, "seed": 1}, 
> {"columns": ["name"], "num_buckets": 4, "seed": 2} ], "range_partition": { 
> "columns": ["age"], "range_bounds": [ { "lower_bound": {"bound_type": 
> "inclusive", "bound_values": ["30"]}, "upper_bound": {"bound_type": 
> "exclusive", "bound_values": ["60"]} }, { "lower_bound": {"bound_type": 
> "inclusive", "bound_values": ["60"]}, "upper_bound": {"bound_type": 
> "exclusive", "bound_values": ["90"]} } ] } }, "num_replicas": 1 }'
> {noformat}
> Add an extra range partition with custom hash schema:
> {noformat}
> $ kudu table add_range_partition $M test '[90]' '[120]' --hash_schema 
> '{"hash_schema": [ {"columns": ["id"], "num_buckets": 3, "seed": 5}, 
> {"columns": ["name"], "num_buckets": 3, "seed": 6} ]}'
> {noformat}
> Check the updated partitioning info:
> {noformat}
> $ kudu table describe $M test
> TABLE test (
>     id INT64 NOT NULL,
>     name STRING NOT NULL,
>     age INT32 NOT NULL,
>     city STRING NULLABLE,
>     PRIMARY KEY (id, name, age)
> )
> HASH (id) PARTITIONS 4 SEED 1,
> HASH (name) PARTITIONS 4 SEED 2,
> RANGE (age) (
>     PARTITION 30 <= VALUES < 60,
>     PARTITION 60 <= VALUES < 90,
>     PARTITION 90 <= VALUES < 120 HASH(id) PARTITIONS 3 HASH(name) PARTITIONS 3
> )
> OWNER root
> REPLICAS 1
> COMMENT 
> {noformat}
> Drop the {{city}} column:
> {noformat}
> $ kudu table delete_column $M test city
> {noformat}
> Now try to run the {{kudu table describe}} against the table once the 
> {{city}} column is dropped.  It errors out with {{Invalid argument}}:
> {noformat}
> $ kudu table describe $M test
> Invalid argument: Invalid split row type UNKNOWN
> {noformat}
> A similar issue manifests itself when trying to run {{kudu table scan}} 
> against the table:
> {noformat}
> $ kudu table scan $M test
> Invalid argument: Invalid split row type UNKNOWN
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KUDU-3577) Altering a table with per-range hash partitions might make the table unusable

Reply via email to