[ https://issues.apache.org/jira/browse/KUDU-3577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Alexey Serbin updated KUDU-3577: -------------------------------- Code Review: https://gerrit.cloudera.org/#/c/21486/ > Altering a table with per-range hash partitions might make the table unusable > ----------------------------------------------------------------------------- > > Key: KUDU-3577 > URL: https://issues.apache.org/jira/browse/KUDU-3577 > Project: Kudu > Issue Type: Bug > Components: client, master, tserver > Affects Versions: 1.17.0 > Reporter: Alexey Serbin > Priority: Major > > For particular table schemas with per-range hash schemas, dropping a nullable > column from might make the table unusable. A workaround exists: just add the > dropped column back using the {{kudu table add_column}} CLI tool. For > example, for the reproduction scenario below, use the following command to > restore the access to the table's data: > {noformat} > $ kudu table add_column $M test city string > {noformat} > As for the reproduction scenario, see below for the sequence of {{kudu}} CLI > commands. > Set environment variable for the Kudu cluster's RPC endpoint: > {noformat} > $ export M=<master_RPC_address(es)> > {noformat} > Create a table with two range partitions. It's crucial that the {{city}} > column is nullable. > {noformat} > $ kudu table create $M '{ "table_name": "test", "schema": { "columns": [ { > "column_name": "id", "column_type": "INT64" }, { "column_name": "name", > "column_type": "STRING" }, { "column_name": "age", "column_type": "INT32" }, > { "column_name": "city", "column_type": "STRING", "is_nullable": true } ], > "key_column_names": ["id", "name", "age"] }, "partition": { > "hash_partitions": [ {"columns": ["id"], "num_buckets": 4, "seed": 1}, > {"columns": ["name"], "num_buckets": 4, "seed": 2} ], "range_partition": { > "columns": ["age"], "range_bounds": [ { "lower_bound": {"bound_type": > "inclusive", "bound_values": ["30"]}, "upper_bound": {"bound_type": > "exclusive", "bound_values": ["60"]} }, { "lower_bound": {"bound_type": > "inclusive", "bound_values": ["60"]}, "upper_bound": {"bound_type": > "exclusive", "bound_values": ["90"]} } ] } }, "num_replicas": 1 }' > {noformat} > Add an extra range partition with custom hash schema: > {noformat} > $ kudu table add_range_partition $M test '[90]' '[120]' --hash_schema > '{"hash_schema": [ {"columns": ["id"], "num_buckets": 3, "seed": 5}, > {"columns": ["name"], "num_buckets": 3, "seed": 6} ]}' > {noformat} > Check the updated partitioning info: > {noformat} > $ kudu table describe $M test > TABLE test ( > id INT64 NOT NULL, > name STRING NOT NULL, > age INT32 NOT NULL, > city STRING NULLABLE, > PRIMARY KEY (id, name, age) > ) > HASH (id) PARTITIONS 4 SEED 1, > HASH (name) PARTITIONS 4 SEED 2, > RANGE (age) ( > PARTITION 30 <= VALUES < 60, > PARTITION 60 <= VALUES < 90, > PARTITION 90 <= VALUES < 120 HASH(id) PARTITIONS 3 HASH(name) PARTITIONS 3 > ) > OWNER root > REPLICAS 1 > COMMENT > {noformat} > Drop the {{city}} column: > {noformat} > $ kudu table delete_column $M test city > {noformat} > Now try to run the {{kudu table describe}} against the table once the > {{city}} column is dropped. It errors out with {{Invalid argument}}: > {noformat} > $ kudu table describe $M test > Invalid argument: Invalid split row type UNKNOWN > {noformat} > A similar issue manifests itself when trying to run {{kudu table scan}} > against the table: > {noformat} > $ kudu table scan $M test > Invalid argument: Invalid split row type UNKNOWN > {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)