Alexey Serbin created KUDU-3577: ----------------------------------- Summary: Dropping a nullable column from a table with per-range hash partitions make the table unusable Key: KUDU-3577 URL: https://issues.apache.org/jira/browse/KUDU-3577 Project: Kudu Issue Type: Bug Components: client, tserver Affects Versions: 1.17.0 Reporter: Alexey Serbin
See the reproduction scenario using the {{kudu}} CLI tools below. Set environment variable for the Kudu cluster's RPC endpoint: {noformat} $ export M=<master_RPC_address(es)> {noformat} Create a table with two range partitions. It's crucial that the {{city}} column is nullable. {noformat} $ kudu table create $M '{ "table_name": "test", "schema": { "columns": [ { "column_name": "id", "column_type": "INT64" }, { "column_name": "name", "column_type": "STRING" }, { "column_name": "age", "column_type": "INT32" }, { "column_name": "city", "column_type": "STRING", "is_nullable": true } ], "key_column_names": ["id", "name", "age"] }, "partition": { "hash_partitions": [ {"columns": ["id"], "num_buckets": 4, "seed": 1}, {"columns": ["name"], "num_buckets": 4, "seed": 2} ], "range_partition": { "columns": ["age"], "range_bounds": [ { "lower_bound": {"bound_type": "inclusive", "bound_values": ["30"]}, "upper_bound": {"bound_type": "exclusive", "bound_values": ["60"]} }, { "lower_bound": {"bound_type": "inclusive", "bound_values": ["60"]}, "upper_bound": {"bound_type": "exclusive", "bound_values": ["90"]} } ] } }, "num_replicas": 1 }' {noformat} Add an extra range partition with custom hash schema: {noformat} $ kudu table add_range_partition $M test '[90]' '[120]' --hash_schema '{"hash_schema": [ {"columns": ["id"], "num_buckets": 3, "seed": 5}, {"columns": ["name"], "num_buckets": 3, "seed": 6} ]}' {noformat} Check the updated partitioning info: {noformat} $ kudu table describe $M test TABLE test ( id INT64 NOT NULL, name STRING NOT NULL, age INT32 NOT NULL, city STRING NULLABLE, PRIMARY KEY (id, name, age) ) HASH (id) PARTITIONS 4 SEED 1, HASH (name) PARTITIONS 4 SEED 2, RANGE (age) ( PARTITION 30 <= VALUES < 60, PARTITION 60 <= VALUES < 90, PARTITION 90 <= VALUES < 120 HASH(id) PARTITIONS 3 HASH(name) PARTITIONS 3 ) OWNER root REPLICAS 1 COMMENT {noformat} Drop the {{city}} column: {noformat} $ kudu table delete_column $M test city {noformat} Now try to run the {{kudu table describe}} against the table once the {{city}} column is dropped. It errors out with {{Invalid argument}}: {noformat} $ kudu table describe $M test Invalid argument: Invalid split row type UNKNOWN {noformat} A similar issue manifests itself when trying to run {{kudu table scan}} against the table: {noformat} $ kudu table scan $M test Invalid argument: Invalid split row type UNKNOWN {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010)