Hi, I would like to propose a solution to this JIRA issue. I looked at the comments and there was some guidance around where in the code we should update to allow for this behaviour. But I believe there are still two questions that remain open:
1. Is this expected behaviour (i.e. users should not be able to create an empty array)? 2. If this is indeed expected behaviour, what should the data type be of the empty array? I did some digging into other query engines / databases in hopes of answering the following two questions - That can be found at the end of this thread. *Q: *Is this expected behaviour (i.e. users should not be able to create an empty array)? *A: *Yes I would say this is expected behaviour and something we should add into the Flink SQL API. *Q: *What should the data type be of the empty array? *A: *This question is a bit harder to answer and I think it would require two steps. *Step 1: Pick a default data type to initialize the empty array.* We can use an "empty data type" such as NULL, VOID. *Step 2: Create or reuse type coercion to make using empty arrays easier.* The above should unblock users from creating empty arrays, but if one would use an empty array in an COALESCE operation. i.e. SELECT COALESCE(int_column, ARRAY[]) I believe they will get a query issue where the type for int_column (INTEGER) and the empty array (NULL, VOID) do not match. Thus a user will need to cast the empty array: i.e. SELECT COALESCE(int_column, CAST(ARRAY[] AS INT)) as such to have the COALESCE query to execute successfully. ----- *Trino* > EXPLAIN SELECT ARRAY[] Fragment 0 [SINGLE] Output layout: [expr] Output partitioning: SINGLE [] Output[columnNames = [_col0]] │ Layout: [expr:array(unknown)] │ Estimates: {rows: 1 (55B), cpu: 0, memory: 0B, network: 0B} │ _col0 := expr └─ Values[] Layout: [expr:array(unknown)] Estimates: {rows: 1 (55B), cpu: 0, memory: 0B, network: 0B} ("$literal$"(from_base64('AwAAAFJMRQAAAAAKAAAAQllURV9BUlJBWQEAAAABgAAAAAA='))) Expected behaviour? *Yes.* Array data type? *Unknown.* *Spark* > sc.sql.sql("SELECT ARRAY[]").explain() DataFrame[array(): array<void>] Expected behaviour? *Yes.* Array data type? *Void.* *BigQuery* > SELECT ARRAY[] Field name Type Mode f0_ INTEGER REPEATED Expected behaviour? *Yes.* Array data type? *Integer.* Best, Eric