Yu Xu created CALCITE-6988: ------------------------------ Summary: Support DuckDB Dialect Key: CALCITE-6988 URL: https://issues.apache.org/jira/browse/CALCITE-6988 Project: Calcite Issue Type: New Feature Components: core Affects Versions: 1.39.0 Reporter: Yu Xu Assignee: Yu Xu Fix For: 1.40.0
DuckDB is a very popular high-performance analytical database system(https://github.com/duckdb/duckdb), many users are using DuckDB in their work but Calcite had not supported with it yet. I had relevant materials about mainly syntax difference between default dialect and DuckDB (from document : https://duckdb.org/docs/stable/ ): *1. NullCollation is LAST mode, Nulls last for DESC and ASC, test as follow:* D select * from t3 order by data asc; ┌───────┬─────────┐ │ id │ data │ │ int32 │ varchar │ ├───────┼─────────┤ │ 1 │ aa │ │ 1 │ bb │ │ 1 │ │ └───────┴─────────┘ D select * from t3 order by data desc; ┌───────┬─────────┐ │ id │ data │ │ int32 │ varchar │ ├───────┼─────────┤ │ 1 │ bb │ │ 1 │ aa │ │ 1 │ │ └───────┴─────────┘ *2. max precision of decimal is 38(we can refer to https://duckdb.org/docs/stable/sql/data_types/numeric#fixed-point-decimals)* *3. IdentifierQuoteString is ", test for it:* test is OK: D SELECT DATETRUNC('second', "d") from t5; ┌────────────────────────┐ │ datetrunc('second', d) │ │ timestamp │ ├────────────────────────┤ │ 2025-01-22 12:21:23 │ │ 2025-01-22 12:21:25 │ └────────────────────────┘ ERROR result: D SELECT DATETRUNC('second', 'd') from t5; Binder Error: Could not choose a best candidate function for the function call "datetrunc(STRING_LITERAL, STRING_LITERAL)". In order to select one, please add explicit type casts. Candidate functions: datetrunc(VARCHAR, DATE) -> TIMESTAMP datetrunc(VARCHAR, INTERVAL) -> INTERVAL datetrunc(VARCHAR, TIMESTAMP WITH TIME ZONE) -> TIMESTAMP WITH TIME ZONE datetrunc(VARCHAR, TIMESTAMP) -> TIMESTAMP LINE 1: SELECT DATETRUNC('second', 'd') from t5; ^ D SELECT DATETRUNC('second', `d`) from t5; Catalog Error: Scalar Function with name `__postfix does not exist! Did you mean "!__postfix"? LINE 1: SELECT DATETRUNC('second', `d`) from t5; *4. support approx_count_distinct function:* D select approx_count_distinct("id") from t3 ; ┌───────────────────────────┐ │ approx_count_distinct(id) │ │ int64 │ ├───────────────────────────┤ │ 1 │ └───────────────────────────┘ *5. Not support character_length/char_length, need use length function to support it:* D select length('aa'); ┌──────────────┐ │ length('aa') │ │ int64 │ ├──────────────┤ │ 2 │ └──────────────┘ D select character_length('aa'); Catalog Error: Scalar Function with name character_length does not exist! Did you mean "array_length"? LINE 1: select character_length('aa'); ^ D select char_length('aa'); Catalog Error: Scalar Function with name char_length does not exist! Did you mean "array_length"? LINE 1: select char_length('aa'); ^ *6. Map construncor method need convert () to {}, test for it:* result is OK: D select MAP \{'a':1, 'b':2}; ┌────────────────────────────────────────────────────────────┐ │ main.map(main.list_value('a', 'b'), main.list_value(1, 2)) │ │ map(varchar, integer) │ ├────────────────────────────────────────────────────────────┤ │ \{a=1, b=2} │ └────────────────────────────────────────────────────────────┘ Error result: D select MAP ('a':1, 'b':2); Parser Error: syntax error at or near ":" LINE 1: select MAP ('a':1, 'b':2); *7. floor date should convert datetrunc function(can refer to: https://duckdb.org/docs/stable/sql/functions/date#date_truncpart-date)* D SELECT DATETRUNC('year', "d") from t5; ┌──────────────────────┐ │ datetrunc('year', d) │ │ date │ ├──────────────────────┤ │ 2025-01-01 │ │ 2025-01-01 │ └──────────────────────┘ D SELECT DATETRUNC('month', "d") from t5; ┌───────────────────────┐ │ datetrunc('month', d) │ │ date │ ├───────────────────────┤ │ 2025-01-01 │ │ 2025-01-01 │ └───────────────────────┘ D SELECT DATETRUNC('day', "d") from t5; ┌─────────────────────┐ │ datetrunc('day', d) │ │ date │ ├─────────────────────┤ │ 2025-01-22 │ │ 2025-01-22 │ └─────────────────────┘ D SELECT DATETRUNC('week', "d") from t5; ┌──────────────────────┐ │ datetrunc('week', d) │ │ date │ ├──────────────────────┤ │ 2025-01-20 │ │ 2025-01-20 │ └──────────────────────┘ D SELECT DATETRUNC('millisecond', "d") from t5; ┌─────────────────────────────┐ │ datetrunc('millisecond', d) │ │ timestamp │ ├─────────────────────────────┤ │ 2025-01-22 12:21:23.123 │ │ 2025-01-22 12:21:25.123 │ └─────────────────────────────┘ D SELECT DATETRUNC('microseconds', "d") from t5; ┌──────────────────────────────┐ │ datetrunc('microseconds', d) │ │ timestamp │ ├──────────────────────────────┤ │ 2025-01-22 12:21:23.123 │ │ 2025-01-22 12:21:25.123223 │ └──────────────────────────────┘ for these differences we should make a new DuckDB dialect. -- This message was sent by Atlassian Jira (v8.20.10#820010)