callicles opened a new issue, #2153:
URL: https://github.com/apache/datafusion-sqlparser-rs/issues/2153

   ## Problem
   
   ClickHouse data types are case-sensitive and require PascalCase (e.g., 
`String`, `Int32`, `Nullable`). However, the `sqlparser-rs` library's `Display` 
implementation for `DataType` converts certain types to uppercase, causing 
`UNKNOWN_TYPE` errors when round-tripping SQL through ClickHouse.
   
   ## Example
   ```rust
   use sqlparser::dialect::ClickHouseDialect;
   use sqlparser::parser::Parser;
   
   let sql = "CREATE TABLE t (col Nullable(String))";
   let dialect = ClickHouseDialect {};
   let ast = Parser::parse_sql(&dialect, sql).unwrap();
   
   // Round-trip: parse and convert back to string
   let regenerated = ast[0].to_string();
   // Result: "CREATE TABLE t (col Nullable(STRING))"
   //                                       ^^^^^^ uppercase! When this 
regenerated SQL is executed against ClickHouse, it fails with:
   ```
   
   ```
   Code: 47. DB::Exception: Unknown type STRING. (UNKNOWN_TYPE)
   ```
   
   ## Affected Types
   
   | Type | Current Output | ClickHouse Requires |
   |------|----------------|---------------------|
   | `DataType::Int8` | `INT8` | `Int8` |
   | `DataType::Int64` | `INT64` | `Int64` |
   | `DataType::Float64` | `FLOAT64` | `Float64` |
   | `DataType::String` | `STRING` | `String` |
   | `DataType::Bool` | `BOOL` | `Bool` |
   | `DataType::Date` | `DATE` | `Date` |
   | `DataType::Datetime` | `DATETIME` | `DateTime` |
   
   **Types already correct (PascalCase):**
   - `Int16`, `Int32`, `Int128`, `Int256`
   - `UInt8`, `UInt16`, `UInt32`, `UInt64`, `UInt128`, `UInt256`
   - `Float32`
   - `Nullable`, `LowCardinality`, `Array`, `Map`, `Tuple`, `Nested`
   
   ## Root Cause
   
   The `Display` trait implementation for `DataType` uses uppercase for type 
names (e.g., `write!(f, "STRING")`), which is standard for most SQL dialects 
but incorrect for ClickHouse.
   
   The challenge is that `Display` doesn't have access to dialect context, so 
it can't conditionally format based on the active dialect.
   
   ## Proposed Solution
   
   Add a dialect-aware formatting method to `DataType`:
   
   1. Add a `requires_pascalcase_types()` method to the `Dialect` trait 
(returns `false` by default, `true` for `ClickHouseDialect`)
   2. Add a `to_sql(&dyn Dialect)` method to `DataType` that respects the 
dialect's casing requirements
   3. Keep the existing `Display` implementation unchanged for backwards 
compatibility
   
   ```rust
   // In src/dialect/mod.rs
   pub trait Dialect {
       fn requires_pascalcase_types(&self) -> bool { false }
       // ...
   }
   
   // In src/dialect/clickhouse.rs
   impl Dialect for ClickHouseDialect {
       fn requires_pascalcase_types(&self) -> bool { true }
   }
   
   // In src/ast/data_type.rs
   impl DataType {
       pub fn to_sql(&self, dialect: &dyn Dialect) -> String {
           // Format with PascalCase if dialect requires it
       }
   }
   ```
   
   ## Workarounds
   
   Currently, users must post-process the SQL string output to fix casing. See 
[514-labs/moosestack#3152](https://github.com/514-labs/moosestack/pull/3152) 
for an example regex-based workaround.
   
   ## References
   
   - [ClickHouse Data Types 
Documentation](https://clickhouse.com/docs/en/sql-reference/data-types)
   - Related workaround: 
[514-labs/moosestack#3152](https://github.com/514-labs/moosestack/pull/3152)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to