alamb commented on issue #14909:
URL: https://github.com/apache/datafusion/issues/14909#issuecomment-2700708838

   Thank you @zhuqi-lucas 
   
   If you change the example slightly (so the column names are not explicitly 
listed) then the type is correctly set to Utf8View
   
   ```sql
   > CREATE EXTERNAL TABLE IF NOT EXISTS lineitem STORED AS parquet
   LOCATION 
'/Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem';
   0 row(s) fetched.
   Elapsed 0.010 seconds.
   
   > select arrow_typeof(l_comment) from lineitem limit 1;
   +----------------------------------+
   | arrow_typeof(lineitem.l_comment) |
   +----------------------------------+
   | Utf8View                         |
   +----------------------------------+
   1 row(s) fetched.
   Elapsed 0.015 seconds.
   ```
   
   I think what is happening is that the SQL type `VARCHAR` is mapped to `Utf8`
   
   
https://github.com/apache/datafusion/blob/c0d53adf8323b840d0adaa62ba868d6acd4ba886/datafusion/sql/src/planner.rs#L561
   
   So when the table is created with an explicit type it is set to Utf8 
   
   However, when the types are read directly from Parquet the types is inferred 
as Utf8View


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to