alamb commented on issue #14909: URL: https://github.com/apache/datafusion/issues/14909#issuecomment-2700708838
Thank you @zhuqi-lucas If you change the example slightly (so the column names are not explicitly listed) then the type is correctly set to Utf8View ```sql > CREATE EXTERNAL TABLE IF NOT EXISTS lineitem STORED AS parquet LOCATION '/Users/andrewlamb/Software/datafusion/benchmarks/data/tpch_sf10/lineitem'; 0 row(s) fetched. Elapsed 0.010 seconds. > select arrow_typeof(l_comment) from lineitem limit 1; +----------------------------------+ | arrow_typeof(lineitem.l_comment) | +----------------------------------+ | Utf8View | +----------------------------------+ 1 row(s) fetched. Elapsed 0.015 seconds. ``` I think what is happening is that the SQL type `VARCHAR` is mapped to `Utf8` https://github.com/apache/datafusion/blob/c0d53adf8323b840d0adaa62ba868d6acd4ba886/datafusion/sql/src/planner.rs#L561 So when the table is created with an explicit type it is set to Utf8 However, when the types are read directly from Parquet the types is inferred as Utf8View -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org