cfis commented on issue #1018: URL: https://github.com/apache/datafusion-python/issues/1018#issuecomment-2676440635
Thanks for looking into this @kosiew. Yes, I understand that the duplicate fields come from the combination of hive partition fields and the parquet fields. However, I think this scenario should be supported. Similarly to when you join two tables with the same field names in a database the database will not throw an error. Instead it provides both columns in the results. You may have to rename them via a select clause to save them or specify `table.column` to reference them, but it is a supported scenario. Reading arrow tickets, arrow also supports duplicate field names. Not sure if this error is from Arrow or from the way DataFusion or the Python bindings use Arrow. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org