leobiscassi commented on issue #6142: URL: https://github.com/apache/hudi/issues/6142#issuecomment-1198147081
Hi hudi community, I'm experiencing a similar issue, for some tables in my data lake we got the following error when trying to query: [16777224] Query failed (#20220727_185609_00434_4n5pr): The column my_column_name_here of table my_tablename_here is declared as type string, but the Parquet file (s3a://bucket/prefix/befb27ee-ee21-4791-95bb-d8aeb521aff9-0_15-22-5118_20220629223504.parquet) declares the column as type INT32 com.facebook.presto.spi.PrestoException: The column my_column_name_here of table my_tablename_here is declared as type string, but the Parquet file (s3a://bucket/prefix/befb27ee-ee21-4791-95bb-d8aeb521aff9-0_15-22-5118_20220629223504.parquet) declares the column as type INT32 **My environment** hudi: amzn 0.10.1 / amzn 0.11.0 on EMR presto: 0.267 / 0.272 on EMR What I've done trying to fix it until now: - Tested in more than one hudi version (0.10.1 and 0.11.0) - Copied the jar `hudi-presto-bundle.jar` from EMR to the presto instalation - Followed [this](https://stackoverflow.com/questions/60183579/presto-fails-with-type-mismatch-errors) stackoverflow thread and tried to change the config `hive.parquet.use-column-names=true` on `hive.properties` file on EMR None of this worked. Does someone knows how to deal with it or if is it a bug on the integration? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
