nsivabalan commented on issue #6194: URL: https://github.com/apache/hudi/issues/6194#issuecomment-1212669526
sorry. I am bit confused. If I am not wrong, this is your issue. you had duplicates in your hudi table somehow. you tried to execute hudi-cli to dedup and ran into issues and posted w/ the stacktrace. I gave you the commands I used and showed that it worked for me. But I could not gauge your response to that. repair dedup command does not fix duplicates in the table. It dumps the deduped records to a separate location as parquet data. you may need to delete the matching entries from hudi and load the parquet data again. I understand its not easy. but thats the only option we have. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
