Hello everyone, I am struggling to read S3 parquet files from S3 with Flink Streaming 1.12.2 I had some difficulty simply reading from local parquet files. I finally managed that part, though the solution feels dirty: - I use the readFile function + ParquetInputFormat abstract class (that is protected) (as I could not find a way to use the public ParquetRowInputFormat). - the open function, in ParquetInputFormat is using org.apache.hadoop.conf.Configuration. I am not sure which import to add. It seems the flink-parquet library is importing the dependency from hadoop-common but the dep is marked as provided. THe doc only shows usage of flink-parquet from Flink SQL. So I am under the impression that this might not work in the streaming case without extra code. I 'solved' this by adding a dependency to hadoop-common. We did something similar to write parquet data to S3.
Now, when trying to run the application to read from S3, I get an exception with root cause: ``` Caused by: org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3" ``` I guess there are some issues with hadoop-common not knowing about the flink-s3-hadoop plugin setup. But I ran out of ideas on how to solve this. I also noticed there were some changes with flink-parquet in Flink 1.14, but I had some issues with simply reading data (but I did not investigate so deeply for that version). Many thanks for any help. -- [image: Grab] <https://htmlsig.com/t/000001BKA99J> [image: Twitter] <https://htmlsig.com/t/000001BKDVDC> [image: Facebook] <https://htmlsig.com/t/000001BF8J9Q> [image: LinkedIn] <https://htmlsig.com/t/000001BKYJ3R> [image: Instagram] <https://htmlsig.com/t/000001BH4CH1> [image: Youtube] <https://htmlsig.com/t/0000001BMMNPF> Alexandre Montecucco / Grab, Software Developer alexandre.montecu...@grab.com <claire...@grab.com> / 8782 0937 Grab 138 Cecil Street, Cecil Court #01-01Singapore 069538 https://www.grab.com/ <https://www.grab.com/sg/hitch> -- By communicating with Grab Inc and/or its subsidiaries, associate companies and jointly controlled entities (“Grab Group”), you are deemed to have consented to the processing of your personal data as set out in the Privacy Notice which can be viewed at https://grab.com/privacy/ <https://grab.com/privacy/> This email contains confidential information and is only for the intended recipient(s). If you are not the intended recipient(s), please do not disseminate, distribute or copy this email Please notify Grab Group immediately if you have received this by mistake and delete this email from your system. Email transmission cannot be guaranteed to be secure or error-free as any information therein could be intercepted, corrupted, lost, destroyed, delayed or incomplete, or contain viruses. Grab Group do not accept liability for any errors or omissions in the contents of this email arises as a result of email transmission. All intellectual property rights in this email and attachments therein shall remain vested in Grab Group, unless otherwise provided by law.