Ziheng Wang created FLINK-27025:
-----------------------------------

             Summary: Cannot read parquet file, after putting the jar in the 
right place with right permissions
                 Key: FLINK-27025
                 URL: https://issues.apache.org/jira/browse/FLINK-27025
             Project: Flink
          Issue Type: Bug
          Components: API / Python, Table SQL / API
    Affects Versions: 1.14.0
            Reporter: Ziheng Wang


I am using Flink with the SQL API on AWS EMR. I can run queries on CSV files, 
no problem.

However when I try to run queries on Parquet files, I get this error: Caused 
by: java.io.StreamCorruptedException: unexpected block data

I have put flink-sql-parquet_2.12-1.14.0.jar under /usr/lib/flink/lib on the 
master node of the EMR cluster. Indeed it seems that Flink picks up on it, 
because if the jar is not there then the error is different (it says it can't 
understand parquet source) The jar has full 777 permissions under the same 
username as all the other jars in that file.

I tried passing a folder name as the Parquet source as well as a single Parquet 
file, nothing works. 



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to