Filtering lines in parquet

Avi Levi Wed, 10 Mar 2021 01:50:53 -0800

Hi all,
I am trying to filter lines from parquet files, the problem is that they
have different schemas, however the field that I am using to filter
exists in all schemas.
in spark this is quite straight forward :


*val filtered = rawsDF.filter(col("id") != "123")*

I tried to do it in flink by extending the ParquetInputFormat but in this
case I need to schema (message type) and implement Convert method which I
want to avoid since I do not want to convert the line (I want to write is
as is to other parquet file)

Any ideas ?

Cheers
Avi

Filtering lines in parquet

Reply via email to