Hey Hyukjin,
Sorry that I missed the JIRA ticket. Thanks for bring this issue up
here, your detailed investigation.
From my side, I think this is a bug of Parquet. Parquet was designed to
support schema evolution. When scanning a Parquet, if a column exists in
the requested schema but missin
When enabling mergedSchema and predicate filter, this fails since Parquet
filters are pushed down regardless of each schema of the splits (or rather
files).
Dominic Ricard reported this issue (
https://issues.apache.org/jira/browse/SPARK-11103)
Even though this would work okay by setting spark.sq