Spark data quality bug when reading parquet files from hive metastore

Long, Andrew Wed, 22 Aug 2018 10:18:02 -0700

Hello Friends,

I’ve encountered a bug where spark silently corrupts data when reading from a 
parquet hive table where the table schema does not match the file schema.  I’d 
like to give a shot at adding some extra validations to the code to handle this 
corner case and I was wondering if anyone had any suggestions for where to 
start looking in the spark code.


Cheers Andrew

Spark data quality bug when reading parquet files from hive metastore

Reply via email to