Re: SPARK-22267 issue: Spark SQL incorrectly reads ORC file when column order is different

2017-11-15 Thread Mark Petruska
Hi Dongjoon, Thanks for the info. Unfortunately I did not find any means to fix the issue without forcing CONVERT_METASTORE_ORC or changing the ORC reader implementation. Closing the PR, as it was only used to demonstrate the root cause. Best regards, Mark On Tue, Nov 14, 2017 at 6:58 PM, Dongjo

Re: SPARK-22267 issue: Spark SQL incorrectly reads ORC file when column order is different

2017-11-14 Thread Dongjoon Hyun
Hi, Mark. That is one of the reasons why I left it behind from the previous PR (below) and I'm focusing is the second approach; use OrcFileFormat with convertMetastoreOrc. https://github.com/apache/spark/pull/19470 [SPARK-14387][SPARK-16628][SPARK-18355][SQL] Use Spark schema to read ORC table in

SPARK-22267 issue: Spark SQL incorrectly reads ORC file when column order is different

2017-11-14 Thread Mark Petruska
Hi, I'm very new to spark development, and would like to get guidance from more experienced members. Sorry this email will be long as I try to explain the details. Started to investigate the issue SPARK-22267 ; added some test cases to highlight