Spark SQL unit test failed when sort-based shuffle is enabled

Shao, Saisai Mon, 11 Aug 2014 18:17:32 -0700

Hi folks,

I met several Spark SQL unit test failures when sort-based shuffle is enabled, 
seems Spark SQL uses GenericMutableRow which will make ExternalSorter's 
internal buffer all referred to the same object, I guess GenericMutableRow uses 
only one mutable object to represent different rows, this is OK for hash-based 
shuffle because the row is directly written to file; but will be failed in 
sort-based shuffle because it will store the object to sort them. I just opened 
a JIRA ticket for this, details can be seen in 
https://issues.apache.org/jira/browse/SPARK-2967.


Any suggestion?

Thanks
Jerry

Spark SQL unit test failed when sort-based shuffle is enabled

Reply via email to