Re: SparkSQL returns ArrayBuffer for fields of type Array

Du Li Wed, 27 Aug 2014 17:35:51 -0700

I found this discrepancy when writing unit tests for my project. Basically the 
expectation was that the returned type should match that of the input data. 
Although it’s easy to work around, I was just feeling a bit weird. Is there a 
better reason to return ArrayBuffer?

From: Michael Armbrust <mich...@databricks.com<mailto:mich...@databricks.com>>
Date: Wednesday, August 27, 2014 at 5:21 PM
To: Du Li <l...@yahoo-inc.com<mailto:l...@yahoo-inc.com>>
Cc: "user@spark.apache.org<mailto:user@spark.apache.org>" 
<user@spark.apache.org<mailto:user@spark.apache.org>>
Subject: Re: SparkSQL returns ArrayBuffer for fields of type Array

Arrays in the JVM are also mutable.  However, you should not be relying on the 
exact type here.  The only promise is that you will get back something of type 
Seq[_].

On Wed, Aug 27, 2014 at 4:27 PM, Du Li 
<l...@yahoo-inc.com<mailto:l...@yahoo-inc.com>> wrote:
Hi, Michael.

I used HiveContext to create a table with a field of type Array. However, in 
the hql results, this field was returned as type ArrayBuffer which is mutable. 
Would it make more sense to be an Array?

The Spark version of my test is 1.0.2. I haven’t tested it on SQLContext nor 
newer version of Spark yet.

Thanks,
Du

Re: SparkSQL returns ArrayBuffer for fields of type Array

Reply via email to