Thanks Michael, opened this https://issues.apache.org/jira/browse/SPARK-4520
On Thu, Nov 20, 2014 at 2:59 PM, Michael Armbrust
wrote:
> Can you open a JIRA?
>
> On Thu, Nov 20, 2014 at 10:39 AM, Sadhan Sood
> wrote:
>
>> I am running on master, pulled yesterday I believe but saw the same issue
Can you open a JIRA?
On Thu, Nov 20, 2014 at 10:39 AM, Sadhan Sood wrote:
> I am running on master, pulled yesterday I believe but saw the same issue
> with 1.2.0
>
> On Thu, Nov 20, 2014 at 1:37 PM, Michael Armbrust
> wrote:
>
>> Which version are you running on again?
>>
>> On Thu, Nov 20, 20
I am running on master, pulled yesterday I believe but saw the same issue
with 1.2.0
On Thu, Nov 20, 2014 at 1:37 PM, Michael Armbrust
wrote:
> Which version are you running on again?
>
> On Thu, Nov 20, 2014 at 8:17 AM, Sadhan Sood
> wrote:
>
>> Also attaching the parquet file if anyone wants
Which version are you running on again?
On Thu, Nov 20, 2014 at 8:17 AM, Sadhan Sood wrote:
> Also attaching the parquet file if anyone wants to take a further look.
>
> On Thu, Nov 20, 2014 at 8:54 AM, Sadhan Sood
> wrote:
>
>> So, I am seeing this issue with spark sql throwing an exception wh
Also attaching the parquet file if anyone wants to take a further look.
On Thu, Nov 20, 2014 at 8:54 AM, Sadhan Sood wrote:
> So, I am seeing this issue with spark sql throwing an exception when
> trying to read selective columns from a thrift parquet file and also when
> caching them:
> On some
So, I am seeing this issue with spark sql throwing an exception when trying
to read selective columns from a thrift parquet file and also when caching
them:
On some further digging, I was able to narrow it down to at-least one
particular column type: map> to be causing this issue.
To reproduce this
Hi Cheng,
I tried reading the parquet file(on which we were getting the exception)
through parquet-tools and it is able to dump the file and I can read the
metadata, etc. I also loaded the file through hive table and can run a
table scan query on it as well. Let me know if I can do more to help
re
(Forgot to cc user mail list)
On 11/16/14 4:59 PM, Cheng Lian wrote:
Hey Sadhan,
Thanks for the additional information, this is helpful. Seems that
some Parquet internal contract was broken, but I'm not sure whether
it's caused by Spark SQL or Parquet, or even maybe the Parquet file
itself w
Hi Cheng,
Thanks for your response.Here is the stack trace from yarn logs:
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/SparkSQL-exception-on-cached-parquet-table-tp18978p19020.html
Sent from the Apache Spark User List mailing list archive at
Hi Sadhan,
Could you please provide the stack trace of the
|ArrayIndexOutOfBoundsException| (if any)? The reason why the first
query succeeds is that Spark SQL doesn’t bother reading all data from
the table to give |COUNT(*)|. In the second case, however, the whole
table is asked to be cached
While testing SparkSQL on a bunch of parquet files (basically used to be a
partition for one of our hive tables), I encountered this error:
import org.apache.spark.sql.SchemaRDD
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
11 matches
Mail list logo