: user
Date: 2016/10/25 17:33
Subject:Re: Spark SQL is slower when DataFrame is cache in Memory
Hi Kazuaki,
I print a debug log right before I call the collect, and use that to
compare against the job start log (it is available when turning on debug
log).
Anyway, I test that in
gt;
> Best Regards,
> Kazuaki Ishizaki
>
>
>
> From:Chin Wei Low
> To:Kazuaki Ishizaki/Japan/IBM@IBMJP
> Cc:user@spark.apache.org
> Date:2016/10/10 11:33
>
> Subject: Re: Spark SQL is slower when DataFrame is cache in Memory
>
:Re: Spark SQL is slower when DataFrame is cache in Memory
Hi Ishizaki san,
Thanks for the reply.
So, when I pre-cache the dataframe, the cache is being used during the job
execution.
Actually there are 3 events:
1. call res.collect
2. job started
3. job completed
I am concerning
;)
> res.explain(true)
> res.collect()
>
> Do I make some misunderstandings?
>
> Best Regards,
> Kazuaki Ishizaki
>
>
>
> From:Chin Wei Low
> To:Kazuaki Ishizaki/Japan/IBM@IBMJP
> Cc: user@spark.apache.org
> Date:
e.org
Date: 2016/10/07 20:06
Subject: Re: Spark SQL is slower when DataFrame is cache in Memory
Hi Ishizaki san,
So there is a gap between res.collect
and when I see this log: spark.SparkContext: Starting job: collect at
:26
What you mean is, during this time Spark already start to
> Best Regards,
> Kazuaki Ishizaki
>
>
>
> From:Chin Wei Low
> To:user@spark.apache.org
> Date: 2016/10/07 13:05
> Subject:Spark SQL is slower when DataFrame is cache in Memory
> --
>
>
>
> Hi,
>
:Spark SQL is slower when DataFrame is cache in Memory
Hi,
I am using Spark 1.6.0. I have a Spark application that create and cache
(in memory) DataFrames (around 50+, with some on single parquet file and
some on folder with a few parquet files) with the following codes:
val df
Hi,
I am using Spark 1.6.0. I have a Spark application that create and cache
(in memory) DataFrames (around 50+, with some on single parquet file and
some on folder with a few parquet files) with the following codes:
val df = sqlContext.read.parquet
df.persist
df.count
I union them to 3 DataFram