Thanks.. it works now.

-Simon


On Thu, Jun 5, 2014 at 10:47 AM, Nick Pentreath <nick.pentre...@gmail.com>
wrote:

> Have you set the persistence level of the RDD to MEMORY_ONLY_SER (
> http://spark.apache.org/docs/latest/programming-guide.html#rdd-persistence)?
> If you're calling cache, the default persistence level is MEMORY_ONLY so
> that setting will have no impact.
>
>
> On Thu, Jun 5, 2014 at 4:41 PM, Xu (Simon) Chen <xche...@gmail.com> wrote:
>
>> I have a working set larger than available memory, thus I am hoping to
>> turn on rdd compression so that I can store more in-memory. Strangely it
>> made no difference. The number of cached partitions, fraction cached, and
>> size in memory remain the same. Any ideas?
>>
>> I confirmed that rdd compression wasn't on before and it was on for the
>> second test.
>>
>> scala> sc.getConf.getAll foreach println
>> ...
>> (spark.rdd.compress,true)
>> ...
>>
>> I haven't tried lzo vs snappy, but my guess is that either one should
>> provide at least some benefit..
>>
>> Thanks.
>> -Simon
>>
>>
>

Reply via email to