Re: Strange shuffle behaviour difference between Zeppelin and Spark-shell

Igor Berman Wed, 19 Aug 2015 04:59:25 -0700

i would compare spark ui metrics for both cases and see any
differences(number of partitions, number of spills etc)
why can't you make repl to be consistent with zepellin spark version?
 might be rc has issues...





On 19 August 2015 at 14:42, Rick Moritz <rah...@gmail.com> wrote:

> No, the setup is one driver with 32g of memory, and three executors each
> with 8g of memory in both cases. No core-number has been specified, thus it
> should default to single-core (though I've seen the yarn-owned jvms
> wrapping the executors take up to 3 cores in top). That is, unless, as I
> suggested, there are different defaults for the two means of job submission
> that come into play in a non-transparent fashion (i.e. not visible in
> SparkConf).
>
> On Wed, Aug 19, 2015 at 1:36 PM, Igor Berman <igor.ber...@gmail.com>
> wrote:
>
>> any differences in number of cores, memory settings for executors?
>>
>>
>> On 19 August 2015 at 09:49, Rick Moritz <rah...@gmail.com> wrote:
>>
>>> Dear list,
>>>
>>> I am observing a very strange difference in behaviour between a Spark
>>> 1.4.0-rc4 REPL (locally compiled with Java 7) and a Spark 1.4.0 zeppelin
>>> interpreter (compiled with Java 6 and sourced from maven central).
>>>
>>> The workflow loads data from Hive, applies a number of transformations
>>> (including quite a lot of shuffle operations) and then presents an enriched
>>> dataset. The code (an resulting DAGs) are identical in each case.
>>>
>>> The following particularities are noted:
>>> Importing the HiveRDD and caching it yields identical results on both
>>> platforms.
>>> Applying case classes, leads to a 2-2.5MB increase in dataset size per
>>> partition (excepting empty partitions).
>>>
>>> Writing shuffles shows this much more significant result:
>>>
>>> Zeppelin:
>>> *Total Time Across All Tasks: * 2,6 min
>>> *Input Size / Records: * 2.4 GB / 7314771
>>> *Shuffle Write: * 673.5 MB / 7314771
>>>
>>> vs
>>>
>>> Spark-shell:
>>> *Total Time Across All Tasks: * 28 min
>>> *Input Size / Records: * 3.6 GB / 7314771
>>> *Shuffle Write: * 9.0 GB / 7314771
>>>
>>> This is one of the early stages, which reads from a cached partition and
>>> then feeds into a join-stage. The latter stages show similar behaviour in
>>> producing excessive shuffle spills.
>>>
>>> Quite often the excessive shuffle volume will lead to massive shuffle
>>> spills which ultimately kill not only performance, but the actual executors
>>> as well.
>>>
>>> I have examined the Environment tab in the SParkUI and identified no
>>> notable difference besides FAIR (Zeppelin) vs FIFO (spark-shell) scheduling
>>> mode. I fail to see how this would impact shuffle writes in such a drastic
>>> way, since it should be on the inter-job level, while this happens at the
>>> inter-stage level.
>>>
>>> I was somewhat supicious of maybe compression or serialization playing a
>>> role, but the SparkConf points to those being set to the default. Also
>>> Zeppelin's interpreter adds no relevant additional default parameters.
>>> I performed a diff between rc4 (which was later released) and 1.4.0 and
>>> as expected there were no differences, besides a single class (remarkably,
>>> a shuffle-relevant class:
>>> /org/apache/spark/shuffle/unsafe/UnsafeShuffleExternalSorter.class )
>>> differing in its binary representation due to being compiled with Java 7
>>> instead of Java 6. The decompiled sources of those two are again identical.
>>>
>>> I may attempt as a next step to simply replace that file in the packaged
>>> jar, to ascertain that indeed there is no difference between the two
>>> versions, but would consider this to be a major bg, if a simple compiler
>>> change leads to this kind of issue.
>>>
>>> I a also open for any other ideas, in particular to verify that the same
>>> compression/serialization is indeed happening, and regarding ways to
>>> determin what exactly is written into these shuffles -- currently I only
>>> know that the tuples are bigger (or smaller) than they ought to be. The
>>> Zeppelin-obtained results do appear to be consistent at least, thus the
>>> suspicion is, that there is an issue with the process launched from
>>> spark-shell. I will also attempt to build a spark job and spark-submit it
>>> using different spark-binaries to further explore the issue.
>>>
>>> Best Regards,
>>>
>>> Rick Moritz
>>>
>>> PS: I already tried to send this mail yesterday, but it never made it
>>> onto the list, as far as I can tell -- I apologize should anyone receive
>>> this as a second copy.
>>>
>>>
>>
>

Re: Strange shuffle behaviour difference between Zeppelin and Spark-shell

Reply via email to