Guys,
Aren't TaskScheduler and DAGScheduler residing in the spark context? So,
the debug configs need to be set in the JVM where the spark context is
running? [1]
But yes, I agree, if you really need to check the execution, you need to
set those configs in the executors [2]
[1]
https://jaceklask
Hi Yash,
Yes, AFAIK, that is the expected behavior of the Overwrite mode.
I think you can use the following approaches if you want to perform a job
on each partitions
[1] for each partition in DF :
https://github.com/apache/spark/blob/branch-1.6/sql/core/src/main/scala/org/apache/spark/sql/DataFr
Hi Maciej,
Thank you for your reply.
I have 2 queries.
1. I can understand your explanation. But in my experience, when I check
the final RDBMS table, I see that the results follow the expected order,
without an issue. Is this just a coincidence?
2. I was further looking into this. So, say I run
of 4 partitions, is
non-deterministic)? That was the reason why I was surprised to see that the
final results are in the same order.
On Tue, Nov 22, 2016 at 5:24 PM, Maciej Szymkiewicz [via Apache Spark
Developers List] wrote:
> On 11/22/2016 12:11 PM, nirandap wrote:
>
> Hi Maciej,
>