Hi,

My cluster (standalone deployment) consisting of 3 worker nodes was in the
middle of computations, when I added one more worker node. I can see that
new worker is registered in master and that my job actually get one more
executor. I have configured default parallelism as 12 and thus I see that
each of three nodes holds 4 RDD blocks. I have expected that by adding one
more node those partitions would be reassigned into 4 nodes instead of 3,
but I don't see that. Even though all partitions are cached in memory of
those 3 nodes and most probably have their locations or preferred locations
known, I'd still expect partitions be reassigned. Are my expectations
incorrect?

I do also have an HDFS setup on those the same 3 worker nodes, but when I
added one more node to the spark cluster I haven't added one more HDFS node
to a hadoop cluster. That in my mind shouldn't be the reason, but who knows?

Thanks.
--
Be well!
Jean Morozov

Reply via email to