Not quiet sure, but one assumption would be that you are not having
sufficient memory to hold that much of data and the process gets busy in
cleaning the garbage and it could be the reason it works when you set
MEMORY_AND_DISK_SER_2.
Thanks
Best Regards
On Mon, Feb 9, 2015 at 8:38 PM, Jong Wook K
replying to my own thread; I realized that this only happens when the
replication level is 1.
Regardless of whether setting memory_only or disk or deserialized, I had to
make the replication level >= 2 to make the streaming work properly on YARN.
I still don't get it why, because intuitively less