Hi Wayne,

Tnks for reply. I did raise the thread max before posting, based on your
previous comment on another post using ulimit -n 2048. That seemed to have
helped on the out of memory issue.

I'm curious if this is standard procedure for scaling a spark node's
resources vertically or is it just a quick workaround. I would expect the
Spark standalone master to have these settings exposed on some configuration
file.

The second item I'm referring to is the trickiest since only occurs (empty
data!) when I increase the number of worker threads Local[N]. I don't see a
real gain on increasing the number of threads, actually seems that the
performance degrades as it seems I get threads waiting for others to finish
to return processed data.

As a general statement we could say that for small sized RDDs a high number
of threads could be a problem. You agree?

tnks,
Rod



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-streaming-on-load-run-How-to-increase-single-node-capacity-tp6953p7096.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to