Hi there,
I am trying to scale up the data size that my application is handling. This
application is running on a cluster with 16 slave nodes. Each slave node has
60GB memory. It is running in standalone mode. The data is coming from HDFS
that also in same local network.
In order to have an under
Hi David,
It is a great point. It is actually one of the reasons that my program is
slow. I found that the major cause of my program running slow is the huge
garbage collection time. I created too many small objects in the map
procedure which triggers GC mechanism frequently. After I improved my
p