This here may also be of help:
http://docs.aws.amazon.com/AmazonS3/latest/dev/request-rate-perf-considerations.html.
Make sure to spread your objects across multiple partitions to not be rate
limited by S3.
-Sven
On Mon, Dec 22, 2014 at 10:20 AM, durga katakam wrote:
> Yes . I am reading thousan
You should be able to kill the job using the webUI or via spark-class.
More info can be found in the thread:
http://apache-spark-user-list.1001560.n3.nabble.com/How-to-kill-a-Spark-job-running-in-cluster-mode-td18583.html.
HTH!
On Tue, Dec 23, 2014 at 4:47 PM, durga wrote:
> Hi All ,
>
> It se
Hi All ,
It seems problem is little more complicated.
If the job is hungup on reading s3 file.even if I kill the unix process that
started the job, it is not killing spark-job. It is still hung up there.
Now the questions are :
How do I find spark-job based on the name?
How do I kill the spark-
http://www.jets3t.org/toolkit/configuration.html
Put the following properties in a file named jets3t.properties and make
sure it is available during the running of your Spark job (just place it in
~/ and pass a reference to it when calling spark-submit with --file
~/jets3t.properties)
httpclien
Yes . I am reading thousands of files every hours. Is there any way I can
tell spark to timeout.
Thanks for your help.
-D
On Mon, Dec 22, 2014 at 4:57 AM, Shuai Zheng wrote:
> Is it possible too many connections open to read from s3 from one node? I
> have this issue before because I open a few
Is it possible too many connections open to read from s3 from one node? I
have this issue before because I open a few hundreds of files on s3 to read
from one node. It just block itself without error until timeout later.
On Monday, December 22, 2014, durga wrote:
> Hi All,
>
> I am facing a stra