Not sure what did you refer to when saying replicated rdd, if you actually mean 
RDD, then, yes , read the API doc and paper as Tobias mentioned.
If you actually focus on the word "replicated", then that is for fault 
tolerant, and probably mostly used in the streaming case for receiver created 
RDD.

For Spark, Application is your user program. And a job is an internal schedule 
conception, It's a group of some RDD operation. Your applications might invoke 
several jobs.


Best Regards,
Raymond Liu

From: rapelly kartheek [mailto:kartheek.m...@gmail.com] 
Sent: Wednesday, September 03, 2014 5:03 PM
To: user@spark.apache.org
Subject: RDDs

Hi,
Can someone tell me what kind of operations can be performed on a replicated 
rdd?? What are the use-cases of a replicated rdd.
One basic doubt that is bothering me from long time: what is the difference 
between an application and job in the Spark parlance. I am confused b'cas of 
Hadoop jargon.
Thank you

Reply via email to