Thanks for the example. However my main problem is that what i would like to do is:
Create a SparkApp that will Sort and Partition the initial file (k) times based on a key. JavaSparkContext ctx = new JavaSparkContext("spark://dmpour:7077", "BasicFileSplit", System.getenv("SPARK_HOME"), JavaSparkContext.jarOfClass(BasicFileSplit.class)); JavaRDD<String> input = ctx.textFile(args[1], 1); // Map based on key extracted by each line. JavaPairRDD<String, String> ones = input.map(new Split()); // Group based on key and partition by k then JavaPairRDD<String, String> twos = input.map(new Split()).sortByKey().partitionBy(new HashPartitioner(k)); // k files.txt twos.values().saveAsTextFile("hdfs://1..../"); Each (k) worker should run myjar app. My myjar app will read these k partitioned files, do some calculations and write new files which will then be available to manipulate via shark... how can i assign this action to each k worker? thanks Dimitri -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-Java-example-using-external-Jars-tp2647p2948.html Sent from the Apache Spark User List mailing list archive at Nabble.com.