I am not sure what you try to achieve here. You should never use the arraylist as you use it here as a global variable (an anti-pattern). Why don't you use the count function of the dataframe?
> On 24. Apr 2017, at 19:36, Devender Yadav <devender.ya...@impetus.co.in> > wrote: > > Hi All, > > > I am using Spark 1.6.2 and Java 7. > > Sample json (total 100 records): > > {"name":"dev","salary":10000,"occupation":"engg","address":"noida"} > > {"name":"karthik","salary":20000,"occupation":"engg","address":"noida"} > > Useful code: > > final List<Map<String,String>> jsonData = new ArrayList<>(); > > DataFrame df = > sqlContext.read().json("file:///home/dev/data-json/emp.json"); > JavaRDD<String> rdd = df.repartition(1).toJSON().toJavaRDD(); > > rdd.foreach(new VoidFunction<String>() { > @Override > public void call(String line) { > try { > jsonData.add (new ObjectMapper().readValue(line, Map.class)); > System.out.println(Thread.currentThread().getName()); > System.out.println("List size: "+jsonData.size()); > } catch (IOException e) { > e.printStackTrace(); > } > } > }); > > System.out.println(Thread.currentThread().getName()); > System.out.println("List size: "+jsonData.size()); > jsonData List is empty in the end. > > Output: > > Executor task launch worker-1 > List size: 1 > Executor task launch worker-1 > List size: 2 > Executor task launch worker-1 > List size: 3 > . > . > . > Executor task launch worker-1 > List size: 100 > > main > List size: 0 > > > Regards, > Devender > > > > > > > > NOTE: This message may contain information that is confidential, proprietary, > privileged or otherwise protected by law. The message is intended solely for > the named addressee. If received in error, please destroy and notify the > sender. Any use of this email is prohibited when received in error. Impetus > does not represent, warrant and/or guarantee, that the integrity of this > communication has been maintained nor that the communication is free of > errors, virus, interception or interference.