Hi Ajay, would you mind to synthesise a minimum code snippet that can reproduce this issue and paste it here?
On Wed, Jun 4, 2014 at 8:32 PM, Ajay Srivastava <[email protected]> wrote: > Hi, > > I am doing join of two RDDs which giving different results ( counting > number of records ) each time I run this code on same input. > > The input files are large enough to be divided in two splits. When the > program runs on two workers with single core assigned to these, output is > consistent and looks correct. But when single worker is used with two or > more than two cores, the result seems to be random. Every time, count of > joined record is different. > > Does this sound like a defect or I need to take care of something while > using join ? I am using spark-0.9.1. > > Regards > Ajay >
