I have rdd with records in below format, id,name,age,houseno,childPresent 1,gupta,35,100,None 1,verma,16,100,None 1,ravi,10,100,None 2, Abc,32,200,None 2,xyz,23,200,None
I have to change childPresent field for all row for same id if any record with same id have age < 18. How can I do that. I want output as below: 1,gupta,35,100,Y 1,verma,16,100,Y -- because it hase age less than 18 so Y for all childPresent for Id =1 1,ravi,10,100,Y 2, Abc,32,200,N 2,xyz,23,200,N -- because there is no age < 18 for Id =2. Please let me know how can I achieve using spark/scala. Thanks Vikash