Strange duplicates in data when scaling up

2014-10-17 Thread Jacob Maloney
Issue was solved by clearing hashmap and hashset at the beginning of the call method. From: Jacob Maloney [mailto:jmalo...@conversantmedia.com] Sent: Thursday, October 16, 2014 5:09 PM To: user@spark.apache.org Subject: Strange duplicates in data when scaling up I have a flatmap function that

Strange duplicates in data when scaling up

2014-10-16 Thread Jacob Maloney
I have a flatmap function that shouldn't possibly emit duplicates and yet it does. The output of my function is a HashSet so the function itself cannot output duplicates and yet I see many copies of keys emmited from it (in one case up to 62). The curious thing is I can't get this to happen unti

Issue with java spark broadcast

2014-10-10 Thread Jacob Maloney
an't I access this map? And what do I have to do to make it accessible. Thanks, Jacob -Original Message- From: user-h...@spark.apache.org [mailto:user-h...@spark.apache.org] Sent: Friday, October 10, 2014 4:02 PM To: Jacob Maloney Subject: FAQ for user@spark.apache.org Hi! Th