I built this recently using the accepted answer on this SO page: http://stackoverflow.com/questions/26741714/how-does-the-pyspark-mappartitions-function-work/26745371
-sujit On Sat, May 14, 2016 at 7:00 AM, Mathieu Longtin <[email protected]> wrote: > From memory: > def processor(iterator): > for item in iterator: > newitem = do_whatever(item) > yield newitem > > newdata = data.mapPartition(processor) > > Basically, your function takes an iterator as an argument, and must either > be an iterator or return one. > > On Sat, May 14, 2016 at 12:39 AM Abi <[email protected]> wrote: > >> >> >> On Tue, May 10, 2016 at 2:20 PM, Abi <[email protected]> wrote: >> >>> Is there any example of this ? I want to see how you write the the >>> iterable example >> >> >> -- > Mathieu Longtin > 1-514-803-8977 >
