Hi, 

maybe this is a stupid question:

I have a list of files. Each file I want to take as an input for a 
ML-algorithm. All files are independent from another.
My question now is how do I distribute the work so that each worker takes a 
block of files and just runs the algorithm on them one by one.
I hope somebody can point me in the right direction! :)

Best regards, 
Lydia
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to