what is the best way to implement mini batches?

ll Wed, 03 Dec 2014 07:31:26 -0800

hi.  what is the best way to pass through a large dataset in small,
sequential mini batches?


for example, with 1,000,000 data points and the mini batch size is 10,  we
would need to do some computation at these mini batches (0..9), (10..19),
(20..29), ... (N-9, N)

RDD.repartition(N/10).mapPartitions() work?

thanks!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/what-is-the-best-way-to-implement-mini-batches-tp20264.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

what is the best way to implement mini batches?

Reply via email to