Thanks a lot for the suggestions!

Le 18/06/2015 15:02, Himanshu Mehra [via Apache Spark User List] a écrit :
> Hi A bellet
>
> You can try RDD.randomSplit(weights array) where a weights array is the
> array of weight you wants to want to put in the consecutive partition
> example RDD.randomSplit(Array(0.7, 0.3)) will create two partitions
> containing 70% data in one and 30% in other, randomly selecting the
> elements. RDD.randomSplit(Array(0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1, 0.1,
> 0.1, 0.1, )) will create 10 partitions of randomly selected elements
> with equal weights.
>   Thank you
>
>
> Himanshu
>
> ------------------------------------------------------------------------
> If you reply to this email, your message will be added to the discussion
> below:
> http://apache-spark-user-list.1001560.n3.nabble.com/Best-way-to-randomly-distribute-elements-tp23391p23392.html
>
> To unsubscribe from Best way to randomly distribute elements, click here
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=23391&code=YXVyZWxpZW4uYmVsbGV0QHRlbGVjb20tcGFyaXN0ZWNoLmZyfDIzMzkxfDQ5OTM3NTkwNA==>.
> NAML
> <http://apache-spark-user-list.1001560.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Best-way-to-randomly-distribute-elements-tp23391p23409.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to