Hi Johan, DataFrames are building on top of RDDs, not sure if the ordering issues are different there. Maybe you could create minimally large enough simulated data and example series of transformations as an example to experiment on. Best, -m
Mehmet Süzen, MSc, PhD <su...@acm.org> | PRIVILEGED AND CONFIDENTIAL COMMUNICATION This e-mail transmission, and any documents, files or previous e-mail messages attached to it, may contain confidential information that is legally privileged. If you are not the intended recipient or a person responsible for delivering it to the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of any of the information contained in or attached to this transmission is STRICTLY PROHIBITED within the applicable law. If you have received this transmission in error, please: (1) immediately notify me by reply e-mail to su...@acm.org, and (2) destroy the original transmission and its attachments without reading or saving in any manner. | On 15 September 2017 at 09:44, <johan.grande....@orange.com> wrote: > Thanks all for your answers. After reading the provided links I am still > uncertain of the details of what I'd need to do to get my calculations right > with RDDs. However I discovered DataFrames and Pipelines on the "ML" side of > the libs and I think they'll be better suited to my needs. > > Best, > Johan Grande > > > _________________________________________________________________________________________________________________________ > > Ce message et ses pieces jointes peuvent contenir des informations > confidentielles ou privilegiees et ne doivent donc > pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu > ce message par erreur, veuillez le signaler > a l'expediteur et le detruire ainsi que les pieces jointes. Les messages > electroniques etant susceptibles d'alteration, > Orange decline toute responsabilite si ce message a ete altere, deforme ou > falsifie. Merci. > > This message and its attachments may contain confidential or privileged > information that may be protected by law; > they should not be distributed, used or copied without authorisation. > If you have received this email in error, please notify the sender and delete > this message and its attachments. > As emails may be altered, Orange is not liable for messages that have been > modified, changed or falsified. > Thank you. > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org