If you want to tie them with other data, I think the best way is to use DataFrame join operation on condition that they share an identity column.
Thanks Yanbo 2016-08-16 20:39 GMT-07:00 ayan guha <guha.a...@gmail.com>: > Hi > > Thank you for your reply. Yes, I can get prediction and original features > together. My question is how to tie them back to other parts of the data, > which was not in LP. > > For example, I have a bunch of other dimensions which are not part of > features or label. > > Sorry if this is a stupid question. > > On Wed, Aug 17, 2016 at 12:57 PM, Yanbo Liang <yblia...@gmail.com> wrote: > >> MLlib will keep the original dataset during transformation, it just >> append new columns to existing DataFrame. That is you can get both >> prediction value and original features from the output DataFrame of >> model.transform. >> >> Thanks >> Yanbo >> >> 2016-08-16 17:48 GMT-07:00 ayan guha <guha.a...@gmail.com>: >> >>> Hi >>> >>> I have a dataset as follows: >>> >>> DF: >>> amount:float >>> date_read:date >>> meter_number:string >>> >>> I am trying to predict future amount based on past 3 weeks consumption >>> (and a heaps of weather data related to date). >>> >>> My Labelpoint looks like >>> >>> label (populated from DF.amount) >>> features (populated from a bunch of other stuff) >>> >>> Model.predict output: >>> label >>> prediction >>> >>> Now, I am trying to put together this prediction value back to meter >>> number and date_read from original DF? >>> >>> One way to assume order of records in DF and Model.predict will be >>> exactly same and zip two RDDs. But any other (possibly better) solution? >>> >>> -- >>> Best Regards, >>> Ayan Guha >>> >> >> > > > -- > Best Regards, > Ayan Guha >