Thanks Till :) I reimplemented my implementation using PredictDataSetOperation.
Regards, Chiwan Park > On Jun 29, 2015, at 7:41 PM, Till Rohrmann <till.rohrm...@gmail.com> wrote: > > Hi Chiwan, > > at the moment the single element PredictOperation only supports > non-distributed models. This means that it expects the model to be a single > element DataSet which can be broadcasted to the predict mappers. > > If you need more flexibility, you can either extend the PredictOperation > interface or you simply use the PredictDataSetOperation, where you have > full control over what data flow you execute. > > Cheers, > Till > > > On Mon, Jun 29, 2015 at 12:16 PM, Chiwan Park <chiwanp...@apache.org> wrote: > >> Thank you Till. >> >> I have another question. Can I use a DataSet object as Model? In KNN, we >> need >> to DataSet given in fit operation. >> >> But when I defined Model generic parameter to DataSet in PredictOperation, >> the getModel method’s return type is DataSet[DataSet]. I’m confused with >> this >> situation. >> >> If any advice about this to me, I will really appreciate. >> >> >> Regards, >> Chiwan Park >> >>> On Jun 29, 2015, at 4:43 PM, Till Rohrmann <trohrm...@apache.org> wrote: >>> >>> Hi Chiwan, >>> >>> when you use the single element predict operation, you always have to >>> implement the `getModel` method. There you have access to the resulting >>> parameters and even to the instance to which the `PredictOperation` >>> belongs. Within in this `getModel` method you can initialize all the >>> information you need for the `predict` operation. >>> >>> You can take a look at the `StandardScalerTransformOperation` [1] where >> the >>> mean and the std are set in the `getModel` method. >>> >>> Cheers, >>> Till >>> >>> [1] >>> >> https://github.com/apache/flink/blob/master/flink-staging/flink-ml/src/main/scala/org/apache/flink/ml/preprocessing/StandardScaler.scala#L197 >>> >>> On Sun, Jun 28, 2015 at 1:49 PM, Chiwan Park <chiwanp...@apache.org> >> wrote: >>> >>>> Hi, I’m implementing k-nearest-neighbors classification based flink-ml >>>> structure. >>>> >>>> In recent commit (7a7a2940 [1]), the pipeline is restructured by >> dividing >>>> predict operation >>>> into case of a single element and case of data set. In case of data set, >>>> parameter map is >>>> given as a method parameter but in case of a single element there is no >>>> method to access >>>> parameter map. >>>> >>>> But in k-nearest-neighbors classification, we need to know k in predict >>>> method to select top >>>> k values. >>>> >>>> How can I solve this problem? >>>> >>>> Regards, >>>> Chiwan Park >>>> >>>> [1] >>>> >> https://github.com/apache/flink/commit/7a7a294033ef99c596e59f670e2e4ae9262f5c5f >>>> >>>> >> >> >> >> >>