subject:"Re\: \[mllib\] Document frequency"

Re: [mllib] Document frequency

2019-01-14 Thread Jatin Puri

Thanks. Created: https://issues.apache.org/jira/browse/SPARK-26616 On Mon, Jan 14, 2019 at 9:19 PM Sean Owen wrote: > Yes that seems OK to me. > > On Mon, Jan 14, 2019 at 9:40 AM Jatin Puri wrote: > > > > Thanks for the response. So do I go ahead and create a jira ticket? > > Can then send a pu

Re: [mllib] Document frequency

2019-01-14 Thread Sean Owen

Yes that seems OK to me. On Mon, Jan 14, 2019 at 9:40 AM Jatin Puri wrote: > > Thanks for the response. So do I go ahead and create a jira ticket? > Can then send a pull request for the same with the changes. > > On Mon, Jan 14, 2019 at 8:18 PM Sean Owen wrote: >> >> I think that's reasonable. T

Re: [mllib] Document frequency

2019-01-14 Thread Jatin Puri

Thanks for the response. So do I go ahead and create a jira ticket? Can then send a pull request for the same with the changes. On Mon, Jan 14, 2019 at 8:18 PM Sean Owen wrote: > I think that's reasonable. The caller probably has the number of docs > already but sure, it's one long and is alread

Re: [mllib] Document frequency

2019-01-14 Thread Sean Owen

I think that's reasonable. The caller probably has the number of docs already but sure, it's one long and is already computed. This would have to be added to Pyspark too. On Mon, Jan 14, 2019 at 7:56 AM Jatin Puri wrote: > > Hello. > > As part of `org.apache.spark.ml.feature.IDFModel`, I think it