Re: ml.feature.Word2Vec.transform() very slow issue

Sean Owen Mon, 09 Nov 2015 22:58:12 -0800

Since it's a fairly expensive operation to build the Map, I tend to agree
it should not happen in the loop.


On Tue, Nov 10, 2015 at 5:08 AM, Yuming Wang <q79969...@gmail.com> wrote:

> Hi
>
>
>
> I found org.apache.spark.ml.feature.Word2Vec.transform() very slow.
>
> I think we should not read broadcast every sentence, so I fixed on my forked.
>
>
>
> https://github.com/979969786/spark/commit/a9f894df3671bb8df2f342de1820dab3185598f3
>
>
>
> I have use 20000 number rows test it. Original version consume *5 minutes*,
>
>
> 
>
> and my version just consume *22 seconds* on same data.
>
>
> 
>
>
>
>
> If I'm right, I will pull request.
>
>
>
> Thanks
>
>

Re: ml.feature.Word2Vec.transform() very slow issue

Reply via email to