It's just the average of the word vectors, for all words in the text. On Fri, Oct 7, 2016 at 9:04 AM kaching <wa...@o2.pl> wrote:
> Hi. How exacly MLlib implementation of word2vec converts word vectors > into one feature vector per row? > > TEXT > [Hi, I, heard, ab..] > [I, wish, Java, c..] > [Logistic, regres.] > > | word2vec > > V > > WORD VECTOR > heard [0.14950960874557...| > are [-0.1639076173305...| > neat [0.13949351012706...| > classes [0.03703496977686...| > I [-0.0189154129475...| > regression [0.15298652648925...| > Logistic [-0.1270201653242...| > Spark [-0.0535793155431...| > could [0.12216471135616...| > use [0.08246973901987...| > Hi [0.16548289358615...| > models [-0.0568316541612...| > case [0.11626788973808...| > about [-0.1500445008277...| > Java [-0.0407485179603...| > wish [0.11882393807172...| > > | HOW? > > V > > TEXT RESULT > [Hi, I, heard, ab... ] [0.01849065460264...| > [I, wish, Java, c... ] [0.05958533100783...| > [Logistic, regres...] [-0.0110558800399...| > > Is there a way to change this default method? > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >