hi Ted,

Yes. I was considering various possibilities. one of them was this. ( scale
up these dimensions, for example,multiplying by a configurable factor
correction.)

 I really want  to mix two different vectors from the same documents
 with different lengths and dictionaries , (perhaps some terms of
dictionaries are the same). Then I will be  multiplying    dimension of
each vector  by a configurable factor correction.

My question is:..
 Is it  better to scale up these dimensions  directly in the tf-idf
sequence final mix  file using this correction factors  OR first do scale
up   in each  tf-vectors and then mix vectors and  recalculate the  tf-idf
final  to minimize  errors or desviations   in a  subsequent clustering
from this tf-idf final mix vectors.

Thanks in advance for your help.

One last note:

I am bass player and  701q AKG  with fiio E12+E09K is a perfect
combination!!


;-)






2015-01-14 20:12 GMT+01:00 Ted Dunning <[email protected]>:

> The easiest way is to scale those dimensions up.
>
>
>
> On Wed, Jan 14, 2015 at 2:41 AM, Miguel Angel Martin junquera <
> [email protected]> wrote:
>
> > hi all,
> >
> >
> > I am clustering using kmeans several text documents from distintct
> sources
> > and I have  generated the sparse vectors of each document yet.
> > I want to boost some dimensions in the sparse vectors.
> >
> > what is the best way to do this ?
> >
> > is it a good idea  load the vectors  and find the dimensions values of tf
> > or tf-idf and boost this values?
> >
> >
> > Thanks in advance and regards
> >
>

Reply via email to