On 12/13/05, Ian Soboroff <[EMAIL PROTECTED]> wrote:
> Paul Libbrecht <[EMAIL PROTECTED]> writes:
>
> > We're also thinking about implementing something similar to LSI within
> > ActiveMath which is lucene-powered where both formulae and text
> > searching would benefit of the latent-semantic-similarity. I've been
> > refrained of doing "exactly this" at least since LSI is patented. This
> > might also be a reason why there's no implementation in Lucene's
> > sandbox.
> >
> > Have you looked at other vector-based approaches which are not exactly LSI ?
> > Have you looked at InfoMap NLP ?
>
> Look for Thomas Hofmann's "probabilistic LSI", and other recent work
> which cites it.

You might also be interested in "Latent Dirichlet Allocation (LDA)" by
David Blei. In short, it is a more advanced version of "probabilistic
LSI". I am currently writing some code to dump Lucene documents into a
file format used by Blei's LDA implementation written in C.


Regards,
Dave.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to