Re: graphically representing an index

Andrzej Bialecki Thu, 31 Aug 2006 06:49:44 -0700

SOMMERIA KLEIN Ariel Ext VIACCESS-BU_DRM wrote:

Hi all,
I'm a newbie with Lucene and I'm looking to implement the following:
I want to index posts from a forum, and, rather than proposing a search
on the contents, graphically represent the contents of the index. More
precisely, I would like to have a list of the most popular words, with a
number next to each indicating how often they occur.The icing on the cake would be to be able to click on such a word andget a subset of the posts including that word.Can Lucene be used for this? Has anyone already implemented it? Any
links?
I've dug around a bit without any success, but my apologies if this has
already been dealt with

See http://www.getopt.org/luke for an example of such functionality.However, I must disappoint you - the most frequent words in a corpus arequite probably also most useless words. For English these are: the, a,to, for, by, in, can, I, ...So, you will need to eliminate them from the top of the list to get anyuseful results.


--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: graphically representing an index

Reply via email to