Hi Andzej, Thanks for the tip, it does what I want. You are right, though, it's of limited use for helping the user access data. But I'm sure it will come in handy for my own analysis. Best, Ariel
-----Message d'origine----- De : Andrzej Bialecki [mailto:[EMAIL PROTECTED] Envoyé : jeudi 31 août 2006 15:49 À : java-user@lucene.apache.org Objet : Re: graphically representing an index SOMMERIA KLEIN Ariel Ext VIACCESS-BU_DRM wrote: > Hi all, > I'm a newbie with Lucene and I'm looking to implement the following: > I want to index posts from a forum, and, rather than proposing a search > on the contents, graphically represent the contents of the index. More > precisely, I would like to have a list of the most popular words, with a > number next to each indicating how often they occur. > The icing on the cake would be to be able to click on such a word and > get a subset of the posts including that word. > Can Lucene be used for this? Has anyone already implemented it? Any > links? > I've dug around a bit without any success, but my apologies if this has > already been dealt with > > See http://www.getopt.org/luke for an example of such functionality. However, I must disappoint you - the most frequent words in a corpus are quite probably also most useless words. For English these are: the, a, to, for, by, in, can, I, ... So, you will need to eliminate them from the top of the list to get any useful results. -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __________________________________ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] ----------------------------------------- "Privileged/Confidential information may be contained in this e-mail and attachments. This e-mail, including attachments, constitutes non-public information intended to be conveyed only to the designated recipient(s). If you are not an intended recipient, please delete this e-mail, including attachments, and notify us immediately. The unauthorized use, dissemination, distribution or reproduction of this e-mail, including attachments, is prohibited and may be unlawful. In general, the content of this e-mail and attachments does not constitute any form of commitment by VIACCESS SA." ----------------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]