Take a look at Luke (http://www.getopt.org/luke/). I think this does a lot of what you're asking for. It's opensource, so you could see how it's done. There are screenshots at the link above so you can see if it's actually what you want.....
You might also want to look at the Term* classes in the API, particularly TermDocs, TermEnum, TermFreqVector, TermPositionVector and TermPositions. I'm quite sure all the information is there, it'll probably be interesting to put it all together efficiently <G> Hope this helps Erick On 8/31/06, SOMMERIA KLEIN Ariel Ext VIACCESS-BU_DRM < [EMAIL PROTECTED]> wrote:
Hi all, I'm a newbie with Lucene and I'm looking to implement the following: I want to index posts from a forum, and, rather than proposing a search on the contents, graphically represent the contents of the index. More precisely, I would like to have a list of the most popular words, with a number next to each indicating how often they occur. The icing on the cake would be to be able to click on such a word and get a subset of the posts including that word. Can Lucene be used for this? Has anyone already implemented it? Any links? I've dug around a bit without any success, but my apologies if this has already been dealt with ----------------------------------------- "Privileged/Confidential information may be contained in this e-mail and attachments. This e-mail, including attachments, constitutes non-public information intended to be conveyed only to the designated recipient(s). If you are not an intended recipient, please delete this e-mail, including attachments, and notify us immediately. The unauthorized use, dissemination, distribution or reproduction of this e-mail, including attachments, is prohibited and may be unlawful. In general, the content of this e-mail and attachments does not constitute any form of commitment by VIACCESS SA." ----------------------------------------- --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]