Hi, I have three questions about indexing:
1) I am indexing HTML documents, how can I do "stop removal" before indexing, I dont want to index stop words? 2) I can have an access to the terms in one document, but how can I have access to the document name that these terms has been appeared? 3) I want to find phrases at index level, e.x. find frequency of phrases in the collection, also their frequency in each document. How can I do it in Lucene, is there any sample code? Thanks ____________________________________________________________________________________ Be a PS3 game guru. Get your game face on with the latest PS3 news and previews at Yahoo! Games. http://videogames.yahoo.com/platform?platform=120121 --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]