Hi, 

I have three questions about indexing:

1) I am indexing HTML documents, how can I do "stop
removal" before indexing, I dont want to index stop
words? 

2) I can have an access to the terms in one document,
but how can I have access to the document name that
these terms has been appeared?

3) I want to find phrases at index level, e.x. find
frequency of phrases in the collection, also their
frequency in each document. How can I do it in Lucene,
is there any sample code?

Thanks



 
____________________________________________________________________________________
Be a PS3 game guru.
Get your game face on with the latest PS3 news and previews at Yahoo! Games.
http://videogames.yahoo.com/platform?platform=120121

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to