Re: Get term id from dictionary

2007-10-31 Thread Mark Miller
You can check out the file format of Lucene's term dictionary here: http://lucene.apache.org/java/docs/fileformats.html#Term%20Dictionary That might give you some insight. Lucene does not keep id's for terms that I can tell though...just for documents...and then the id is really just an offset

Re: Get term id from dictionary

2007-10-31 Thread Ilias Flaounas
I want to have IDs for the terms (words) not the documents! Also, I need the same ID for a word if it appears in more than one documents. Example: Doc1: The sea is blue Doc2: Sky is blue For these two docs the dictionary would be [the]->1 [sea]->2 [is]->3 [blue]->4 [sky]->5 So I want to represen

Re: Get term id from dictionary

2007-10-31 Thread Mark Miller
The id does change. You need to index your own "id" field with the document. Ilias Flaounas wrote: Dear experts, I need to store and index a string of text into Lucene, and later I want to get the Id of each term inside this string. Is it possible? How can I do that? I want a unique associati

Get term id from dictionary

2007-10-31 Thread Ilias Flaounas
Dear experts, I need to store and index a string of text into Lucene, and later I want to get the Id of each term inside this string. Is it possible? How can I do that? I want a unique association, term (in my case a word) -> Id. I know, that If I delete a document, the dictionary changes. Does t