You can check out the file format of Lucene's term dictionary here:
http://lucene.apache.org/java/docs/fileformats.html#Term%20Dictionary
That might give you some insight.
Lucene does not keep id's for terms that I can tell though...just for
documents...and then the id is really just an offset
I want to have IDs for the terms (words) not the documents!
Also, I need the same ID for a word if it appears in more than one documents.
Example:
Doc1: The sea is blue
Doc2: Sky is blue
For these two docs the dictionary would be [the]->1 [sea]->2 [is]->3
[blue]->4 [sky]->5
So I want to represen
The id does change. You need to index your own "id" field with the document.
Ilias Flaounas wrote:
Dear experts,
I need to store and index a string of text into Lucene, and later I
want to get the Id of each term inside this string. Is it possible?
How can I do that?
I want a unique associati
Dear experts,
I need to store and index a string of text into Lucene, and later I
want to get the Id of each term inside this string. Is it possible?
How can I do that?
I want a unique association, term (in my case a word) -> Id. I know,
that If I delete a document, the dictionary changes. Does t