An IndexReader doesn't see changes in the index unless you close and reopen it, but if there is significant time between the time you fetch your docid and read it's vector, that could be a problem.
You can always use TermEnum/TermDocs to find the doc ID associated with a particular field you have, but I suspect that suffers from the same problem. In fact, *anything* you do between fetching the doc ID and getting its termvector has this problem, and there's no way I know of to get termvectors by your own ID. What might work is a "sanity check" sort of algorithm. That is, fetch the doc ID, then fetch it's term vector, then look at your custom field for that doc ID and see if it matches the original. If not, do it all over again. But that all seems too complicated to me. Why not just insure that you use the *same* IndexReader both when you get the original doc ID and when you get its termverctor? Even a temporary reference should hold things open long enough to insure that atomicity. Best Erick On 5/21/07, Walter Ferrara <[EMAIL PROTECTED]> wrote:
I'm interested in getting the term vector of a lucene doc. The point is, it seems I have to give to the IndexReader.getTermFreqVector a doc ID, while I would know if there is a way to get the termvector by a doc identifier (not lucene doc id, but a my own field). I know how to get the lucene docid for the doc I'm interested, but my concern is about the non-atomicity of getting a id and pass it to another function. This because I reload index time by time, and I'm worried about a loss of consistency if the new indexreader remap docids (after deletion for example), and I end up accessing a different doc, just because between "get the id" and "get the termvector for that id", the reader could have been reloaded (and doc-ids changed). Best, Walter --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]