5 feb 2009 kl. 09.30 skrev Amin Mohammed-Coleman:

Is there a seperate part in the lucene document that the tokenised strings
are stored and therefore Lucene knows where to look?


Yes.

Stored fields is meta data bound to a document, for instance the primary key of the object the Lucene document represents. Note that I call this meta data, it is not the data Lucene looks at when searching.

In order to collect a list of documents matching a query Lucene navigates an inverted index of string tokens. Usually each word in a string is made in to a token, but there are many other strategies. (A token is known as a term when associated with a specific field name.)

You might want to take a look at this:
http://en.wikipedia.org/wiki/Inverted_index

There is a third way Lucene stores data, the so called term vector view. This is a cache of the terms available in a document, available as it is very expensive for an inverted index to extract the terms available in a document.




     karl





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to