> Why go through all this effort when it's easy to make your > own unique ID? > Add a new field to each document "myuniqueid" and fill it in > yourself. It'll > never change then.
I am sorry I did not mention in my post that I am aware of this solution but that it cannot be used for my purposes. I need the stable ID during filtering and afterwards for counting for faceted browsing. My tests show, and from the documentation/mailing lists I conclude, that retrieving a stable ID from a field during filtering and for each hit in a query result is too expensive. After your post I put some more thought into storing a stable ID in a field. I figured I could read all the stable IDs once and create a map from Lucene IDs to stable IDs. But this takes too long (> 600 ms on my laptop) for a small set of documents (< 150,000). Another problem is that I have to do millions of additional lookups during filtering and counting. > Of course, I may misunderstand your problem space enough that this is > useless. If so, please tell us the problem you're trying to > solve and maybe > wiser heads than mine will have better suggestions.... Here is a description of our problem. We want to build a repository that can handle a number of documents that is in the low millions (we are designing the repository for 10 million documents intially). Almost all navigation through this repository will be faceted. For this we need to be able to filter based on the facet values selected by the user, and we have to count how many documents in the search result have a particular facet value for multiple (estimation: 25-40) facet values. The documents in the repository are constantly changed and we want the faceted navigation to be updated in near real-time: if the user refreshes a search page after making changes to a document, the changes should be visible. I estimate we have about 250-500 ms, the time it takes to go to another page and refresh it, to update the index(es). My idea is to use Lucene for regular searching, and use a custom index for filtering based on facets and for counting the number of matches for facet values. For this to work, (reasonably) stable IDs are needed so updating the facet value index is simply changing values in doing a number of arrays. I am willing to sacrifice search performance for stable IDs if it gains performance in faceted filtering and counting. Johan --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]