This strategy can also be nicely abstracted from your main app. Whilst I haven't yet implemented it, my plan is to create a template style structure which tells me which fields are in lucene, and which are externalized. This way I don't bother storing data in lucene that it stored elsewhere, but I also don't bother storing data elsewhere that is available in lucene.
Of course I always have the source data from which I can do a total re-index if necessary. The end result is an abstraction of a "document" which (by using a template) will automatically retrieve data from the index or from external storage based on the meta information in the template. This means the app that uses the index no longer cares where the data is, and it can re-index whenever it likes with confidence that everything will be handled appropriately. Unfortunately you may have transactional problems with this approach. That is, if you write to lucene and in the same logical transaction you write to external storage you may have atomicity problems if one of these actions fails. But you can't have everything! As long as you can "fix" the index if it gets out of sync you should be ok. On 8/11/06, Karel Tejnora <[EMAIL PROTECTED]> wrote:
Jason is right. I think, even Im not expert on lucene too, your newly added document cann't recreate terms for field with analyzer, because field text in empty. There is very hairy solution - hack a IndexReader, FieldInfosWriter and use addIndexes. Lucene is "only" a fulltext search library, not a datastore. I end up with the same design as he suggested - At beginning I choose to store fields in lucene index, because lucene has no limit on fields length like DB2. The Jason's strategy is very useful in all cases - *before adding document* to lucene index obtain an unique id from sequence (e. g. db) add document with this synthetic id (e.g. field id and store it within index) and after add, save document *with id* (if it has not been saved yet) to xml or another storage. Well, I haven't stored syn. ids with indexed documents and now I'm about to reassign them. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]