: There are many other ways to approach this and my recommendation is : just the simplest one based on the description of your needs.
I'd like to add one thing to Erik's excellent advice, something many people (especially people use to dealing with rgorously structured data) tend to overlook: Documents in a Lucene index can be heterogenous. You can have some documents in your index with fields A, B, and C; and you can have other documents in your index with field X, Y and Z. And in some cases you can have a search result of docs from that first set of documents, in other cases you can have a search result with docs from the second set, or you can return a mixed bag of both -- it all depends on how you structure your query, and which fields you search on. Consider a simplified example of your orriginal question: What if you had products with specs, and reviews of produduts. You could have one document per review, indexing all the text of the review, and the product Ids of the products mentioned in it. You can also have one document per product, indexing all the spec data. If a user does a search for "dell" you can return results containing a mix of products and reviews that contain the word dell. By making the product Id field stored and indexed in all documents, you can even provide links next to reviews to see all the products mentioned in the review, or next to a product to see all reviews that mention that product. There's no need to limit yourself and say "based on my data, I will make one document per X in my data model." you can make one doc per X, and one doc per Y, and one doc per Z -- all depending on the desired behavior at search time. -Hoss --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]