Sachin, As the merging of the results is the issue, I'll assume that you don't have clear user requirements for that.
The simplest way out of that is to allow the users to search the B's first, and once they have determined which B's they'd like to use, use those B's to limit the results in of user searches in A. That would normally be done by a filtering on B, much like RangeFilter. Caching that filter allows for quick repeated searches in A. Is that what the users want? For each normalization a filter can be used to search across it. One feature of filters is that the original score is lost. Would you have user requirements related to this? As the texts of A and B are the problem for reindexing, you may want to index these separately: one index for Aid+Atext, and one for Bid+Btext. That leaves the A-B 1-n association: one more index for Aid+Bids. In this last one you could also put a small text field of A. Denormalizing the Btext into Aid+Bids as Aid+Bids+Btexts can make it difficult for the users to explicitly select the B's. OTOH it makes it easy to implicitly select the B's. What do the users want? Each id field will have to be indexed to allow filtering, and stored to allow retrieval for filtering in another index. Retrieving stored fields is normally a performance bottleneck, so a FieldCache might be handy. Regards, Paul Elschot On Thursday 10 January 2008 12:58:44 sachin wrote: > Here are more details about my issue. > > I have two tables in database. A row in table 1 can have multiple rows > associated with it in table 2. It is a one to many mapping. > Let's say a row in table 1 is A and it has multiple rows B1, B2 and B3 > associated with it in table 2. I need to search on both A and B types > and the result should have A and all the Bs associated with it. Also for > your information, A and Bs are long text in database. > > I could have two approaches for indexing/searching > > First approach is to create the index in denormalized form. In this case > document would be like A, B1, B2, B3. The issue with this approach is > that any modification to any row would require me to re-index the > document again and fetch A and all Bs again from database. This is a > heavy process. > > The other approach is to index A, B1, B2 and B3 in different documents > and after search merge the results. This makes my re-indexing lighter > but I need to put extra logic to merge the results. For this type of > index I would require self join kind of query from lucene. Query can be > written by using boolean query but merging of two type of documents is a > issue. If I go by this approach for indexing, what is the best way to > fetch the results? > > I hope I have made myself clear. > > Thanks > Sachin > > > > On Tue, 2008-01-08 at 20:13 +0530, Developer Developer wrote: > > Provide more details please. > > > > Can you not use boolean query and filters if need be ? > > > > > > > > On Jan 8, 2008 7:23 AM, sachin <[EMAIL PROTECTED]> wrote: > > > > I need to write lucene query something similar to SQL self > > joins. > > > > My current implementation is very primitive. I fire first > > query, get the > > results, based on the result of first query I fire second > > query and then > > merge the results from both the queries. The whole processing > > is very > > expensive. Doing this is very easy with SQL query as we need > > to just > > write self join query and database do the rest for you. > > > > What is the best way of implementing the above functionality > > in lucene? > > > > Regards > > Sachin > > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: > > [EMAIL PROTECTED] > > For additional commands, e-mail: > > [EMAIL PROTECTED] > > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]