No.. I don't see your solution is performant.. If each lucene Document corresponds to a row in 'A join B' then Index explodes.. Index size drastically increases.
Why not then creating two indexs A and B. And search for A and then from obtained A documents information search in B. It seems for me more performant than indexing all 'A join B' documents. Any commenters? Jelda > -----Original Message----- > From: Satuluri, Venu_Madhav [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 13, 2006 6:15 PM > To: java-user@lucene.apache.org > Subject: RE: Lucene Seaches VS. Relational database Queries > > I think you are asking if we can retain 1:n relationships in lucene. > > Ok, I'll go out on a limb and give my solution. Say you have > a table A and table B with B having multiple rows associated > to each row in A. > Also your documents are centered around A, i.e. all your > queries return some row(s) of A, not B, but you should be > able to query on fields in B. > > > In such a case, you need to have multiple documents for each row in A. > To be more specific, if a row in A has 5 corresponding rows > in B, then there must be 5 Documents in lucene index > corresponding to A. In other words, each lucene Document > corresponds to a row in 'A join B'. > > I am not sure of this scheme. If there are more tables, then > this quickly explodes the no. of documents. We'll have as > many documents as will be there in {A join B join C join D.. > }. Plus, we'll need to remove Documents which correspond > logically to the same row in A from the Hits. > > Is there a better way to do this? Or I don't make sense? > > > -----Original Message----- > From: Ananth T. Sarathy [mailto:[EMAIL PROTECTED] > Sent: Thursday, April 13, 2006 9:04 PM > To: java-user@lucene.apache.org > Subject: Re: Lucene Seaches VS. Relational database Queries > > > Ok, > Some of the stuff makes some sense. I was a little loopy > from lack of > sleep and some of these solutions don't really cover my concerns.... > > > Let's take this movie example. If each member of a production Crew can > have > multiple titles that come from a lookup table of Distinct Jobs > > Titles > Assistant Producer > Producer > Executive Producer > Director > Director Trainee > Stunt Director > > In the Database there would be a Assocation Table Linking each Crew > member > the titles they had > > Crew_Titles > Crew_ID Title > 1 Producer > 1 > > On 4/12/06, Nadav Har'El <[EMAIL PROTECTED]> wrote: > > > > Chris Hostetter <[EMAIL PROTECTED]> wrote on 12/04/2006 > 01:41:37 > > AM: > > > : them in one field). One of the problems I see would be with > values > > that > > > : over lap (Example, name where one name is Jason Bateman, and one > is > > Jason > > > : Bateman Black, and it would be hard to replicate the Discrete > Search > > for > > > > > > they way field values are "analyzed" is extremely configurable -- > down > > to > > > the individual field level. Which means that while you > can have an > > actor > > > field where you can do loose text searching for "bateman" and get > back > > > movies staring "Jason Bateman" and "Jason Bateman Black" (and even > Guido > > > Batemans" if you use stemming) you can also have another > field using > a > > > KeywordAnalyzer such that a record with teh values "Jason Bateman" > and > > > "Jack Black" will only be matched if hte user searches for "Jason > > Bateman" > > > or "Jack Black" ... searching for "Bateman Jack" or "Black Jason" > will > > not > > > work. > > > > Another possible trick is to have one field, but mark its end with > special > > tokens, say "^" and "$", so that "Jason Bateman" gets > indexed as four > > tokens: > > ^ Jason Bateman $ > > Then, if you want to search for the name Jason Bateman and that name > only, > > just search for the phrase "^ Jason Bateman $" - and only this entry > will > > match. (you can also continue to search this field normally) > > > > If you'll think about this, you'll notice that you don't > actually need > > the beginning-of-field marker ("^") because it's easy to > recognize the > > beginning of a field because the position there is 0. Unfortunately, > > I don't know how to match position 0 using the standard QueryParser, > > but you can do it with the SpanFirstQuery: for example if we index > > Jason Bateman as the three tokens > > Jason Bateman $ > > then we can search for it using something like > > SpanQuery[] terms = { > > new SpanTermQuery(new Term("actor", "Jason")), > > new SpanTermQuery(new Term("actor", "Bateman")), > > new SpanTermQuery(new Term("actor", "$")) }; > > new SpanFirstQuery(new SpanNearQuery(terms, 0, true), 3); > > (or something like that... I didn't test this) > > > > > > -- > > Nadav Har'El > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: [EMAIL PROTECTED] > > For additional commands, e-mail: [EMAIL PROTECTED] > > > > > > > -- > Ananth T Sarathy > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]