Hi! I don't think the easy solution will work for me, because I'll have more than two fields in a group - perhaps 6 - 10.
However using span queries looks very promising. I'll investigate that. I see setPositionIncrement() only on the Token object. Is there a way to set this when adding a field to the document, so that the first token get its position pushed away. I would prefer not to modify my analyzer if possible. Thank you. tjk :) On Wed, Jan 13, 2010 at 3:52 PM, Erick Erickson <erickerick...@gmail.com>wrote: > Ooooh, isn't that easier. You just prompted me to think > that you don't even have to do that, just index the pairs as single > tokens (KeywordAnalyzer? but watch out for no case folding)... > > On Wed, Jan 13, 2010 at 4:30 PM, Digy <digyd...@gmail.com> wrote: > > > How about using languages as fieldnames? > > Doc1(Ra): > > Java:5 > > C:2 > > PHP:3 > > > > Doc2(Rb) > > Java:2 > > C:5 > > VB:1 > > > > Query:Java:5 AND C:2 > > > > DIGY > > > > -----Original Message----- > > From: TJ Kolev [mailto:tjko...@gmail.com] > > Sent: Wednesday, January 13, 2010 11:00 PM > > To: java-user@lucene.apache.org > > Subject: Problem: Indexing and searching repeating groups of fields > > > > Greetings, > > > > Let's assume I have to index and search "resume" documents. Two fields > are > > defined: Language and Years. The fields are associated together in a > group > > called Experience. A resume document may have 0 or more Experience > groups: > > > > Ra{ E1(Java,5); E2(C,2); E3(PHP,3);} > > Rb{ E1(Java,2); E2(C,5); E3(VB,1);} > > > > How do I index such documents, and how do I search, so I can formulate a > > query like this "Resumes which have (Java,5) and (C,2)" and get back Ra. > I > > know I can index multiple fields of the same name, and do "(Language:Java > > AND Years:5) AND (Language:C AND Years:2)", but in addition to Ra that > > would > > also return Rb, which I don't want. The problem here is that the > "grouping" > > is lost. I can create fields with compound names (E1Language, E1Years, > > E2Language, E2Years, etc), but that helps me none, as I don't know which > > group to search. I'd also like to query for "(Language:Java AND Years:5) > OR > > (Language:C AND Years:2)" > > > > This is a simplified example. Real documents may have 30 - 40 groups, > each > > one with several fields. Putting all the fields in a group in one index > > field won't work as the numeric/date ones should be available for range > > searchers. > > > > So far the way I see it is to do my own post processing on the results. > The > > issue is that text fields will need to be untokenized, or otherwise it > > would > > be difficult to work on the result, and determine what matches. > > > > Thank you. > > tjk :) > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > >