Payload Matching Query
Hi guys, I am trying to figure out whether there is a query that would do matching based od token payloads. JUST on token payloads. I think, there is not such query. Thus, I thought about implementing one, but somehow I dont know where to start. I've been trying to achieve this by extending CustomScoreQuery, where subQuery is MatchAllDocsQuery and CustomScoreProvider would evaluate score based on term payloads, but I cant get it working :-/ Could you give me any hints how to do this. Thanks *Michal Samek *
Re: Payload Matching Query
Hi Adrien, thanks for your reply. If payloads cannot be used for searching, is there any workaround how to achieve similar functionality? What I'd like to accomplish is to be able to search documents with contents for example "W. A. Mozart[artist] was born in Salzburg[city]" just by specifying the *payload*s [artist] [city]. Thanks *Michal * 2013/6/20 Adrien Grand > Hi Michal, > > Although payloads can be used at query time to customize scoring, they > can't be used for searching. Lucene only allows to search on terms. > > -- > Adrien > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >
Re: Payload Matching Query
Well, with this solution you won't be able to search for near occurences of payloads - as with NearSpanQueries :-/ I just need to store some searchable data with terms, not with documents. But why not implement totally new Query? I'm very new to Lucene, so I've got no idea what it involves, how indices are structured, how searching is implemented... Would it be possible? *Michal * 2013/6/20 Brendan Grainger > Any reason not to have separate artist and city fields? So you would search > for: > > artist:(W. A. Mozart) city:Salzburg > > HTH > Brendan > > > On Thu, Jun 20, 2013 at 12:27 PM, michal samek >wrote: > > > Hi Adrien, > > > > thanks for your reply. If payloads cannot be used for searching, is there > > any workaround how to achieve similar functionality? > > > > What I'd like to accomplish is to be able to search documents with > contents > > for example > > "W. A. Mozart[artist] was born in Salzburg[city]" > > just by specifying the *payload*s [artist] [city]. > > > > Thanks > > > > *Michal > > * > > > > > > 2013/6/20 Adrien Grand > > > > > Hi Michal, > > > > > > Although payloads can be used at query time to customize scoring, they > > > can't be used for searching. Lucene only allows to search on terms. > > > > > > -- > > > Adrien > > > > > > - > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > > > > -- > Brendan Grainger > www.kuripai.com >
Re: Payload Matching Query
Eventualy, I have choosen yet another solution. I work with those "payloads" as with synonyms. In my TokenFilter with every occurence of token with "payload", I inject new term - containing this "payload" with zeroed PossitionIncrementAttribute. It solves nearly all my issues =) Thanks everyone for assistance! *Michal * 2013/6/20 Shai Erera > There are several ways to implement it : > > Query as you mentioned. You'd need to implement a Scorer which traverses > the posting list where the payload exists. The methods you should implement > are nextDoc() and advance(). You'll also need to traverse > DocsAndPositionsEnum. > > A Filter. That's somewhat easier than a Query, I think. Principally it will > work the same as Scorer. One benefit is that you can cache filters. So if > it's common to search for a certain artist, you can cache the documents > this artist belong to. > > Third option is to implement a Collector which filters documents before > they are sent to whatever other collector aggregates the documents eg > TopScoreDocCollector. Again, you'll need to traverse the posting list of > the term that holds the payload, only this time you'll need to implement > collect() and setNextReader(). > > Each has pros and cons. They all share the same con though - the payload > doesn't help to drive the query. Ie, unlike what inverted indexes are meant > for, this approach cannot tell fast which documents have artist:foo, rather > you need to traverse all documents until you find one. > > I would perhaps give another thought to what's been proposed before, adding > an artist and city fields to each document. Perhaps if you tell us more > what you're trying to achieve in the end, we can help you structure your > index better. > > Shai > On Jun 20, 2013 8:12 PM, "michal samek" wrote: > > > Well, with this solution you won't be able to search for near occurences > of > > payloads - as with NearSpanQueries :-/ I just need to store some > searchable > > data with terms, not with documents. > > > > But why not implement totally new Query? I'm very new to Lucene, so I've > > got no idea what it involves, how indices are structured, how searching > is > > implemented... Would it be possible? > > > > *Michal > > * > > > > > > 2013/6/20 Brendan Grainger > > > > > Any reason not to have separate artist and city fields? So you would > > search > > > for: > > > > > > artist:(W. A. Mozart) city:Salzburg > > > > > > HTH > > > Brendan > > > > > > > > > On Thu, Jun 20, 2013 at 12:27 PM, michal samek > > >wrote: > > > > > > > Hi Adrien, > > > > > > > > thanks for your reply. If payloads cannot be used for searching, is > > there > > > > any workaround how to achieve similar functionality? > > > > > > > > What I'd like to accomplish is to be able to search documents with > > > contents > > > > for example > > > > "W. A. Mozart[artist] was born in Salzburg[city]" > > > > just by specifying the *payload*s [artist] [city]. > > > > > > > > Thanks > > > > > > > > *Michal > > > > * > > > > > > > > > > > > 2013/6/20 Adrien Grand > > > > > > > > > Hi Michal, > > > > > > > > > > Although payloads can be used at query time to customize scoring, > > they > > > > > can't be used for searching. Lucene only allows to search on terms. > > > > > > > > > > -- > > > > > Adrien > > > > > > > > > > > - > > > > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > > > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > Brendan Grainger > > > www.kuripai.com > > > > > >
Re: Payload Matching Query
@Sujit: That's exactly what I'm talking about, I just might not have put it clearly enough =) @Uwe: Thanks for inspiration, I'll check it out! Thanks, everybody! Michal 2013/6/21 Uwe Schindler > You may also be interested in this talk @ BerlinBuzzwords2013: > http://intrafind.de/tl_files/documents/INTRAFIND_BerlinBuzzwords2013_The-Typed-Index.pdf > > Unfortunately the slides are not available. > > Uwe > > - > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > > > > -Original Message- > > From: Sujit Pal [mailto:sujitatgt...@gmail.com] On Behalf Of SUJIT PAL > > Sent: Friday, June 21, 2013 5:14 PM > > To: java-user@lucene.apache.org > > Subject: Re: Payload Matching Query > > > > Hi Michael, > > > > Instead of putting the annotation in Payloads, why not put them in as > > "synonyms", ie at the same spot as the original string (see > SynonymFilter in > > the LIA book). So your string would look like (to the index): > > > > W. A. Mozart was born in Salzburg > > artist city > > > > so you can query as s:"__artist__ __city__"~slop > > > > -sujit > > > > On Jun 20, 2013, at 9:27 AM, michal samek wrote: > > > > > Hi Adrien, > > > > > > thanks for your reply. If payloads cannot be used for searching, is > > > there any workaround how to achieve similar functionality? > > > > > > What I'd like to accomplish is to be able to search documents with > > > contents for example "W. A. Mozart[artist] was born in Salzburg[city]" > > > just by specifying the *payload*s [artist] [city]. > > > > > > Thanks > > > > > > *Michal > > > * > > > > > > > > > 2013/6/20 Adrien Grand > > > > > >> Hi Michal, > > >> > > >> Although payloads can be used at query time to customize scoring, > > >> they can't be used for searching. Lucene only allows to search on > terms. > > >> > > >> -- > > >> Adrien > > >> > > >> - > > >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > >> > > >> > > > > > > - > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > - > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > >