>> For an example, in the phrase "A man saw a elephant" "saw" has annotations as >> follows (we also say that its position in index is 1234): >> >> {lemma: see, pos: verb, tense: past}, {lemma: saw, pos: noun, number: >> singular} >> >> I think, it would be more effective to insert parse index in each attribute's >> posting list entry as a payload and use it at the intersectiion stage. E.g., >> we have a posting list for 'pos = Verb' like ...|...|1.1234|...|..., and a >> posting list for 'number = Singular': ...|...|2.1234|...|... While processing >> a query like 'pos = Verb AND number = singular' at all stages of posting list >> processing 'x.1234' will be accepted until the intersection stage at which >> they will be rejected because of non-corresponding parse indexes. We're working on something very similar. Are there really posting lists like this (e.g., separate lists for pos=Verb, number=Singular) for things in Payloads? I think some previous discussion was saying this kind of posting list is not available. I couldn't find anything like that in the documentation about the index format. If there are, this would be really efficient.
> You might be able to insert your parses as payloads on a term and then > implement a scorer extension (override computePayloadFactor) to handle your > join cases for a given word. You may also need to extend PayloadQuery or > PayloadTermQuery. Note, I don't know how well this will perform. We've done it this way before, storing a slightly different set of information in the Payload. I thought making use of a Payload, though, requires you to iterate through all the tokens, whether in the Analyzer (i.e., in a TokenFilter) or Similarity (in an overridden scorePayload() function). If I'm right, then filtering this out at intersection time might not be quite as efficient as you're talking about, but it's definitely a reasonable way to do it. stephen --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org