I read through the http://searchhub.org/2009/07/18/the-spanquery/ which 
provided a good overview for how one can construct fairly complex span queries. 
 I was particularly interested in the ability to construct nested span queries. 
 I'm trying to apply this concept to search a field that contains some 
structure (as below).  I have a couple of other fields that will have a bit 
more nesting, but this should give the general idea.  

authors
  author [one or more]
    first name
    last name

Prior to indexing the content with Lucene, I added some 'markers' around the 
various bits I might want to search.  For example 'bauthor' implies beginning 
author, 'eauthor' implies ending author, and 'sauthor' implies a separator 
between individual authors (that would be used as part of the exclude clause in 
a not span query).  I do similar things for 'first name' and 'last name'.

My constructed query (as interpreted by Lucene) is included below.  This was 
extracted from the 'parsed string' returned from the query when I set 
debug=true.  Within a given 'authscope' field, I'm trying to find a situation 
where the author first name is 'darin' and the last name is 'fulford' within a 
given 'author'.   

spanNot(
    spanNear(
        [authscope:bauthor, 
        spanNear(
            [spanNot(
                spanNear(
                    [authscope:bfname, 
                    authscope:darin, 
                    authscope:efname], 
                    2147483647, true), 
                authscope:sfname, 0, 0), 
             spanNot(
                spanNear(
                    [authscope:blname, 
                    authscope:fulford, 
                    authscope:elname], 
                    2147483647, true), 
                authscope:slname, 0, 0)], 
             2147483647, false), 
         authscope:eauthor], 
         2147483647, true), 
     authscope:sauthor, 0, 0)",

I have loaded the following  2 documents into my index.

[
  {"id":"1", "authscope":" bauthors  bauthor blname mcbeath elname slname  
bfname  darin efname sfname  eauthor sauthor  bauthor blname  fulford elname 
slname  bfname  darby efname sfname  eauthor sauthor  bauthor blname  mcbeath 
elname slname  bfname  darby efname sfname  eauthor sauthor  eauthors sauthors 
"},
  {"id":"2", "authscope":" bauthors  bauthor blname  mcbeath elname slname  
bfname  darin efname sfname  eauthor sauthor  bauthor blname  fulford elname 
slname  bfname  darin efname sfname  eauthor sauthor  eauthors sauthors "}
]

What I can't figure out is why the above query would match on both documents.  
It should only match the document with id:2.


Any insights would be appreciated.  I'm using Lucene 4.7.2.

Darin.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to