Example for position end and positionLength of SGF.

query: natural forest

WT      text     start  end  positionLength  type  position
        natural  0      7    1               word  1
        forest   8      14   1               word  2
...

SPF     text     start  end  positionLength  type     position
        natural  0      7    1               word     1
 natural forest  0      14   2               shingle  2
        forest   8      14   1               word     3

SGF     text     start  end  positionLength  type     position
        natural  0      7    1               word     1
      naturwald  0      14   1               SYNONYM  2
forêt naturelle  0      14   1               SYNONYM  2
natürlicher wald 0      14   1               SYNONYM  2
 natural forest  0      14   1               shingle  2
         forest  8      14   1               word     3

SPF     text     start  end  positionLength  type     position
        natural  0      7    1               word     1
      naturwald  0      9    1               SYNONYM  2
"forêt naturelle"  0    17   2               SYNONYM  2
"natürlicher wald" 0    18   2               SYNONYM  2
"natural forest" 0      16   2               shingle  2
         forest  8      14   1               word     3


SGF (SynonymsGraphFilter) has for all SYNONYM's the same position end and 
positionLength.
I suppose that it is not correct?

Regards
Bernd


Am 09.02.2017 um 18:39 schrieb Michael McCandless:
> On Thu, Feb 9, 2017 at 2:40 AM, Bernd Fehling
> <bernd.fehl...@uni-bielefeld.de> wrote:
>> I tried SynonymGraphFilter with my setup and it works right away.
>> It payed of that I did some modifications on my filters while
>> testing 6.3 with my setup.
> 
> Good!
> 
>> I only replaced SynonymFilter with SynonymGraphFilter and did not
>> use FlattenGraphFilter, pretty simple. So I can confirm that, up
>> to this point, SynonymGraphFilter is a full replacement for
>> SynonymFilter. At least for search-time synonym handling.
>>
>> But this also means there is still some work with the attributes, right?
>> Position looks good, type and start are no problem anyway, but
>> the end position is still wrong and the positionLength for multi-word
>> synonyms.
> 
> Can you give an example or make a small test case?
> PositionLengthAttribute is supposed to be correct coming out of
> SynonymGraphFilter.
> 
>> One thing I noticed was that the originating token which "produces"
>> synonyms comes out last from SynonymGraphFilter, after the
>> "produced" synonyms.
>> I will have a look inside with debugger but I guess this is due
>> to output buffering of SynonymGraphFilter?
> 
> Yeah they do come out in a different order, which token filters are
> allowed to do in general for all tokens leaving from the same position
> ...
> 
> Mike McCandless
> 
> http://blog.mikemccandless.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
> For additional commands, e-mail: java-user-h...@lucene.apache.org
> 

-- 
*************************************************************
Bernd Fehling                    Bielefeld University Library
Dipl.-Inform. (FH)                LibTec - Library Technology
Universitätsstr. 25                  and Knowledge Management
33615 Bielefeld
Tel. +49 521 106-4060       bernd.fehling(at)uni-bielefeld.de

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-user-h...@lucene.apache.org

Reply via email to