Re: Multi-valued field and numTerms

2015-01-15 Thread Michael Sokolov
On 1/15/15 4:34 AM, rama44ster wrote: Hi, I am using lucene to index documents that have a multivalued text field named ‘city’. Each document might have multiple values for this field, like la, los angeles etc. Assuming document d1 contains city = la ; city = los angeles document d2 contains cit

Re: Multi-valued field and numTerms

2015-01-15 Thread Michael McCandless
Normally Lucene will count your d1 as having length=2. However, if "la" was added as a synonym for "los angeles", such that it "overlaps" its position, then the default similarity discounts that and will count it as length=1. But for that to work, the position of the 2nd token must be the same as

Multi-valued field and numTerms

2015-01-15 Thread rama44ster
Hi, I am using lucene to index documents that have a multivalued text field named ‘city’. Each document might have multiple values for this field, like la, los angeles etc. Assuming document d1 contains city = la ; city = los angeles document d2 contains city = la mirada document d3 contains city