Re: Scoring in Lucene 6.6.0, 7.7.2, 8.1

2019-06-25 Thread Adrien Grand
You can use IndexSearcher#explain to see how scores are computed. On Wed, Jun 26, 2019 at 12:48 AM wrote: > > Hi,- > > i really want to know why the scoring works this way: search String is > either MAINO or MAINS: MAIN appears as the 276th entry in the results. > > NEW HAMPSHIRE in results: ci

Scoring in Lucene 6.6.0, 7.7.2, 8.1

2019-06-25 Thread baris . kazar
Hi,-  i really want to know why the scoring works this way: search String is either MAINO or MAINS: MAIN appears as the 276th entry in the results. NEW HAMPSHIRE in results: city="NASHUA" municipality="HILLSBOROUGH" region="NEW HAMPSHIRE" country="UNITED STATES" in the 0 th result NEW HAMPSHI

Re: StanardFilter Question : https://issues.apache.org/jira/browse/LUCENE-8356

2019-06-25 Thread Trejkaz
Yeah, that code looks right to me. The factory we use for keeping backwards compatibility is entirely ours. I think CustomAnalyzer is a similar-looking API to what we have but we made ours much earlier and it supports analysis stuff all the way back to Lucene 3 which we migrated all the way to whe

Re: StanardFilter Question : https://issues.apache.org/jira/browse/LUCENE-8356

2019-06-25 Thread baris . kazar
Corrected a typo below in the new code. Best regards On 6/25/19 5:01 PM, baris.ka...@oracle.com wrote: Hi,-  do You mean there is a backward compatibility factory in Lucene for these kinds of cases? i think it can be fixed like this,  In other words is the following first line redundant t

Re: StanardFilter Question : https://issues.apache.org/jira/browse/LUCENE-8356

2019-06-25 Thread baris . kazar
Hi,-  do You mean there is a backward compatibility factory in Lucene for these kinds of cases? i think it can be fixed like this,  In other words is the following first line redundant then? TokenStream filter = new StandardFilter(tokenizer); -> redundant (tokenizer is actually a StandardT

Re: FuzzyQuery- why is it ignored?

2019-06-25 Thread baris . kazar
i tested this on Lucene 7.7.2 and got the same answer MAINS cannot find MAIN but all other consonant combos at the end can be found. i am now confident that this is a bug with Lucene. Best regards PS. Lucene 8.1 has drastic changes such as StandardFilter is removed in one of the packages and

Re: Index Optimization

2019-06-25 Thread Erick Erickson
Optimize is rarely useful. It can give some performance gains, but is quite an expensive operation. Pre Solr 7.5, optimizing had some behaviors that weren’t obvious, see: https://lucidworks.com/2017/10/13/segment-merging-deleted-documents-optimize-may-bad/ Post 7.5, the behavior has changed. I

Index Optimization

2019-06-25 Thread Eduardo Costa Lopes
Hello folks, I got some Lucene indexes in my project, mostly of them are created once and updated, not so frequently, about once a week or monthly. The indexes sizes are about 20GB and as more inserts are done the indexes grow, so I'd like to know what the best index optimization strategy or e