Re: StartsWith on DrillDown?

2016-11-17 Thread Michael McCandless
The idea w/ drill down is you are running a "base query" (what the user actually searched for, originally) and then, if the user has clicked to drill down on any facet labels, you are also adding drill-down queries. You pass the "base query" to the DrillDownQuery constructor. And, normally, to ad

Re: Faceting : what are the limitations of Taxonomy (Separate index and hierarchical facets) and SortedSetDocValuesFacetField ( flat facets and no sidecar index) ?

2016-11-17 Thread Chitra R
Okay. I agree with you, Taxonomy maintains and supports hierarchical facets during indexing. Hope hierarchical in the sense, we might index the field Publish date : 2010/10/15 as Publish date: 2010 , Publish date: 2010/10 and Publish date: 2010/10/15 , their facet ordinals are maintained in sidecar

Re: StartsWith on DrillDown?

2016-11-17 Thread Matt Hicks
I understand this, and that's how I'm using it now, but my situation is that in my application I want to offer the ability to auto-complete tags that have results based on the current query. This is why I'm looking for a "StartsWith" filter on the tags. Certainly I could get back all of the tags

enhancement for SynonymFilter

2016-11-17 Thread Bernd Fehling
Currently I'm tackling a problem with SynonymFilter while going from 4.10.4 to 6.3.0. For a special solution I need to know if a word (or multiword) is producing synonyms in SynonymFilter. Therefore I suggest the enhancement of "hasSynonyms" for SynonymFilter. A workaroud would be to buffer all

ASCIIFoldingFilter

2016-11-17 Thread Julian Motz
Hello together, We're currently discussing about the usage of the ASCIIFoldingFilter class in our diacritics project. This project will be about

Multi-field IDF

2016-11-17 Thread Nicolás Lichtmaier
IDF measures the selectivity of a term. But the calculation is per-field. That can be bad for very short fields (like titles). One example of this problem: If I don't delete stop words, then "or", "and", etc. should be dealt with low IDF values, however "or" is, perhaps, not so usual in titles.

Re: enhancement for SynonymFilter

2016-11-17 Thread Michael McCandless
Hmm are you saying SynonymFilter in 4.10.4 has this capability but 6.3.0 lost it? So you you have a synonym "wow that's funny" -> "wtf", you want the token for "wow" to state that it has a synonym? Using the PositionLengthAttribute you should be able to reconstruct this, because when you see "wtf

Re: Multi-field IDF

2016-11-17 Thread Ahmet Arslan
Hi Nicholas, IDF, among others, is a measure of term specificity. If 'or' is not so usual in titles, then it has some discrimination power in that domain. I think it's OK 'or' to get a high IDF value in this case. Ahmet On Thursday, November 17, 2016 9:09 PM, Nicolás Lichtmaier wrote: IDF

Re: Multi-field IDF

2016-11-17 Thread Nicolás Lichtmaier
That depends on what you want. In this case I want to use a discrimination power based in all the body text, not just the titles. Because otherwise terms that are really not that relevant end up being very high! El 17/11/16 a las 18:25, Ahmet Arslan escribió: Hi Nicholas, IDF, among others,

Re: Multi-field IDF

2016-11-17 Thread Will Martin
are you familiar with pivoted normalized document length practice or theory? or croft's recent work on relevance algorithms accounting for structured field presence? On 11/17/2016 5:20 PM, Nicolás Lichtmaier wrote: That depends on what you want. In this case I want to use a discrimination po

Re: Faceting : what are the limitations of Taxonomy (Separate index and hierarchical facets) and SortedSetDocValuesFacetField ( flat facets and no sidecar index) ?

2016-11-17 Thread Chitra R
case 1: In taxonomy, for each indexed document, examines facet label , computes their ordinals and mappings, and which will be stored in sidecar index at index time. case 2: In doc values, these(ordinals) are computed at search time, so there will be a time and memory trade-off bet