Re: Faceting Queries NON-Taxonomy-based

2023-11-21 Thread Greg Miller
ing a lot about it by debugging and fighting through it. > Thank you, and also thank you, Stefan for your responses and for your > help. Stefan, I will review your demo as well. > > Thank you, > > Tony > > > -Original Message- > From: Greg Miller > Sent:

Re: Faceting Queries NON-Taxonomy-based

2023-11-16 Thread Greg Miller
Hi Tony- There are indeed a few different ways faceting can be implemented, which can be confusing. Can you share a little more about what you're looking to do with faceting? It sounds like maybe you want to facet on a docvalues field you already have in your index? If that's the use-case, you mig

Re: When to use StringField and when to use FacetField for categorization?

2023-10-23 Thread Greg Miller
Hey Michael- You've gotten a lot of great information here already. I'll point you to one more implementation as well: StringValueFacetCounts. This implementation lets you do faceting over arbitrary "string-like" doc value fields (SORTED and SORTED_SET). So if you already have a field of this type

Fwd: TAC supporting Berlin Buzzwords

2023-03-27 Thread Greg Miller
[forwarding to dev@ and java-users@] For anyone interested, please see the following note from Gavin on ASF Travel Assistance for Berlin Buzzwords. Cheers, -Greg -- Forwarded message - From: Gavin McDonald Date: Fri, Mar 24, 2023 at 2:57 AM Subject: TAC supporting Berlin Buzzwor

Re: Computing multiple different aggregations over a match-set in one pass

2023-03-06 Thread Greg Miller
lps > with the > use-case we were discussing. I left a comment with some other ideas that > I'd > like to explore. > > Thank you for coding this up, > Stefan > > On Sun, 5 Mar 2023 at 19:33, Greg Miller wrote: > > > > Hi Stefan- > > > > I cobb

Re: Computing multiple different aggregations over a match-set in one pass

2023-03-05 Thread Greg Miller
defining the expression and making a > single > faceting call. Has anyone worked on something similar? > > Best, > Stefan > > On Thu, 23 Feb 2023 at 16:53, Greg Miller wrote: > > > > Thanks for the detailed benchmarking Stefan! I think you've demonstrated >

Re: Computing multiple different aggregations over a match-set in one pass

2023-02-23 Thread Greg Miller
3317 | > > With 10 aggregations, we're saving a second or more. That is significant > for my > use-case. > > I'd like to know if the test and results seem reasonable. If so, maybe > we can think > about providing this functionality. > > Thanks, >

Re: Computing multiple different aggregations over a match-set in one pass

2023-02-17 Thread Greg Miller
facet field using data from a `popularity` > DocValue. > > > > > > In the demo, we compute `sum(_score * sqrt(popularity))`, but what if > we > > > want several other different aggregations with respect to the same > facet > > > field? Maybe we want `max(popular

Re: Computing multiple different aggregations over a match-set in one pass

2023-02-15 Thread Greg Miller
c924028063/lucene/demo/src/java/org/apache/lucene/demo/facet/ExpressionAggregationFacetsExample.java#L91 > > On Mon, 13 Feb 2023 at 22:46, Greg Miller wrote: > > > > Hi Stefan- > > > > That helps, thanks. I'm a bit confused about where you're concerned

Re: Computing multiple different aggregations over a match-set in one pass

2023-02-13 Thread Greg Miller
“Washington”, which is present in Mark Twain’s writing, and those of other > American authors, and in sci-fi novels too. > > Does that make the example clearer? > > > Stefan > > > On Sat, 11 Feb 2023 at 00:16, Greg Miller wrote: > > > > Hi Stefan- > >

Re: Computing multiple different aggregations over a match-set in one pass

2023-02-10 Thread Greg Miller
Hi Stefan- Can you clarify your example a little bit? It sounds like you want to facet over three different match sets (one constrained by "Mark Twain" as the author, one constrained by "American authors" and one constrained by the "sci-fi" genre). Is that correct? Cheers, -Greg On Fri, Feb 10,

Re: Getting all values for a specific dimension for SortedSetDocValues per document

2022-07-01 Thread Greg Miller
oes let you avoid building a global ordinal map and doing map lookups within the tight loop. Cheers, -Greg On Fri, Jul 1, 2022 at 2:35 AM Harald Braumann wrote: > > Hi! > > On 01.07.22 00:46, Greg Miller wrote: > > Have you considered taxonomy faceting for your use-case?

Re: Getting all values for a specific dimension for SortedSetDocValues per document

2022-06-30 Thread Greg Miller
Hi Harry- Have you considered taxonomy faceting for your use-case? Because the taxonomy structure is maintained in a separate index, it's (relatively) trivial to iterate all direct child ordinals of a given dimension. The cost of mapping to a global ordinal space is done when the index is merged.

Re: RangeFacetsCount Question

2022-04-26 Thread Greg Miller
I wonder if the idea is that fastMatchQuery provides additional filtering to only documents that might match one of the ranges being faceted on. As a (somewhat contrived) example, what if you were searching over items in an ecommerce catalog that all contain an indexed numeric "price" attribute, an

Re: FacetsCollector ScoreMode

2022-03-22 Thread Greg Miller
+1 to align this to the needs of keepScores. Good find! On Mon, Mar 21, 2022 at 10:00 AM Adrien Grand wrote: > > +1 to adjusting the ScoreMode based on keepScores. > > On Mon, Mar 21, 2022 at 5:47 PM Mike Drob wrote: > > > > Hey all, > > > > I was looking into some performance issues and was a l

Re: luceneutil

2022-01-07 Thread Greg Miller
My understanding is that, 1) there isn't any specific relationship between the iterations, and 2) the final output is a summary over all iterations. The idea is that randomness might affect results on any particular iteration, but by running multiple times (20 I think?) and then aggregating the sta

Re: Taxonomy vs SSDVFF for faceted search

2021-04-30 Thread Greg Miller
e implemented something similar, or have any thoughts or > ideas about that? > > -- > Regards, > Alex > > > On Thu, Apr 29, 2021 at 6:08 AM Greg Miller wrote: > > > Hi Alex- > > > > Amazon's product search engine is built on top of Lucene, which is a &

Re: Taxonomy vs SSDVFF for faceted search

2021-04-29 Thread Greg Miller
Hi Alex- Amazon's product search engine is built on top of Lucene, which is a fairly large-scale application (w.r.t. both index size, traffic and use-case complexity). We have found taxonomy-based faceting to work well for us generally, and haven't needed to do much to optimize beyond what's alrea