Hi everyone,
I ended up using the idea of doing multiple aggregations in one go and it
was
a nice improvement. Maybe we can reconsider introducing this? I've opened an
issue [1] and published a PR [2] based on the code I had previously shared,
with some extra tests and a few improvements.
Stefan
Hi Stefan-
I opened https://github.com/apache/lucene/issues/12190 where we can discuss
this further. Thanks for raising the idea!
Cheers,
-Greg
On Mon, Mar 6, 2023 at 7:21 AM Stefan Vodita
wrote:
> Hi Greg,
>
> The PR looks great. I think it's a useful feature to have and it helps
> with the
>
Hi Greg,
The PR looks great. I think it's a useful feature to have and it helps with the
use-case we were discussing. I left a comment with some other ideas that I'd
like to explore.
Thank you for coding this up,
Stefan
On Sun, 5 Mar 2023 at 19:33, Greg Miller wrote:
>
> Hi Stefan-
>
> I cobble
Hi Stefan-
I cobbled together a draft PR that I _think_ is what you're looking for so
we can have something to talk about. Please let me know if this misses the
mark, or is what you had in mind. If so, we could open an issue to propose
the idea of adding something like this. I'm not totally convin
Hi everyone,
Greg and I discussed a bit offline. His assessment was right - I’m not looking
to compute multiple values per ordinal as an end in itself. That is only a means
to compute a single value which depends on other facet results. This
section from
the previous email explains it really well:
Thanks for the detailed benchmarking Stefan! I think you've demonstrated
here that looping over the collected hits multiple times does in fact add
meaningful overhead. That's interesting to see!
As for whether-or-not to add functionality to the facets module that
supports this, I'm not convinced a
After benchmarking my implementation against the existing one, I think there is
some meaningful overhead. I built a small driver [1] that runs the two
solutions over
a geo data [2] index (thank you Greg for providing the indexing code!).
The table below lists faceting times in milliseconds. I’ve n
Thanks for the follow up Stefan. If you find significant overhead
associated with the multiple iterations, please keep challenging the
current approach and suggest improvements. It's always good to revisit
these things!
Cheers,
-Greg
On Thu, Feb 16, 2023 at 1:32 PM Stefan Vodita
wrote:
> Hi Gre
Hi Greg,
To better understand how much work gets duplicated, I went ahead
and modified FloatTaxonomyFacets as an example [1]. It doesn't look
too pretty, but it illustrates how I think multiple aggregations in one
iteration could work.
Overall, you're right, there's not as much wasted work as I h
Hi Stefan-
> In that case, iterating twice duplicates most of the work, correct?
I'm not sure I'd agree that it duplicates "most" of the work. This is an
association faceting example, which is a little bit of a special case in
some ways. But, to your question, there is duplicated work here of
re-
Hi Greg,
I see now where my example didn’t give enough info. In my mind, `Genre /
Author nationality / Author name` is stored in one hierarchical facet field.
The data we’re aggregating over, like publish date or price, are stored in
DocValues.
The demo package shows something similar [1], where
Hi Stefan-
That helps, thanks. I'm a bit confused about where you're concerned with
iterating over the match set multiple times. Is this a situation where the
ordinals you want to facet over are stored in different index fields, so
you have to create multiple Facets instances (one per field) to co
Hi Greg,
I’m assuming we have one match-set which was not constrained by any
of the categories we want to aggregate over, so it may have books by
Mark Twain, books by American authors, and sci-fi books.
Maybe we can imagine we obtained it by searching for a keyword, say
“Washington”, which is pre
Hi Stefan-
Can you clarify your example a little bit? It sounds like you want to facet
over three different match sets (one constrained by "Mark Twain" as the
author, one constrained by "American authors" and one constrained by the
"sci-fi" genre). Is that correct?
Cheers,
-Greg
On Fri, Feb 10,
14 matches
Mail list logo