We've been using grouping on our queries for years, and I'm looking at changing
to collapsing because sometimes larger result sets take a long time. I did
some experimenting and I see that while collapsing is indeed faster for large
result sets, it's also slower for smaller ones.
Here is some data, where it shows how many results are returned (always capped
at 2500) and how many ms each query took.
For example, "hamilton" returned 2500 hits, and the old grouping query took
1027ms, but with collapsing it only took 234ms. But then "incorporate" only
returned 23 hits, but the runtime increased with collapsing.
SIZE GROUP COLL KEYWORD
2500 1027 234 hamilton
294 124 130 achievements
23 34 107 incorporate
696 256 200 pastor
2319 854 179 marilyn
301 123 145 domain
2500 977 247 sciences
129 70 162 faced
1446 534 205 danger
2500 1008 185 thinking
40 35 105 pointing
922 338 229 turkey
64 42 159 kuwait
192 89 189 jail
645 234 146 transactions
3 22 138 casio
2500 906 200 process
616 228 155 discussion
879 323 160 leave
175 81 150 dependence
Why would the collapsing be slower than grouping for collapsing for the small
result set cases? Also, yes, we're using faceting. Yes, I have docValues on
every relevant field.
# Grouping args
group.field=tf1flrid
group.format=simple
group.limit=1
group.main=true
group.sort=gsort_durable asc, is_live desc, is_english desc, reportyear desc,
flrid asc
group=true
# Collapse args
fq={!collapse field=tf1flrid sort="gsort_durable asc, is_live desc, is_english
desc, reportyear desc, flrid asc"}
Thanks,
Andy