Hello Solr users :) Right now it seems that if I want to rollup on two different fields with streaming expressions, I would need to do two separate requests. This is too slow for our use-case, when we need to do joins before sorting and rolling up (because we'd have to re-do the joins).
Since in our case we are actually looking for some not-necessarily accurate facets (top N), the best solution we could come up with was to implement a new stream decorator that implements an algorithm like Count-min sketch[1] which would run on the tuples provided by the stream function it wraps. This would have two big wins for us: 1) it would do the facet without needing to sort on the facet field, so we'll potentially save lots of memory 2) because sorting isn't needed, we could do multiple facets in one go That said, I have two (broad) questions: A) is there a better way of doing this? Let's reduce the problem to streaming aggregations, where the assumption is that we have multiple collections where data needs to be joined, and then facet on fields from all collections. But maybe there's a better algorithm, something out of the box or closer to what is offered out of the box? B) whatever the best way is, could we do it in a way that can be contributed back to Solr? Any hints on how to do that? Just another decorator? Thanks and best regards, Radu -- Performance Monitoring * Log Analytics * Search Analytics Solr & Elasticsearch Support * http://sematext.com/ [1] https://en.wikipedia.org/wiki/Count%E2%80%93min_sketch
