[ https://issues.apache.org/jira/browse/SOLR-8965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Cassandra Targett updated SOLR-8965: ------------------------------------ Component/s: streaming expressions > Add Path reduce operation to aggregate paths in a session > --------------------------------------------------------- > > Key: SOLR-8965 > URL: https://issues.apache.org/jira/browse/SOLR-8965 > Project: Solr > Issue Type: New Feature > Components: streaming expressions > Reporter: Joel Bernstein > Priority: Major > > Session aggregation can be hard to do at scale. MapReduce of course makes > this easy. Now that we have MapReduce it would be good to add some session > aggregations to the base library. > The Path reduce operation can be used with the *reduce* function to > concatenate the path taken in a session into a single field. These path > records can then be added to another SolrCloud collection using the update > stream. Once they have been consolidated in another collection aggregations > can be run on the paths using the RollupStream. > A HashRollupStream could also be developed to aggregate the paths as they are > reduced. The HashRollupStream would keep all the paths in a hash map during > the aggregation so it would not require the paths to be received in order. > sample syntax: > {code} > reduce(search(logs, q="*:*", sort="sessionId, timestamp", ...), > by="sessionId", > path(field="pageId")) > {code} > This would work great in parallel by partitioning on the sessionId. -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@solr.apache.org For additional commands, e-mail: issues-h...@solr.apache.org