I don't have a textbook example to point you to, but you should be able to handle the problem either using: a) a UDF b) an external TRANSFORM script in a language of your choosing c) using Hive Windowing and Analytics functions (Lead/Lag, over, etc) https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics
All depending on the version of hive you are using as well as your programming language preferences. From: JB Rawlings [mailto:jrawli...@societyconsulting.com] Sent: Friday, February 05, 2016 1:53 PM To: user@hive.apache.org Subject: RE: Sessionize using Hive Ryan, Can you perhaps point me to example(s) of how this is done in Hive? Thanks, J. B. Rawlings Senior Consultant C: 425.233.1315 www.societyconsulting.com<http://www.societyconsulting.com/> From: Ryan Harris [mailto:ryan.har...@zionsbancorp.com] Sent: Monday, February 1, 2016 6:19 PM To: user@hive.apache.org Subject: RE: Sessionize using Hive it can be done in hive...whether or not it is the "best choice" depends on whether or not you have any other reason for your data to be in hive. If you are wondering whether Hive is the best tool for accomplishing this one task....it would probably be easier to do in pig. From: JB Rawlings [mailto:jrawli...@societyconsulting.com] Sent: Monday, February 01, 2016 7:11 PM To: user@hive.apache.org<mailto:user@hive.apache.org> Subject: Sessionize using Hive We are considering whether Hive is the best choice for "sessionizing" a set of data given the following parameters: * Input data set: A series of records with userID, startTimstamp, EndTimestamp, recordType, etc. * Output data set: Same records (no aggregation) with an added SessionId based on time difference between endTime of previous record and startTime of current record plus satisfying other criteria of the type current.recordType = previousRecordType. As long as a series of records meet the criteria for sessionization they would all have the same SessionId appended to each record. Briefly based on my analysis it appears that this problem would be better suited to MapReduce using Java, but would be interested in hearing from those with more experience in this area. J. B. Rawlings ________________________________ THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately. Thank you. ====================================================================== THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately. Thank you.