A first use case of gap constraint is included in the article. Another application would be customer-shopping sequence analysis where you want to put a constraint on the duration between two purchases for them to be considered as a pertinent sequence.
Additional question regarding the code : what's the point of using ReversedPrefix in localprefispan ? The prefix is used neither in finding frequent items of a projected database or computing a new projected database so it looks like it's appended in inverse order just to be reversed when transformed to a sequence. 2015-08-25 12:15 GMT+08:00 Feynman Liang <fli...@databricks.com>: > CCing the mailing list again. > > It's currently not on the radar. Do you have a use case for it? I can > bring it up during 1.6 roadmap planning tomorrow. > > On Mon, Aug 24, 2015 at 8:28 PM, alexis GILLAIN <ila...@hotmail.com> > wrote: > >> Hi, >> >> I just realized the article I mentioned is cited in the jira and not in >> the code so I guess you didn't use this result. >> >> Do you plan to implement sequence with timestamp and gap constraint as in >> : >> >> https://people.mpi-inf.mpg.de/~rgemulla/publications/miliaraki13mg-fsm.pdf >> >> 2015-08-25 7:06 GMT+08:00 Feynman Liang <fli...@databricks.com>: >> >>> Hi Alexis, >>> >>> Unfortunately, both of the papers you referenced appear to be >>> translations and are quite difficult to understand. We followed >>> http://doi.org/10.1109/ICDE.2001.914830 when implementing PrefixSpan. >>> Perhaps you can find the relevant lines in there so I can elaborate further? >>> >>> Feynman >>> >>> On Thu, Aug 20, 2015 at 9:07 AM, alexis GILLAIN <ila...@hotmail.com> >>> wrote: >>> >>>> I want to use prefixspan so I had a look at the code and the cited >>>> paper : "Distributed PrefixSpan Algorithm Based on MapReduce". >>>> >>>> There is a result in the paper I didn't really undertstand and I >>>> could'nt find where it is used in the code. >>>> >>>> Suppose a sequence database S = {1,2...n}, a sequence <a...> is a >>>> length-(L-1) (2≤L≤n) sequential pattern, in projected databases which is a >>>> prefix of a length-(L-1) sequential pattern <a...a>, when the support count >>>> of <a> is not less than min_support, it is equal to obtaining a length-L >>>> sequential pattern < a ... a > from projected databases that obtaining a >>>> length-L sequential pattern < a ... a > from a sequence database S. >>>> >>>> According to the paper It's supposed to add a pruning step in the >>>> reduce function but I couldn't find where. >>>> >>>> This result seems to come from a previous paper : "Wang Linlin, Fan >>>> Jun. Improved Algorithm for Sequential Pattern Mining Based on PrefixSpan >>>> [J]. Computer Engineering, 2009, 35(23): 56-61" but it didn't help me to >>>> understand it and how it can improve the algorithm. >>>> >>> >>> >> >