Hello all,

We have recently implemented two frequent itemset (pattern) mining
algorithms for MapReduce. They are much faster than the available PFP
implementation (see the paper). We would like to contribute these
implementations to Mahout and maintain them as needed.You can find the code
on the link below.

I know that the PFP is removed because of the lack of developer interest.
As the research group we are willing to provide new frequent pattern mining
algorithms and maintain existing ones.

Can you please help me with the steps that I have to go through? I have a
couple of questions in mind but any comments/suggestions are very welcome:

 - Shall I start a JIRA issue?
 - Shall I use the old fpm package for the algorithms?
 - We use a file for configuration because it makes runs easier to
reproduce, is this OK with Mahout guidelines?
 - Is there a guideline for the trade off between code quality and
performance? For example we preferred functional/integration tests to unit
tests because making some classes unit testable adds too much class
creation overhead.

Link for the code and paper: http://adrem.ua.ac.be/bigfim

Thank you in advance for your help.

Cheers!
--
M. Emin Akşehirli

Reply via email to