I looks really cool, I think I will try it on. Cheers, Zhuoluo (Clark) Yang
2013/10/5 Makoto YUI <yuin...@gmail.com> > Hi Edward, > > Thank you for your interst. > > Hivemall project does not have a plan to have a specific mailing list, I > will answer following questions/comments on twitter or through Github > issues (with a question label). > > BTW, I just added a CTR (Click-Through-Rate) prediction example that is > provided by a commercial search engine provider for the KDDCup 2012 track > 2. > https://github.com/myui/**hivemall/wiki/KDDCup-2012-** > track-2-CTR-prediction-dataset<https://github.com/myui/hivemall/wiki/KDDCup-2012-track-2-CTR-prediction-dataset> > > I guess many of you working on ad CTR/CVR predictions. This example might > be some help understanding how to do it only within Hive. > > Thanks, > Makoto @myui > > > (2013/10/04 23:02), Edward Capriolo wrote: > >> Looks cool im already starting to play with it. >> >> On Friday, October 4, 2013, Makoto Yui <yuin...@gmail.com >> <mailto:yuin...@gmail.com>> wrote: >> > Hi Dean, >> > >> > Thank you for your interest in Hivemall. >> > >> > Twitter's paper actually influenced me in developing Hivemall and I >> > initially implemented such functionality as Pig UDFs. >> > >> > Though my Pig ML library is not released, you can find a similar >> > attempt for Pig in >> > >> https://github.com/y-tag/java-**pig-MyUDFs<https://github.com/y-tag/java-pig-MyUDFs> >> > >> > Thanks, >> > Makoto >> > >> > 2013/10/3 Dean Wampler <deanwamp...@gmail.com >> <mailto:deanwamp...@gmail.com>**>: >> >> >> This is great news! I know that Twitter has done something similar >> with UDFs >> >> for Pig, as described in this paper: >> >> >> http://www.umiacs.umd.edu/~**jimmylin/publications/Lin_** >> Kolcz_SIGMOD2012.pdf<http://www.umiacs.umd.edu/~jimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf>< >> http://www.umiacs.umd.edu/%**7Ejimmylin/publications/Lin_** >> Kolcz_SIGMOD2012.pdf<http://www.umiacs.umd.edu/%7Ejimmylin/publications/Lin_Kolcz_SIGMOD2012.pdf> >> > >> >> >> >> >> I'm glad to see the same thing start with Hive. >> >> >> >> Dean >> >> >> >> >> >> On Wed, Oct 2, 2013 at 10:21 AM, Makoto YUI <yuin...@gmail.com >> <mailto:yuin...@gmail.com>> wrote: >> >>> >> >>> Hello all, >> >>> >> >>> My employer, AIST, has given the thumbs up to open source our machine >> >>> learning library, named Hivemall. >> >>> >> >>> Hivemall is a scalable machine learning library running on >> Hive/Hadoop, >> >>> licensed under the LGPL 2.1. >> >>> >> >>> https://github.com/myui/**hivemall<https://github.com/myui/hivemall> >> >>> >> >>> Hivemall provides machine learning functionality as well as feature >> >>> engineering functions through UDFs/UDAFs/UDTFs of Hive. It is >> designed >> >>> to be scalable to the number of training instances as well as the >> number >> >>> of training features. >> >>> >> >>> Hivemall is very easy to use as every machine learning step is done >> >>> within HiveQL. >> >>> >> >>> -- Installation is just as follows: >> >>> add jar /tmp/hivemall.jar; >> >>> source /tmp/define-all.hive; >> >>> >> >>> -- Logistic regression is performed by a query. >> >>> SELECT >> >>> feature, >> >>> avg(weight) as weight >> >>> FROM >> >>> (SELECT logress(features,label) as (feature,weight) FROM >> >>> training_features) t >> >>> GROUP BY feature; >> >>> >> >>> You can find detailed examples on our wiki pages. >> >>> >> https://github.com/myui/**hivemall/wiki/_pages<https://github.com/myui/hivemall/wiki/_pages> >> >>> >> >>> Though we consider that Hivemall is much easier to use and more >> scalable >> >>> than Mahout for classification/regression tasks, please check it by >> >>> yourself. If you have a Hive environment, you can evaluate Hivemall >> >>> within 5 minutes or so. >> >>> >> >>> Hope you enjoy the release! Feedback (and pull request) is always >> welcome. >> >>> >> >>> Thank you, >> >>> Makoto >> >> >> >> >> >> >> >> >> >> -- >> >> Dean Wampler, Ph.D. >> >> @deanwampler >> >> http://polyglotprogramming.com >> > >> > >