Thank you for humoring my questions. I do not know the mind of the DataFu community. Your observations are quite clear; I have no further concerns.
-n On Friday, November 21, 2014, Makoto Yui <yuin...@gmail.com> wrote: > Hi Nick, > > Thank you for the comments. > > (2014/11/22 3:42), Nick Dimiduk wrote: > >> I would also encourage you to consider joining forces with DataFu, >> rather than "competing". I think there's a real appetite a wholistic >> toolbox of patterns and implementations that can span these projects. >> From my understanding, there's nothing about DataFu that's unique to >> Pig, they just need the work done to abstract away the Pig bits and >> implement the Hive interfaces. >> > > My current understanding of DataFu is that it is UDF collections for > Apache Pig. Though Hive interface is not yet supported in DataFu, is the > direction (to extend DataFu for Hive) a consensus in DataFu community? > > My concern is that merging Hivemall codebase to DataFu makes the building > and packing process of DataFu complex and the target/objective of the > project unclear. > > I do not think that Hivemall competes with DataFu because > 1) There are users who prefer Pig and Hive respectively, and > 2) Pig/DataFu is useful for what HiveQL is unsuited (e.g., complex feature > engineering steps). After preprocessing using DataFu, Hivemall can be > applied for classification/regression in a scalable way in Hive. > > Is there anything about Hivemall that's unique to Hive, that wouldn't be >> applicable to Pig as well? >> > > The techniques used in Hivemall (e.g., training data amplification that > emulates iterative training and machine learning algorithms as > table-generating functions) could be appreciable to Apache Pig. > > However, I am not a heavy user of Pig and porting Hivemall to Pig requires > a bunch of works. So, I am currently considering to stick with HiveQL > interfaces (Hive, HCatalog, and Tez for the software stack of Hivemall) in > developing Hivemall because SQL-like interface is friendly to a broader > range of developers. > > Thanks, > Makoto > > -- > ******************************************* > Makoto YUI <m....@aist.go.jp> > Information Technology Research Institute, AIST. > https://staff.aist.go.jp/m.yui/index_e.html > ******************************************* >