Hi Sanjay, Lefty Thanks for the help but none of above responses directly answering my question (probably I am not asking clear enough :-( ).
Below I have two different structure of a UDAF (aggregation function). My question is which one is the preferred/right approach http://pastebin.com/QCgd4Hxc : This version is based on based on what I could understand from API docs about UDAF class. http://pastebin.com/Uctamtek : This version is based on the book Hadoop The definitive guide. Notice the function names for different from the first one. I hope this clarifies my question. Thanks Ritesh On Wed, Aug 7, 2013 at 5:34 PM, Lefty Leverenz <leftylever...@gmail.com>wrote: > Sounds like the wikidoc needs some work. I'm open to suggestions. If > Sanjay's simple UDF helps, I could put it in the wiki along with any advice > you think would help. > > Does anyone else have use cases to contribute? > > -- Lefty > > > On Mon, Aug 5, 2013 at 2:45 PM, Sanjay Subramanian < > sanjay.subraman...@wizecommerce.com> wrote: > >> Hi Ritesh >> >> To help u get started , I am writing a simple HelloWorld-ish UDF that >> might help…If it doesn't please ask for more clarifications... >> >> Good Luck >> Thanks >> >> sanjay >> >> >> ******************************************************************************** >> *ToUpperCase.java* >> >> *package* com.sanjaysubramanian.utils.hive.udf; >> >> *import* org.apache.hadoop.hive.ql.exec.UDF; >> >> >> *public* *final* *class* ToUpperCase *extends* UDF{ >> >> *protected* *final* Log logger = LogFactory.*getLog*(toUpperCase.* >> class*); >> >> >> *public* *String* evaluate(*final* String inputString) { >> >> if (inputString != null){ >> >> *return* inputString.toUpper; >> >> } >> >> else { >> >> *return* inputString; >> >> } >> >> } >> >> } >> >> ******************************************************************************** >> >> *Usage in a Hive script* >> * >> * >> hive -e " >> >> create temporary function toupper as >> 'com.sanjaysubramanian.utils.hive.udf.ToUpperCase'; >> SELECT >> first_name, >> toupper(first_name) >> FROM >> company_names >> " >> >> >> *********************************************************************************** >> >> From: Ritesh Agrawal <ragra...@netflix.com> >> Reply-To: "user@hive.apache.org" <user@hive.apache.org> >> Date: Monday, August 5, 2013 9:41 AM >> To: "user@hive.apache.org" <user@hive.apache.org> >> Subject: Re: Hive UDAF extending UDAF class: iterate or evaluate method >> >> Hi Lefty, >> >> I used the wiki you sent to write my first version of UDAF. However, I >> found it to be utterly complex, especially for storing partial results as I >> am not very familiar with hive API. Then I found another example of UDAF in >> the hadoop the definitive guide book and it had much simpler code but using >> different method. Instead of using iterate it was using evaluate method and >> so I am getting confused. >> >> Ritesh >> >> >> On Sun, Aug 4, 2013 at 2:18 PM, Lefty Leverenz >> <leftylever...@gmail.com>wrote: >> >>> You might find this wikidoc useful: >>> GenericUDAFCaseStudy<https://cwiki.apache.org/confluence/display/Hive/GenericUDAFCaseStudy>. >>> >>> >>> The O'Reilly book "Programming Hive" also has a section called >>> "User-Defined Aggregate Functions" in chapter 13 (Functions), pages 172 to >>> 176. >>> >>> -- Lefty >>> >>> >>> On Sun, Aug 4, 2013 at 7:12 AM, Ritesh Agrawal <ragra...@netflix.com>wrote: >>> >>>> Hi all, >>>> >>>> I am trying to write a UDAF function. I found an example that shows >>>> how to implement a UDAF in "Hadoop The Definitive Guide" book. However I am >>>> little confused. In the book, the author extends UDAF class and implements >>>> init, iterate, terminatePartial, merge and terminate function. However >>>> looking at the hive docs ( >>>> http://hive.apache.org/docs/r0.11.0/api/org/apache/hadoop/hive/ql/exec/UDAF.html), >>>> it seems I need to implement init, aggregate, evaluatePartial, >>>> aggregatePartial and evaluate function. Please let me know what are the >>>> write functions to implement. >>>> >>>> Ritesh >>>> >>> >>> >>> >>> -- >>> Lefty >>> >> >> >> CONFIDENTIALITY NOTICE >> ====================== >> This email message and any attachments are for the exclusive use of the >> intended recipient(s) and may contain confidential and privileged >> information. Any unauthorized review, use, disclosure or distribution is >> prohibited. If you are not the intended recipient, please contact the >> sender by reply email and destroy all copies of the original message along >> with any attachments, from your computer system. If you are the intended >> recipient, please be advised that the content of this message is subject to >> access, review and disclosure by the sender's Email System Administrator. >> > > > > -- > Lefty >