Re: Better way to do UDF's for Hive

2015-10-01 Thread Edward Capriolo
You can define them in groovy from inside the CLI... https://gist.github.com/mwinkle/ac9dbb152a1e10e06c16 On Thu, Oct 1, 2015 at 12:57 PM, Ryan Harris wrote: > If you want to use python... > > The python script should expect tab-separated input on stdin and it should > return tab-separated deli

RE: Better way to do UDF's for Hive

2015-10-01 Thread Ryan Harris
If you want to use python... The python script should expect tab-separated input on stdin and it should return tab-separated delimited columns for the output... add file mypython.py; SELECT TRANSFORM (tbl.id, tbl.name, tbl.city) USING 'python mypython.py' AS (id, name, city, state) FROM my_db.my_

Re: Better way to do UDF's for Hive

2015-10-01 Thread Dmitry Tolpeko
In case of single string input Java UDF can be easier to write: accept string parameter, lookup hash map and return. In case of Python you have to use TRANSFORM clause and handle all columns, so it will be hard to reuse your Python script as the code may depend on the column position. One other po

Re: Better way to do UDF's for Hive

2015-10-01 Thread Elliot West
Perhaps a macro? CREATE TEMPORARY MACRO state_from_city (city string) " + /* HQL column logic */ ...; On 1 October 2015 at 14:11, Daniel Lopes wrote: > Hi, > > I'd like to know the good way to do a a UDF for a single field, like > > SELECT > tbl.id AS id, > tbl.name AS name, > tbl.city A