If you want to use python... The python script should expect tab-separated input on stdin and it should return tab-separated delimited columns for the output...
add file mypython.py; SELECT TRANSFORM (tbl.id, tbl.name, tbl.city) USING 'python mypython.py' AS (id, name, city, state) FROM my_db.my_table ; From: Daniel Lopes [mailto:dan...@bankfacil.com.br] Sent: Thursday, October 01, 2015 7:12 AM To: user@hive.apache.org Subject: Better way to do UDF's for Hive Hi, I'd like to know the good way to do a a UDF for a single field, like SELECT tbl.id<http://tbl.id> AS id, tbl.name<http://tbl.name> AS name, tbl.city AS city, state_from_city(tbl.city) AS state FROM my_db.my_table tbl; Native Java? Python over Hadoop Streaming? I prefer Python, but I don't know how to do in a good way. Thanks, Daniel Lopes, B.Eng Data Scientist - BankFacil CREA/SP 5069410560<http://edital.confea.org.br/ConsultaProfissional/cartao.aspx?rnp=2613651334> Mob +55 (18) 99764-2733<callto:+5518997642733> Ph +55 (11) 3522-8009 http://about.me/dannyeuu Av. Nova Independência, 956, São Paulo, SP Bairro Brooklin Paulista CEP 04570-001 https://www.bankfacil.com.br<https://www.bankfacil.com.br/> ====================================================================== THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS CONFIDENTIAL and may contain information that is privileged and exempt from disclosure under applicable law. If you are neither the intended recipient nor responsible for delivering the message to the intended recipient, please note that any dissemination, distribution, copying or the taking of any action in reliance upon the message is strictly prohibited. If you have received this communication in error, please notify the sender immediately. Thank you.