Re: UDF reflect

2014-04-03 Thread John Meagher
It's probably not as pretty as the new built-in version, but this allows scripted UDFs in any javax.script language: https://github.com/livingsocial/HiveSwarm/blob/master/src/main/java/com/livingsocial/hive/udf/ScriptedUDF.java On Thu, Apr 3, 2014 at 4:35 PM, Andy Srine wrote: > Thanks Edward. Bu

Re: Help with simple UDF error..

2014-02-13 Thread John Meagher
Try changing ArrayList to just List as the argument to the evaluate function. On Thu, Feb 13, 2014 at 12:15 PM, Manish wrote: > Hello: > While running the hive query : > > hive> select "Actor", count (*) from metrics where > array_contains(type_tuple, "Actor") and checkAttributes (attribute_tuple

Re: Can data be passed to the final mode init call in a UDAF?

2014-02-12 Thread John Meagher
4:26 PM, John Meagher wrote: > I'm working on a UDAF that takes in a constant string that defines > what the final output of the UDAF will be. In the mode=PARTIAL1 call > to the init function all the parameters are available and the constant > can be read so the output ObjectInsp

Re: FUNCTION HIVE to DAYS OF WEEK

2014-02-11 Thread John Meagher
c.Operator.initialize(Operator.java:377) > at > org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425) > at > org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377) > at > org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(Exec

Re: FUNCTION HIVE to DAYS OF WEEK

2014-02-10 Thread John Meagher
Here's one implementation of it: https://github.com/livingsocial/HiveSwarm#dayofweekdate. The code for it is pretty straight forward: https://github.com/livingsocial/HiveSwarm/blob/master/src/main/java/com/livingsocial/hive/udf/DayOfWeek.java On Mon, Feb 10, 2014 at 4:38 PM, Stephen Sprague wro

Can data be passed to the final mode init call in a UDAF?

2014-02-10 Thread John Meagher
I'm working on a UDAF that takes in a constant string that defines what the final output of the UDAF will be. In the mode=PARTIAL1 call to the init function all the parameters are available and the constant can be read so the output ObjectInspector can be built. I haven't found a way to pass this

Re: Formatting hive queries

2014-01-22 Thread John Meagher
I use vim and https://github.com/vim-scripts/SQLUtilities to do it. It's not hive specific. Any SQL formatting tool will work. On Tue, Jan 21, 2014 at 11:23 PM, pandees waran wrote: > Hi, > > I would like to come up with a code which automatically formats your hql > files. > Because, formatting

Re: Problème with min function in HiveQL

2013-08-29 Thread John Meagher
Aggregate functions need to go in a HAVING clause instead of the WHERE clause. WHERE clauses are applied prior to aggregation, HAVING is applied post aggregation. select ... from ... where some row level filter group by ... having some aggregate level filter On Thu, Aug 29, 2013 at 2:49 PM, Ja

Re: Java Courses for Scripters/Big Data Geeks

2013-07-17 Thread John Meagher
The Data Science course on Coursera has a pretty good overview of map reduce, Hive, and Pig without going into the Java side of things. https://www.coursera.org/course/datasci. It's not in depth, but it is enough to get started. On Wed, Jul 17, 2013 at 3:52 PM, John Omernik wrote: > Hey all - >

Re: Hive - max rows limit (int limit = 2^31). need Help (looks liek a bug)

2013-05-29 Thread John Meagher
What is the data type of the p1 column? I've used hive with partitions containing far above 2 billion rows without having any problems like this. On Wed, May 29, 2013 at 2:41 PM, Gabi Kazav wrote: > Hi, > > > > We are working on hive DB with our Hadoop cluster. > > We now facing an issue about j

Re: How to do a calculation among 2 rows?

2013-05-15 Thread John Meagher
There's a new release candidate version of Hive that includes windowing functions that address this. RC1 is available at http://people.apache.org/~hashutosh/hive-0.11.0-rc1/. Also see https://issues.apache.org/jira/browse/HIVE-896 Without 0.11 the approach from Paul will work, but will be slow.

Re: Hive Group By Limitations

2013-05-06 Thread John Meagher
"Not quite sure but I think each group by will give another M/R job." It will be done in a single M/R job no matter how many fields are in the GROUP BY clause. On Mon, May 6, 2013 at 2:07 PM, Peter Chu wrote: > In Hive, I cannot perform a SELECT GROUP BY on fields not in the GROUP BY > clause. >

Re: Using Reflect: A thread for ideas

2013-02-19 Thread John Meagher
Another option for this functionality would be to use the Java scripting API. The basic structure of the call would be... select script( scriptLanguage, scriptToRun, args... ) I haven't seen that in Hive, but something similar is available for Pig. Documentation for that is available on http://

Re: Help in hive query

2012-10-30 Thread John Meagher
The WHERE part in the approvals can be moved up to be an IF in the SELECT... SELECT client_id,receive_dd,receive_hh, receive_hh+1, COUNT(1) AS transaction_count, SUM( IF ( response=00, 1, 0) ) AS approval_count, SUM( IF ( response=00, 1, 0)

Re: Hive directory permissions

2012-08-16 Thread John Meagher
Creating the /user/hive/warehouse folder is a one-time setup step that can be done as the hdfs user. With g+w permissions any user can then create and read the tables. On Thu, Aug 16, 2012 at 9:57 AM, Connell, Chuck wrote: > I have no doubt that works, but surely a Hive user should not need su

Re: Converting rows into dynamic colums in Hive

2012-08-09 Thread John Meagher
"On ", i, " of ", nrow(data), "\n") } rowidx <- which(idrows == data[i, 1]) colidx <- which(idcols == data[i, 2]) count <- data[i,3] output[rowidx, colidx] <- count } write.csv(output, file=outputmatrixfile, quote=F) }

Re: Converting rows into dynamic colums in Hive

2012-08-08 Thread John Meagher
I don't think having dynamic columns is possible in Hive. I've always output from Hive a structure like your query output and used R to convert it into a dynamic column structure. On Wed, Aug 8, 2012 at 3:56 PM, wrote: > Thanks Ashish, that gives an idea. > > But I am not sure about the outer

Re: Hive NOT IN clause

2012-07-06 Thread John Meagher
This syntax works: where NOT (me.EMAIL_ADDR_ID IN ('23')) On Fri, Jul 6, 2012 at 2:44 PM, abhiTowson cal wrote: > hi all, > Does hive support NOT IN clause,if not what is the alternative for > that.My query is posted below > > SELECT > me.EADDR_TXT as EMAIL_ADDR_TXT, > '' as SRC_CUST_ID, > 'ACX