It's probably not as pretty as the new built-in version, but this
allows scripted UDFs in any javax.script language:
https://github.com/livingsocial/HiveSwarm/blob/master/src/main/java/com/livingsocial/hive/udf/ScriptedUDF.java
On Thu, Apr 3, 2014 at 4:35 PM, Andy Srine wrote:
> Thanks Edward. Bu
Try changing ArrayList to just List as the argument to the evaluate function.
On Thu, Feb 13, 2014 at 12:15 PM, Manish wrote:
> Hello:
> While running the hive query :
>
> hive> select "Actor", count (*) from metrics where
> array_contains(type_tuple, "Actor") and checkAttributes (attribute_tuple
4:26 PM, John Meagher wrote:
> I'm working on a UDAF that takes in a constant string that defines
> what the final output of the UDAF will be. In the mode=PARTIAL1 call
> to the init function all the parameters are available and the constant
> can be read so the output ObjectInsp
c.Operator.initialize(Operator.java:377)
> at
> org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:425)
> at
> org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:377)
> at
> org.apache.hadoop.hive.ql.exec.mr.ExecMapper.configure(Exec
Here's one implementation of it:
https://github.com/livingsocial/HiveSwarm#dayofweekdate.
The code for it is pretty straight forward:
https://github.com/livingsocial/HiveSwarm/blob/master/src/main/java/com/livingsocial/hive/udf/DayOfWeek.java
On Mon, Feb 10, 2014 at 4:38 PM, Stephen Sprague wro
I'm working on a UDAF that takes in a constant string that defines
what the final output of the UDAF will be. In the mode=PARTIAL1 call
to the init function all the parameters are available and the constant
can be read so the output ObjectInspector can be built. I haven't
found a way to pass this
I use vim and https://github.com/vim-scripts/SQLUtilities to do it.
It's not hive specific. Any SQL formatting tool will work.
On Tue, Jan 21, 2014 at 11:23 PM, pandees waran wrote:
> Hi,
>
> I would like to come up with a code which automatically formats your hql
> files.
> Because, formatting
Aggregate functions need to go in a HAVING clause instead of the WHERE
clause. WHERE clauses are applied prior to aggregation, HAVING is
applied post aggregation.
select ...
from ...
where some row level filter
group by ...
having some aggregate level filter
On Thu, Aug 29, 2013 at 2:49 PM, Ja
The Data Science course on Coursera has a pretty good overview of map
reduce, Hive, and Pig without going into the Java side of things.
https://www.coursera.org/course/datasci. It's not in depth, but it is
enough to get started.
On Wed, Jul 17, 2013 at 3:52 PM, John Omernik wrote:
> Hey all -
>
What is the data type of the p1 column? I've used hive with
partitions containing far above 2 billion rows without having any
problems like this.
On Wed, May 29, 2013 at 2:41 PM, Gabi Kazav wrote:
> Hi,
>
>
>
> We are working on hive DB with our Hadoop cluster.
>
> We now facing an issue about j
There's a new release candidate version of Hive that includes
windowing functions that address this. RC1 is available at
http://people.apache.org/~hashutosh/hive-0.11.0-rc1/. Also see
https://issues.apache.org/jira/browse/HIVE-896
Without 0.11 the approach from Paul will work, but will be slow.
"Not quite sure but I think each group by will give another M/R job."
It will be done in a single M/R job no matter how many fields are in
the GROUP BY clause.
On Mon, May 6, 2013 at 2:07 PM, Peter Chu wrote:
> In Hive, I cannot perform a SELECT GROUP BY on fields not in the GROUP BY
> clause.
>
Another option for this functionality would be to use the Java scripting
API. The basic structure of the call would be...
select script( scriptLanguage, scriptToRun, args... )
I haven't seen that in Hive, but something similar is available for Pig.
Documentation for that is available on
http://
The WHERE part in the approvals can be moved up to be an IF in the SELECT...
SELECT client_id,receive_dd,receive_hh,
receive_hh+1,
COUNT(1) AS transaction_count,
SUM( IF ( response=00, 1, 0) ) AS approval_count,
SUM( IF ( response=00, 1, 0)
Creating the /user/hive/warehouse folder is a one-time setup step that can
be done as the hdfs user. With g+w permissions any user can then create
and read the tables.
On Thu, Aug 16, 2012 at 9:57 AM, Connell, Chuck wrote:
> I have no doubt that works, but surely a Hive user should not need su
"On ", i, " of ", nrow(data), "\n")
}
rowidx <- which(idrows == data[i, 1])
colidx <- which(idcols == data[i, 2])
count <- data[i,3]
output[rowidx, colidx] <- count
}
write.csv(output, file=outputmatrixfile, quote=F)
}
I don't think having dynamic columns is possible in Hive. I've always
output from Hive a structure like your query output and used R to
convert it into a dynamic column structure.
On Wed, Aug 8, 2012 at 3:56 PM, wrote:
> Thanks Ashish, that gives an idea.
>
> But I am not sure about the outer
This syntax works:
where NOT (me.EMAIL_ADDR_ID IN ('23'))
On Fri, Jul 6, 2012 at 2:44 PM, abhiTowson cal
wrote:
> hi all,
> Does hive support NOT IN clause,if not what is the alternative for
> that.My query is posted below
>
> SELECT
> me.EADDR_TXT as EMAIL_ADDR_TXT,
> '' as SRC_CUST_ID,
> 'ACX
18 matches
Mail list logo