Maximum Number of Hive Partitions = 256?

2011-05-03 Thread Time Less
I created a partitioned table, partitioned daily. If I query the earlier partitions, everything works. The later ones fail with error: hive> select substr(user_name,1,1),count(*) from u_s_h_b where dtpartition='2010-10-24' group by substr(user_name,1,1) ; Total MapReduce jobs = 1 Launching Job 1 o

Re: Maximum Number of Hive Partitions = 256?

2011-05-04 Thread Time Less
> I am sure the issue has something to do with an empty string passed to the > substr function. We can rule out the substr() function. I get the same stack trace with any query like: hive> select from ushb where dtpartition='2010-10-25' limit 10; But this query succeeds: hive> select * from u

Re: Maximum Number of Hive Partitions = 256?

2011-05-04 Thread Time Less
> It turns out that 2010-10-24 is 257 days from the very first partition in > my dataset (2010-01-09): > > | date_sub('2010-10-24',interval 257 day) | > +-+ > | 2010-02-09 | > I just noticed 257 days is FEBRUARY 9th, not JANUARY

Re: Maximum Number of Hive Partitions = 256?

2011-05-04 Thread Time Less
> This is definitely a curious problem. > It's data corruption. The file is tab-separated, so I created a quick Perl pipe to print out the number of tabs on a given line: -bash-3.2$ hadoop fs -cat /user/hive/warehouse/ushb/2010-10-25/data-2010-10-25 | perl -pe 's/[^\t\n]//g' | perl -pe 's/\t/-/g'

Re: HiveQL scripts and input arguments

2011-05-04 Thread Time Less
> Just wondering if there is native support for input arguments on Hive > scripts. > eg. $ bin/hive -f script.q > Any documentation I could reference to look into this further? > A workaround: cat script.q | sed -e 's/arg1/arg1val/' | bin/hive -- Tim Ellis Riot Games

Re: What's official site for howl ?

2011-05-04 Thread Time Less
Official site is pretty lean. Is the idea to do away with that MySQL metastore? On Wed, May 4, 2011 at 7:18 PM, Jeff Zhang wrote: > Thanks for your guys' quick reply, really appreciate that > > > > > On Thu, May 5, 2011 at 10:13 AM, Alan Gates wrote: > >> http://incubator.apache.org/hcatalog/ >

Bizarro Hive (Hadoop?) Error

2011-05-06 Thread Time Less
My cluster went corrupt-mode. I wiped it and deleted the Hive metastore and started over. In the process, I did a "yum upgrade" which probably took me from CDH3b4 to CDH3u0. Now everytime I submit a Hive query of complexity requiring a map/reduce job*, I get this error: 2011-05-06 18:39:14,533 Sta

Re: Extract and Load to Hadoop via Pipes

2011-05-26 Thread Time Less
I had trouble with Sqoop, so here's what I do (Perl): $cmd = qq#echo "select * from $tableName where $dateColumn >= '$dayStart 00:00:00' and $dateColumn < '$dayEnd 00:00:00'" \\ | mysql -h $dwIP --quick -B --skip-column-names --user=$USER --password=$PASS $databaseName \\ | ssh hdfs\@$

Re: Join optimization in Hive

2011-05-27 Thread Time Less
I would also like to know this! I haven't tried this sort of query yet, but I know I will run this exact formula of query very often. On Fri, May 27, 2011 at 2:32 PM, Shantian Purkad wrote: > We have a query which looks something like this > > select /*+ MAPJOIN(a) */ * from tableA a > left oute

Re: wiki has moved!

2011-06-27 Thread Time Less
> We need your help (or at least tolerance) to deal with some of the > imperfections in the migration process: > > https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki > > If you already an editor on the old wiki, or if you would like to help with > fixing/editing now, contact me for writ

Re: Bizarro Hive (Hadoop?) Error

2011-06-27 Thread Time Less
log url)' FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask Again, trying to go to " http://hadooptest14:50060/tasklog?taskid=attempt_201106271658_0002_m_08_0&all=true"; returns that argument attemptid error. What am I doing wrong here? I appear to k

Re: Bizarro Hive (Hadoop?) Error

2011-06-28 Thread Time Less
> You are hitting this bug - https://issues.apache.org/jira/browse/HIVE-1579 > I consistently hit this bug for one of the Hive queries. > > > Sumanth > > > > On Mon, Jun 27, 2011 at 5:08 PM, Time Less wrote: > >> Today I'm getting this error again. A Google search b

One Schema Per Partition? (Multiple schemas per table?)

2011-08-22 Thread Time Less
I found a set of slides from Facebook online about Hive that claims you can have a schema per partition in the table, this is exciting to us, because we have a table like so: id int name string level int date string And it's broken up into partitions by date. However, on a particular dat

Re: One Schema Per Partition? (Multiple schemas per table?)

2011-08-29 Thread Time Less
> c) alter table tbl replace columns (id: int, level: int, name_id: int) > d) -- add more partitions. > > If you do select * from tbl, then this should work. You need not to rewrite > any of your data. Can you provide more info about what output you were > expecting and what you

Re: Bizarro Hive (Hadoop?) Error

2011-08-29 Thread Time Less
, Jun 28, 2011 at 10:40 AM, Time Less wrote: > I'm having a hard time interpreting the JIRA - it seems to be saying that > Hive is passing an incorrect "mapred.child.java.opts=XmxNNNM" parameter, > missing the - ... Is that correct? I could dig into the Hive source co

Re: One Schema Per Partition? (Multiple schemas per table?)

2011-10-06 Thread Time Less
; > I figured that both reading the code and manual. I don't think > its explicitly documented anywhere, so it will be great if you document > this. This page looks right place where this place of information can live. > Thanks for the help in making Hive better. > > Ashutosh

Odd Behaviour with get_json() (or perhaps with explode(array))

2012-01-19 Thread Time Less
We are running this query: select name, sum_id from ( select name, players, array(player1, player2, player3, player4, player5, player6, player7, player8) arr from ( select name, get_json_object(roster_json, '$.memberList.playerId') players, get_json_object(roster_json, '$.memberList.playerId\[0]')

Re: Odd Behaviour with get_json() (or perhaps with explode(array))

2012-01-26 Thread Time Less
r one. We think this is probably a bug, but we're really not sure. Does anyone have any feedback? Does it look like a bug or an error on the part of our BI team? On Thu, Jan 19, 2012 at 1:27 PM, Time Less wrote: > We are running this query: > > select name, sum_id > from ( > sel

UDF ;; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask

2012-02-02 Thread Time Less
I'm having a heck of a time the past couple days. Google suggests others have had this same error without resolution since mid-2010. Maybe someone here can shed some light on this? *See the package and method name: *[hdfs@laxhadoop1-012 15:23:14 ~/Tim] :) head LeoRank.java package com.riot.hive.ud

Re: UDF ;; FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.FunctionTask

2012-02-02 Thread Time Less
nstructions for this seem to lead people astray, hopefully this post will help others in the future. -- Tim Ellis Data Architect, Riot Games On Thu, Feb 2, 2012 at 5:08 PM, Edward Capriolo wrote: > Maybe you are being effected by this? > > https://issues.apache.org/jira/browse/HIVE-2635 &g

Building a Large Properly-Configured HBase Cluster -- In a Day

2012-10-26 Thread Time Less
Greetings. I have an announcement that of great interest to the HBase/Hive communities. tl;dr :: You can build a large, realistically-configured HBase cluster in a few hours (from nothing) using Chef or Ansible (Puppet in the works). It also builds large high-availability MySQL clusters as well, a

Question re: Concurrency

2014-10-13 Thread Time Less
I am looking at this page: https://cwiki.apache.org/confluence/display/Hive/Locking I have two questions. First, what is the shared/exclusive locking behaviour of "alter table T1 partition(P1) concatenate"? It seems likely it should be the same as for touch partition. And see this quote (emphasis

ALTER TABLE T1 PARTITION(P1) CONCATENATE bug?

2014-10-13 Thread Time Less
Has anyone seen anything like this? Google searches turned up nothing, so I thought I'd ask here, then file a JIRA if no-one thinks I'm doing it wrong. If I ALTER a particular table with three partitions once, it works. Second time it works, too, but reports it is moving a directory to the Trash t

Re: ALTER TABLE T1 PARTITION(P1) CONCATENATE bug?

2014-10-14 Thread Time Less
2014 at 10:37 PM, Time Less wrote: > Has anyone seen anything like this? Google searches turned up nothing, so > I thought I'd ask here, then file a JIRA if no-one thinks I'm doing it > wrong. > > If I ALTER a particular table with three partitions once, it works. Second &