I created a partitioned table, partitioned daily. If I query the earlier
partitions, everything works. The later ones fail with error:
hive> select substr(user_name,1,1),count(*) from u_s_h_b where
dtpartition='2010-10-24' group by substr(user_name,1,1) ;
Total MapReduce jobs = 1
Launching Job 1 o
> I am sure the issue has something to do with an empty string passed to the
> substr function.
We can rule out the substr() function. I get the same stack trace with any
query like:
hive> select from ushb where dtpartition='2010-10-25' limit 10;
But this query succeeds:
hive> select * from u
> It turns out that 2010-10-24 is 257 days from the very first partition in
> my dataset (2010-01-09):
>
> | date_sub('2010-10-24',interval 257 day) |
> +-+
> | 2010-02-09 |
>
I just noticed 257 days is FEBRUARY 9th, not JANUARY
> This is definitely a curious problem.
>
It's data corruption. The file is tab-separated, so I created a quick Perl
pipe to print out the number of tabs on a given line:
-bash-3.2$ hadoop fs -cat
/user/hive/warehouse/ushb/2010-10-25/data-2010-10-25 | perl -pe
's/[^\t\n]//g' | perl -pe 's/\t/-/g'
> Just wondering if there is native support for input arguments on Hive
> scripts.
> eg. $ bin/hive -f script.q
> Any documentation I could reference to look into this further?
>
A workaround:
cat script.q | sed -e 's/arg1/arg1val/' | bin/hive
--
Tim Ellis
Riot Games
Official site is pretty lean. Is the idea to do away with that MySQL
metastore?
On Wed, May 4, 2011 at 7:18 PM, Jeff Zhang wrote:
> Thanks for your guys' quick reply, really appreciate that
>
>
>
>
> On Thu, May 5, 2011 at 10:13 AM, Alan Gates wrote:
>
>> http://incubator.apache.org/hcatalog/
>
My cluster went corrupt-mode. I wiped it and deleted the Hive metastore and
started over. In the process, I did a "yum upgrade" which probably took me
from CDH3b4 to CDH3u0. Now everytime I submit a Hive query of complexity
requiring a map/reduce job*, I get this error:
2011-05-06 18:39:14,533 Sta
I had trouble with Sqoop, so here's what I do (Perl):
$cmd = qq#echo "select * from $tableName where $dateColumn >= '$dayStart
00:00:00' and $dateColumn < '$dayEnd 00:00:00'" \\
| mysql -h $dwIP --quick -B --skip-column-names --user=$USER
--password=$PASS $databaseName \\
| ssh hdfs\@$
I would also like to know this! I haven't tried this sort of query yet, but
I know I will run this exact formula of query very often.
On Fri, May 27, 2011 at 2:32 PM, Shantian Purkad
wrote:
> We have a query which looks something like this
>
> select /*+ MAPJOIN(a) */ * from tableA a
> left oute
> We need your help (or at least tolerance) to deal with some of the
> imperfections in the migration process:
>
> https://cwiki.apache.org/confluence/display/Hive/AboutThisWiki
>
> If you already an editor on the old wiki, or if you would like to help with
> fixing/editing now, contact me for writ
log url)'
FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.MapRedTask
Again, trying to go to "
http://hadooptest14:50060/tasklog?taskid=attempt_201106271658_0002_m_08_0&all=true";
returns that argument attemptid error.
What am I doing wrong here? I appear to k
> You are hitting this bug - https://issues.apache.org/jira/browse/HIVE-1579
> I consistently hit this bug for one of the Hive queries.
>
>
> Sumanth
>
>
>
> On Mon, Jun 27, 2011 at 5:08 PM, Time Less wrote:
>
>> Today I'm getting this error again. A Google search b
I found a set of slides from Facebook online about Hive that claims you can
have a schema per partition in the table, this is exciting to us, because we
have a table like so:
id int
name string
level int
date string
And it's broken up into partitions by date. However, on a particular dat
> c) alter table tbl replace columns (id: int, level: int, name_id: int)
> d) -- add more partitions.
>
> If you do select * from tbl, then this should work. You need not to rewrite
> any of your data. Can you provide more info about what output you were
> expecting and what you
, Jun 28, 2011 at 10:40 AM, Time Less wrote:
> I'm having a hard time interpreting the JIRA - it seems to be saying that
> Hive is passing an incorrect "mapred.child.java.opts=XmxNNNM" parameter,
> missing the - ... Is that correct? I could dig into the Hive source co
;
> I figured that both reading the code and manual. I don't think
> its explicitly documented anywhere, so it will be great if you document
> this. This page looks right place where this place of information can live.
> Thanks for the help in making Hive better.
>
> Ashutosh
We are running this query:
select name, sum_id
from (
select name, players,
array(player1, player2, player3, player4, player5, player6, player7,
player8) arr
from (
select name,
get_json_object(roster_json, '$.memberList.playerId') players,
get_json_object(roster_json, '$.memberList.playerId\[0]')
r one.
We think this is probably a bug, but we're really not sure. Does anyone
have any feedback? Does it look like a bug or an error on the part of our
BI team?
On Thu, Jan 19, 2012 at 1:27 PM, Time Less wrote:
> We are running this query:
>
> select name, sum_id
> from (
> sel
I'm having a heck of a time the past couple days. Google suggests others
have had this same error without resolution since mid-2010. Maybe someone
here can shed some light on this?
*See the package and method name:
*[hdfs@laxhadoop1-012 15:23:14 ~/Tim] :) head LeoRank.java
package com.riot.hive.ud
nstructions for this seem
to lead people astray, hopefully this post will help others in the future.
--
Tim Ellis
Data Architect, Riot Games
On Thu, Feb 2, 2012 at 5:08 PM, Edward Capriolo wrote:
> Maybe you are being effected by this?
>
> https://issues.apache.org/jira/browse/HIVE-2635
&g
Greetings. I have an announcement that of great interest to the HBase/Hive
communities.
tl;dr :: You can build a large, realistically-configured HBase cluster in a
few hours (from nothing) using Chef or Ansible (Puppet in the works). It
also builds large high-availability MySQL clusters as well, a
I am looking at this page:
https://cwiki.apache.org/confluence/display/Hive/Locking
I have two questions. First, what is the shared/exclusive locking behaviour
of "alter table T1 partition(P1) concatenate"? It seems likely it should be
the same as for touch partition.
And see this quote (emphasis
Has anyone seen anything like this? Google searches turned up nothing, so I
thought I'd ask here, then file a JIRA if no-one thinks I'm doing it wrong.
If I ALTER a particular table with three partitions once, it works. Second
time it works, too, but reports it is moving a directory to the Trash t
2014 at 10:37 PM, Time Less wrote:
> Has anyone seen anything like this? Google searches turned up nothing, so
> I thought I'd ask here, then file a JIRA if no-one thinks I'm doing it
> wrong.
>
> If I ALTER a particular table with three partitions once, it works. Second
&
24 matches
Mail list logo