What are the data volume? And what are the meaning of those data?
>From what I can see, you have a 'pack' per day. If that's true, a map join
could be used because you should not have that many pack creation (But I am
not sure how to enforce that.)
I so filtering could happen right after. You woul
If you don't specify join condition, hive performs cross join.
What is added to hive 0.10.0 is just a clarifying grammar.
2012/8/17 Himanish Kushary
> We are on Hive 0.8 , I think cross join is available only since 0.10.0
>
> Do we have any other options ?
>
> On Thu, Aug 16, 2012 at 2:28 PM, A
We are on Hive 0.8 , I think cross join is available only since 0.10.0
Do we have any other options ?
On Thu, Aug 16, 2012 at 2:28 PM, Ablimit Aji wrote:
> You can do a CROSS JOIN, then filter with the original inequality join
> condition.
> This would generate a lot of redundant tuples and may
You can do a CROSS JOIN, then filter with the original inequality join
condition.
This would generate a lot of redundant tuples and may not work if you have
large amounts of data.
On Thu, Aug 16, 2012 at 2:07 PM, Himanish Kushary wrote:
> Hi,
>
> We have two tables in the following structure :
>
Hi,
We have two tables in the following structure :
Table1 :
| id |packcreatetime | packid |
--
| 505 |2012-07-16 11:51:12 | 111024 |
| 505 |2012-07-18 11:52:13 | 111025 |
| 505
Thanks Jan,
I was looking for the first one, summing the values from two columns into one
number.
I did it as sum(col1) + sum(col2), but your solution is more elegant ☺
Regards,
Richin
From: ext Jan Dolinár [mailto:dolik@gmail.com]
Sent: Thursday, August 16, 2012 12:07 PM
To: user@hive.apac
Hi Richin,
Do you mean summing the values from two columns into one number, or
calculating sum of both columns into two sums in one query? Both is
possible, the first can be done simply as SUM(col1 + col2), the second can
be accomplished with two sums: sum(col1), sum(col2). Does that answer your
q
Hello All,
I am copying data from local drive to the HDFS and creating an external
table in hive. But due to some reason data is not copied and create
table script giving an error "file/folder doesn't exists".
Note: there is an error also "log4j", don't know what exact reason for
this error. In
You could do it using Pivot table in MS Excel. It's under the Insert tab, first
option on the left.
Richin
-Original Message-
From: Jain Richin (Nokia-LC/Boston)
Sent: Thursday, August 09, 2012 4:16 PM
To: user@hive.apache.org
Subject: RE: Converting rows into dynamic colums in Hive
Th
Hello,
Is there a way to aggregate multiple columns in Hive?
I can do it in two separate queries but is there something similar to
sum(col1,col2)?
Thanks,
Richin
Creating the /user/hive/warehouse folder is a one-time setup step that can
be done as the hdfs user. With g+w permissions any user can then create
and read the tables.
On Thu, Aug 16, 2012 at 9:57 AM, Connell, Chuck wrote:
> I have no doubt that works, but surely a Hive user should not need su
I have no doubt that works, but surely a Hive user should not need sudo
privileges! I am also looking for best practices, since we have run into the
same.
From: Himanish Kushary [mailto:himan...@gmail.com]
Sent: Thursday, August 16, 2012 9:51 AM
To: user@hive.apache.org
Subject: Re: Hive direc
We usually start the shell thru sudo,otherwise we get a "Permission denied"
while creating Hive tables.
But this is a good point, any suggestions/best practices from the user
community ?
Thanks
On Thu, Aug 16, 2012 at 9:37 AM, Connell, Chuck wrote:
> I have run into similar problems. Thanks fo
I have run into similar problems. Thanks for the suggestions. One concern...
Isn't hdfs a highly privileged user within the Hadoop cluster? So do we really
want it to be standard practice for all Hive users to su to hdfs?
Chuck Connell
Nuance R&D Data Team
Burlington, MA
From: Himanish Kushary
Hi,
To address this issue , for now I have changed the all my fields in the
external tables to STRING datatype.The joins on external tables are working
fine now. Will try to change the datatype while transforming to Hive
managed table and re-execute the joins on the new tables.
Any other suggesti
Hi Sean,
>From the Hive language manual - "Moreover, we strongly advise users to
create the HDFS directories /tmp and /user/hive/warehouse
(aka hive.metastore.warehouse.dir) and set them chmod g+w before tables are
created in Hive"
My warehouse directory has the following permissions:
*Name*
*Ty
Thanks for your suggestion Bejoy
I am using hive 0.7.1... So, cant use you first solution...
The second one is a good idea, but - I get a large chunk of files in
staging which clutters my HDFS... Each file sizes from 40 KB to a max
4.4 MB, though my block size is 64MB...
This is one of the reaso
Hi Anand
You necessarily don't need to go in for UNION ALL for your requirement.
Use INSERT INTO instead, which has less overhead. It is supported from hive 0.8
.
INSERT INTO main_table SELECT * FROM stage_table;
Or an even better approach if you are just copying whole data from one table to
18 matches
Mail list logo