Thanks Leo,
Does the smaller table go into the mapjoin hint? Actually, when I ran a test
query with the bigger table in the hint, it performed better.
On Thu, Jan 20, 2011 at 12:40 PM, Leo Alekseyev wrote:
> You can only specify one table, and make sure to include its name,
> i.e. /*+ mapjoin(t
You can only specify one table, and make sure to include its name,
i.e. /*+ mapjoin(t2)*/. For more info see
http://wiki.apache.org/hadoop/Hive/JoinOptimization and
http://www.slideshare.net/aiolos127/join-optimization-in-hive.
Also, you are using a relatively old version of Hive, but I'll let
m
Hi,
How do I use the mapjoin hint in a query.
Say, I have two tables t1 and t2 where t2 is the smaller table. Do I specify
t2 in the mapjoin hint?
select /*+ mapjoin(b)*/ * from t1 join t2 b on (a.id = b.id)
If I am joining two smaller tables, can I specify two clauses in the
mapjoin? /*+mapjoi
Thanks again.
I think I figured out the bug (not sure if it's a bug or whether that's
a known limitation when creating a third-level join) we need another
table c to re-create my scenario.
table_a
create table table_a(a_id bigint, common_id bigint, int_a int, int_b int,
int_c int, int_d int,
Yup. Thanks for you help J-D, much appreciated!
On Wed, Jan 19, 2011 at 4:41 PM, Jean-Daniel Cryans wrote:
> Have a looksee here http://wiki.apache.org/hadoop/Hive/FAQ
>
> J-D
>
> On Wed, Jan 19, 2011 at 4:38 PM, vipul sharma
> wrote:
> > Thanks!
> >
> > Now I am hitting mysql bug of max key len
Have a looksee here http://wiki.apache.org/hadoop/Hive/FAQ
J-D
On Wed, Jan 19, 2011 at 4:38 PM, vipul sharma wrote:
> Thanks!
>
> Now I am hitting mysql bug of max key length: Specified key was too long;
> max key length is 767 bytes
>
> 11/01/19 16:34:47 ERROR DataNucleus.Datastore: Error throw
Thanks!
Now I am hitting mysql bug of max key length: Specified key was too long;
max key length is 767 bytes
11/01/19 16:34:47 ERROR DataNucleus.Datastore: Error thrown executing CREATE
TABLE `SD_PARAMS`
(
`SD_ID` BIGINT NOT NULL,
`PARAM_KEY` VARCHAR(256) BINARY NOT NULL,
`PARAM_VALU
Try setting this in your hive-site:
datanucleus.transactionIsolation
repeatable-read
datanucleus.valuegeneration.transactionIsolation
repeatable-read
J-D
On Wed, Jan 19, 2011 at 4:05 PM, vipul sharma wrote:
> Hi,
>
> we had been running cloudera distribution of hadoop. We installed
Hi,
we had been running cloudera distribution of hadoop. We installed hive
following this document
https://wiki.cloudera.com/display/DOC/Hive+Installation. hive-site.xml was
later modified for storing metastore in mysql very similar to the config in
this blog
http://blog.milford.io/2010/06/install
I didn't do the test you suggested, but
With the sequence file case:
- the size of what should have been compressed was bigger than the uncompressed
- it didn't have .defate suffix
- in contrast to the text file case, where I got 10x compression or so,
Cheers,
-Ajo
On Wed, Jan 19, 2011 at 11:30
Here's a simple check -- look inside one of your sequence files:
hadoop fs -cat /your/seq/file | head
If it is compressed, the header will contain the compression codec's name and
the data will look gibberish. Otherwise, it is not compressed.
-Original Message-
From: Ajo Fod [mailto:aj
EXPLAIN select
t1.some_string,t2.some_string,sum(t1.total_count),sum(t2.total_count) from
table_a t1 join table_b t2 on t1.part_col = t2.part_col and t1.common_id =
t2.common_id where t1.part_col >= 'mypart' and t2.part_col >= 'mypart' group by
t1.some_string,t2.some_string;
OK
ABSTRACT SYNTAX
Thanks Appan for verifying. I will do some more tests on my side too and let
you know the results.
I tried a different version of the query where I join'ed two sub-queries for
the same partitions and the data comes out to be correct.
I will see if I can post the real-world example to the list, be
Viral,
I tried the queries below (similar to yours) and I get the expected results
when I do the join. I ran my queries after building hive from the latest source
and hadoop 0.20+.
create table table_a(a_id bigint, common_id bigint, some_string
string,total_count bigint) partitioned by
On Wed, Jan 19, 2011 at 12:00 PM, Ajo Fod wrote:
> The wiki probably needs to be fixed :
> For 32, buckets, I need to set the following flags.
>
>>>set hive.merge.mapfiles = false;
>>>set mapred.map.tasks=32;
>
> ... the set mapred.reduce.tasks ... is irrelevant.
>
> The query mechanism should ide
Ah! ok.
Thanks.
-Ajo.
On Wed, Jan 19, 2011 at 9:03 AM, Ping Zhu wrote:
> I think only Hive 0,7 or later accepts syntax drop table if exists.
> http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Drop_Table
>
>
> On Wed, Jan 19, 2011 at 8:54 AM, Ajo Fod wrote:
>>
>> I don't think this works.
I think only Hive 0,7 or later accepts syntax drop table if exists.
http://wiki.apache.org/hadoop/Hive/LanguageManual/DDL#Drop_Table
On Wed, Jan 19, 2011 at 8:54 AM, Ajo Fod wrote:
> I don't think this works.
> >> drop table if exists ;
> ... it seems to fail on the if exists part.
>
> Is anyo
The wiki probably needs to be fixed :
For 32, buckets, I need to set the following flags.
>>set hive.merge.mapfiles = false;
>>set mapred.map.tasks=32;
... the set mapred.reduce.tasks ... is irrelevant.
The query mechanism should ideally set this automatically !!
Cheers,
-Ajo
On Wed, Jan 19, 2
I don't think this works.
>> drop table if exists ;
... it seems to fail on the if exists part.
Is anyone's experience different ?... I'm using CDH3 ... Hive 0.5.0.
-Ajo
On Wed, Jan 19, 2011 at 10:46 AM, Ajo Fod wrote:
> I've 2 questions:
> 1) how to raise the number of reducers?
> 2) why are there only 2 bucket files per partition even though I
> specified 32 buckets?
>
>
> I've set the following and don't see an increase in the number of reducers.
>>>set hive.ex
I've 2 questions:
1) how to raise the number of reducers?
2) why are there only 2 bucket files per partition even though I
specified 32 buckets?
I've set the following and don't see an increase in the number of reducers.
>>set hive.exec.reducers.max=32;
>>set mapred.reduce.tasks=32;
Could this b
On Wed, Jan 19, 2011 at 2:37 AM, Guy Doulberg wrote:
> Hey All again,
>
>
>
> I bet I am not the first one to ask this question, but I could not find an
> answer anywhere.
>
>
>
> I am using the following temporary function:
>
> CREATE TEMPORARY FUNCTION jeval AS 'org.apache.hadoop.hive.ql.udf.UDF
22 matches
Mail list logo