Full outer join have problem

2010-10-29 Thread jaydeep vishwakarma

Hi,

   Full outer join is not working as it should work in hive 0.4.1. I
tried few query with full outer join in Hive. It should suppose to first
filter the rows by where clause, then It should go for  join table. But
every time it takes full rows for join after it goes for filter .  Hence
I can see every time it use full number of maps for join.  Is that known
issue?

Regards,
Jaydeep


The information contained in this communication is intended solely for the use 
of the individual or entity to whom it is addressed and others authorized to 
receive it. It may contain confidential or legally privileged information. If 
you are not the intended recipient you are hereby notified that any disclosure, 
copying, distribution or taking any action in reliance on the contents of this 
information is strictly prohibited and may be unlawful. If you have received 
this communication in error, please notify us immediately by responding to this 
email and then delete it from your system. The firm is neither liable for the 
proper and complete transmission of the information contained in this 
communication nor for any delay in its receipt.


Problem to fetch data

2010-12-17 Thread jaydeep vishwakarma

Hi ,


I am running some query on hive with syntax of "insert overwrite local".
If I am querying on small data set  it is storing data on my local
system.  if I am querying on larger data set it is not storing data in
my local, But that data I am able to see in hive scratch directory. Any
idea what could be the problem?

Regards,
Jaydeep

The information contained in this communication is intended solely for the use 
of the individual or entity to whom it is addressed and others authorized to 
receive it. It may contain confidential or legally privileged information. If 
you are not the intended recipient you are hereby notified that any disclosure, 
copying, distribution or taking any action in reliance on the contents of this 
information is strictly prohibited and may be unlawful. If you have received 
this communication in error, please notify us immediately by responding to this 
email and then delete it from your system. The firm is neither liable for the 
proper and complete transmission of the information contained in this 
communication nor for any delay in its receipt.


Issue with map join

2010-12-23 Thread jaydeep vishwakarma

Hi,

I am trying to running some MAPJOIN queries. When I am placing single
table in MAP JOIN it works fine,But when I run same query with two
tables on MAPJOIN it gives error. Can any tell me what could be the
problem? Here is the error log which I am getting from job tracker.


java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row 
{"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-15-20"}
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
   at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row 
{"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-15-20"}
   at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:417)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:153)
   ... 4 more
Caused by: java.lang.NullPointerException
   at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:177)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
   at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
   at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:400)
   ... 5 more

Regard,
Jaydeep

The information contained in this communication is intended solely for the use 
of the individual or entity to whom it is addressed and others authorized to 
receive it. It may contain confidential or legally privileged information. If 
you are not the intended recipient you are hereby notified that any disclosure, 
copying, distribution or taking any action in reliance on the contents of this 
information is strictly prohibited and may be unlawful. If you have received 
this communication in error, please notify us immediately by responding to this 
email and then delete it from your system. The firm is neither liable for the 
proper and complete transmission of the information contained in this 
communication nor for any delay in its receipt.


Re: Issue with map join

2010-12-24 Thread jaydeep vishwakarma

Hi Namit,

Here is the query which I am trying to run :-
select /*+ MAPJOIN(region,carrier ) */
region.reg,carrier.country,count(column1) from foo join carrier on
(foo.column10 =carrier.id) join region on
(carrier.country=region.country)  where foo.rq_dt = '2010-12-16-00' AND
carrier.rq_dt = '2010-12-15-20' AND region.rq_dt = '2010-12-15-20' group
by region.name, carrier.country;


Here are the schema of tables. First two tables contains hardly 600 and
200 rows in each partions. But in foo table I have about 30 million rows
in each partition and also foo table have serde.

create table carrier
(
id int,
country STRING,
name STRING
)PARTITIONED BY(rq_dt STRING);

create table region
(
id int,
country STRING,
name STRING
)PARTITIONED BY(rq_dt STRING);





create table foo
(
column_1 STRING,
column_2 STRING,
column_3 STRING,
column_4 STRING,
column_5 STRING,
column_6 STRING,
column_7 STRING,
column_8 STRING,
column_9 STRING,
column_10 INT,
column_11 INT,
column_12 STRING,
column_13 INT,
column_14 STRING,
column_15 BIGINT,
column_16 BIGINT,
column_17 BIGINT,
column_18 BIGINT,
column_19 BIGINT,
column_20 BIGINT,
column_21 BIGINT,
column_22 BIGINT,
column_23 BIGINT,
column_24 BIGINT,
column_25 BIGINT,
column_26 STRING,
column_27 STRING,
column_28 STRING,
column_29 STRING,
column_30 STRING,
column_31 STRING,
column_32 STRING,
column_33 STRING,
column_34 STRING,
column_35 INT,
column_36 STRING,
column_37 STRING,
column_38 STRING,
column_39 STRING,
column_40 STRING,
column_41 STRING,
column_42 STRING,
column_43 STRING,
column_44 STRING,
column_45 STRING,
column_46 STRING,
column_47 STRING,
column_48 STRING,
column_49 STRING,
column_50 STRING,
column_51 STRING,
column_52 STRING,
column_53 STRING,
column_54 STRING,
column_55 STRING,
column_56 STRING,
column_57 INT,
column_58 STRING,
column_59 STRING,
column_60 STRING,
column_61 STRING,
column_62 STRING,
column_63 STRING,
column_64 STRING,
column_65 STRING,
column_66 STRING,
column_67 STRING,
column_68 STRING,
column_69 STRING,
column_70 STRING,
column_71 STRING,
column_72 STRING,
column_73 STRING,
column_74 STRING,
column_75 STRING,
column_76 STRING,
column_77 STRING,
column_78 STRING,
column_79 STRING,
column_80 STRING,
column_81 STRING,
column_82 STRING,
column_83 STRING,
column_84 STRING,
column_85 STRING,
column_86 STRING,
column_87 STRING,
)
PARTITIONED BY(rq_dt STRING)
ROW FORMAT SERDE 'com.inmobi.dw.datastore.hive.MySerDe'
STORED AS  SEQUENCEFILE;

Thanks,
Jaydep

On Friday 24 December 2010 12:23 AM, Namit Jain wrote:

Can you send the exact query along with the schema of the tables ?


On 12/23/10 1:48 AM, "jaydeep vishwakarma"
wrote:


Hi,

I am trying to running some MAPJOIN queries. When I am placing single
table in MAP JOIN it works fine,But when I run same query with two
tables on MAPJOIN it gives error. Can any tell me what could be the
problem? Here is the error log which I am getting from job tracker.


java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error
while processing row
{"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-1
5-20"}
at
org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime
Error while processing row
{"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-1
5-20"}
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:417)
at
org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:153)
... 4 more
Caused by: java.lang.NullPointerException
at
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.j
ava:177)
at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
at
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.jav
a:84)
at
org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
at
org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
at
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:400)
... 5 more

Regard,
Jaydeep

The information contained in this communication is intended solely for
the use of the individual or entity to whom it is addressed and others
authorized to receive it. It may contain confidential or legally
privileged information. If you are not the intended recipient you are
hereby notified th

Re: Issue with map join

2010-12-31 Thread jaydeep vishwakarma

I checked map join behaviour, When I use more than one table in map join 
without serde it works perfectly fine , But it does not work with Serde for 
more than one tables. I checked the code, I found the class called 
MapJoinOperator.java have joinKeys variable,It have null value when I use serde 
with mapjoin. Is there any thing by that we can tell mapjoin use Serde. Or 
Mapjoin has not implemented for more than one table on Serde.

Regards,
Jaydeep

On Thursday 23 December 2010 03:18 PM, jaydeep vishwakarma wrote:

Hi,

I am trying to running some MAPJOIN queries. When I am placing single
table in MAP JOIN it works fine,But when I run same query with two
tables on MAPJOIN it gives error. Can any tell me what could be the
problem? Here is the error log which I am getting from job tracker.


java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row 
{"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-15-20"}
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:171)
   at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
   at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
   at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
   at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row 
{"_col0":"5","_col87":"2010-12-16-00","_col89":"China","_col91":"2010-12-15-20"}
   at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:417)
   at org.apache.hadoop.hive.ql.exec.ExecMapper.map(ExecMapper.java:153)
   ... 4 more
Caused by: java.lang.NullPointerException
   at 
org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:177)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
   at 
org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
   at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
   at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
   at 
org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:400)
   ... 5 more

Regard,
Jaydeep

The information contained in this communication is intended solely for the use 
of the individual or entity to whom it is addressed and others authorized to 
receive it. It may contain confidential or legally privileged information. If 
you are not the intended recipient you are hereby notified that any disclosure, 
copying, distribution or taking any action in reliance on the contents of this 
information is strictly prohibited and may be unlawful. If you have received 
this communication in error, please notify us immediately by responding to this 
email and then delete it from your system. The firm is neither liable for the 
proper and complete transmission of the information contained in this 
communication nor for any delay in its receipt.
.





The information contained in this communication is intended solely for the use 
of the individual or entity to whom it is addressed and others authorized to 
receive it. It may contain confidential or legally privileged information. If 
you are not the intended recipient you are hereby notified that any disclosure, 
copying, distribution or taking any action in reliance on the contents of this 
information is strictly prohibited and may be unlawful. If you have received 
this communication in error, please notify us immediately by responding to this 
email and then delete it from your system. The firm is neither liable for the 
proper and complete transmission of the information contained in this 
communication nor for any delay in its receipt.