log4j format logs in Hive table

2011-12-05 Thread sangeetha k
Hi,   I am new to Hive.   I am using Flume agent to collect log4j logs and sending to HDFS. Now i wanted to load the log4j format logs from HDFS to Hive tables. Each of the attributes in log statements like timestamp, level, classname etc... should be loaded in seperate columns in the Hive tables.

Re: Hive Reducers hanging - interesting problem - skew ?

2011-12-05 Thread Mark Grover
jS, Check out if this helps: http://search-hadoop.com/m/l1usr1MAHX32&subj=Re+Severely+hit+by+curse+of+last+reducer+ Mark Grover, Business Intelligence Analyst OANDA Corporation www: oanda.com www: fxtrade.com e: mgro...@oanda.com "Best Trading Platform" - World Finance's Forex Awards 2009.

Hive Reducers hanging - interesting problem - skew ?

2011-12-05 Thread john smith
Hi list, I am trying to run a Join query on my 10 node cluster. My query looks as follows select * from A JOIN B on (A.a = B.b) size of A = 15 million rows size of B = 1 million rows The problem is A.a and B.b has around 25-30 distinct values per column which implies that they have high selecti

Re: hive case and group-by statement

2011-12-05 Thread Peter Hanlon
or select A, CASE WHEN B IN(1,2) THEN 'Type A' ELSE 'Type B' END AS B, C from table_a groupby A, CASE WHEN B IN(1,2) THEN 'Type A' ELSE 'Type B' END, C using a column alias defined in the select clause, is not valid in the group by. On 4 December 2011 09:21, Mapred Learn wrote: > Hi, > I