Re: Setting job diagnostics to REDUCE capability required - error in hive

2014-11-07 Thread Ja Sam
I found the problem. I had a diffrent configuration on namnode in yarn-site.xml and on datanodes in same file. I still don't know why, but this is easy to fix On Fri, Nov 7, 2014 at 3:41 PM, Ja Sam wrote: > I don't use any scheduler. Anyway this error happens when we try to run

Re: Setting job diagnostics to REDUCE capability required - error in hive

2014-11-07 Thread Ja Sam
ails please ? setting config parameters optimally in > yarn/mr configs might help you but please do so wisely as it may imbalance > other things if not implemented thoughtfully. > > regards > Devopam > > On Fri, Nov 7, 2014 at 7:56 PM, Ja Sam wrote: > >> I have a simple

Setting job diagnostics to REDUCE capability required - error in hive

2014-11-07 Thread Ja Sam
I have a simple query with grouping. Something similar to bellow: SELECT col1, col2, col3, min(date), count(*) FROM tblX WHERE partitionDate="20141107" GROUP BY col1, col2, col3; When I run this query through WebHCat everything works fine. But when I try to run it from hive shell I have

Optimize hive external tables with serde

2014-10-21 Thread Ja Sam
*Part 1: my enviroment* I have following files uploaded to Hadoop: 1. The are plain text 2. Each line contains JSON like: {code:[int], customerId:[string], data:{[something more here]}} 1. code are numbers from 1 to 3000, 2. customerId are total up to 4 millions, daily up to 0.5 mil