in the meantime why don't you breakup your single query into a series of queries (using CTAS semantics to create intermediate tables ).
The idea is narrow the problem down to a minimal size that _isolates the problem_ . what you have there is overly complex to expect someone to troubleshoot for you. try to minimize the failure case. take out your UDF's. Does it work then or fail? strip it down to the bare necessities! On Fri, May 17, 2013 at 10:56 AM, Sanjay Subramanian < sanjay.subraman...@wizecommerce.com> wrote: > I am using Hive 0.9.0+155 that is bundled in Cloudera Manager version > 4.1.2 > Still getting the errors listed below :-( > Any clues will be be cool !!! > Thanks > > sanjay > > > From: Sanjay Subramanian <sanjay.subraman...@wizecommerce.com> > Date: Thursday, May 16, 2013 9:42 PM > > To: "user@hive.apache.org" <user@hive.apache.org> > Subject: Re: need help with an error - script used to work and now it > does not :-( > > :-( Still facing problems in large datasets > Were u able to solve this Edward ? > Thanks > sanjay > > From: Sanjay Subramanian <sanjay.subraman...@wizecommerce.com> > Reply-To: "user@hive.apache.org" <user@hive.apache.org> > Date: Thursday, May 16, 2013 8:25 PM > To: "user@hive.apache.org" <user@hive.apache.org> > Subject: Re: need help with an error - script used to work and now it > does not :-( > > Thanks Edward…I just checked all instances of guava jars…except those > in red all seem same version > > /usr/lib/hadoop/client/guava-11.0.2.jar > /usr/lib/hadoop/client-0.20/guava-11.0.2.jar > /usr/lib/hadoop/lib/guava-11.0.2.jar > /usr/lib/hadoop-httpfs/webapps/webhdfs/WEB-INF/lib/guava-11.0.2.jar > /usr/lib/hadoop-hdfs/lib/guava-11.0.2.jar > /usr/lib/oozie/libtools/guava-11.0.2.jar > /usr/lib/hive/lib/guava-11.0.2.jar > /usr/lib/hadoop-0.20-mapreduce/lib/guava-11.0.2.jar > /usr/lib/hbase/lib/guava-11.0.2.jar > /usr/lib/flume-ng/lib/guava-11.0.2.jar > /usr/share/cmf/lib/cdh3/guava-r09-jarjar.jar > /usr/share/cmf/lib/guava-12.0.1.jar > > But I made a small change in my query (I just removed the text marked in > blue) that seemed to solve it at least for the test data set that I > had….Now I need to run it in production for a days worth of data > > Will keep u guys posted > > > ------------------------------------------------------------------------------------------------------------ > SELECT > h.header_date_donotquery * as date_*, > h.header_id as *impression_id*, > h.header_searchsessionid as *search_session_id*, > h.cached_visitid *as visit_id* , > split(h.server_name_donotquery,'[\.]')[0] *as server*, > h.cached_ip *ip*, > h.header_adnodeid *ad_nodes*, > > ------------------------------------------------------------------------------------------------------------ > > Thanks > > sanjay > > > From: Edward Capriolo <edlinuxg...@gmail.com> > Reply-To: "user@hive.apache.org" <user@hive.apache.org> > Date: Thursday, May 16, 2013 7:51 PM > To: "user@hive.apache.org" <user@hive.apache.org> > Subject: Re: need help with an error - script used to work and now it > does not :-( > > Ironically I just got a misleading error like this today. What happened > was I upgraded to hive 0.10.One of my programs was liked to guava 15 but > hive provides guava 09 on the classpath confusing things. I also had a > similar issue with mismatched slf 4j and commons-logger. > > > On Thu, May 16, 2013 at 10:34 PM, Sanjay Subramanian < > sanjay.subraman...@wizecommerce.com> wrote: > >> 2013-05-16 18:57:21,094 FATAL [IPC Server handler 19 on 40222] >> org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: >> attempt_1368666339740_0135_m_000104_1 - exited : java.lang.RuntimeException: >> Error in configuring object >> at >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) >> at >> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72) >> at >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130) >> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:395) >> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:334) >> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:152) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:396) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1332) >> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:147) >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103) >> ... 9 more >> Caused by: java.lang.RuntimeException: Error in configuring object >> at >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:106) >> at >> org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:72) >> at >> org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:130) >> at org.apache.hadoop.mapred.MapRunner.configure(MapRunner.java:38) >> ... 14 more >> Caused by: java.lang.reflect.InvocationTargetException >> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> at >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> at >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> at java.lang.reflect.Method.invoke(Method.java:597) >> at >> org.apache.hadoop.util.ReflectionUtils.setJobConf(ReflectionUtils.java:103) >> ... 17 more >> Caused by: java.lang.RuntimeException: Map operator initialization failed >> at >> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:121) >> ... 22 more*Caused by: java.lang.RuntimeException: cannot find field >> header_date from >> [org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2add5681, >> >> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspect*or$MyField@295a4523, >> >> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@6571120a, >> >> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@6257828d, >> >> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@5f3c296b, >> >> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@66c360a5, >> >> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@24fe2558, >> >> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2945c761, >> >> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector$MyField@2424c672] >> at >> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.getStandardStructFieldRef(ObjectInspectorUtils.java:345) >> at >> org.apache.hadoop.hive.serde2.objectinspector.UnionStructObjectInspector.getStructFieldRef(UnionStructObjectInspector.java:100) >> at >> org.apache.hadoop.hive.ql.exec.ExprNodeColumnEvaluator.initialize(ExprNodeColumnEvaluator.java:57) >> at >> org.apache.hadoop.hive.ql.exec.Operator.initEvaluators(Operator.java:896) >> at >> org.apache.hadoop.hive.ql.exec.Operator.initEvaluatorsAndReturnStruct(Operator.java:922) >> at >> org.apache.hadoop.hive.ql.exec.SelectOperator.initializeOp(SelectOperator.java:60) >> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) >> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433) >> at >> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389) >> at >> org.apache.hadoop.hive.ql.exec.FilterOperator.initializeOp(FilterOperator.java:78) >> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) >> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:433) >> at >> org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:389) >> at >> org.apache.hadoop.hive.ql.exec.TableScanOperator.initializeOp(TableScanOperator.java:166) >> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) >> at >> org.apache.hadoop.hive.ql.exec.MapOperator.initializeOp(MapOperator.java:427) >> at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:357) >> at >> org.apache.hadoop.hive.ql.exec.ExecMapper.configure(ExecMapper.java:98) >> ... 22 more >> >> *MY SCRIPT is given below* >> ===================== >> hive -hiveconf hive.root.logger=INFO,console -hiveconf >> mapred.job.priority=VERY_HIGH -e " >> SET hive.exec.compress.output=true; >> SET mapred.reduce.tasks=16; >> SET >> mapred.output.compression.codec=org.apache.hadoop.io.compress.GzipCodec; >> add jar ${JAR_NAME_AND_PATH}; >> create temporary function collect as >> 'com.wizecommerce.utils.hive.udf.GenericUDAFCollect'; >> create temporary function isnextagip as >> 'com.wizecommerce.utils.hive.udf.IsNextagIP'; >> create temporary function isfrombot as >> 'com.wizecommerce.utils.hive.udf.IsFromBot'; >> create temporary function processblankkeyword as >> 'com.wizecommerce.utils.hive.udf.ProcessBlankKeyword'; >> create temporary function getSellersProdImpr as >> 'com.wizecommerce.utils.hive.udf.GetSellersWithValidSellerIdsProdImpr'; >> create temporary function getProgramCode as >> 'com.wizecommerce.utils.hive.udf.GetProgramCodeFromSellerClickContext'; >> INSERT OVERWRITE DIRECTORY >> '/user/beeswax/warehouse/${HIVE_OUTPUT_TBL}/${DATE_STR}' >> SELECT >> h.header_date_donotquery as date_, >> h.header_id as impression_id, >> h.header_searchsessionid as search_session_id, >> h.cached_visitid as visit_id , >> split(h.server_name_donotquery,'[\.]')[0] as server, >> h.cached_ip ip, >> h.header_adnodeid ad_nodes, >> if(concat_ws(',' , getSellersProdImpr(collect_set(concat_ws('|', >> if(h.seller_sellerid is null, >> 'null',cast(h.seller_sellerid as STRING)), >> if(h.seller_tagid is >> null,'null',cast(h.seller_tagid as STRING)), >> cast(IF(h.seller_subtotal IS NULL, >> -1, h.seller_subtotal) as STRING), >> cast(IF(h.seller_pricetier IS >> NULL, -1, h.seller_pricetier) as STRING), >> cast(IF(h.seller_pricerank >> IS NULL, -1, h.seller_pricerank) as STRING), >> cast(IF(h.seller_cpc IS NULL, -1, >> h.seller_cpc) as STRING), >> h.program_code_notnull)))) = '', >> NULL, concat_ws(',' , getSellersProdImpr(collect_set(concat_ws('|', >> if(h.seller_sellerid is null, >> 'null',cast(h.seller_sellerid as STRING)), >> if(h.seller_tagid is >> null,'null',cast(h.seller_tagid as STRING)), >> cast(IF(h.seller_subtotal IS NULL, >> -1, h.seller_subtotal) as STRING), >> cast(IF(h.seller_pricetier IS >> NULL, -1, h.seller_pricetier) as STRING), >> cast(IF(h.seller_pricerank >> IS NULL, -1, h.seller_pricerank) as STRING), >> cast(IF(h.seller_cpc IS NULL, -1, >> h.seller_cpc) as STRING), >> h.program_code_notnull))))) as >> visible_sellers, >> >> if(concat_ws(',' , getSellersProdImpr(collect_set(concat_ws('|', >> if(sh.seller_id is >> null,'null',cast(sh.seller_id as STRING)), >> if(sh.tag_id is null, 'null', >> cast(sh.tag_id as STRING)), >> '-1.0', >> cast(IF(sh.price_tier IS NULL, -1, >> sh.price_tier) as STRING), >> '-1', >> cast(IF(sh.price_tier IS NULL, >> -1.0, sh.price_tier*1.0) as STRING), >> h.program_code_null)))) = '', >> NULL, concat_ws(',' , getSellersProdImpr(collect_set(concat_ws('|', >> if(sh.seller_id is >> null,'null',cast(sh.seller_id as STRING)), >> if(sh.tag_id is null, 'null', >> cast(sh.tag_id as STRING)), >> '-1.0', >> cast(IF(sh.price_tier IS NULL, -1, >> sh.price_tier) as STRING), >> '-1', >> cast(IF(sh.price_tier IS NULL, >> -1.0, sh.price_tier*1.0) as STRING), >> h.program_code_null))))) as >> invisible_sellers >> FROM >> (SELECT >> header_id, >> header_date, >> header_date_donotquery, >> header_searchsessionid, >> cached_visitid, >> cached_ip, >> header_adnodeid, >> server_name_donotquery, >> seller_sellerid, >> seller_tagid, >> cast (regexp_replace(seller_subtotal,',','.') as DOUBLE) as >> seller_subtotal, >> seller_pricetier, >> seller_pricerank, >> CAST(CAST(seller_cpc as INT) as DOUBLE) as seller_cpc, >> cast(getProgramCode('${THISHOST}', >> '${REST_API_SERVER_NAME}',seller_clickcontext) as STRING) as >> program_code_notnull, >> cast(getProgramCode('${THISHOST}', '${REST_API_SERVER_NAME}', >> '') as STRING) as program_code_null >> FROM >> product_impressions_hive_only >> WHERE >> header_date='${DATE_STR}' >> AND >> cached_recordid IS NOT NULL >> AND >> isnextagip(cached_ip) = FALSE >> AND >> isfrombot(cached_visitid) = FALSE >> AND >> header_skipsellerloggingflag = 0 >> ) h >> >> LEFT OUTER JOIN >> (SELECT >> * >> FROM >> prodimpr_seller_hidden >> WHERE >> date_seller = '${DATE_STR}' >> ) sh >> ON >> h.header_id = sh.header_id >> AND >> sh.date_seller=h.header_date >> GROUP BY >> h.header_date_donotquery, >> h.header_id, >> h.header_searchsessionid, >> h.cached_visitid, >> h.server_name_donotquery, >> h.cached_ip, >> h.header_adnodeid >> ; >> " >> >> >> CONFIDENTIALITY NOTICE >> ====================== >> This email message and any attachments are for the exclusive use of the >> intended recipient(s) and may contain confidential and privileged >> information. Any unauthorized review, use, disclosure or distribution is >> prohibited. If you are not the intended recipient, please contact the >> sender by reply email and destroy all copies of the original message along >> with any attachments, from your computer system. If you are the intended >> recipient, please be advised that the content of this message is subject to >> access, review and disclosure by the sender's Email System Administrator. >> > > > CONFIDENTIALITY NOTICE > ====================== > This email message and any attachments are for the exclusive use of the > intended recipient(s) and may contain confidential and privileged > information. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient, please contact the > sender by reply email and destroy all copies of the original message along > with any attachments, from your computer system. If you are the intended > recipient, please be advised that the content of this message is subject to > access, review and disclosure by the sender's Email System Administrator. > > CONFIDENTIALITY NOTICE > ====================== > This email message and any attachments are for the exclusive use of the > intended recipient(s) and may contain confidential and privileged > information. Any unauthorized review, use, disclosure or distribution is > prohibited. If you are not the intended recipient, please contact the > sender by reply email and destroy all copies of the original message along > with any attachments, from your computer system. If you are the intended > recipient, please be advised that the content of this message is subject to > access, review and disclosure by the sender's Email System Administrator. >