Re: Hive setup on Hadoop cluster

JOHN MILLER Fri, 20 May 2016 05:23:17 -0700

Greetings Mich

Sending u a partilal listing of my hive.log   Problem areas are in bold type


from=org.apache.hadoop.hive.ql.exec.Utilities>
2016-05-20 07:51:19,500 INFO  [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=serializePlan
from=org.apache.hadoop.hive.ql.exec.Utilities>
2016-05-20 07:51:19,501 INFO  [main]: exec.Utilities
(Utilities.java:serializePlan(937)) - Serializing ReduceWork via kryo
2016-05-20 07:51:19,517 INFO  [main]: log.PerfLogger
(PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=serializePlan
start=1463745079500 end=1463745079517 duration=17
from=org.apache.hadoop.hive.ql.exec.Utilities>
*2016-05-20 07:51:19,525 ERROR [main]: mr.ExecDriver
(ExecDriver.java:execute(400)) - yarn*
2016-05-20 07:51:19,540 INFO  [main]: client.RMProxy
(RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /
0.0.0.0:8032
2016-05-20 07:51:19,563 INFO  [main]: client.RMProxy
(RMProxy.java:createRMProxy(98)) - Connecting to ResourceManager at /
0.0.0.0:8032
2016-05-20 07:51:19,565 INFO  [main]: exec.Utilities
(Utilities.java:getBaseWork(389)) - PLAN PATH =
hdfs://localhost:8025/tmp/hive/jmill383/6284d80e-85a9-4c93-b8a7-aefd02cda333/hive_2016-05-20_07-51-19_320_3672167560240760416-1/-mr-10004/4efac57c-18ed-4df3-8fa7-8b000b609919/map.xml
2016-05-20 07:51:19,566 INFO  [main]: exec.Utilities
(Utilities.java:getBaseWork(389)) - PLAN PATH =
hdfs://localhost:8025/tmp/hive/jmill383/6284d80e-85a9-4c93-b8a7-aefd02cda333/hive_2016-05-20_07-51-19_320_3672167560240760416-1/-mr-10004/4efac57c-18ed-4df3-8fa7-8b000b609919/reduce.xml
2016-05-20 07:51:19,578 WARN  [main]: mapreduce.JobSubmitter
(JobSubmitter.java:copyAndConfigureFiles(153)) - Hadoop command-line option
parsing not performed. Implement the Tool interface and execute your
application with ToolRunner to remedy this.
2016-05-20 07:51:19,698 INFO  [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=getSplits
from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2016-05-20 07:51:19,698 INFO  [main]: exec.Utilities
(Utilities.java:getBaseWork(389)) - PLAN PATH =
hdfs://localhost:8025/tmp/hive/jmill383/6284d80e-85a9-4c93-b8a7-aefd02cda333/hive_2016-05-20_07-51-19_320_3672167560240760416-1/-mr-10004/4efac57c-18ed-4df3-8fa7-8b000b609919/map.xml
2016-05-20 07:51:19,698 INFO  [main]: io.CombineHiveInputFormat
(CombineHiveInputFormat.java:getSplits(517)) - Total number of paths: 1,
launching 1 threads to check non-combinable ones.
2016-05-20 07:51:19,702 INFO  [main]: io.CombineHiveInputFormat
(CombineHiveInputFormat.java:getCombineSplits(439)) - CombineHiveInputSplit
creating pool for hdfs://localhost:8025/user/hive/warehouse/commoncrawl18;
using filter path hdfs://localhost:8025/user/hive/warehouse/commoncrawl18
2016-05-20 07:51:19,707 INFO  [main]: input.FileInputFormat
(FileInputFormat.java:listStatus(281)) - Total input paths to process : 5
2016-05-20 07:51:19,709 INFO  [main]: input.CombineFileInputFormat
(CombineFileInputFormat.java:createSplits(413)) - DEBUG: Terminated node
allocation with : CompletedNodes: 1, size left: 0
2016-05-20 07:51:19,709 INFO  [main]: io.CombineHiveInputFormat
(CombineHiveInputFormat.java:getCombineSplits(494)) - number of splits 1
2016-05-20 07:51:19,709 INFO  [main]: io.CombineHiveInputFormat
(CombineHiveInputFormat.java:getSplits(587)) - Number of all splits 1
2016-05-20 07:51:19,710 INFO  [main]: log.PerfLogger
(PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=getSplits
start=1463745079698 end=1463745079709 duration=11
from=org.apache.hadoop.hive.ql.io.CombineHiveInputFormat>
2016-05-20 07:51:19,783 INFO  [main]: mapreduce.JobSubmitter
(JobSubmitter.java:submitJobInternal(494)) - number of splits:1
2016-05-20 07:51:19,834 INFO  [main]: mapreduce.JobSubmitter
(JobSubmitter.java:printTokens(583)) - Submitting tokens for job:
job_1463594979064_0006
2016-05-20 07:51:19,853 INFO  [main]: impl.YarnClientImpl
(YarnClientImpl.java:submitApplication(251)) - Submitted application
application_1463594979064_0006
2016-05-20 07:51:19,857 INFO  [main]: mapreduce.Job (Job.java:submit(1300))
- The url to track the job:
http://starchild:8088/proxy/application_1463594979064_0006/
2016-05-20 07:51:19,858 INFO  [main]: exec.Task
(SessionState.java:printInfo(948)) - Starting Job = job_1463594979064_0006,
Tracking URL = http://starchild:8088/proxy/application_1463594979064_0006/
2016-05-20 07:51:19,858 INFO  [main]: exec.Task
(SessionState.java:printInfo(948)) - Kill Command = /opt/hadoop/bin/hadoop
job  -kill job_1463594979064_0006
2016-05-20 07:51:22,893 INFO  [main]: exec.Task
(SessionState.java:printInfo(948)) - Hadoop job information for Stage-1:
number of mappers: 0; number of reducers: 0
2016-05-20 07:51:22,908 WARN  [main]: mapreduce.Counters
(AbstractCounters.java:getGroup(234)) - Group
org.apache.hadoop.mapred.Task$Counter is deprecated. Use
org.apache.hadoop.mapreduce.TaskCounter instead
2016-05-20 07:51:22,908 INFO  [main]: exec.Task
(SessionState.java:printInfo(948)) - 2016-05-20 07:51:22,908 Stage-1 map =
0%,  reduce = 0%
2016-05-20 07:51:22,910 WARN  [main]: mapreduce.Counters
(AbstractCounters.java:getGroup(234)) - Group
org.apache.hadoop.mapred.Task$Counter is deprecated. Use
org.apache.hadoop.mapreduce.TaskCounter instead
*2016-05-20 07:51:22,911 ERROR [main]: exec.Task
(SessionState.java:printError(957)) - Ended Job = job_1463594979064_0006
with errors*
*2016-05-20 07:51:22,912 ERROR [Thread-50]: exec.Task
(SessionState.java:printError(957)) - Error during job, obtaining debugging
information...*
2016-05-20 07:51:22,912 INFO  [Thread-50]: Configuration.deprecation
(Configuration.java:warnOnceIfDeprecated(1049)) - mapred.job.tracker is
deprecated. Instead, use mapreduce.jobtracker.address
*2016-05-20 07:51:22,912 ERROR [Thread-50]: exec.Task
(SessionState.java:printError(957)) - Job Tracking URL:
http://starchild:8088/cluster/app/application_1463594979064_0006
<http://starchild:8088/cluster/app/application_1463594979064_0006>*
2016-05-20 07:51:22,936 INFO  [main]: impl.YarnClientImpl
(YarnClientImpl.java:killApplication(364)) - Killed application
application_1463594979064_0006
*2016-05-20 07:51:22,943 ERROR [main]: ql.Driver
(SessionState.java:printError(957)) - FAILED: Execution Error, return code
2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask*
2016-05-20 07:51:22,944 INFO  [main]: log.PerfLogger
(PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=Driver.execute
start=1463745079436 end=1463745082944 duration=3508
from=org.apache.hadoop.hive.ql.Driver>
2016-05-20 07:51:22,944 INFO  [main]: ql.Driver
(SessionState.java:printInfo(948)) - MapReduce Jobs Launched:
2016-05-20 07:51:22,944 WARN  [main]: mapreduce.Counters
(AbstractCounters.java:getGroup(234)) - Group FileSystemCounters is
deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead
2016-05-20 07:51:22,944 INFO  [main]: ql.Driver
(SessionState.java:printInfo(948)) - Stage-Stage-1:  HDFS Read: 0 HDFS
Write: 0 FAIL
2016-05-20 07:51:22,944 INFO  [main]: ql.Driver
(SessionState.java:printInfo(948)) - Total MapReduce CPU Time Spent: 0 msec
2016-05-20 07:51:22,944 INFO  [main]: log.PerfLogger
(PerfLogger.java:PerfLogBegin(121)) - <PERFLOG method=releaseLocks
from=org.apache.hadoop.hive.ql.Driver>
2016-05-20 07:51:22,944 INFO  [main]: log.PerfLogger
(PerfLogger.java:PerfLogEnd(148)) - </PERFLOG method=releaseLocks
start=1463745082944 end=1463745082944 duration=0
from=org.apache.hadoop.hive.ql.Driver>
2016-05-20 07:51:22,944 INFO  [main]: exec.ListSinkOperator
(Operator.java:close(612)) - 7 finished. closing...
2016-05-20 07:51:22,944 INFO  [main]: exec.ListSinkOperator
(Operator.java:close(634)) - 7 Close done

On Fri, May 20, 2016 at 7:50 AM, JOHN MILLER <jmill...@gmail.com> wrote:

> Greetings  Attached is the results of the select count(1) from table
>
> The contents of the dataset(table) is 18 columns and 3340 rows
>
> hive> select count(1) from commoncrawl18;
> Query ID = jmill383_20160520074710_3b5ee662-2ead-4d89-9123-df9b2cf6e2d7
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks determined at compile time: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> Starting Job = job_1463594979064_0005, Tracking URL =
> http://starchild:8088/proxy/application_1463594979064_0005/
> Kill Command = /opt/hadoop/bin/hadoop job  -kill job_1463594979064_0005
> Hadoop job information for Stage-1: number of mappers: 0; number of
> reducers: 0
> 2016-05-20 07:47:15,936 Stage-1 map = 0%,  reduce = 0%
> Ended Job = job_1463594979064_0005 with errors
> Error during job, obtaining debugging information...
> Job Tracking URL:
> http://starchild:8088/cluster/app/application_1463594979064_0005
> FAILED: Execution Error, return code 2 from
> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched:
> Stage-Stage-1:  HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec
> hive>
>
>
> On Thu, May 19, 2016 at 8:56 AM, Mich Talebzadeh <
> mich.talebza...@gmail.com> wrote:
>
>> Hi John,
>>
>> stderr does not say much
>>
>> Exception in thread "main" java.lang.IncompatibleClassChangeError:
>> Implementing class
>>
>> at java.lang.ClassLoader.defineClass1(Native Method)
>>
>> at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
>>
>> at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>
>> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
>>
>> at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
>>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>>
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>>
>> at java.security.AccessController.doPrivileged(Native Method)
>>
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>>
>> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
>>
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>>
>> at java.lang.Class.getDeclaredMethods0(Native Method)
>>
>> at java.lang.Class.privateGetDeclaredMethods(Class.java:2615)
>>
>> at java.lang.Class.getMethod0(Class.java:2856)
>>
>> at java.lang.Class.getMethod(Class.java:1668)
>>
>> at sun.launcher.LauncherHelper.getMainMethod(LauncherHelper.java:494)
>>
>> at sun.launcher.LauncherHelper.checkAndLoadMain(LauncherHelper.java:486)
>>
>>
>> However, sounds like you may have an  issue with yarn container memory.
>>
>>
>> How big is the underlying table. Also can you just do a plain select
>> count(1) from <table> itself (no distinct etc) and see it works?
>>
>>
>> HTH
>>
>>
>>
>>
>> Dr Mich Talebzadeh
>>
>>
>>
>> LinkedIn * 
>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>
>>
>>
>> http://talebzadehmich.wordpress.com
>>
>>
>>
>> On 18 May 2016 at 19:46, JOHN MILLER <jmill...@gmail.com> wrote:
>>
>>> Mich
>>>
>>> Attaching hadoop logs
>>>
>>> John M
>>>
>>> On Wed, May 18, 2016 at 1:48 PM, Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
>>>> Hi John,
>>>>
>>>> can you please a new thread for your problem so we can deal with
>>>> separately.
>>>>
>>>> thanks
>>>>
>>>> Dr Mich Talebzadeh
>>>>
>>>>
>>>>
>>>> LinkedIn * 
>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>
>>>>
>>>>
>>>> http://talebzadehmich.wordpress.com
>>>>
>>>>
>>>>
>>>> On 18 May 2016 at 15:11, JOHN MILLER <jmill...@gmail.com> wrote:
>>>>
>>>>> Greetings Mitch
>>>>>
>>>>> I have an issue with running mapreduce in hive   I am getting a
>>>>> FAILED: Execution Error, return code 2 from
>>>>> org.apache.hadoop.hive.ql.exec.mr.MapRedTask
>>>>>
>>>>> error while attemtiing to execute SELECT DISTINCT(fieldname) FROM
>>>>> TABLE x  or SELECT  COUNT(*)  FROM TABLE x;;  Trying to run cascading-hive
>>>>> gives me the same problem as well
>>>>>
>>>>> Please advise if u have come across this type of problem or generated
>>>>> some ideas as to resolve this problema
>>>>>
>>>>> On Wed, May 18, 2016 at 9:53 AM, Mich Talebzadeh <
>>>>> mich.talebza...@gmail.com> wrote:
>>>>>
>>>>>> Hi Kuldeep,
>>>>>>
>>>>>> Have you installed hive on any of these nodes.
>>>>>>
>>>>>> Hive is basically an API. You will also need to install sqoop as well
>>>>>> if you are going to import data from other RDBMss like Oracle, Sybase 
>>>>>> etc.
>>>>>>
>>>>>> Hive has a very small footprint so my suggestion is to install it on
>>>>>> all your boxes and permission granted to Haddop user say hduser.
>>>>>>
>>>>>> Hive will require a metadata in  a database of your choice. default
>>>>>> is derby which I don't use. try to use a reasonable database. ours is on
>>>>>> Oracle
>>>>>>
>>>>>>  Now under directory $HIVE_HOME/conf/hive-site.xml you can set up
>>>>>> info about Hadoop and your metastore etc. You also need to set up
>>>>>> environment variables for both Hadoop and hive in your start up script 
>>>>>> like
>>>>>> .profile .kshrc etc
>>>>>>
>>>>>> Have a look anyway.
>>>>>>
>>>>>> HTH
>>>>>>
>>>>>> Dr Mich Talebzadeh
>>>>>>
>>>>>>
>>>>>>
>>>>>> LinkedIn * 
>>>>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw
>>>>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>*
>>>>>>
>>>>>>
>>>>>>
>>>>>> http://talebzadehmich.wordpress.com
>>>>>>
>>>>>>
>>>>>>
>>>>>> On 18 May 2016 at 13:49, Kuldeep Chitrakar <
>>>>>> kuldeep.chitra...@synechron.com> wrote:
>>>>>>
>>>>>>> I have a very basic question regarding Hadoop & Hive setup.  I have
>>>>>>> 7 Machines say M1,M2,M3,M4,M5,M6,M7
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Hadoop Cluster Setup:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Namenode: M1
>>>>>>>
>>>>>>> Seondary Namenode: M2
>>>>>>>
>>>>>>> Datanodes: M3,M4,M5
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Now question is:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Where do I need to install Hive.
>>>>>>>
>>>>>>> 1.       Should I install Hiverserver on M6
>>>>>>>
>>>>>>> a.       if yes does that machine needs core Hadoop JAR’s installed?
>>>>>>>
>>>>>>> b.      How this Hive server knows where Hadoop cluster is. What
>>>>>>> configurations needs to be done?
>>>>>>>
>>>>>>> c.       How can we restrict this machine to be only hive server
>>>>>>> and not datanode of Hadoop cluster?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2.       Where do we install Hive CLI
>>>>>>>
>>>>>>> a.       If I want to hive M7 as Hive CLI, then what needs to be
>>>>>>> installed on this machine.
>>>>>>>
>>>>>>> b.      Any required configurations.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Thanks,
>>>>>>>
>>>>>>> Kuldeep
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Hive setup on Hadoop cluster

Reply via email to