Re: An issue with Hive on hadoop cluster -- Resolved

MIS Tue, 24 May 2011 03:25:21 -0700

Finally Got It !
Here's the full story

I started with running hive on a single node hadoop cluster. And our
metastore was on a mysql instance.
At that time, namenode uri was localhost:54310


Then it was decided to add more nodes to the cluster. And modified all the
*-site.xml files where ever relevant.
But the hive was picking up the URIs not from *-site files, but from the
metastore tables in mysql. {I didn't know this}
Upon changing the DB_LOCATION_URI column and LOCATION in the tables DBS and
SDS respectively to point to the latest namenode URI, I was back in
business.

Thanks.



On Tue, May 24, 2011 at 2:49 PM, MIS <misapa...@gmail.com> wrote:

> Thanks for the suggestions.
> I had tried out by specifying only the ips instead of hostnames, but now I
> modified the /etc/hosts file as suggested below, appropriately. Still no
> success.
>
> If I use IPs instead of hostnames I get the below error in the hive cli.
>
> *2011-05-24 14:42:53,485 ERROR ql.Driver
> (SessionState.java:printError(343)) - FAILED: Hive Internal Error:
> java.lang.RuntimeException(Error while making MR scratch directory - check
> filesystem config (null))
> java.lang.RuntimeException: Error while making MR scratch directory - check
> filesystem config (null)
>     at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:196)
>     at org.apache.hadoop.hive.ql.Context.getMRTmpFileURI(Context.java:247)
>     at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getMetaData(SemanticAnalyzer.java:900)
>     at
> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:6594)
>     at
> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:238)
>     at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:340)
>     at org.apache.hadoop.hive.ql.Driver.run(Driver.java:736)
>     at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
>
>     at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
>     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: java.lang.IllegalArgumentException: Wrong FS: hdfs://
> 192.168.0.18:9000/tmp/hive-hadoop/hive_2011-05-24_14-42-53_287_7078843136333133329,
> expected: hdfs://<myHostName>:9000
>     at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:310)
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.checkPath(DistributedFileSystem.java:99)
>     at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:222)
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.makeQualified(DistributedFileSystem.java:116)
>     at org.apache.hadoop.hive.ql.Context.getScratchDir(Context.java:146)
>     at org.apache.hadoop.hive.ql.Context.getMRScratchDir(Context.java:190)
>     ... 14 more*
>
> Thanks.
>
>
>
> On Tue, May 24, 2011 at 2:01 PM, Eric Djatsa <djatsa...@gmail.com> wrote:
>
>> Hi, I had similar problems when I was setting up my hadoop cluster. The
>> Datanodes where trying to access localhost instead of my namenode.
>> To fix this issue I modified my /etc/hosts file on all my nodes (namenode
>> + datanodes)  in such a way that the first line corresponds to the binding
>> <IP address-->hostname>.
>>
>> For example on my namenode I have :
>> /etc/hosts :
>> 192.168.0.1  mynamenode.mydomain.com    mynamenode
>> 127.0.0.1       localhost
>>
>> On my datanodes I have :
>> 192.168.0.X  mydatanodeX.mydomain.com    mydatanodeX
>> 127.0.0.1       localhost
>>
>> If this does'nt work in first place, try secifying the namenode in
>> core-site.xml on all datanodes with it's IP address rather than it's
>> hostname.
>>
>> Hope this helps !
>> Regards
>> Eric
>>
>>
>>
>> 2011/5/24 MIS <misapa...@gmail.com>
>>
>>> I have the configuration consistent across both the client and server
>>> sides. I have checked the hadoop logs on both  the nodes. On both the nodes,
>>> in the tasktracker logs, every task attempt is directed towards
>>> hdfs://localhost:54310/user/hive/warehouse and not towards
>>> hdfs://<myHostName>:54310/user/hive/warehouse.
>>>
>>> Further, I have given the absolute path for the property
>>> hive.metastore.warehouse.dir as
>>> hdfs://<myHostName>:54310:/user/hive/warehouse in the file hive-site.xml
>>>
>>> Also, if I change the port number for fs.default.name across all the
>>> locations, the change is visible, but still the hostname comes as localhost.
>>>
>>> As mentioned earlier, If i give the server running namenode an alias in
>>> the /etc/hosts file as localhost, on all the nodes, every thing works fine.
>>> But obviously I can't go ahead with this.
>>>
>>> Thanks.
>>>
>>>
>>> On Tue, May 24, 2011 at 1:50 AM, Ning Zhang <nzh...@fb.com> wrote:
>>>
>>>>  AFAIK, the fs.default.name should be set by both the client and server
>>>> side .xml files, and they should be consistent (the URI scheme, the 
>>>> hostname
>>>> and port number). The server side config (also called fs.default.name)
>>>> should be read by the namenode and the client side is read by any HDFS
>>>> clients (Hive is one of them).
>>>>
>>>>  For example, the setting we have is:
>>>>
>>>>  server side core-site-custom.xml:
>>>>
>>>>  <property>
>>>>   <name>fs.default.name</name>
>>>>   <value>*hdfs://hostname:9000*</value>
>>>>   <description>The name of the default file system.  A URI whose
>>>>   scheme and authority determine the FileSystem implementation.  The
>>>>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>>>>   the FileSystem implementation class.  The uri's authority is used to
>>>>   determine the host, port, etc. for a filesystem.</description>
>>>> </property>
>>>>
>>>>  client side core-site.xml:
>>>>
>>>>  <property>
>>>>   <name>fs.default.name</name>
>>>>   <value>*hdfs://hostname:9000*</value>
>>>>   <description>The name of the default file system.  A URI whose
>>>>   scheme and authority determine the FileSystem implementation.  The
>>>>   uri's scheme determines the config property (fs.SCHEME.impl) naming
>>>>   the FileSystem implementation class.  The uri's authority is used to
>>>>   determine the host, port, etc. for a filesystem.</description>
>>>> </property>
>>>>
>>>>  From the stack trace it seems Hive is trying to connect to port 54310,
>>>> which you should check if it is correct from your server side HDFS config.
>>>>
>>>>
>>>>  On May 23, 2011, at 4:00 AM, MIS wrote:
>>>>
>>>> I have already tried your suggestion. I have mentioned the same in my
>>>> mail.
>>>> I have also given the required permissions for the directory
>>>> (hive.metastore.warehouse.dir).
>>>>
>>>> If you look closely at the stack trace , the port number that I have
>>>> specified in the config files for the namenode and jobtracker is reflected
>>>> but not the hostname. I have also gone through the code base to verify the
>>>> issue. But nothing fishy there.
>>>> The stand-alone hadoop cluster is working fine, but when I try to run a
>>>> simple query a select , to fetch a few rows, hive throws up the exception.
>>>>
>>>> I was able to get this to work with a few hacks though, like adding
>>>> localhost as alias in the /etc/hosts file for the server running the
>>>> namenode. But I can't go ahead with this solution, as it'll break other
>>>> things.
>>>>
>>>> Thanks.
>>>>
>>>>
>>>> On Mon, May 23, 2011 at 4:14 PM, jinhang du <dujinh...@gmail.com>wrote:
>>>>
>>>>> Set the follow property in hive.site.xml.
>>>>> fs.default.name = hdfs:<your namenode of hadoop>
>>>>> mapred.job.tracker  = <your job tracker:port>
>>>>> hive.metastore.warehouse.dir =  <hdfs path>
>>>>> Make sure you have the authority to write into this directory
>>>>> (hive.metastore.warehouse.dir).
>>>>> Try it.
>>>>>
>>>>>
>>>>> 2011/5/23 MIS <misapa...@gmail.com>
>>>>>
>>>>>> I'm getting into an issue when trying to run hive over the hadoop
>>>>>> cluster.
>>>>>>
>>>>>> The hadoop cluster is working fine, in a stand alone manner.
>>>>>> I'm using hadoop 0.20.2 and hive 0.7.0 versions.
>>>>>>
>>>>>> The problem is that the hive is not considering the 
>>>>>> fs.default.nameproperty that I am setting in the core-site.xml or the 
>>>>>> mapred.job.tracker in
>>>>>> the mapred-site.xml files.
>>>>>> It always considers that namenode can be accessed at localhost (refer
>>>>>> to the stack trace below)
>>>>>> So I have specified these properties in the hive-site.xml file as
>>>>>> well. I tried making them as final in the hive-site.xml file, but didn't 
>>>>>> get
>>>>>> the intended result.
>>>>>> Further, I set the above properties through command line as well.
>>>>>> Again, no success.
>>>>>>
>>>>>> I looked at the hive code for 0.7.0 branch to debug the issue, to see
>>>>>> if it getting fs.default.name property from the file hive-site.xml,
>>>>>> which it does through clone of the JobConf. So no issues here.
>>>>>>
>>>>>> Further, in hive-site.xml, if I make any of the properties as final,
>>>>>> then hive gives me a WARNING log. as below :
>>>>>>
>>>>>> *WARN  conf.Configuration (Configuration.java:loadResource(1154)) -
>>>>>> file:/usr/local/hive-0.7.0/conf/hive-site.xml:a attempt to override final
>>>>>> parameter: hive.metastore.warehouse.dir;  Ignoring.*
>>>>>>
>>>>>> From the above message I can assume that it has already read the
>>>>>> property(don't know from where, or it may be trying to read the property
>>>>>> multiple times), but I have explicitly specified the hive conf folder in 
>>>>>> the
>>>>>> hive-env.sh.
>>>>>>
>>>>>> Below is the stack trace I'm getting in the log file:
>>>>>> *2011-05-23 15:11:00,793 ERROR CliDriver
>>>>>> (SessionState.java:printError(343)) - Failed with exception
>>>>>> java.io.IOException:java.net.ConnectException: Call to localhost/
>>>>>> 127.0.0.1:54310 failed on connection exception:
>>>>>> java.net.ConnectException: Connection refused
>>>>>> java.io.IOException: java.net.ConnectException: Call to localhost/
>>>>>> 127.0.0.1:54310 failed on connection exception:
>>>>>> java.net.ConnectException: Connection refused
>>>>>>     at
>>>>>> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:341)
>>>>>>     at
>>>>>> org.apache.hadoop.hive.ql.exec.FetchTask.fetch(FetchTask.java:133)
>>>>>>     at org.apache.hadoop.hive.ql.Driver.getResults(Driver.java:1114)
>>>>>>     at
>>>>>> org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:187)
>>>>>>     at
>>>>>> org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
>>>>>>     at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
>>>>>>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>>>>>     at
>>>>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>>>>>     at
>>>>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>>>>>     at java.lang.reflect.Method.invoke(Method.java:597)
>>>>>>     at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>>>>> Caused by: java.net.ConnectException: Call to localhost/
>>>>>> 127.0.0.1:54310 failed on connection exception:
>>>>>> java.net.ConnectException: Connection refused
>>>>>>     at org.apache.hadoop.ipc.Client.wrapException(Client.java:767)
>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:743)
>>>>>>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:220)
>>>>>>     at $Proxy4.getProtocolVersion(Unknown Source)
>>>>>>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:359)
>>>>>>     at
>>>>>> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:106)
>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:207)
>>>>>>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:170)
>>>>>>     at
>>>>>> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:82)
>>>>>>     at
>>>>>> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1378)
>>>>>>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
>>>>>>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1390)
>>>>>>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:196)
>>>>>>     at org.apache.hadoop.fs.Path.getFileSystem(Path.java:175)
>>>>>>     at
>>>>>> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextPath(FetchOperator.java:241)
>>>>>>     at
>>>>>> org.apache.hadoop.hive.ql.exec.FetchOperator.getRecordReader(FetchOperator.java:259)
>>>>>>     at
>>>>>> org.apache.hadoop.hive.ql.exec.FetchOperator.getNextRow(FetchOperator.java:320)
>>>>>>     ... 10 more
>>>>>> Caused by: java.net.ConnectException: Connection refused
>>>>>>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>>>>>>     at
>>>>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
>>>>>>     at
>>>>>> org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
>>>>>>     at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:404)
>>>>>>     at
>>>>>> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:304)
>>>>>>     at
>>>>>> org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
>>>>>>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:860)
>>>>>>     at org.apache.hadoop.ipc.Client.call(Client.java:720)
>>>>>>     ... 25 more
>>>>>> *
>>>>>> Has anybody encountered similar issues earlier ? any thoughts towards
>>>>>> resolving the above issue would be helpful
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>>  --
>>>>> dujinhang
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>

Re: An issue with Hive on hadoop cluster -- Resolved

Reply via email to