[ 
https://issues.apache.org/jira/browse/HIVE-4403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chu Tong updated HIVE-4403:
---------------------------

    Status: Patch Available  (was: Open)

The reason for the warnings presented in this bug is Hive client tries to 
override some Hadoop final parameters defined in mapred-default.xml. To dig it 
deeper, Hive configuration inherits Hadoop configuration and the current way 
Hive overrides Hadoop parameters is:

1) Create a default Hadoop configuration (this contains all default Hadoop 
parameters including the ones defined as final).
2) Overlay parameters it wants to override on configuration created in 1).
3) Overlay configuration generated in 2) over the default Hadoop parameters it 
inherits (these default Hadoop parameters contains all default Hadoop 
parameters including the ones defined as final again).

Since configuration generated in 2) contains all the default Hadoop parameters 
and when it comes to 3), the warning is thrown.

Solution to resolve this problem is when 1) happens, instead of create a 
default Hadoop configuration, an empty Hadoop configuration is created and 2) 
overlays Hive parameters on this empty configuration. This way, in 3), 
configuration in 2) will override any default Hadoop parameters it wants to 
overrides, however, no warning will be thrown as 2) does not contain default 
Hadoop parameters.

I have tested this by different code path including:

1) keep everything as default
2) define overriding parameters in hive-site.xml
3) define overriding parameters in hive client shell

and all these cases work well.
                
> Running Hive queries on Yarn (MR2) gives warnings related to overriding final 
> parameters
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-4403
>                 URL: https://issues.apache.org/jira/browse/HIVE-4403
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>            Reporter: Mark Grover
>         Attachments: HIVE-4403.patch
>
>
> While working on BIGTOP-885, I saw that Hive was giving a bunch of warnings 
> related to overriding final parameters in job.conf. This was on a pseudo 
> distributed cluster. FWIW, I didn't see this happen on a fully-distributed 
> cluster. Perhaps, Hive's job.conf is overriding some final parameters it 
> shouldn't.
> Here is what the warnings looked like:
> {code}
> 2013-04-19 14:20:32,304 WARN  [main] conf.Configuration 
> (Configuration.java:loadProperty(2032)) - 
> file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
>  attempt to override final parameter: 
> mapreduce.job.end-notification.max.retry.interval;  Ignoring.
> 2013-04-19 14:20:32,367 WARN  [main] conf.Configuration 
> (Configuration.java:loadProperty(2032)) - 
> file:/tmp/root/hive_2013-04-19_14-20-30_159_5701876916688815815/-local-10002/jobconf.xml:an
>  attempt to override final parameter: 
> mapreduce.job.end-notification.max.attempts;  Ignoring.
> {code}
> To reproduce, run a query like:
> {code}
> CREATE TABLE u_data (
>   userid INT,
>   movieid INT,
>   rating INT,
>   unixtime STRING)
> ROW FORMAT DELIMITED
> FIELDS TERMINATED BY '\t'
> STORED AS TEXTFILE;
> {code}
> Load some data into u_data, here is some sample data:
> https://github.com/apache/bigtop/blob/master/bigtop-tests/test-artifacts/hive/src/main/resources/seed_data_files/ml-data/u.data
> Run a simple query on that data (on YARN/MR2)
> {code}
> INSERT OVERWRITE DIRECTORY '/tmp/count'
> SELECT COUNT(1) FROM u_data
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to