Problem with org.apache.hadoop.conf.Configuration.REGISTRY ----------------------------------------------------------
Key: HADOOP-7004 URL: https://issues.apache.org/jira/browse/HADOOP-7004 Project: Hadoop Common Issue Type: Bug Environment: hadoop 0.20.2, hbase 0.20.6 Reporter: Henning Blohm Priority: Minor When reusing Configuration that has an added addResource(InputStream) a reload of configuration will fail as the stream has been read before. The reload gets triggered when addDefaultResource is called. That method uses the REGISTRY static WeakHashMap to reach out to all reachable Configuration instances to reset their properties. The method addDefaultResource is called by e.g. ConfigUtil in org.apache.hadoop.mapreduce.util (hadoop trunk) or JobConf (hadoop 0.20.2). The problem has been observed in Hadoop 0.20.2 but the code in trunk has essentially the same structure. There are a few problems here: 1. You cannot safely use addResource(InputStream), if Configuration objects are to be re-used (you can however use addResource(URL) instead) 2. Modifying the state of Configuration instances at some later point in time as a side effect of some class initialization in some completely unrelated thread leads to unpredictable behavior (properties change under the hood) 3. Configuration instances keep context classloaders to find resources. After redeployment these may not be "valid" anymore. As long as the Configuration instance has not been collected, addDefaultResource will still invoke reloadConfiguration on them. While that is harmless today (only resetting members), this looks like a ticking time bomb. Suggestion: Define all default resources in Configuration once. Do not hold on to other configuration instances and do not modify their state as a side effect of some other activity. See also: http://osdir.com/ml/general/2010-10/msg25893.html -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.