[
https://issues.apache.org/jira/browse/HBASE-16234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15397196#comment-15397196
]
Yi Liang commented on HBASE-16234:
----------------------------------
public static List<HRegionInfo> replicaRegionsNotRecordedInMeta(
Set<HRegionInfo> regionsRecordedInMeta, MasterServices master)throws
IOException {
List<HRegionInfo> regionsNotRecordedInMeta = new ArrayList<HRegionInfo>();
for (HRegionInfo hri : regionsRecordedInMeta) {
TableName table = hri.getTable();
// look at the HTD for the replica count. That's the source of truth
int desiredRegionReplication = 1;
try {
HTableDescriptor htd = master.getTableDescriptors().get(table);
if (htd == null) {
LOG.warn("master can not get TableDescriptor from table '" + table);
} else {
desiredRegionReplication = htd.getRegionReplication();
}
} catch (IOException e){
LOG.warn("Couldn't get the replication attribute of the table " + table
+ " due to "
+ e.getMessage());
}
for (int i = 0; i < desiredRegionReplication; i++) {
HRegionInfo replica = RegionReplicaUtil.getRegionInfoForReplica(hri, i);
if (regionsRecordedInMeta.contains(replica)) continue;
regionsNotRecordedInMeta.add(replica);
}
}
return regionsNotRecordedInMeta;
}
ok, i got your idea, in my code, i set desiredRegionReplication==1, so if
htd==null, in the inner for() loop
the method
HRegionInfo replica = RegionReplicaUtil.getRegionInfoForReplica(hri, i);
will only get the primary replica, since every region should have primary
replica, I just want to try at least whether primary replica go to the
regionsNotRecordedInMeta list or not.
But I think you solution is also good, if the descriptor is broken, we can just
ignore this region and continue.
Which one you think is better?
and maybe we can discuss this with [~jerryhe].
Thanks :)
> Expect and handle nulls when assigning replicas
> -----------------------------------------------
>
> Key: HBASE-16234
> URL: https://issues.apache.org/jira/browse/HBASE-16234
> Project: HBase
> Issue Type: Bug
> Components: Region Assignment
> Affects Versions: 2.0.0
> Reporter: Harsh J
> Assignee: Yi Liang
> Attachments: HBASE-16234-V1.patch, HBASE-16234-V1.patch,
> HBASE-16234-V2.patch, HBASE-16234-V3.patch
>
>
> Observed this on a cluster:
> {code}
> FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting
> shutdown.
> java.lang.NullPointerException
> at
> org.apache.hadoop.hbase.master.AssignmentManager.replicaRegionsNotRecordedInMeta(AssignmentManager.java:2799)
>
> at
> org.apache.hadoop.hbase.master.AssignmentManager.assignAllUserRegions(AssignmentManager.java:2778)
>
> at
> org.apache.hadoop.hbase.master.AssignmentManager.processDeadServersAndRegionsInTransition(AssignmentManager.java:638)
>
> at
> org.apache.hadoop.hbase.master.AssignmentManager.joinCluster(AssignmentManager.java:485)
>
> at
> org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:723)
>
> at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
> at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1481)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> It looks like {{FSTableDescriptors#get(…)}} can be expected to return null in
> some cases, but {{AssignmentManager.replicaRegionsNotRecordedInMeta(…)}} does
> not currently have any handling for such a possibility.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)