Duo Xu created HADOOP-11685: ------------------------------- Summary: StorageException complaining " no lease ID" during HBase distributed log splitting Key: HADOOP-11685 URL: https://issues.apache.org/jira/browse/HADOOP-11685 Project: Hadoop Common Issue Type: Bug Components: tools Reporter: Duo Xu Assignee: Duo Xu
This is similar to HADOOP-11523, but in a different place. During HBase distributed log splitting, multiple threads will access the same folder called "recovered.edits". However, lots of places in our WASB code did not acquire lease and simply passed null to Azure storage, which caused this issue. {code} 2015-02-26 03:21:28,871 WARN org.apache.hadoop.hbase.regionserver.SplitLogWorker: log splitting of WALs/workernode4.hbaseproddm2001.g6.internal.cloudapp.net,60020,1422071058425-splitting/workernode4.hbaseproddm2001.g6.internal.cloudapp.net%2C60020%2C1422071058425.1424914216773 failed, returning error java.io.IOException: org.apache.hadoop.fs.azure.AzureException: java.io.IOException at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.checkForErrors(HLogSplitter.java:633) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.access$000(HLogSplitter.java:121) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$OutputSink.finishWriting(HLogSplitter.java:964) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.finishWritingAndClose(HLogSplitter.java:1019) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:359) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.splitLogFile(HLogSplitter.java:223) at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:142) at org.apache.hadoop.hbase.regionserver.handler.HLogSplitterHandler.process(HLogSplitterHandler.java:79) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:128) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.fs.azure.AzureException: java.io.IOException at org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1477) at org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1862) at org.apache.hadoop.fs.azurenative.NativeAzureFileSystem.mkdirs(NativeAzureFileSystem.java:1812) at org.apache.hadoop.fs.FileSystem.mkdirs(FileSystem.java:1815) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter.getRegionSplitEditsPath(HLogSplitter.java:502) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.createWAP(HLogSplitter.java:1211) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.getWriterAndPath(HLogSplitter.java:1200) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$LogRecoveredEditsOutputSink.append(HLogSplitter.java:1243) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.writeBuffer(HLogSplitter.java:851) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.doRun(HLogSplitter.java:843) at org.apache.hadoop.hbase.regionserver.wal.HLogSplitter$WriterThread.run(HLogSplitter.java:813) Caused by: java.io.IOException at com.microsoft.windowsazure.storage.core.Utility.initIOException(Utility.java:493) at com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:282) at org.apache.hadoop.fs.azurenative.AzureNativeFileSystemStore.storeEmptyFolder(AzureNativeFileSystemStore.java:1472) ... 10 more Caused by: com.microsoft.windowsazure.storage.StorageException: There is currently a lease on the blob and no lease ID was specified in the request. at com.microsoft.windowsazure.storage.StorageException.translateException(StorageException.java:163) at com.microsoft.windowsazure.storage.core.StorageRequest.materializeException(StorageRequest.java:306) at com.microsoft.windowsazure.storage.core.ExecutionEngine.executeWithRetry(ExecutionEngine.java:229) at com.microsoft.windowsazure.storage.blob.CloudBlockBlob.commitBlockList(CloudBlockBlob.java:248) at com.microsoft.windowsazure.storage.blob.BlobOutputStream.commit(BlobOutputStream.java:319) at com.microsoft.windowsazure.storage.blob.BlobOutputStream.close(BlobOutputStream.java:279) ... 11 more {code} The fix is simple, just to acquire lease before the operation mkdir. However, this might hurt performance a little bit, I will locally run a perf test before submitting the patch. -- This message was sent by Atlassian JIRA (v6.3.4#6332)