Apache Hadoop qbt Report: branch-2.10+JDK7 on Linux/x86_64

2024-04-17 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/

No changes




-1 overall


The following subsystems voted -1:
asflicense hadolint mvnsite pathlen unit


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck whitespace


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

Failed junit tests :

   hadoop.ipc.TestIPC 
   hadoop.fs.TestFileUtil 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.hdfs.TestLeaseRecovery2 
   
hadoop.hdfs.server.blockmanagement.TestReplicationPolicyWithUpgradeDomain 
   hadoop.hdfs.server.datanode.TestDirectoryScanner 
   hadoop.hdfs.TestFileLengthOnClusterRestart 
   hadoop.hdfs.TestDFSInotifyEventInputStream 
   hadoop.hdfs.server.namenode.snapshot.TestSnapshotBlocksMap 
   hadoop.hdfs.qjournal.server.TestJournalNodeRespectsBindHostKeys 
   hadoop.fs.viewfs.TestViewFileSystemHdfs 
   hadoop.hdfs.server.balancer.TestBalancerWithHANameNodes 
   hadoop.hdfs.server.federation.router.TestRouterQuota 
   hadoop.hdfs.server.federation.router.TestRouterNamenodeHeartbeat 
   hadoop.hdfs.server.federation.resolver.order.TestLocalResolver 
   hadoop.hdfs.server.federation.resolver.TestMultipleDestinationResolver 
   hadoop.contrib.bkjournal.TestBookKeeperHACheckpoints 
   hadoop.mapreduce.v2.app.TestRuntimeEstimators 
   hadoop.mapreduce.lib.input.TestLineRecordReader 
   hadoop.mapred.TestLineRecordReader 
   hadoop.mapreduce.jobhistory.TestHistoryViewerPrinter 
   hadoop.resourceestimator.service.TestResourceEstimatorService 
   hadoop.resourceestimator.solver.impl.TestLpSolver 
   hadoop.yarn.sls.TestSLSRunner 
   
hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestNumaResourceAllocator
 
   
hadoop.yarn.server.nodemanager.containermanager.linux.resources.TestNumaResourceHandlerImpl
 
   hadoop.yarn.server.resourcemanager.TestClientRMService 
   
hadoop.yarn.server.resourcemanager.monitor.invariants.TestMetricsInvariantChecker
 
  

   cc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/diff-compile-cc-root.txt
  [4.0K]

   javac:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/diff-compile-javac-root.txt
  [488K]

   checkstyle:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/diff-checkstyle-root.txt
  [14M]

   hadolint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/diff-patch-hadolint.txt
  [4.0K]

   mvnsite:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/patch-mvnsite-root.txt
  [572K]

   pathlen:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/pathlen.txt
  [12K]

   pylint:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/diff-patch-pylint.txt
  [20K]

   shellcheck:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/diff-patch-shellcheck.txt
  [72K]

   whitespace:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/whitespace-eol.txt
  [12M]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/whitespace-tabs.txt
  [1.3M]

   javadoc:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/patch-javadoc-root.txt
  [36K]

   unit:

   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt
  [224K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
  [1.8M]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-rbf.txt
  [36K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs_src_contrib_bkjournal.txt
  [16K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-app.txt
  [44K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/out/patch-unit-hadoop-mapreduce-project_hadoop-mapreduce-client_hadoop-mapreduce-client-core.txt
  [104K]
   
https://ci-hadoop.apache.org/job/hadoop-qbt-branch-2.10-java7-linux-x86_64/1365/artifact/o

[jira] [Created] (HDFS-17473) [FGL] Make quota related operations thread-safe

2024-04-17 Thread ZanderXu (Jira)
ZanderXu created HDFS-17473:
---

 Summary: [FGL] Make quota related operations thread-safe 
 Key: HDFS-17473
 URL: https://issues.apache.org/jira/browse/HDFS-17473
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu
Assignee: ZanderXu


Concurrent operations on directory tree may cause Quota updates and 
verification to not be thread-safe.

For example: 
 # Supposing there is directory _/a/b_ and quota is on inode _a_ and _b_
 # There are some directories and files under {_}/a/b{_}, such as: 
{_}/a/b/c/d1{_}, _/a/b/d/f1.txt_
 # Supposing there is a create operation under _/a/b/c/d1_ and there is a 
addBlock operation on _/a/b/d/f1.txt_
 # These two operations can be handled concurrently by namenode
 # They will update the quota on inode a concurrently since these operations 
just hold the read lock of the inode _a_ and {_}b{_}.
 # so we should make quota-related thread safe.

 

There are two solutions to make quota-related thread safe。

Solution one: Hold the write lock of the first iNode with Quota set when 
resolvePath
 * Directly hold the write lock of iNode _a_ so that all operations involving 
subtree _/a_ can be handled safety.
 * Due to lower concurrency, maximum improvements cannot be achieved.
 * But the implementation is simple and straightforward.

Solution two: Lock all QuotaFeatures during quota verification or update
 * Still hold the read lock of iNode a and b
 * Lock all QuotaFeatures involved in this operations, when validating or 
updating quotas.
 * Maximum improvements can be achieved.
 * But the implementation is a little complex
 ** Add a lock for each QuotaFeature
 ** Acquire locks for all involving QuotaFeature



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17475) Add a command to check if files are readable

2024-04-17 Thread Felix N (Jira)
Felix N created HDFS-17475:
--

 Summary: Add a command to check if files are readable
 Key: HDFS-17475
 URL: https://issues.apache.org/jira/browse/HDFS-17475
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs
Reporter: Felix N
Assignee: Felix N
 Fix For: 3.5.0


Sometimes a job can fail due to one unreadable file down the line due to 
missing replicas or dead DNs or other reason. This command should allow users 
to check whether files are readable by checking for metadata on DNs without 
executing full read pipelines of the files.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17476) fix: False positive "Observer Node is too far behind" due to long overflow.

2024-04-17 Thread Jian Zhang (Jira)
Jian Zhang created HDFS-17476:
-

 Summary: fix: False positive "Observer Node is too far behind" due 
to long overflow.
 Key: HDFS-17476
 URL: https://issues.apache.org/jira/browse/HDFS-17476
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Jian Zhang
Assignee: Jian Zhang


In the code GlobalStateIdContext#receiveRequestState(), if clientStateId is a 
small negative number, clientStateId-serverStateId may be greater than 

(ESTIMATED_TRANSACTIONS_PER_SECOND due to overflow
                  * TimeUnit.MILLISECONDS.toSeconds(clientWaitTime)
                  * ESTIMATED_SERVER_TIME_MULTIPLIER),

resulting in false positives that Observer Node is too far behind.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86_64

2024-04-17 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java8-linux-x86_64/1562/

No changes




-1 overall


The following subsystems voted -1:
blanks hadolint pathlen spotbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

spotbugs :

   module:hadoop-hdfs-project/hadoop-hdfs-httpfs 
   Redundant nullcheck of xAttrs, which is known to be non-null in 
org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) 
Redundant null check at HttpFSFileSystem.java:is known to be non-null in 
org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) 
Redundant null check at HttpFSFileSystem.java:[line 1373] 

spotbugs :

   module:hadoop-yarn-project/hadoop-yarn 
   org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may 
return null, but is declared @Nonnull At ServiceScheduler.java:is declared 
@Nonnull At ServiceScheduler.java:[line 555] 

spotbugs :

   module:hadoop-hdfs-project/hadoop-hdfs-rbf 
   Redundant nullcheck of dns, which is known to be non-null in 
org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
 Redundant null check at RouterRpcServer.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
 Redundant null check at RouterRpcServer.java:[line 1093] 

spotbugs :

   module:hadoop-hdfs-project 
   Redundant nullcheck of xAttrs, which is known to be non-null in 
org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) 
Redundant null check at HttpFSFileSystem.java:is known to be non-null in 
org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) 
Redundant null check at HttpFSFileSystem.java:[line 1373] 
   Redundant nullcheck of dns, which is known to be non-null in 
org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
 Redundant null check at RouterRpcServer.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.federation.router.RouterRpcServer.getCachedDatanodeReport(HdfsConstants$DatanodeReportType)
 Redundant null check at RouterRpcServer.java:[line 1093] 

spotbugs :

   module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications 
   org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may 
return null, but is declared @Nonnull At ServiceScheduler.java:is declared 
@Nonnull At ServiceScheduler.java:[line 555] 

spotbugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services
 
   org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may 
return null, but is declared @Nonnull At ServiceScheduler.java:is declared 
@Nonnull At ServiceScheduler.java:[line 555] 

spotbugs :

   
module:hadoop-yarn-project/hadoop-yarn/hadoop-yarn-applications/hadoop-yarn-services/hadoop-yarn-services-core
 
   org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may 
return null, but is declared @Nonnull At ServiceScheduler.java:is declared 
@Nonnull At ServiceScheduler.java:[line 555] 

spotbugs :

   module:hadoop-yarn-project 
   org.apache.hadoop.yarn.service.ServiceScheduler$1.load(ConfigFile) may 
return null, but is declared @Nonnull At ServiceScheduler.java:is declared 
@Nonnull At ServiceScheduler.java:[line 555] 

spotbugs :

   module:root 
   Redundant nullcheck of xAttrs, which is known to be non-null in 
org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) 
Redundant null check at HttpFSFileSystem.java:is known to be non-null in 
org.apache.hadoop.fs.http.client.HttpF

[jira] [Created] (HDFS-17477) IncrementalBlockReport race condition additional edge cases

2024-04-17 Thread Danny Becker (Jira)
Danny Becker created HDFS-17477:
---

 Summary: IncrementalBlockReport race condition additional edge 
cases
 Key: HDFS-17477
 URL: https://issues.apache.org/jira/browse/HDFS-17477
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: auto-failover, ha, namenode
Affects Versions: 3.3.6, 3.3.4, 3.3.5
Reporter: Danny Becker


HDFS-17453 fixes a race condition between IncrementalBlockReports (IBR) and the 
Edit Log Tailer which can cause the Standby NameNode (SNN) to incorrectly mark 
blocks as corrupt when it transitions to Active. There are a few edge cases 
that HDFS-17453 does not cover.

For Example:
1. SNN1 loads the edits for b1gs1 and b1gs2.
2. DN1 reports b1gs1 to SNN1, so it gets queued for later processing.
3. DN1 reports b1gs2 to SNN1 so it gets added to the blocks map.
4. SNN1 transitions to Active (ANN1).
5. ANN1 processes the pending DN message queue and marks DN1->b1gs1 as corrupt 
because it was still in the queue.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17478) FSPermissionChecker to avoid obtaining a new AccessControlEnforcer instance before each authz call

2024-04-17 Thread Madhan Neethiraj (Jira)
Madhan Neethiraj created HDFS-17478:
---

 Summary: FSPermissionChecker to avoid obtaining a new 
AccessControlEnforcer instance before each authz call
 Key: HDFS-17478
 URL: https://issues.apache.org/jira/browse/HDFS-17478
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: namanode
Reporter: Madhan Neethiraj


An instance of AccessControlEnforcer is obtained from the registered 
INodeAttributeProvider before every call made to authorizer. This can be 
avoided by initializing the AccessControlEnforcer instance during construction 
of FsPermissionChecker and using it in every subsequent call to the authorizer. 
This will eliminate the unnecessary overhead in highly performance sensitive 
authz code path.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK11 on Linux/x86_64

2024-04-17 Thread Apache Jenkins Server
For more details, see 
https://ci-hadoop.apache.org/job/hadoop-qbt-trunk-java11-linux-x86_64/662/

[Apr 15, 2024, 10:35:53 AM] (github) HDFS-17383:Datanode current block token 
should come from active NameNode in HA mode (#6562). Contributed by lei w.
[Apr 15, 2024, 4:28:05 PM] (github) HDFS-17465. RBF: Use 
ProportionRouterRpcFairnessPolicyController get “'ava.Lang. Error: Maximum 
permit count exceeded' (#6727)




-1 overall


The following subsystems voted -1:
blanks hadolint mvnsite pathlen spotbugs unit xml


The following subsystems voted -1 but
were configured to be filtered/ignored:
cc checkstyle javac javadoc pylint shellcheck


The following subsystems are considered long running:
(runtime bigger than 1h  0m  0s)
unit


Specific tests:

XML :

   Parsing Error(s): 
   
hadoop-common-project/hadoop-common/src/test/resources/xml/external-dtd.xml 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-excerpt.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-output-missing-tags2.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/resources/nvidia-smi-sample-output.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/fair-scheduler-invalid.xml
 
   
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/resources/yarn-site-with-invalid-allocation-file-ref.xml
 

spotbugs :

   module:hadoop-hdfs-project/hadoop-hdfs 
   Redundant nullcheck of oldLock, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)
 Redundant null check at DataStorage.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.DataStorage.isPreUpgradableLayout(Storage$StorageDirectory)
 Redundant null check at DataStorage.java:[line 695] 
   Redundant nullcheck of metaChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlockLoader.verifyChecksum(long,
 FileInputStream, FileChannel, String) Redundant null check at 
MappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MappableBlockLoader.verifyChecksum(long,
 FileInputStream, FileChannel, String) Redundant null check at 
MappableBlockLoader.java:[line 138] 
   Redundant nullcheck of blockChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at MemoryMappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at MemoryMappableBlockLoader.java:[line 75] 
   Redundant nullcheck of blockChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at NativePmemMappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.load(long,
 FileInputStream, FileInputStream, String, ExtendedBlockId) Redundant null 
check at NativePmemMappableBlockLoader.java:[line 85] 
   Redundant nullcheck of metaChannel, which is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.verifyChecksumAndMapBlock(NativeIO$POSIX$PmemMappedRegion,
 long, FileInputStream, FileChannel, String) Redundant null check at 
NativePmemMappableBlockLoader.java:is known to be non-null in 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.NativePmemMappableBlockLoader.verifyChecksumAndMapBlock(NativeIO$POSIX$PmemMappedRegion,
 long, FileInputStream, FileChannel, String) Redundant null check at 
NativePmemMappableBlockLoader.java:[line 130] 
   
org.apache.hadoop.hdfs.server.namenode.top.window.RollingWindowManager$UserCounts
 doesn't override java.util.ArrayList.equals(Object) At 
RollingWindowManager.java:At RollingWindowManager.java:[line 1] 

spotbugs :

   module:hadoop-hdfs-project/hadoop-hdfs-httpfs 
   Redundant nullcheck of xAttrs, which is known to be non-null in 
org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) 
Redundant null check at HttpFSFileSystem.java:is known to be non-null in 
org.apache.hadoop.fs.http.client.HttpFSFileSystem.getXAttr(Path, String) 
Redundant null ch

[jira] [Resolved] (HDFS-17472) [FGL] gcDeletedSnapshot and getDelegationToken support FGL

2024-04-17 Thread Hui Fei (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-17472?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hui Fei resolved HDFS-17472.

Resolution: Fixed

> [FGL] gcDeletedSnapshot and getDelegationToken support FGL
> --
>
> Key: HDFS-17472
> URL: https://issues.apache.org/jira/browse/HDFS-17472
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Reporter: ZanderXu
>Assignee: ZanderXu
>Priority: Major
>  Labels: pull-request-available
>
> [FGL] gcDeletedSnapshot and getDelegationToken support FGL



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17479) [FGL] Snapshot related operations still use global lock

2024-04-17 Thread ZanderXu (Jira)
ZanderXu created HDFS-17479:
---

 Summary: [FGL] Snapshot related operations still use global lock
 Key: HDFS-17479
 URL: https://issues.apache.org/jira/browse/HDFS-17479
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu


Snapshot feature is a very useful feature in certain scenarios. As far as I 
know, very few companies use this feature on the prod environment. The 
implementation is complex and it is difficult to support FGL with only a minor 
modifications.

So we can still use the Global lock to make snapshot-related operations 
thread-safe.

 

Snapshot has some access modules, let's analyze them and find a way to still 
use GlobalLock.

 

!image-2024-04-18-10-31-34-458.png|width=288,height=219!

The above picture shows a simple case, we can access the iNode foo through the 
following paths:
 # /abc/foo
 # /abc/.snapshot/s1/foo

If we want to delete the iNode foo, we need to lock /abc and /abc/.snapshot/s1 
(DirectoryWithSnapshotFeature on iNode abc).

If we want to change permission of the iNode foo, we need to lock /abc/foo and 
/abc/.snapshot/s1/foo (DirectoryWithSnapshotFeature on the iNode foo)

 

For this case, we can directly acquire the global lock when resolving the IIPs 
for the input path if there is an iNode that has DirectorySnapshottableFeature.

!image-2024-04-18-10-48-08-773.png|width=368,height=383!

After /abc/foo is renamed to /xyz/bar, the access modules will be changed, as 
the above picture shows. We can access this bar through the following path:
 # /abc/.snapshot/s1/bar
 # /xyz/bar

For /abc/.snapshot/s1/bar, since the iNode abc has 
DirectorySnapshottableFeature, so we can identify it and acquire the global 
lock.

For /xyz/bar, we can identify it through Reference flag, since the iNode bar is 
a DstReference Node.

 

So we can use DirectorySnapshottableFeature and Reference to determine if we 
need to acquire the Global lock when resolving the IIPs for input path.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-17480) [FGL] GetListing RPC supports fine-grained locking

2024-04-17 Thread ZanderXu (Jira)
ZanderXu created HDFS-17480:
---

 Summary: [FGL] GetListing RPC supports fine-grained locking
 Key: HDFS-17480
 URL: https://issues.apache.org/jira/browse/HDFS-17480
 Project: Hadoop HDFS
  Issue Type: Sub-task
Reporter: ZanderXu
Assignee: ZanderXu


GetListing is a very common used RPC by end-users. But we should consider how 
does GetListing support FGL.  

For example, there is directory /a/b/c contains some children, such as d1, d2, 
d3, f1, f2, f3.

Normally, we should hold the write lock iNode c for listing /a/b/c to make sure 
that there is no other threads are updating children of iNode c. But if the 
listing path is /, the entire directory tree will be locked, which will have a 
great impact.

 

There are two solutions to fix this problem:

Solution 1:
 * Hold the read lock of iNode c
 * Loop through all children
 ** Hold the read lock of each child and return it's file status

The result may contains some stale file status, because the looped children may 
be updated by other thread before the result of getListing is returned to 
client.

 

Solution 2:
 * Hold the write lock of parent and current Node when updating the current node
 ** Holding the write lock of iNode c and d1 when updating d1
 * Hold the read lock of iNode c
 * Loop through all children

This solution will increases the scope of lock, since the parent's write lock 
is usually not required.

 

I prefer the first solution, since namenode always returns results in batches. 
Changes may have occurred between batch and batch.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org