[jira] [Created] (HDFS-15122) Log all balancer related parameters at Balancer startup

2020-01-14 Thread Istvan Fajth (Jira)
Istvan Fajth created HDFS-15122:
---

 Summary: Log all balancer related parameters at Balancer startup
 Key: HDFS-15122
 URL: https://issues.apache.org/jira/browse/HDFS-15122
 Project: Hadoop HDFS
  Issue Type: Improvement
Reporter: Istvan Fajth


Currently Balancer just logs the parameters it deals with.
It would be good to emit all the Balancer related parameter into its log, it 
would be easier to see all at once when checking into any problem regarding 
balancing.

The maximum balancing bandwidth is one of the missing configuration values that 
are not there, but other related parameters should be added as well all at once 
in this effort.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-13339) Volume reference can't be released and may lead to deadlock when DataXceiver does a check volume

2020-01-14 Thread Jim Brennan (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-13339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jim Brennan reopened HDFS-13339:


Re-opening issue so I can put up a patch for branch-2.10.

> Volume reference can't be released and may lead to deadlock when DataXceiver 
> does a check volume
> 
>
> Key: HDFS-13339
> URL: https://issues.apache.org/jira/browse/HDFS-13339
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
> Environment: os: Linux 2.6.32-358.el6.x86_64
> hadoop version: hadoop-3.2.0-SNAPSHOT
> unit: mvn test -Pnative 
> -Dtest=TestDataNodeVolumeFailureReporting#testVolFailureStatsPreservedOnNNRestart
>Reporter: liaoyuxiangqin
>Assignee: Zsolt Venczel
>Priority: Critical
>  Labels: DataNode, volumes
> Fix For: 3.2.0, 3.1.1, 3.0.4
>
> Attachments: HDFS-13339.001.patch, HDFS-13339.002.patch, 
> HDFS-13339.003.patch, HDFS-13339.004.patch
>
>
> When i execute Unit Test of
>  TestDataNodeVolumeFailureReporting#testVolFailureStatsPreservedOnNNRestart, 
> the process blocks on waitReplication, detail information as follows:
> [INFO] ---
>  [INFO] T E S T S
>  [INFO] ---
>  [INFO] Running 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
>  [ERROR] Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
> 307.492 s <<< FAILURE! - in 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting
>  [ERROR] 
> testVolFailureStatsPreservedOnNNRestart(org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting)
>  Time elapsed: 307.206 s <<< ERROR!
>  java.util.concurrent.TimeoutException: Timed out waiting for /test1 to reach 
> 2 replicas
>  at org.apache.hadoop.hdfs.DFSTestUtil.waitReplication(DFSTestUtil.java:800)
>  at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting.testVolFailureStatsPreservedOnNNRestart(TestDataNodeVolumeFailureReporting.java:283)
>  at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>  at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>  at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
>  at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>  at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
>  at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>  at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>  at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>  at 
> org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:74)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-15123) Remove unnecessary null check in FoldedTreeSet

2020-01-14 Thread Ravuri Sushma sree (Jira)
Ravuri Sushma sree created HDFS-15123:
-

 Summary: Remove unnecessary null check in FoldedTreeSet
 Key: HDFS-15123
 URL: https://issues.apache.org/jira/browse/HDFS-15123
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ravuri Sushma sree
Assignee: Ravuri Sushma sree
 Fix For: 3.1.1


private void deleteNode(final Node node) {
 +*if (node.right == null) {*+
 if (node.left != null) {
 attachToParent(node, node.left);
 } else {
 attachNullToParent(node);
 }
 +*} else if (node.left == null) {*+
 attachToParent(node, node.right);
 } *else {*
else {
 // node.left != null && node.right != null
 // node.next should replace node in tree
 // node.next != null guaranteed since node.left != null
 // node.next.left == null since node.next.prev is node
 // node.next.right may be null or non-null
 Node toMoveUp = node.next;
 if (toMoveUp.right == null) {
 attachNullToParent(toMoveUp);
 } else {
 attachToParent(toMoveUp, toMoveUp.right);
 }
 toMoveUp.left = node.left;
 *if (toMoveUp.left != null) {*
 toMoveUp.left.parent = toMoveUp;
}
 toMoveUp.right = node.right;
*if (toMoveUp.right != null) {*
 toMoveUp.right.parent = toMoveUp;
 }
 attachToParentNoBalance(node, toMoveUp);
 toMoveUp.color = node.color;
 }

*if (toMoveUp.left != null)* and  *if (toMoveUp.right != null)* null checks are 
not necessary as they are being handled in the if and else if conditions



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Reopened] (HDFS-14476) lock too long when fix inconsistent blocks between disk and in-memory

2020-01-14 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14476?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang reopened HDFS-14476:


> lock too long when fix inconsistent blocks between disk and in-memory
> -
>
> Key: HDFS-14476
> URL: https://issues.apache.org/jira/browse/HDFS-14476
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Affects Versions: 2.6.0, 2.7.0, 3.0.3
>Reporter: Sean Chow
>Assignee: Sean Chow
>Priority: Major
> Fix For: 3.3.0
>
> Attachments: HDFS-14476-branch-2.01.patch, 
> HDFS-14476-branch-2.02.patch, HDFS-14476.00.patch, HDFS-14476.002.patch, 
> HDFS-14476.01.patch, HDFS-14476.branch-3.2.001.patch, 
> datanode-with-patch-14476.png
>
>
> When directoryScanner have the results of differences between disk and 
> in-memory blocks. it will try to run {{checkAndUpdate}} to fix it. However 
> {{FsDatasetImpl.checkAndUpdate}} is a synchronized call
> As I have about 6millions blocks for every datanodes and every 6hours' scan 
> will have about 25000 abnormal blocks to fix. That leads to a long lock 
> holding FsDatasetImpl object.
> let's assume every block need 10ms to fix(because of latency of SAS disk), 
> that will cost 250 seconds to finish. That means all reads and writes will be 
> blocked for 3mins for that datanode.
>  
> {code:java}
> 2019-05-06 08:06:51,704 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-1644920766-10.223.143.220-1450099987967 Total blocks: 6850197, missing 
> metadata files:23574, missing block files:23574, missing blocks in 
> memory:47625, mismatched blocks:0
> ...
> 2019-05-06 08:16:41,625 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: 
> Took 588402ms to process 1 commands from NN
> {code}
> Take long time to process command from nn because threads are blocked. And 
> namenode will see long lastContact time for this datanode.
> Maybe this affect all hdfs versions.
> *how to fix:*
> just like process invalidate command from namenode with 1000 batch size, fix 
> these abnormal block should be handled with batch too and sleep 2 seconds 
> between the batch to allow normal reading/writing blocks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



[jira] [Created] (HDFS-15124) Crashing bugs in NameNode when using a valid configuration for `dfs.namenode.audit.loggers`

2020-01-14 Thread Ctest (Jira)
Ctest created HDFS-15124:


 Summary: Crashing bugs in NameNode when using a valid 
configuration for `dfs.namenode.audit.loggers`
 Key: HDFS-15124
 URL: https://issues.apache.org/jira/browse/HDFS-15124
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode
Affects Versions: 2.10.0
Reporter: Ctest


I am using Hadoop-2.10.0.

 

The configuration parameter `dfs.namenode.audit.loggers` allows `default` 
(which is the default value) and 
`org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`.

 

When I use `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
namenode will not be started successfully because of an 
`InstantiationException` thrown from 
`org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers`. 

 

The root cause is that while initializing namenode, `initAuditLoggers` will be 
called and it will try to call the default constructor of 
`org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger` which doesn't have 
a default constructor. Thus the `InstantiationException` exception is thrown.

 

 

 

Symptom

 

$ ./start-dfs.sh

 

 

 

2019-12-18 14:05:20,670 ERROR 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
initialization failed.java.lang.RuntimeException: 
java.lang.InstantiationException: 
org.apache.hadoop.hdfs.server.namenode.top.TopAuditLoggerat 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1024)at
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:858)at
 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:677)at
 
org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:674)at
 
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:736)at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:961)at 
org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:940)at 
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1714)at
 org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1782)Caused 
by: java.lang.InstantiationException: 
org.apache.hadoop.hdfs.server.namenode.top.TopAuditLoggerat 
java.lang.Class.newInstance(Class.java:427)at 
org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1017)...
 8 moreCaused by: java.lang.NoSuchMethodException: 
org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger.()at 
java.lang.Class.getConstructor0(Class.java:3082)at 
java.lang.Class.newInstance(Class.java:412)

... 9 more

 

 

 

 

Detailed Root Cause

 

There is no default constructor in 
`org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`:

 

/** 

 * An \{@link AuditLogger} that sends logged data directly to the metrics 

 * systems. It is used when the top service is used directly by the name node 

 */ 

@InterfaceAudience.Private 

public class TopAuditLogger implements AuditLogger {     

  public static finalLogger LOG = 
LoggerFactory.getLogger(TopAuditLogger.class); 

 

  private final TopMetrics topMetrics; 

 

  public TopAuditLogger(TopMetrics topMetrics) {

    Preconditions.checkNotNull(topMetrics, "Cannot init with a null " + 

        "TopMetrics");

    this.topMetrics = topMetrics; 

  }

 

  @Override

  public void initialize(Configuration conf) { 

  }

As long as the configuration parameter `dfs.namenode.audit.loggers` is set to 
`org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, `initAuditLoggers` 
will try to call its default constructor to make a new instance:

 

private List initAuditLoggers(Configuration conf) {

  // Initialize the custom access loggers if configured.

  Collection alClasses =

      conf.getTrimmedStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);

  List auditLoggers = Lists.newArrayList();

  if (alClasses != null && !alClasses.isEmpty()) {

    for (String className : alClasses) {

      try {

        AuditLogger logger;

        if (DFS_NAMENODE_DEFAULT_AUDIT_LOGGER_NAME.equals(className)) {

          logger = new DefaultAuditLogger();

        } else {

          logger = (AuditLogger) Class.forName(className).newInstance();

        }

        logger.initialize(conf);

        auditLoggers.add(logger);

      } catch (RuntimeException re) {

        throw re;

      } catch (Exception e) {

        throw new RuntimeException(e);

      }

    }

  }

`initAuditLoggers` tries to call the default constructor to make a new instance 
in:

 

logger = (AuditLogger) Class.forName(className).newInstance();

This is very different from the default configuration, `default`, which 
implements a default constructor so the default is fine.

 

 

 

How To Reproduce 

 

The version of Hadoop: 2.10.0

 

Set the value of configuration parameter `dfs.namenode.audit.loggers` to 
`org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogg

Re: [DISCUSS] Hadoop 2019 Release Planning

2020-01-14 Thread Wei-Chiu Chuang
I'm curious about the fate of branch-2.8 and branch-2.9.

2.9.2 is over a year old (released on 11/19/2018)
2.8.5 is over a year old too (released on 9/10/2018)

It appears to me most of the development in 2.x is focusing on
stabilizing 2.10 and I wonder if there are people still on 2.8/2.9.

Downstreamers like HBase would be keen to know whether to drop the support
for 2.8/2.9.

On Thu, Jan 9, 2020 at 9:37 AM Steve Loughran 
wrote:

> Well volunteered! I will help with the testing
>
> On Mon, Jan 6, 2020 at 10:08 AM Gabor Bota  .invalid>
> wrote:
>
> > I'm interested in doing a release of hadoop.
> > The version we need an RM is 3.1.3 right? What's the target date for
> that?
> >
> > Thanks,
> > Gabor
> >
> > On Mon, Jan 6, 2020 at 8:31 AM Akira Ajisaka 
> wrote:
> >
> > > Thank you Wangda.
> > >
> > > Now it's 2020. Let's release Hadoop 3.3.0.
> > > I created a wiki page for tracking blocker/critical issues for 3.3.0
> and
> > > I'll check the issues in the list.
> > > https://cwiki.apache.org/confluence/display/HADOOP/Hadoop+3.3+Release
> > > If you find blocker/critical issues in trunk, please set the target
> > version
> > > to 3.3.0 for tracking.
> > >
> > > > We still need RM for 3.3.0 and 3.1.3.
> > > I can work as a release manager for 3.3.0. Is there anyone who wants to
> > be
> > > a RM?
> > >
> > > Thanks and regards,
> > > Akira
> > >
> > > On Fri, Aug 16, 2019 at 9:28 PM zhankun tang 
> > > wrote:
> > >
> > > > Thanks Wangda for bring this up!
> > > >
> > > > I ran the submarine 0.2.0 release before with a lot of help from
> folks
> > > > especially Sunil. :D
> > > > And this time I would like to help to release the 3.1.4. Thanks!
> > > >
> > > > BR,
> > > > Zhankun
> > > >
> > > > Hui Fei 于2019年8月16日 周五下午7:19写道:
> > > >
> > > > > Hi Wangda,
> > > > > Thanks for bringing this up!
> > > > > Looking forward to see HDFS 3.x is widely used,but RollingUpgrade
> is
> > a
> > > > > problem.
> > > > > Hope commiters watch and review these issues, Thanks
> > > > > https://issues.apache.org/jira/browse/HDFS-13596
> > > > > https://issues.apache.org/jira/browse/HDFS-14396
> > > > >
> > > > > Wangda Tan  于2019年8月10日周六 上午10:59写道:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > Hope this email finds you well
> > > > > >
> > > > > > I want to hear your thoughts about what should be the release
> plan
> > > for
> > > > > > 2019.
> > > > > >
> > > > > > In 2018, we released:
> > > > > > - 1 maintenance release of 2.6
> > > > > > - 3 maintenance releases of 2.7
> > > > > > - 3 maintenance releases of 2.8
> > > > > > - 3 releases of 2.9
> > > > > > - 4 releases of 3.0
> > > > > > - 2 releases of 3.1
> > > > > >
> > > > > > Total 16 releases in 2018.
> > > > > >
> > > > > > In 2019, by far we only have two releases:
> > > > > > - 1 maintenance release of 3.1
> > > > > > - 1 minor release of 3.2.
> > > > > >
> > > > > > However, the community put a lot of efforts to stabilize features
> > of
> > > > > > various release branches.
> > > > > > There're:
> > > > > > - 217 fixed patches in 3.1.3 [1]
> > > > > > - 388 fixed patches in 3.2.1 [2]
> > > > > > - 1172 fixed patches in 3.3.0 [3] (OMG!)
> > > > > >
> > > > > > I think it is the time to do maintenance releases of 3.1/3.2 and
> > do a
> > > > > minor
> > > > > > release for 3.3.0.
> > > > > >
> > > > > > In addition, I saw community discussion to do a 2.8.6 release for
> > > > > security
> > > > > > fixes.
> > > > > >
> > > > > > Any other releases? I think there're release plans for Ozone as
> > well.
> > > > And
> > > > > > please add your thoughts.
> > > > > >
> > > > > > Volunteers welcome! If you have interests to run a release as
> > Release
> > > > > > Manager (or co-Resource Manager), please respond to this email
> > thread
> > > > so
> > > > > we
> > > > > > can coordinate.
> > > > > >
> > > > > > Thanks,
> > > > > > Wangda Tan
> > > > > >
> > > > > > [1] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution =
> > Fixed
> > > > AND
> > > > > > fixVersion = 3.1.3
> > > > > > [2] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution =
> > Fixed
> > > > AND
> > > > > > fixVersion = 3.2.1
> > > > > > [3] project in (YARN, HADOOP, MAPREDUCE, HDFS) AND resolution =
> > Fixed
> > > > AND
> > > > > > fixVersion = 3.3.0
> > > > > >
> > > > >
> > > >
> > >
> >
>


[jira] [Resolved] (HDFS-15070) Crashing bugs in NameNode when using a valid configuration for `dfs.namenode.audit.loggers`

2020-01-14 Thread Xudong Sun (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-15070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xudong Sun resolved HDFS-15070.
---
Resolution: Duplicate

> Crashing bugs in NameNode when using a valid configuration for 
> `dfs.namenode.audit.loggers`
> ---
>
> Key: HDFS-15070
> URL: https://issues.apache.org/jira/browse/HDFS-15070
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode
>Affects Versions: 2.10.0
>Reporter: Xudong Sun
>Priority: Critical
>
> I am using Hadoop-2.10.0.
> The configuration parameter `dfs.namenode.audit.loggers` allows `default` 
> (which is the default value) and 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`.
> When I use `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> namenode will not be started successfully because of an 
> `InstantiationException` thrown from 
> `org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers`. 
> The root cause is that while initializing namenode, `initAuditLoggers` will 
> be called and it will try to call the default constructor of 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger` which doesn't 
> have a default constructor. Thus the `InstantiationException` exception is 
> thrown.
>  
> *Symptom*
> *$ ./start-dfs.sh*
>  
> {code:java}
> 2019-12-18 14:05:20,670 ERROR 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem: FSNamesystem 
> initialization failed.java.lang.RuntimeException: 
> java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLoggerat 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1024)at
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.(FSNamesystem.java:858)at
>  
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:677)at
>  
> org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:674)at
>  
> org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:736)at
>  org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:961)at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.(NameNode.java:940)at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1714)at
>  
> org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1782)Caused
>  by: java.lang.InstantiationException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLoggerat 
> java.lang.Class.newInstance(Class.java:427)at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.initAuditLoggers(FSNamesystem.java:1017)...
>  8 moreCaused by: java.lang.NoSuchMethodException: 
> org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger.()at 
> java.lang.Class.getConstructor0(Class.java:3082)at 
> java.lang.Class.newInstance(Class.java:412)
> ... 9 more
> {code}
>  
>  
> *Detailed Root Cause*
> There is no default constructor in 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`:
> {code:java}
> /** 
>  * An {@link AuditLogger} that sends logged data directly to the metrics 
>  * systems. It is used when the top service is used directly by the name node 
>  */ 
> @InterfaceAudience.Private 
> public class TopAuditLogger implements AuditLogger { 
>   public static finalLogger LOG = 
> LoggerFactory.getLogger(TopAuditLogger.class); 
>   private final TopMetrics topMetrics; 
>   public TopAuditLogger(TopMetrics topMetrics) {
> Preconditions.checkNotNull(topMetrics, "Cannot init with a null " + 
> "TopMetrics");
> this.topMetrics = topMetrics; 
>   }
>   @Override
>   public void initialize(Configuration conf) { 
>   }{code}
> As long as the configuration parameter `dfs.namenode.audit.loggers` is set to 
> `org.apache.hadoop.hdfs.server.namenode.top.TopAuditLogger`, 
> `initAuditLoggers` will try to call its default constructor to make a new 
> instance:
> {code:java}
> private List initAuditLoggers(Configuration conf) {
>   // Initialize the custom access loggers if configured.
>   Collection alClasses =
>   conf.getTrimmedStringCollection(DFS_NAMENODE_AUDIT_LOGGERS_KEY);
>   List auditLoggers = Lists.newArrayList();
>   if (alClasses != null && !alClasses.isEmpty()) {
> for (String className : alClasses) {
>   try {
> AuditLogger logger;
> if (DFS_NAMENODE_DEFAULT_AUDIT_LOGGER_NAME.equals(className)) {
>   logger = new DefaultAuditLogger();
> } else {
>   logger = (AuditLogger) Class.forName(className).newInstance();
> }
> logger.initialize(conf);
> auditLoggers.add(logger);
>   } catch (RuntimeException re) {
> throw re;
>   } catch (Exception e) {
> throw new RuntimeException(e);
>   }
> }
>   }{code}
> `initAuditLoggers` tries to call t

[jira] [Resolved] (HDFS-14126) DataNode DirectoryScanner holding global lock for too long

2020-01-14 Thread Wei-Chiu Chuang (Jira)


 [ 
https://issues.apache.org/jira/browse/HDFS-14126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Wei-Chiu Chuang resolved HDFS-14126.

Resolution: Duplicate

I believe HDFS-14476 solves the same issue and so I'll mark this as a duplicate.

> DataNode DirectoryScanner holding global lock for too long
> --
>
> Key: HDFS-14126
> URL: https://issues.apache.org/jira/browse/HDFS-14126
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Wei-Chiu Chuang
>Priority: Major
>
> I've got a Hadoop 3 based cluster set up, and this DN has just 434 thousand 
> blocks.
> And yet, DirectoryScanner holds the fsdataset lock for 2.7 seconds:
> {quote}
> 2018-12-03 21:33:09,130 INFO 
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner: BlockPool 
> BP-4588049-10.17.XXX-XX-281857726 Total blocks: 434401, missing metadata 
> fi
> les:0, missing block files:0, missing blocks in memory:0, mismatched blocks:0
> 2018-12-03 21:33:09,131 WARN 
> org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetImpl: Lock 
> held time above threshold: lock identifier: org.apache.hadoop.hdfs.serve
> r.datanode.fsdataset.impl.FsDatasetImpl lockHeldTimeMs=2710 ms. Suppressed 0 
> lock warnings. The stack trace is: 
> java.lang.Thread.getStackTrace(Thread.java:1559)
> org.apache.hadoop.util.StringUtils.getStackTrace(StringUtils.java:1032)
> org.apache.hadoop.util.InstrumentedLock.logWarning(InstrumentedLock.java:148)
> org.apache.hadoop.util.InstrumentedLock.check(InstrumentedLock.java:186)
> org.apache.hadoop.util.InstrumentedLock.unlock(InstrumentedLock.java:133)
> org.apache.hadoop.util.AutoCloseableLock.release(AutoCloseableLock.java:84)
> org.apache.hadoop.util.AutoCloseableLock.close(AutoCloseableLock.java:96)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:473)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:373)
> org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:318)
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> java.lang.Thread.run(Thread.java:748)
> {quote}
> Log messages like this repeats every several hours (6, to be exact). I am not 
> sure if this is a performance regression, or just the fact that the lock 
> information is printed in Hadoop 3. [~vagarychen] or [~templedf] do you know?
> There's no log in DN to indicate any sort of JVM GC going on. Plus, the DN's 
> heap size is set to several GB.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org



Apache Hadoop qbt Report: trunk+JDK8 on Linux/x86

2020-01-14 Thread Apache Jenkins Server
For more details, see 
https://builds.apache.org/job/hadoop-qbt-trunk-java8-linux-x86/1381/

[Jan 13, 2020 5:10:29 AM] (github) HADOOP-16797. Add Dockerfile for ARM builds. 
Contributed by Vinayakumar
[Jan 13, 2020 3:50:07 PM] (snemeth) YARN-9989. Typo in CapacityScheduler 
documentation: Runtime
[Jan 13, 2020 4:15:09 PM] (snemeth) YARN-9868. Validate %primary_group queue in 
CS queue manager.
[Jan 13, 2020 4:23:00 PM] (snemeth) YARN-9912. Capacity scheduler: support 
u:user2:%secondary_group queue
[Jan 13, 2020 6:48:53 PM] (weichiu) HDFS-15097. Purge log in KMS and HttpFS. 
Contributed by Doris Gu.
[Jan 14, 2020 10:00:08 AM] (snemeth) YARN-10028. Integrate the new abstract log 
servlet to the JobHistory
[Jan 14, 2020 11:26:03 AM] (snemeth) YARN-9788. Queue Management API does not 
support parallel updates.
[Jan 15, 2020 1:28:37 AM] (dazhou) HADOOP-16005. NativeAzureFileSystem does not 
support setXAttr.

-
To unsubscribe, e-mail: hdfs-dev-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-h...@hadoop.apache.org

Re: [DISCUSS] Removal of old hamlet package in 3.3+ or 3.4+

2020-01-14 Thread Akira Ajisaka
Hi folks,

Now I have a strong reason to remove the old hamlet package.

There are 1000+ javadoc warnings in the package and they fill the output of
the precommit javadoc module.
That's why the new warnings and errors are ignored and sometimes they cause
build errors.

Now I'm investigating why the precommit job ignores new javadoc
warnings/errors
in https://issues.apache.org/jira/browse/HADOOP-16802. I'd like to remove
the package to make the investigation easier.

Regards,
Akira

On Thu, Feb 14, 2019 at 3:18 PM Akira Ajisaka  wrote:

> Thanks Masatake for your comments.
> Added the other Hadoop mailing lists to Cc.
>
> > I'm +1 on making imcompatible change if this blocks another Java
> migration issues, while I don't see strong reason to hurry as I see the
> patch of YARN-9279.
> Agreed.
>
> -Akira
>
> On Sun, Feb 10, 2019 at 2:00 AM Masatake Iwasaki
>  wrote:
> >
> > Thanks for working on this, Akira.
> >
> >  > The only usage I can see is Apache Slider, however, the
> >  > functionalities of Apache Slider have been merged into YARN.
> >
> > Do we have mailing lists other than yarn-dev to reach downstream
> developers?
> > It would be better to make it confident that old hamlet of Hadoop 3 is
> > used nowhere.
> >
> > I'm +1 on making imcompatible change if this blocks another Java
> > migration issues,
> > while I don't see strong reason to hurry as I see the patch of YARN-9279.
> >
> > Masatake Iwasaki
> >
> > On 2/3/19 18:10, Akira Ajisaka wrote:
> > > Filed https://issues.apache.org/jira/browse/YARN-9279 to remove the
> > > old hamlet package.
> > >
> > > -Akira
> > >
> > > 2019年1月21日(月) 13:08 Akira Ajisaka :
> > >> Hi folks,
> > >>
> > >> I'd like to remove the deprecated hamlet package to reduce the
> maintenance cost.
> > >>
> > >> The old hamlet package has one character '_' and it is banned in Java
> > >> 9+, so HADOOP-11875 deprecated this package and created a profile in
> > >> pom.xml not to compile the package when the Java version is 9+. After
> > >> the deprecation, we still have to maintenance the profile (see
> > >> YARN-8123 and HADOOP-16046).
> > >>
> > >> The only usage I can see is Apache Slider, however, the
> > >> functionalities of Apache Slider have been merged into YARN. Therefore
> > >> I think there are no people using Slider with Hadoop 3.1+ and we can
> > >> remove the package in 3.3+.
> > >>
> > >> Any thoughts?
> > >>
> > >> Regards,
> > >> Akira
> > > -
> > > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> > > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> > >
> >
> >
> > -
> > To unsubscribe, e-mail: yarn-dev-unsubscr...@hadoop.apache.org
> > For additional commands, e-mail: yarn-dev-h...@hadoop.apache.org
> >
>