Some of those tests are trying to stress conditions that require a lot of 
resources to replicate specific conditions. Have you tried to run those 
individual tests in isolation so that you are not competing for resources? Do 
they always fail, or are the failures transient?

-----Original Message-----
From: Mark Jens <mark.r.j...@gmail.com> 
Sent: Tuesday, November 30, 2021 4:05 AM
To: dev@accumulo.apache.org
Subject: Consistent IT tests failures on Linux ARM64

Hello Accumulo community,

At my job we consider using Linux ARM64 servers and I've been tasked to test 
Accumulo.

I face some timeout related issues with several IT tests:


[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
 Time elapsed: 420.122 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420 seconds 
at java.base@11.0.11/jdk.internal.misc.Unsafe.park(Native Method) at 
java.base@11.0.11
/java.util.concurrent.locks.LockSupport.park(LockSupport.java:194)
at java.base@11.0.11
/java.util.concurrent.FutureTask.awaitDone(FutureTask.java:447)
at java.base@11.0.11
/java.util.concurrent.FutureTask.get(FutureTask.java:190)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete(ConcurrentDeleteTableIT.java:213)
at 
java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)

[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
 Time elapsed: 420.122 s  <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:44251)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)

[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps
 Time elapsed: 420.011 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 420 seconds 
at java.base@11.0.11/java.lang.Thread.sleep(Native Method) at
app//org.apache.accumulo.fate.zookeeper.ZooCache$ZooRunnable.retry(ZooCache.java:299)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:442)
at app//org.apache.accumulo.fate.zookeeper.ZooCache.get(ZooCache.java:372)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.verifyInstanceId(ClientContext.java:467)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getInstanceID(ClientContext.java:446)
at
app//org.apache.accumulo.core.clientImpl.ClientContext.getManagerLocations(ClientContext.java:405)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnection(ManagerClient.java:59)
at
app//org.apache.accumulo.core.clientImpl.ManagerClient.getConnectionWithRetry(ManagerClient.java:49)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.beginFateOperation(TableOperationsImpl.java:260)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:369)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doFateOperation(TableOperationsImpl.java:359)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.doTableFateOperation(TableOperationsImpl.java:1670)
at
app//org.apache.accumulo.core.clientImpl.TableOperationsImpl.create(TableOperationsImpl.java:248)
at
app//org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps(ConcurrentDeleteTableIT.java:76)
at 
java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)

[INFO] Running org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
102.909 s - in org.apache.accumulo.test.functional.ScannerContextIT
[INFO] Running org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
504.472 s - in org.apache.accumulo.test.functional.KerberosRenewalIT
[INFO] Running org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
62.132 s - in org.apache.accumulo.test.functional.BatchWriterFlushIT
[INFO] Running org.apache.accumulo.test.functional.BinaryIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
65.034 s - in org.apache.accumulo.test.functional.BinaryIT
[INFO] Running org.apache.accumulo.test.functional.PermissionsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.25 s - in org.apache.accumulo.test.functional.PermissionsIT
[INFO] Running org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
37.37 s - in org.apache.accumulo.test.functional.ZookeeperRestartIT
[INFO] Running org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
23.046 s - in org.apache.accumulo.test.functional.CreateManyScannersIT
[INFO] Running org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Tests run: 7, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
255.108 s - in org.apache.accumulo.test.functional.CreateInitialSplitsIT
[INFO] Running org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.304 s - in org.apache.accumulo.test.functional.MonitorSslIT
[INFO] Running org.apache.accumulo.test.functional.RestartStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
78.359 s - in org.apache.accumulo.test.functional.RestartStressIT
[INFO] Running org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
59.289 s - in org.apache.accumulo.test.functional.BulkSplitOptimizationIT
[INFO] Running org.apache.accumulo.test.functional.BulkNewIT
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
63.696 s - in org.apache.accumulo.test.functional.BulkNewIT
[INFO] Running org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
135.298 s - in org.apache.accumulo.test.functional.BloomFilterIT
[INFO] Running org.apache.accumulo.test.functional.BulkIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
122.959 s - in org.apache.accumulo.test.functional.BulkIT
[INFO] Running org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.626 s - in org.apache.accumulo.test.functional.BinaryStressIT
[INFO] Running org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
45.61 s - in org.apache.accumulo.test.functional.ClassLoaderIT
[INFO] Running org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
116.819 s - in org.apache.accumulo.test.functional.LogicalTimeIT
[INFO] Running org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
25.421 s - in org.apache.accumulo.test.functional.SplitRecoveryIT
[INFO] Running org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
96.86 s - in org.apache.accumulo.test.functional.BigRootTabletIT
[INFO] Running org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
238.409 s - in org.apache.accumulo.test.functional.GarbageCollectorIT
[INFO] Running
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
219.253 s - in
org.apache.accumulo.test.functional.BalanceInPresenceOfOfflineTableIT
[INFO] Running org.apache.accumulo.test.functional.VisibilityIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
38.015 s - in org.apache.accumulo.test.functional.VisibilityIT
[INFO] Running org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
489.863 s - in org.apache.accumulo.test.functional.SslWithClientAuthIT
[INFO] Running org.apache.accumulo.test.functional.SummaryIT
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
111.552 s - in org.apache.accumulo.test.functional.SummaryIT
[INFO] Running org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
30.061 s - in org.apache.accumulo.test.functional.MaxOpenIT
[INFO] Running org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
47.089 s - in org.apache.accumulo.test.functional.ManagerFailoverIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
229.586 s - in org.apache.accumulo.test.functional.DeleteRowsIT
[INFO] Running org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
22.943 s - in org.apache.accumulo.test.functional.BackupManagerIT
[INFO] Running org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.728 s - in org.apache.accumulo.test.functional.TabletMetadataIT
[INFO] Running org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
46.648 s - in org.apache.accumulo.test.functional.LateLastContactIT
[INFO] Running org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
71.934 s - in org.apache.accumulo.test.functional.SimpleBalancerFairnessIT
[INFO] Running org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] Tests run: 3, Failures: 0, Errors: 2, Skipped: 0, Time elapsed:
307.904 s <<< FAILURE! - in
org.apache.accumulo.test.functional.HalfDeadTServerIT
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
 Time elapsed: 240.011 s  <<< ERROR!
org.junit.runners.model.TestTimedOutException: test timed out after 240 seconds 
at java.base@11.0.11/java.lang.Object.wait(Native Method) at 
java.base@11.0.11/java.lang.Object.wait(Object.java:328)
at java.base@11.0.11/java.lang.ProcessImpl.waitFor(ProcessImpl.java:495)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.test(HalfDeadTServerIT.java:217)
at
app//org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover(HalfDeadTServerIT.java:142)
at 
java.base@11.0.11/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native
Method)
at java.base@11.0.11
/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at java.base@11.0.11
/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.base@11.0.11/java.lang.reflect.Method.invoke(Method.java:566)
at
app//org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at
app//org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at
app//org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at
app//org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at
app//org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at
app//org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299)
at
app//org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293)
at java.base@11.0.11
/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base@11.0.11/java.lang.Thread.run(Thread.java:829)

[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
 Time elapsed: 240.012 s  <<< ERROR!
java.lang.Exception: Appears to be stuck in thread Time-limited
test-SendThread(localhost:39285)
at java.base@11.0.11/sun.nio.ch.EPoll.wait(Native Method) at java.base@11.0.11
/sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:120)
at java.base@11.0.11
/sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:124)
at java.base@11.0.11/sun.nio.ch.SelectorImpl.select(SelectorImpl.java:136)
at
app//org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:347)
at app//org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)

[INFO] Running org.apache.accumulo.test.functional.MetadataIT
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
97.987 s - in org.apache.accumulo.test.functional.MetadataIT
[INFO] Running org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
43.91 s - in org.apache.accumulo.test.functional.ScanSessionTimeOutIT
[INFO] Running org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
33.986 s - in org.apache.accumulo.test.functional.ZooCacheIT
[INFO] Running org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
113.928 s - in org.apache.accumulo.test.functional.DeleteRowsSplitIT
[INFO] Running org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
36.854 s - in org.apache.accumulo.test.ScanFlushWithTimeIT
[INFO] Running org.apache.accumulo.test.AuditMessageIT
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed:
165.169 s - in org.apache.accumulo.test.AuditMessageIT
[INFO] Running
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[WARNING] Tests run: 1, Failures: 0, Errors: 0, Skipped: 1, Time elapsed:
0.039 s - in
org.apache.accumulo.test.gc.replication.CloseWriteAheadLogReferencesIT
[INFO]
[INFO] Results:
[INFO]
[ERROR] Errors:
[ERROR]
org.apache.accumulo.test.compaction.ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction
[ERROR]   Run 1:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction:178 » 
TestTimedOut
[ERROR]   Run 2:
ExternalCompaction_3_IT.testCoordinatorRestartsDuringCompaction »  Appears to 
...
[INFO]
[ERROR]   ConcurrentDeleteTableIT.testConcurrentDeleteTablesOps:76 »
TestTimedOut test t...
[ERROR]
org.apache.accumulo.test.functional.ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete
[ERROR]   Run 1:
ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete:213 » TestTimedOut 
tes...
[ERROR]   Run 2: ConcurrentDeleteTableIT.testConcurrentFateOpsWithDelete »
 Appears to be stuck...
[INFO]
[ERROR] org.apache.accumulo.test.functional.HalfDeadTServerIT.testRecover
[ERROR]   Run 1:
HalfDeadTServerIT.testRecover:142->test:217->Object.wait:328->Object.wait:-2
» TestTimedOut
[ERROR]   Run 2: HalfDeadTServerIT.testRecover »  Appears to be stuck in
thread Time-limited te...
[INFO]
[ERROR] org.apache.accumulo.test.functional.SslIT.adminStop
[ERROR]   Run 1: SslIT.adminStop:68->Object.wait:328->Object.wait:-2 »
TestTimedOut test timed ...
[ERROR]   Run 2: SslIT.adminStop »  Appears to be stuck in thread
Time-limited test-SendThread(...

These tests fail consistently at every build attempt!

The tests fail even when executed separately, e.g.:
mvn verify -Dit.test=ConcurrentDeleteTableIT -o -rf :accumulo-test


I am using the current 'main' branch of Accumulo.
JDK 11.0.11
Maven: 3.8.2
OS: Ubuntu 20.04.3 ARM64

Is there anything that could be done to fix these problems ?
For example some config settings ?!

P.S. At https://github.com/apache/accumulo/issues/1884 I read that Linux
ARM64 is a supported platform since the JVM supports it.

Thanks!

Mark

Reply via email to