I have also bumped into flaky druid failures although with different symptoms 
(Wait time exhausted and we have [1] out of [1] segments not loaded yet)

All of the druid failures in this case are during create table with following 
error stack: (Not sure why this is intermittent)

org.apache.hadoop.security.ShellBasedUnixGroupsMapping$PartialGroupNameException:
 The user name 'hive_test_user' is not found. id: hive_test_user: no such user
id: hive_test_user: no such user

at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.resolvePartialGroupNames(ShellBasedUnixGroupsMapping.java:294)
 ~[hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getUnixGroups(ShellBasedUnixGroupsMapping.java:207)
 [hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.security.ShellBasedUnixGroupsMapping.getGroups(ShellBasedUnixGroupsMapping.java:97)
 [hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback.getGroups(JniBasedUnixGroupsMappingWithFallback.java:51)
 [hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.security.Groups$GroupCacheLoader.fetchGroupList(Groups.java:384)
 [hadoop-common-3.1.0.jar:?]
at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:319) 
[hadoop-common-3.1.0.jar:?]
at org.apache.hadoop.security.Groups$GroupCacheLoader.load(Groups.java:269) 
[hadoop-common-3.1.0.jar:?]
at 
com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3542)
 [guava-19.0.jar:?]
at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2323) 
[guava-19.0.jar:?]
at 
com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2286)
 [guava-19.0.jar:?]
at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2201) 
[guava-19.0.jar:?]
at com.google.common.cache.LocalCache.get(LocalCache.java:3953) 
[guava-19.0.jar:?]
at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3957) 
[guava-19.0.jar:?]
at 
com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4875) 
[guava-19.0.jar:?]
at org.apache.hadoop.security.Groups.getGroups(Groups.java:227) 
[hadoop-common-3.1.0.jar:?]
at 
org.apache.hadoop.security.UserGroupInformation.getGroups(UserGroupInformation.java:1540)
 [hadoop-common-3.1.0.jar:?]
at org.apache.hadoop.hive.ql.exec.tez.TezTask.execute(TezTask.java:164) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:205) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:97) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:2689) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:2341) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:2012) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1712) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:1706) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:157) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.reexec.ReExecDriver.run(ReExecDriver.java:218) 
[hive-exec-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.CliDriver.processLocalCmd(CliDriver.java:239) 
[hive-cli-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:188) 
[hive-cli-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:402) 
[hive-cli-4.0.0-SNAPSHOT.jar:?]
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:335) 
[hive-cli-4.0.0-SNAPSHOT.jar:?]
at 
org.apache.hadoop.hive.ql.QTestUtil.executeClientInternal(QTestUtil.java:1340) 
[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.ql.QTestUtil.executeClient(QTestUtil.java:1314) 
[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.control.CoreCliDriver.runTest(CoreCliDriver.java:171)
 [hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.apache.hadoop.hive.cli.control.CliAdapter.runTest(CliAdapter.java:104) 
[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at 
org.apache.hadoop.hive.cli.TestMiniDruidCliDriver.testCliDriver(TestMiniDruidCliDriver.java:59)
 [test-classes/:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_102]
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
~[?:1.8.0_102]
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 ~[?:1.8.0_102]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_102]
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:47)
 [junit-4.11.jar:?]
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 [junit-4.11.jar:?]
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:44)
 [junit-4.11.jar:?]
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 [junit-4.11.jar:?]
at 
org.apache.hadoop.hive.cli.control.CliAdapter$2$1.evaluate(CliAdapter.java:92) 
[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.junit.rules.RunRules.evaluate(RunRules.java:20) [junit-4.11.jar:?]
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:271) 
[junit-4.11.jar:?]
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:70)
 [junit-4.11.jar:?]
at 
org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:50)
 [junit-4.11.jar:?]
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.run(ParentRunner.java:309) [junit-4.11.jar:?]
at org.junit.runners.Suite.runChild(Suite.java:127) [junit-4.11.jar:?]
at org.junit.runners.Suite.runChild(Suite.java:26) [junit-4.11.jar:?]
at org.junit.runners.ParentRunner$3.run(ParentRunner.java:238) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:63) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:236) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner.access$000(ParentRunner.java:53) 
[junit-4.11.jar:?]
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:229) 
[junit-4.11.jar:?]
at 
org.apache.hadoop.hive.cli.control.CliAdapter$1$1.evaluate(CliAdapter.java:73) 
[hive-it-util-4.0.0-SNAPSHOT.jar:4.0.0-SNAPSHOT]
at org.junit.rules.RunRules.evaluate(RunRules.java:20) [junit-4.11.jar:?]
at org.junit.runners.ParentRunner.run(ParentRunner.java:309) [junit-4.11.jar:?]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:365)
 [surefire-junit4-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeWithRerun(JUnit4Provider.java:273)
 [surefire-junit4-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:238)
 [surefire-junit4-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:159) 
[surefire-junit4-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.booter.ForkedBooter.invokeProviderInSameClassLoader(ForkedBooter.java:379)
 [surefire-booter-2.21.0.jar:2.21.0]
at 
org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:340)
 [surefire-booter-2.21.0.jar:2.21.0]
at org.apache.maven.surefire.booter.ForkedBooter.execute(ForkedBooter.java:125) 
[surefire-booter-2.21.0.jar:2.21.0]
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:413) 
[surefire-booter-2.21.0.jar:2.21.0]
2018-06-27T22:24:51,363  INFO [d00e737b-0dde-4230-ae29-d20498bf8332 main] 
ql.Context: New scratch dir is hdfs://localhost:37593/home/hiveptest


Vineet

On Jun 27, 2018, at 11:47 PM, Deepak Jaiswal 
<djais...@hortonworks.com<mailto:djais...@hortonworks.com>> wrote:

Ptests have become really unstable.

The druid tests are failing randomly,
https://builds.apache.org/job/PreCommit-HIVE-Build/12203/testReport

Should we disable them?

Deepak

On 6/27/18, 10:13 AM, "Deepak Jaiswal" <djais...@hortonworks.com> wrote:

   Hi All,

   It seems we are going back to instability in Hive QA runs. In the past few 
days I saw many runs where the failures were completely independent. When those 
tests are run locally, they don’t fail which makes them harder to catch.

   On one side I think having green run to commit makes sense, however, on the 
other side, the development is unnecessarily blocked. Putting the randomly 
failing tests in disabled list is also not a good idea as it brings down the 
code coverage.
   Any suggestions?

   Regards,
   Deepak



Reply via email to