Thanks for following up here with the solution and the steps for accessing the logs!
On Mon, 23 Nov 2020 at 08:59, Peter Vary <pv...@cloudera.com.invalid> wrote: > Hi Team, > > Ryan pushed my changes. Thanks for the review and the merge! > > The final solution was to create a log file for every package which will > contain the StdErr / StdOut of the tests. These will be stored in the > <ROOT>/build/testlogs/<PACKAGE_NAME>.log file. > Like <ROOT>/build/testlogs/iceberg-parquet.log: > > -------- > - Test log for: Test > testRowGroupSizeConfigurableWithWriter(org.apache.iceberg.parquet.TestParquet) > -------- > StdErr log4j:WARN No appenders could be found for logger > (org.apache.hadoop.util.NativeCodeLoader). > StdErr log4j:WARN Please initialize the log4j system properly. > StdErr log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig > for more info. > -------- > - Test log for: Test > testListProjection(org.apache.iceberg.avro.TestParquetReadProjection) > -------- > StdErr [Test worker] INFO > org.apache.parquet.hadoop.InternalParquetRecordReader - RecordReader > initialized will read a total of 1 records. > StdErr [Test worker] INFO > org.apache.parquet.hadoop.InternalParquetRecordReader - at row 0. reading > next block > StdErr [Test worker] INFO > org.apache.parquet.hadoop.InternalParquetRecordReader - block read in memory > in 1 ms. row count = 1 > > If there is a failure in the CI run then these logs are achieved. "By > default, GitHub stores build logs and artifacts for 90 days", and they are > accessible > through the "Artifacts/test logs" on the top right corner of the failed run. > See: > > This could help us investigating flaky failures. That said, if you find > flaky tests for the Hive/Tez related tests, please notify me, Laszlo Pinter > or Marton Bod. > > Thanks, > Peter > > > On Nov 19, 2020, at 16:02, Peter Vary <pv...@cloudera.com> wrote: > > Created the pull request for it: > https://github.com/apache/iceberg/pull/1789 > > You can turn it on for manual builds by > > *export CI=true* > > > Any reviewers would be welcome! > Thanks, > Peter > > On Nov 18, 2020, at 10:11, Mass Dosage <massdos...@gmail.com> wrote: > > I can definitely see how having more detailed logs could be useful so I > like what you're suggesting. I guess another option could be to make this > configurable so you can pass in an argument to turn on the > "showStandardStreams", by default it's false but while you're debugging > this issue it would be turned on? > > On Wed, 18 Nov 2020 at 09:03, Peter Vary <pv...@cloudera.com.invalid> > wrote: > >> Hi Team, >> >> Recently I have been working on trying to reproduce the following CI >> failure without success: >> >> >> >> >> >> *org.apache.iceberg.mr.hive.TestHiveIcebergStorageHandlerWithCustomCatalog >> > testScanTable[fileFormat=PARQUET, engine=tez] FAILED >> java.lang.IllegalArgumentException: Failed to execute Hive query 'SELECT * >> FROM default.customers ORDER BY customer_id DESC': Error while processing >> statement: FAILED: Execution Error, return code 1 from >> org.apache.hadoop.hive.ql.exec.tez.TezTask Caused by: >> org.apache.hive.service.cli.HiveSQLException: Error while processing >> statement: FAILED: Execution Error, return code 1 from >> org.apache.hadoop.hive.ql.exec.tez.TezTask* >> >> >> Since I was unsuccessful reproing the case, and the provided error >> message in CI logs are not really helpful this means I can not fix this >> flaky test for now. :( >> >> After Marton Bods changes for adding logs for tests ( >> https://github.com/apache/iceberg/pull/1712), we could have more info >> about the failures in the test logs ( >> *build/test-results/test/binary/output.bin*), but I am not sure if that >> is retained and accessible after a CI run. >> >> I would like to propose adding the following to the build.gradle for the >> CI runs: >> >> >> >> >> >> >> >> >> >> >> >> >> *test { testLogging { >> if ("true".equalsIgnoreCase(System.getenv('CI'))) { >> events "failed", "passed"+ testLogging.showStandardStreams = true >> } else { events "failed" } exceptionFormat "full" }}* >> >> >> This would add the logs printed during the tests to the standard output >> for the CI runs. Example can be seen here ( >> https://github.com/pvary/iceberg/runs/1405960983) - only enabled >> standard streams for the hive related tests in this patch to see the >> results. >> >> Pros: >> >> - Easily accessible log information for the failed runs >> >> Cons: >> >> - Harder to read CI logs >> - Possible cost associated with retaining the logs >> >> >> I think having more logs would be great, but I am not sure who pays the >> bill and whether having bigger logs could cause any problem and whether the >> CI is able to handle the increased amount of data. >> >> Any thoughts, comments, ideas? >> >> Thanks, >> Peter >> > > >