Thanks for following up here with the solution and the steps for accessing
the logs!

On Mon, 23 Nov 2020 at 08:59, Peter Vary <pv...@cloudera.com.invalid> wrote:

> Hi Team,
>
> Ryan pushed my changes. Thanks for the review and the merge!
>
> The final solution was to create a log file for every package which will
> contain the StdErr / StdOut of the tests. These will be stored in the
> <ROOT>/build/testlogs/<PACKAGE_NAME>.log file.
> Like <ROOT>/build/testlogs/iceberg-parquet.log:
>
> --------
> - Test log for: Test 
> testRowGroupSizeConfigurableWithWriter(org.apache.iceberg.parquet.TestParquet)
> --------
> StdErr log4j:WARN No appenders could be found for logger 
> (org.apache.hadoop.util.NativeCodeLoader).
> StdErr log4j:WARN Please initialize the log4j system properly.
> StdErr log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig 
> for more info.
> --------
> - Test log for: Test 
> testListProjection(org.apache.iceberg.avro.TestParquetReadProjection)
> --------
> StdErr [Test worker] INFO 
> org.apache.parquet.hadoop.InternalParquetRecordReader - RecordReader 
> initialized will read a total of 1 records.
> StdErr [Test worker] INFO 
> org.apache.parquet.hadoop.InternalParquetRecordReader - at row 0. reading 
> next block
> StdErr [Test worker] INFO 
> org.apache.parquet.hadoop.InternalParquetRecordReader - block read in memory 
> in 1 ms. row count = 1
>
> If there is a failure in the CI run then these logs are achieved. "By
> default, GitHub stores build logs and artifacts for 90 days", and they are 
> accessible
> through the "Artifacts/test logs" on the top right corner of the failed run.
> See:
>
> This could help us investigating flaky failures. That said, if you find
> flaky tests for the Hive/Tez related tests, please notify me, Laszlo Pinter
> or Marton Bod.
>
> Thanks,
> Peter
>
>
> On Nov 19, 2020, at 16:02, Peter Vary <pv...@cloudera.com> wrote:
>
> Created the pull request for it:
> https://github.com/apache/iceberg/pull/1789
>
> You can turn it on for manual builds by
>
> *export CI=true*
>
>
> Any reviewers would be welcome!
> Thanks,
> Peter
>
> On Nov 18, 2020, at 10:11, Mass Dosage <massdos...@gmail.com> wrote:
>
> I can definitely see how having more detailed logs could be useful so I
> like what you're suggesting. I guess another option could be to make this
> configurable so you can pass in an argument to turn on the
> "showStandardStreams", by default it's false but while you're debugging
> this issue it would be turned on?
>
> On Wed, 18 Nov 2020 at 09:03, Peter Vary <pv...@cloudera.com.invalid>
> wrote:
>
>> Hi Team,
>>
>> Recently I have been working on trying to reproduce the following CI
>> failure without success:
>>
>>
>>
>>
>>
>> *org.apache.iceberg.mr.hive.TestHiveIcebergStorageHandlerWithCustomCatalog
>> > testScanTable[fileFormat=PARQUET, engine=tez] FAILED
>> java.lang.IllegalArgumentException: Failed to execute Hive query 'SELECT *
>> FROM default.customers ORDER BY customer_id DESC': Error while processing
>> statement: FAILED: Execution Error, return code 1 from
>> org.apache.hadoop.hive.ql.exec.tez.TezTask        Caused by:
>> org.apache.hive.service.cli.HiveSQLException: Error while processing
>> statement: FAILED: Execution Error, return code 1 from
>> org.apache.hadoop.hive.ql.exec.tez.TezTask*
>>
>>
>> Since I was unsuccessful reproing the case, and the provided error
>> message in CI logs are not really helpful this means I can not fix this
>> flaky test for now. :(
>>
>> After Marton Bods changes for adding logs for tests (
>> https://github.com/apache/iceberg/pull/1712), we could have more info
>> about the failures in the test logs (
>> *build/test-results/test/binary/output.bin*), but I am not sure if that
>> is retained and accessible after a CI run.
>>
>> I would like to propose adding the following to the build.gradle for the
>> CI runs:
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> *test {  testLogging {
>>   if ("true".equalsIgnoreCase(System.getenv('CI'))) {
>> events "failed", "passed"+      testLogging.showStandardStreams = true
>>   } else {      events "failed"    }    exceptionFormat "full"  }}*
>>
>>
>> This would add the logs printed during the tests to the standard output
>> for the CI runs. Example can be seen here (
>> https://github.com/pvary/iceberg/runs/1405960983) - only enabled
>> standard streams for the hive related tests in this patch to see the
>> results.
>>
>> Pros:
>>
>>    - Easily accessible log information for the failed runs
>>
>> Cons:
>>
>>    - Harder to read CI logs
>>    - Possible cost associated with retaining the logs
>>
>>
>> I think having more logs would be great, but I am not sure who pays the
>> bill and whether having bigger logs could cause any problem and whether the
>> CI is able to handle the increased amount of data.
>>
>> Any thoughts, comments, ideas?
>>
>> Thanks,
>> Peter
>>
>
>
>

Reply via email to