Github user JoshRosen commented on the pull request:
https://github.com/apache/spark/pull/3564#issuecomment-67785676
Hmm, looks like we're only down to a few test failures and they tend to be
the same failures across runs (plus some known flaky tests). Once we finish
the streaming test flakiness PR, I'm hoping that I'll be able to help pick off
those final remaining tests so that we can try this out.
BTW, really impressed at how fast the Scala tests are running. ~500
seconds for the tests, plus some extra time for the build (a couple minutes
here and there) and it's looking like ~15-20 minute PR builds could be possible.
One source of time that's not being included here is the PySpark tests.
There are a few opportunities for parallelism there, such as using `xargs` or
`parallel` to run the PySpark suites in parallel. In addition, the build
currently tests against multiple Python versions (`python2.6` and `pypy`) and
we might add more soon (`python3`), so we can either add extra parallelism
across Python versions or just split the multi-version testing into a separate
Jenkins job, like what we do with Hadoop versions. For the PySpark tests, it
would be great to see if we can set up test XML reporting, too.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]