Github user JoshRosen commented on the pull request:

    https://github.com/apache/spark/pull/3564#issuecomment-67785676
  
    Hmm, looks like we're only down to a few test failures and they tend to be 
the same failures across runs (plus some known flaky tests).  Once we finish 
the streaming test flakiness PR, I'm hoping that I'll be able to help pick off 
those final remaining tests so that we can try this out.
    
    BTW, really impressed at how fast the Scala tests are running.  ~500 
seconds for the tests, plus some extra time for the build (a couple minutes 
here and there) and it's looking like ~15-20 minute PR builds could be possible.
    
    One source of time that's not being included here is the PySpark tests.  
There are a few opportunities for parallelism there, such as using `xargs` or 
`parallel` to run the PySpark suites in parallel.  In addition, the build 
currently tests against multiple Python versions (`python2.6` and `pypy`) and 
we might add more soon (`python3`), so we can either add extra parallelism 
across Python versions or just split the multi-version testing into a separate 
Jenkins job, like what we do with Hadoop versions.  For the PySpark tests, it 
would be great to see if we can set up test XML reporting, too.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to