This vote fails. Please test RC5. On Jun 21, 2017 6:50 AM, "Nick Pentreath" <nick.pentre...@gmail.com> wrote:
> Thanks, I added the details of my environment to the JIRA (for what it's > worth now, as the issue is identified) > > On Wed, 14 Jun 2017 at 11:28 Hyukjin Kwon <gurwls...@gmail.com> wrote: > >> Actually, I opened - https://issues.apache.org/jira/browse/SPARK-21093. >> >> 2017-06-14 17:08 GMT+09:00 Hyukjin Kwon <gurwls...@gmail.com>: >> >>> For a shorter reproducer ... >>> >>> >>> df <- createDataFrame(list(list(1L, 1, "1", 0.1)), c("a", "b", "c", "d")) >>> collect(gapply(df, "a", function(key, x) { x }, schema(df))) >>> >>> And running the below multiple times (5~7): >>> >>> collect(gapply(df, "a", function(key, x) { x }, schema(df))) >>> >>> looks occasionally throwing an error. >>> >>> >>> I will leave here and probably explain more information if a JIRA is >>> open. This does not look a regression anyway. >>> >>> >>> >>> 2017-06-14 16:22 GMT+09:00 Hyukjin Kwon <gurwls...@gmail.com>: >>> >>>> >>>> Per https://github.com/apache/spark/tree/v2.1.1, >>>> >>>> 1. CentOS 7.2.1511 / R 3.3.3 - this test hangs. >>>> >>>> I messed it up a bit while downgrading the R to 3.3.3 (It was an actual >>>> machine not a VM) so it took me a while to re-try this. >>>> I re-built this again and checked the R version is 3.3.3 at least. I >>>> hope this one could double checked. >>>> >>>> Here is the self-reproducer: >>>> >>>> irisDF <- suppressWarnings(createDataFrame (iris)) >>>> schema <- structType(structField("Sepal_Length", "double"), >>>> structField("Avg", "double")) >>>> df4 <- gapply( >>>> cols = "Sepal_Length", >>>> irisDF, >>>> function(key, x) { >>>> y <- data.frame(key, mean(x$Sepal_Width), stringsAsFactors = FALSE) >>>> }, >>>> schema) >>>> collect(df4) >>>> >>>> >>>> >>>> 2017-06-14 16:07 GMT+09:00 Felix Cheung <felixcheun...@hotmail.com>: >>>> >>>>> Thanks! Will try to setup RHEL/CentOS to test it out >>>>> >>>>> _____________________________ >>>>> From: Nick Pentreath <nick.pentre...@gmail.com> >>>>> Sent: Tuesday, June 13, 2017 11:38 PM >>>>> Subject: Re: [VOTE] Apache Spark 2.2.0 (RC4) >>>>> To: Felix Cheung <felixcheun...@hotmail.com>, Hyukjin Kwon < >>>>> gurwls...@gmail.com>, dev <dev@spark.apache.org> >>>>> >>>>> Cc: Sean Owen <so...@cloudera.com> >>>>> >>>>> >>>>> Hi yeah sorry for slow response - I was RHEL and OpenJDK but will have >>>>> to report back later with the versions as am AFK. >>>>> >>>>> R version not totally sure but again will revert asap >>>>> On Wed, 14 Jun 2017 at 05:09, Felix Cheung <felixcheun...@hotmail.com> >>>>> wrote: >>>>> >>>>>> Thanks >>>>>> This was with an external package and unrelated >>>>>> >>>>>> >> macOS Sierra 10.12.3 / R 3.2.3 - passed with a warning ( >>>>>> https://gist.github.com/HyukjinKwon/85cbcfb245825852df20ed6a9ecfd845) >>>>>> >>>>>> As for CentOS - would it be possible to test against R older than >>>>>> 3.4.0? This is the same error reported by Nick below. >>>>>> >>>>>> _____________________________ >>>>>> From: Hyukjin Kwon <gurwls...@gmail.com> >>>>>> Sent: Tuesday, June 13, 2017 8:02 PM >>>>>> >>>>>> Subject: Re: [VOTE] Apache Spark 2.2.0 (RC4) >>>>>> To: dev <dev@spark.apache.org> >>>>>> Cc: Sean Owen <so...@cloudera.com>, Nick Pentreath < >>>>>> nick.pentre...@gmail.com>, Felix Cheung <felixcheun...@hotmail.com> >>>>>> >>>>>> >>>>>> >>>>>> For the test failure on R, I checked: >>>>>> >>>>>> >>>>>> Per https://github.com/apache/spark/tree/v2.2.0-rc4, >>>>>> >>>>>> 1. Windows Server 2012 R2 / R 3.3.1 - passed ( >>>>>> https://ci.appveyor.com/project/spark-test/spark/ >>>>>> build/755-r-test-v2.2.0-rc4) >>>>>> 2. macOS Sierra 10.12.3 / R 3.4.0 - passed >>>>>> 3. macOS Sierra 10.12.3 / R 3.2.3 - passed with a warning ( >>>>>> https://gist.github.com/HyukjinKwon/85cbcfb245825852df20ed6a9ecfd845) >>>>>> 4. CentOS 7.2.1511 / R 3.4.0 - reproduced (https://gist.github.com/ >>>>>> HyukjinKwon/2a736b9f80318618cc147ac2bb1a987d) >>>>>> >>>>>> >>>>>> Per https://github.com/apache/spark/tree/v2.1.1, >>>>>> >>>>>> 1. CentOS 7.2.1511 / R 3.4.0 - reproduced (https://gist.github.com/ >>>>>> HyukjinKwon/6064b0d10bab8fc1dc6212452d83b301) >>>>>> >>>>>> >>>>>> This looks being failed only in CentOS 7.2.1511 / R 3.4.0 given my >>>>>> tests and observations. >>>>>> >>>>>> This is failed in Spark 2.1.1. So, it sounds not a regression >>>>>> although it is a bug that should be fixed (whether in Spark or R). >>>>>> >>>>>> >>>>>> 2017-06-14 8:28 GMT+09:00 Xiao Li <gatorsm...@gmail.com>: >>>>>> >>>>>>> -1 >>>>>>> >>>>>>> Spark 2.2 is unable to read the partitioned table created by Spark >>>>>>> 2.1 or earlier. >>>>>>> >>>>>>> Opened a JIRA https://issues.apache.org/jira/browse/SPARK-21085 >>>>>>> >>>>>>> Will fix it soon. >>>>>>> >>>>>>> Thanks, >>>>>>> >>>>>>> Xiao Li >>>>>>> >>>>>>> >>>>>>> >>>>>>> 2017-06-13 9:39 GMT-07:00 Joseph Bradley <jos...@databricks.com>: >>>>>>> >>>>>>>> Re: the QA JIRAs: >>>>>>>> Thanks for discussing them. I still feel they are very helpful; I >>>>>>>> particularly notice not having to spend a solid 2-3 weeks of time QAing >>>>>>>> (unlike in earlier Spark releases). One other point not mentioned >>>>>>>> above: I >>>>>>>> think they serve as a very helpful reminder/training for the community >>>>>>>> for >>>>>>>> rigor in development. Since we instituted QA JIRAs, contributors have >>>>>>>> been >>>>>>>> a lot better about adding in docs early, rather than waiting until the >>>>>>>> end >>>>>>>> of the cycle (though I know this is drawing conclusions from >>>>>>>> correlations). >>>>>>>> >>>>>>>> I would vote in favor of the RC...but I'll wait to see about the >>>>>>>> reported failures. >>>>>>>> >>>>>>>> On Fri, Jun 9, 2017 at 3:30 PM, Sean Owen <so...@cloudera.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Different errors as in https://issues.apache.org/ >>>>>>>>> jira/browse/SPARK-20520 but that's also reporting R test >>>>>>>>> failures. >>>>>>>>> >>>>>>>>> I went back and tried to run the R tests and they passed, at least >>>>>>>>> on Ubuntu 17 / R 3.3. >>>>>>>>> >>>>>>>>> >>>>>>>>> On Fri, Jun 9, 2017 at 9:12 AM Nick Pentreath < >>>>>>>>> nick.pentre...@gmail.com> wrote: >>>>>>>>> >>>>>>>>>> All Scala, Python tests pass. ML QA and doc issues are resolved >>>>>>>>>> (as well as R it seems). >>>>>>>>>> >>>>>>>>>> However, I'm seeing the following test failure on R consistently: >>>>>>>>>> https://gist.github.com/MLnick/5f26152f97ae8473f807c6895817cf72 >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Thu, 8 Jun 2017 at 08:48 Denny Lee <denny.g....@gmail.com> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> +1 non-binding >>>>>>>>>>> >>>>>>>>>>> Tested on macOS Sierra, Ubuntu 16.04 >>>>>>>>>>> test suite includes various test cases including Spark SQL, ML, >>>>>>>>>>> GraphFrames, Structured Streaming >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Wed, Jun 7, 2017 at 9:40 PM vaquar khan < >>>>>>>>>>> vaquar.k...@gmail.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> +1 non-binding >>>>>>>>>>>> >>>>>>>>>>>> Regards, >>>>>>>>>>>> vaquar khan >>>>>>>>>>>> >>>>>>>>>>>> On Jun 7, 2017 4:32 PM, "Ricardo Almeida" < >>>>>>>>>>>> ricardo.alme...@actnowib.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>> +1 (non-binding) >>>>>>>>>>>> >>>>>>>>>>>> Built and tested with -Phadoop-2.7 -Dhadoop.version=2.7.3 >>>>>>>>>>>> -Pyarn -Phive -Phive-thriftserver -Pscala-2.11 on >>>>>>>>>>>> >>>>>>>>>>>> - Ubuntu 17.04, Java 8 (OpenJDK 1.8.0_111) >>>>>>>>>>>> - macOS 10.12.5 Java 8 (build 1.8.0_131) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 5 June 2017 at 21:14, Michael Armbrust < >>>>>>>>>>>> mich...@databricks.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Please vote on releasing the following candidate as Apache >>>>>>>>>>>>> Spark version 2.2.0. The vote is open until Thurs, June 8th, >>>>>>>>>>>>> 2017 at 12:00 PST and passes if a majority of at least 3 +1 PMC >>>>>>>>>>>>> votes are cast. >>>>>>>>>>>>> >>>>>>>>>>>>> [ ] +1 Release this package as Apache Spark 2.2.0 >>>>>>>>>>>>> [ ] -1 Do not release this package because ... >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> To learn more about Apache Spark, please see >>>>>>>>>>>>> http://spark.apache.org/ >>>>>>>>>>>>> >>>>>>>>>>>>> The tag to be voted on is v2.2.0-rc4 >>>>>>>>>>>>> <https://github.com/apache/spark/tree/v2.2.0-rc4> ( >>>>>>>>>>>>> 377cfa8ac7ff7a8a6a6d273182e18ea7dc25ce7e) >>>>>>>>>>>>> >>>>>>>>>>>>> List of JIRA tickets resolved can be found with this filter >>>>>>>>>>>>> <https://issues.apache.org/jira/browse/SPARK-20134?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%202.2.0> >>>>>>>>>>>>> . >>>>>>>>>>>>> >>>>>>>>>>>>> The release files, including signatures, digests, etc. can be >>>>>>>>>>>>> found at: >>>>>>>>>>>>> http://home.apache.org/~pwendell/spark-releases/spark- >>>>>>>>>>>>> 2.2.0-rc4-bin/ >>>>>>>>>>>>> >>>>>>>>>>>>> Release artifacts are signed with the following key: >>>>>>>>>>>>> https://people.apache.org/keys/committer/pwendell.asc >>>>>>>>>>>>> >>>>>>>>>>>>> The staging repository for this release can be found at: >>>>>>>>>>>>> https://repository.apache.org/content/repositories/ >>>>>>>>>>>>> orgapachespark-1241/ >>>>>>>>>>>>> >>>>>>>>>>>>> The documentation corresponding to this release can be found >>>>>>>>>>>>> at: >>>>>>>>>>>>> http://people.apache.org/~pwendell/spark-releases/spark- >>>>>>>>>>>>> 2.2.0-rc4-docs/ >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> *FAQ* >>>>>>>>>>>>> >>>>>>>>>>>>> *How can I help test this release?* >>>>>>>>>>>>> >>>>>>>>>>>>> If you are a Spark user, you can help us test this release by >>>>>>>>>>>>> taking an existing Spark workload and running on this release >>>>>>>>>>>>> candidate, >>>>>>>>>>>>> then reporting any regressions. >>>>>>>>>>>>> >>>>>>>>>>>>> *What should happen to JIRA tickets still targeting 2.2.0?* >>>>>>>>>>>>> >>>>>>>>>>>>> Committers should look at those and triage. Extremely >>>>>>>>>>>>> important bug fixes, documentation, and API tweaks that impact >>>>>>>>>>>>> compatibility should be worked on immediately. Everything else >>>>>>>>>>>>> please >>>>>>>>>>>>> retarget to 2.3.0 or 2.2.1. >>>>>>>>>>>>> >>>>>>>>>>>>> *But my bug isn't fixed!??!* >>>>>>>>>>>>> >>>>>>>>>>>>> In order to make timely releases, we will typically not hold >>>>>>>>>>>>> the release unless the bug in question is a regression from 2.1.1. >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Joseph Bradley >>>>>>>> >>>>>>>> Software Engineer - Machine Learning >>>>>>>> >>>>>>>> Databricks, Inc. >>>>>>>> >>>>>>>> [image: http://databricks.com] <http://databricks.com/> >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>> >>