Re: The Spark email setting should be update

2023-04-19 Thread Jonathan Kelly
respond to the list, I had to click "Reply All", move the list to the To field and remove everybody else. Is this the same issue you are talking about, Jia? ~ Jonathan Kelly On Wed, Apr 19, 2023 at 3:29 PM Rui Wang wrote: > I am replying now and the default address is dev@spa

Re: [VOTE] Release Apache Spark 3.4.0 (RC2)

2023-03-03 Thread Jonathan Kelly
Small correction: I found a mention of it on https://github.com/apache/spark/pull/39807 from a month ago. On Fri, Mar 3, 2023 at 9:44 AM Jonathan Kelly wrote: > So did I... :-( However, there had been no new JIRA issue or PR that has > mentioned this test case specifically, until &

Re: [VOTE] Release Apache Spark 3.4.0 (RC2)

2023-03-03 Thread Jonathan Kelly
hat. > > On Fri, Mar 3, 2023 at 12:35 AM Jonathan Kelly > wrote: > >> I see that one too but have not investigated it myself. In the RC1 >> thread, it was mentioned that this occurs when running the tests via Maven >> but not via SBT. Does the test class path get set u

Re: [VOTE] Release Apache Spark 3.4.0 (RC2)

2023-03-02 Thread Jonathan Kelly
org.apache.spark.sql.Dataset.withResult(Dataset.scala:2747) > at org.apache.spark.sql.Dataset.collect(Dataset.scala:2425) > at > org.apache.spark.sql.ClientE2ETestSuite.$anonfun$new$8(ClientE2ETestSuite.scala:85) > at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) > > On Thu, Mar

Re: [VOTE] Release Apache Spark 3.4.0 (RC2)

2023-03-02 Thread Jonathan Kelly
lp was to delete sql/hive-thriftserver/target between building Spark and running the tests. This helps in my builds where the issue only occurs during the testing phase and not during the initial build phase, but of course it doesn't help in my builds where the issue occurs during that first build

Re: [VOTE] Release Apache Spark 3.4.0 (RC1)

2023-02-22 Thread Jonathan Kelly
ort Any, Callable, Iterator, List, Mapping, Protocol, TYPE_CHECKING, Tuple, Union ImportError: cannot import name 'Protocol' from 'typing' (/usr/lib64/python3.7/typing.py) Had test failures in pyspark.ml.tests.test_functions with python3; see logs. I know we should move on to a new

Re: [VOTE] Release Spark 3.3.1 (RC2)

2022-10-11 Thread Jonathan Kelly
ady for 3.3.1 release here. > > https://github.com/apache/spark/pull/38196/files > > BTW, thank you for asking the question, Jonathan. > > Dongjoon. > > > On Tue, Oct 11, 2022 at 12:06 PM Jonathan Kelly > wrote: > >> Yep, makes sense. Thanks for the quick response! &

Re: [VOTE] Release Spark 3.3.1 (RC2)

2022-10-11 Thread Jonathan Kelly
sent to just roll another RC if > there's any objection or -1. We could formally re-check the votes, as I > think the +1s would agree, but think we've defaulted into accepting a > 'veto' if there are otherwise no objections. > > On Tue, Oct 11, 2022 at 2:01 PM Jonathan

Re: [VOTE] Release Spark 3.3.1 (RC2)

2022-10-11 Thread Jonathan Kelly
+1 and minimum of 3 +1 votes) were met? I don't personally mind either way if the vote is considered passed or failed (and I see you've already cut the v3.3.1-rc3 tag but haven't started the new vote yet), but I just wanted to ask for clarification on the requirements. Thank you,

Re: [VOTE] Release Apache Spark 2.0.0 (RC5)

2016-07-20 Thread Jonathan Kelly
+1 (non-binding) On Wed, Jul 20, 2016 at 2:48 PM Michael Allman wrote: > I've run some tests with some real and some synthetic parquet data with > nested columns with and without the hive metastore on our Spark 1.5, 1.6 > and 2.0 versions. I haven't seen any unexpected performance surprises, > e

Re: [VOTE] Release Apache Spark 2.0.0 (RC4)

2016-07-19 Thread Jonathan Kelly
The docs link from Reynold's initial email is apparently no longer valid. He posted an updated link a little later in this same thread. http://people.apache.org/~pwendell/spark-releases/spark-2.0.0-rc4-docs-updated/ On Tue, Jul 19, 2016 at 3:19 PM Holden Karau wrote: > -1 : The docs don't seem

Re: [VOTE] Release Apache Spark 2.0.0 (RC2)

2016-07-14 Thread Jonathan Kelly
I see that all blockers targeted for 2.0.0 have either been resolved or downgraded. Do you have an ETA for the next RC? Thanks, Jonathan On Mon, Jul 11, 2016 at 4:33 AM Sean Owen wrote: > Yeah there were already other blockers when the RC was released. This > one was already noted in this threa

Re: Anyone knows the hive repo for spark-2.0?

2016-07-07 Thread Jonathan Kelly
I'm not sure, but I think it's https://github.com/JoshRosen/hive/tree/release-1.2.1-spark2. It would be really nice though to have this whole process better documented and more "official" than just building from somebody's personal fork of Hive. Or is there some way that the Spark community could

Re: [VOTE] Release Apache Spark 1.6.2 (RC2)

2016-06-22 Thread Jonathan Kelly
+1 On Wed, Jun 22, 2016 at 10:41 AM Tim Hunter wrote: > +1 This release passes all tests on the graphframes and tensorframes > packages. > > On Wed, Jun 22, 2016 at 7:19 AM, Cody Koeninger > wrote: > >> If we're considering backporting changes for the 0.8 kafka >> integration, I am sure there a

Re: Spark 2.0 on YARN - Files in config archive not ending up on executor classpath

2016-06-20 Thread Jonathan Kelly
g it, in case anyone else has > time to look at it before I do. > > On Mon, Jun 20, 2016 at 1:20 PM, Jonathan Kelly > wrote: > > Thanks for the confirmation! Shall I cut a JIRA issue? > > > > On Mon, Jun 20, 2016 at 10:42 AM Marcelo Vanzin > wrote: > >> > &

Re: Spark 2.0 on YARN - Files in config archive not ending up on executor classpath

2016-06-20 Thread Jonathan Kelly
6 at 7:04 AM, Jonathan Kelly > wrote: > > Does anybody have any thoughts on this? > > > > On Fri, Jun 17, 2016 at 6:36 PM Jonathan Kelly > > wrote: > >> > >> I'm trying to debug a problem in Spark 2.0.0-SNAPSHOT (commit > >> bdf5fe4143e5a

Re: Spark 2.0 on YARN - Files in config archive not ending up on executor classpath

2016-06-20 Thread Jonathan Kelly
Does anybody have any thoughts on this? On Fri, Jun 17, 2016 at 6:36 PM Jonathan Kelly wrote: > I'm trying to debug a problem in Spark 2.0.0-SNAPSHOT > (commit bdf5fe4143e5a1a393d97d0030e76d35791ee248) where Spark's > log4j.properties is not getting picked up in the exec

Spark 2.0 on YARN - Files in config archive not ending up on executor classpath

2016-06-17 Thread Jonathan Kelly
I'm trying to debug a problem in Spark 2.0.0-SNAPSHOT (commit bdf5fe4143e5a1a393d97d0030e76d35791ee248) where Spark's log4j.properties is not getting picked up in the executor classpath (and driver classpath for yarn-cluster mode), so Hadoop's log4j.properties file is taking precedence in the YARN

Re: [VOTE] Release Apache Spark 1.6.2 (RC1)

2016-06-17 Thread Jonathan Kelly
+1 (non-binding) On Thu, Jun 16, 2016 at 9:49 PM Reynold Xin wrote: > Please vote on releasing the following candidate as Apache Spark version > 1.6.2! > > The vote is open until Sunday, June 19, 2016 at 22:00 PDT and passes if a > majority of at least 3+1 PMC votes are cast. > > [ ] +1 Release

Re: Spark 2.0.0-preview artifacts still not available in Maven

2016-06-01 Thread Jonathan Kelly
I think what Reynold probably means is that previews are releases for which a vote *passed*. ~ Jonathan On Wed, Jun 1, 2016 at 1:53 PM Marcelo Vanzin wrote: > So are RCs, aren't they? > > Personally I'm fine with not releasing to maven central. Any extra > effort needed by regular users to use

Re: Unable to access Resource Manager /Name Node on port 9026 / 9101 on a Spark EMR Cluster

2016-04-15 Thread Jonathan Kelly
Ever since emr-4.x, the service ports have been synced as much as possible with open source, so the YARN ResourceManager UI is on port 8088, and the NameNode UI is on port 50070. See http://docs.aws.amazon.com/ElasticMapReduce/latest/ReleaseGuide/emr-release-differences.html#d0e23719 for more infor

Re: some joins stopped working with spark 2.0.0 SNAPSHOT

2016-02-27 Thread Jonathan Kelly
If you want to find what commit caused it, try out the "git bisect" command. On Sat, Feb 27, 2016 at 11:06 AM Koert Kuipers wrote: > https://issues.apache.org/jira/browse/SPARK-13531 > > On Sat, Feb 27, 2016 at 3:49 AM, Reynold Xin wrote: > >> Can you file a JIRA ticket? >> >> >> On Friday, Febr

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

2016-02-09 Thread Jonathan Kelly
AM,InstanceGroupType=TASK,InstanceType=m3.xlarge > > > How can I specify yarn label AM for that box? > > > > On Tue, Feb 9, 2016 at 12:16 PM, Jonathan Kelly > wrote: > >> Interesting, I was not aware of spark.yarn.am.nodeLabelExpression. >> >> We do use YA

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

2016-02-09 Thread Jonathan Kelly
ake sure we set spark.yarn.am.nodeLabelExpression appropriately in the next EMR release. ~ Jonathan On Tue, Feb 9, 2016 at 1:30 PM Marcelo Vanzin wrote: > On Tue, Feb 9, 2016 at 12:16 PM, Jonathan Kelly > wrote: > > And we do set yarn.app.mapreduce.am.labels=CORE > > That sou

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

2016-02-09 Thread Jonathan Kelly
gt; a > >> >> little bit of a corner case. There's not a good answer if all your > >> >> nodes are the same size. > >> >> > >> >> I think you can let YARN over-commit RAM though, and allocate more > >> >> memory than it act

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

2016-02-09 Thread Jonathan Kelly
Praveen, You mean cluster mode, right? That would still in a sense cause one box to be "wasted", but at least it would be used a bit more to its full potential, especially if you set spark.driver.memory to higher than its 1g default. Also, cluster mode is not an option for some applications, such

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

2016-02-09 Thread Jonathan Kelly
beneficial to let them all > >> think they have an extra GB, and let one node running the AM > >> technically be overcommitted, a state which won't hurt at all unless > >> you're really really tight on memory, in which case something might > >> get killed

Re: spark on yarn wastes one box (or 1 GB on each box) for am container

2016-02-08 Thread Jonathan Kelly
Alex, That's a very good question that I've been trying to answer myself recently too. Since you've mentioned before that you're using EMR, I assume you're asking this because you've noticed this behavior on emr-4.3.0. In this release, we made some changes to the maximizeResourceAllocation (which

Re: A proposal for Spark 2.0

2015-11-11 Thread Jonathan Kelly
If Scala 2.12 will require Java 8 and we want to enable cross-compiling Spark against Scala 2.11 and 2.12, couldn't we just make Java 8 a requirement if you want to use Scala 2.12? On Wed, Nov 11, 2015 at 9:29 AM, Koert Kuipers wrote: > i would drop scala 2.10, but definitely keep java 7 > > cro

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-11 Thread Jonathan Kelly
I just clicked the http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.spark%22 link provided above by Ryan, and I see 1.5.0. Was this just fixed within the past hour, or is some caching causing some people not to see it? On Fri, Sep 11, 2015 at 10:24 AM, Reynold Xin wrote: > It is alre