Re: [VOTE] Release Apache Spark 2.3.3 (RC2)

2019-02-07 Thread Takeshi Yamamuro
Thanks, all. Yea, I think we don't need to block the release, too. > Jungtaek Thanks! That is very helpful! If you find something, please let me know. Best, Takeshi On Fri, Feb 8, 2019 at 1:10 AM Dongjoon Hyun wrote: > +1 for 2.3.3 RC2. > > Thank you, Takeshi. > > And, +1 for 2.3.4 as 2.3.x E

Re: [DISCUSS] Change default executor log URLs for YARN

2019-02-07 Thread Jungtaek Lim
New URL shows all of local logs which includes stdout and stderr as a list. The change would help when end users modify their log4j configuration to have another log files, as well as GC logs. Currently Spark only shows two static files (stdout, stderr) as individual links so easier to see the con

Re: [VOTE] Release Apache Spark 2.3.3 (RC2)

2019-02-07 Thread John Zhuge
+1 On Thu, Feb 7, 2019 at 8:10 AM Dongjoon Hyun wrote: > +1 for 2.3.3 RC2. > > Thank you, Takeshi. > > And, +1 for 2.3.4 as 2.3.x EOL release. > > Cheers, > Dongjoon. > > On Thu, Feb 7, 2019 at 6:48 AM Sean Owen wrote: > >> It wouldn't be wasted effort, as there is probably going to be a 2.3.4

Re: [DISCUSS] Change default executor log URLs for YARN

2019-02-07 Thread Ryan Blue
Jungtaek, What is shown at the new URL and how would this improve usability? On Thu, Feb 7, 2019 at 12:45 AM Jungtaek Lim wrote: > Hi devs, > > Based on the suggestion Tom Graves gave me in SPARK-26792 > , I'd like to hear > voices on changing

Re: [VOTE] Release Apache Spark 2.3.3 (RC2)

2019-02-07 Thread Dongjoon Hyun
+1 for 2.3.3 RC2. Thank you, Takeshi. And, +1 for 2.3.4 as 2.3.x EOL release. Cheers, Dongjoon. On Thu, Feb 7, 2019 at 6:48 AM Sean Owen wrote: > It wouldn't be wasted effort, as there is probably going to be a 2.3.4 > release before 2.3.x is EOL. At least, having reliable tests on > Jenkins

Re: [VOTE] Release Apache Spark 2.3.3 (RC2)

2019-02-07 Thread Sean Owen
It wouldn't be wasted effort, as there is probably going to be a 2.3.4 release before 2.3.x is EOL. At least, having reliable tests on Jenkins helps not miss problems with backports to 2.3.x. I seem to recall something was change in 2.4.x to help this but either didn't work or didn't apply to 2.3.x

Re: Jenkins commands?

2019-02-07 Thread Tom Graves
Thanks, that is exactly what I was looking for. Tom On Wednesday, February 6, 2019, 10:50:14 PM CST, shane knapp wrote: the PRB executes the following scripts:./dev/run-tests-jenkins ./build/sbt unsafe/test SBT QA tests:./dev/run-tests  maven QA tests:ZINC_PORT=$(python -S -c "import r

TaskMemoryManager

2019-02-07 Thread Jack Kolokasis
Hello all,     well I try to profile Spark in order to see which functions are called while the execution of an application. Based on my results, i see that in SVM benchmark, TaskMemoryManager called to allocate extra memory using HeapMemoryAllocator. In addition with Linear Regression applic

[Spark SQL]Support ANY/SOME subquery

2019-02-07 Thread Mingcong Han
Hi, I'm new here and trying to contribute to Catalyst based on my experience. SparkSQL supports three kinds of subquery in Filter: InSubquery, Exists, ScalarSubquery. But "ANY(SOME) subquery" and "ALL subquery" are also supported by most DBs. Therefore I'm writing this email in order to discuss how

[DISCUSS] Change default executor log URLs for YARN

2019-02-07 Thread Jungtaek Lim
Hi devs, Based on the suggestion Tom Graves gave me in SPARK-26792 , I'd like to hear voices on changing default executor log URLs for YARN, specifically removing "stdout" and "stderr" and provide link which shows log file"s". For example, instead

Re: Array indexing functions

2019-02-07 Thread Petar Zečević
Hi, as far as I know these are not standard functions. Writing UDFs is easy, but only in Java and Scala is it equally efficient as a built-in function. When using Python, data movement/conversion to/from Arrow is still necessary, and that makes a difference in performance. That was the motiva