RDD object Out of scope.

2019-05-20 Thread Nasrulla Khan Haris
HI Spark developers, Can someone point out the code where RDD objects go out of scope ?. I found the contextcleaner code in which only persisted RDDs are cleaned up in regular intervals if

Re: Resolving all JIRAs affecting EOL releases

2019-05-20 Thread Hyukjin Kwon
I took an action for those JIRAs. The JIRAs that has not been updated for the last year, and having affect version of EOL releases were now: - Resolved as 'Incomplete' status - Has a 'bulk-closed' label. Thanks guys. 2019년 5월 21일 (화) 오전 8:35, shane knapp 님이 작성: > alright, i found 3 jiras th

Re: Resolving all JIRAs affecting EOL releases

2019-05-20 Thread shane knapp
alright, i found 3 jiras that i was able to close: 1. SPARK-19612 2. 1. SPARK-22996 2. 1. SPARK-22766

Re: Hadoop version(s) compatible with spark-2.4.3-bin-without-hadoop-scala-2.12

2019-05-20 Thread Koert Kuipers
its somewhat weird because avro-mapred-1.8.2-hadoop2.jar is included in the hadoop-provided distro, but avro-1.8.2.jar is not. i tried to fix it but i am not too familiar with the pom file. regarding jline you only run into this if you use spark-shell (and it isnt always reproducible it seems). se

Re: Hadoop version(s) compatible with spark-2.4.3-bin-without-hadoop-scala-2.12

2019-05-20 Thread Sean Owen
Re: 1), I think we tried to fix that on the build side and it requires flags that not all tar versions (i.e. OS X) have. But that's tangential. I think the Avro + Parquet dependency situation is generally problematic -- see JIRA for some details. But yes I'm not surprised if Spark has a different

Re: Hadoop version(s) compatible with spark-2.4.3-bin-without-hadoop-scala-2.12

2019-05-20 Thread Koert Kuipers
we run it without issues on hadoop 2.6 - 2.8 on top of my head. we however do some post-processing on the tarball: 1) we fix the ownership of the files inside the tar.gz file (should be uid/gid 0/0, otherwise untarring by root can lead to ownership by unknown user). 2) add avro-1.8.2.jar and jline

Hadoop version(s) compatible with spark-2.4.3-bin-without-hadoop-scala-2.12

2019-05-20 Thread Michael Heuer
Hello, Which Hadoop version or versions are compatible with Spark 2.4.3 and Scala 2.12? The binary distribution spark-2.4.3-bin-without-hadoop-scala-2.12.tgz is missing avro-1.8.2.jar, so when attempting to run with Hadoop 2.7.7 there are classpath conflicts at runtime, as Hadoop 2.7.7 includes

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2019-05-20 Thread Arun Mahadevan
Heres the proposal for supporting it in "append" mode - https://github.com/apache/spark/pull/23576. You could see if it addresses your requirement and post your feedback in the PR. For "update" mode its going to be much harder to support this without first adding support for "retractions", otherwis

Re: What's the root cause of not supporting multiple aggregations in structured streaming?

2019-05-20 Thread Gabor Somogyi
There is PR for this but not yet merged. On Mon, May 20, 2019 at 10:13 AM 张万新 wrote: > Hi there, > > I'd like to know what's the root reason why multiple aggregations on > streaming dataframe is not allowed since it's a very useful feature, and > flink has supported it for a long time. > > Thank

What's the root cause of not supporting multiple aggregations in structured streaming?

2019-05-20 Thread 张万新
Hi there, I'd like to know what's the root reason why multiple aggregations on streaming dataframe is not allowed since it's a very useful feature, and flink has supported it for a long time. Thanks.