date:20210215

[VOTE] Release Spark 3.0.2 (RC1)

2021-02-15 Thread Dongjoon Hyun

Please vote on releasing the following candidate as Apache Spark version 3.0.2. The vote is open until February 19th 9AM (PST) and passes if a majority +1 PMC votes are cast, with a minimum of 3 +1 votes. [ ] +1 Release this package as Apache Spark 3.0.2 [ ] -1 Do not release this package because

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-15 Thread Ye Xianjin

Hi, Thanks for Ryan and Wenchen for leading this. I’d like to add my two cents here. In production environments, the function catalog might be used by multiple systems, such as Spark, Presto and Hive. Is it possible that this function catalog is designed with as an unified function catalog in

Re: [DISCUSS] assignee practice on committers+ (possible issue on preemption)

2021-02-15 Thread Jungtaek Lim

Thanks for the input, Hyukjin! I have been keeping my own policy among all discussions I have raised - I would provide the hypothetical example closer to the actual one and avoid pointing out directly. The main purpose of the discussion is to ensure our policy / consensus makes sense, no more. I c

Re: [DISCUSS] assignee practice on committers+ (possible issue on preemption)

2021-02-15 Thread Hyukjin Kwon

I remember I raised a similar issue a long time ago in the dev mailing list. I agree that setting no assignee makes sense in most of the cases, and also think we share similar thoughts about the assignee on umbrella JIRAs, followup tasks, the case when it's clear with a design doc, etc. It makes me

Re: [DISCUSS] SPIP: FunctionCatalog

2021-02-15 Thread Ryan Blue

Thanks for the positive feedback, everyone. It sounds like there is a clear path forward for calling functions. Even without a prototype, the `invoke` plans show that Wenchen's suggested optimization can be done, and incorporating it as an optional extension to this proposal solves many of the unkn

Re: Is there any inplict RDD cache operation for query optimizations?

2021-02-15 Thread attilapiros

hi, There is good reason why the decision about caching is left for the user. Spark does not know about the future of the DataFrames and RDDs. Think about how your program is running (you are still running program), so there is an exact point where the execution is and when Spark reaches an actio

[VOTE] Release Spark 3.0.2 (RC1)

Re: [DISCUSS] SPIP: FunctionCatalog

Re: [DISCUSS] assignee practice on committers+ (possible issue on preemption)

Re: [DISCUSS] assignee practice on committers+ (possible issue on preemption)

Re: [DISCUSS] SPIP: FunctionCatalog

Re: Is there any inplict RDD cache operation for query optimizations?

6 matches

Site Navigation

Mail list logo

Footer information