date:20200228

Datasource V2 support in Spark 3.x

2020-02-28 Thread Mihir Sahu

Hi Team, Wanted to know ahead of developing new datasource for Spark 3.x. Shall it be done using Datasource V2 or Datasource V1(via Relation) or there is any other plan. When I tried to build datasource using V2 for Spark 3.0, I could not find the associated classes and they seems to be m

Re: [DISCUSS] Remove multiple workers on the same host support from Standalone backend

2020-02-28 Thread Sean Owen

I'll admit, I didn't know you could deploy multiple workers per machine. I agree, I don't see the use case for it? multiple executors, yes of course. And I guess you could imagine multiple distinct Spark clusters running a worker on one machine. I don't have an informed opinion therefore, but agree

[DISCUSS] Remove multiple workers on the same host support from Standalone backend

2020-02-28 Thread Xingbo Jiang

Hi all, Based on my experience, there is no scenario that necessarily requires deploying multiple Workers on the same node with Standalone backend. A worker should book all the resources reserved to Spark on the host it is launched, then it can allocate those resources to one or more executors lau

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-28 Thread Sean Owen

On Fri, Feb 28, 2020 at 12:03 PM Holden Karau wrote: >> 1. Could you estimate how many revert commits are required in >> `branch-3.0` for new rubric? Fair question about what actual change this implies for 3.0? so far it seems like some targeted, quite reasonable reverts. I don't think anyon

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-28 Thread Holden Karau

On Fri, Feb 28, 2020 at 9:48 AM Dongjoon Hyun wrote: > Hi, Matei and Michael. > > I'm also a big supporter for policy-based project management. > > Before going further, > > 1. Could you estimate how many revert commits are required in > `branch-3.0` for new rubric? > 2. Are you going to

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

2020-02-28 Thread Dongjoon Hyun

Hi, Matei and Michael. I'm also a big supporter for policy-based project management. Before going further, 1. Could you estimate how many revert commits are required in `branch-3.0` for new rubric? 2. Are you going to revert all removed test cases for the deprecated ones? 3. Does it

Re: GitHub action permissions

2020-02-28 Thread Tom Graves

No, I couldn't see that button, looks like the process of syncing in gitbox didn't finish with my accounts. I finished that and its working now. Thanks,Tom On Friday, February 28, 2020, 09:39:12 AM CST, Dongjoon Hyun wrote: Hi, Thomas. If you log-in with a GitHub account registered Ap

Re: GitHub action permissions

2020-02-28 Thread Dongjoon Hyun

Hi, Thomas. If you log-in with a GitHub account registered Apache project member, it will be enough. On some PRs of Apache Spark, can you see 'Squash and merge' button? Bests, Dongjoon On Fri, Feb 28, 2020 at 07:15 Thomas graves wrote: > Does anyone know how the GitHub action permissions are

GitHub action permissions

2020-02-28 Thread Thomas graves

Does anyone know how the GitHub action permissions are setup? I see a lot of random failures and want to be able to rerun them, but I don't seem to have a "rerun" button like some folks do. Thanks, Tom - To unsubscribe e-mail: d

Keytab, Proxy User & Principal

2020-02-28 Thread Lars Francke

Hi, I understand that we forbid specifying "principal" & "proxy user" at the same time because the current logic would just stage the keytab and the proxy user could then use that to gain full access circumventing any security. But we have a use-case for Livy where a different semantic would be g

Re: dropDuplicates and watermark in structured streaming

2020-02-28 Thread Tathagata Das

why do you have two watermarks? once you apply the watermark to a column (i.e., "time"), it can be used in all later operations as long as the column is preserved. So the above code should be equivalent to df.withWarmark("time","window size").dropDulplicates("id").groupBy(window("time","window siz

Datasource V2 support in Spark 3.x

Re: [DISCUSS] Remove multiple workers on the same host support from Standalone backend

[DISCUSS] Remove multiple workers on the same host support from Standalone backend

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

Re: [Proposal] Modification to Spark's Semantic Versioning Policy

Re: GitHub action permissions

Re: GitHub action permissions

GitHub action permissions

Keytab, Proxy User & Principal

Re: dropDuplicates and watermark in structured streaming

11 matches

Site Navigation

Mail list logo

Footer information