Re: [VOTE] SPIP: Stored Procedures API for Catalogs

2024-05-13 Thread Mich Talebzadeh
+0

For reasons I outlined in the discussion thread

https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo

Mich Talebzadeh,
Technologist | Architect | Data Engineer  | Generative AI | FinCrime
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  Von
Braun )".


On Mon, 13 May 2024 at 08:24, Wenchen Fan  wrote:

> +1
>
> On Mon, May 13, 2024 at 10:30 AM Kent Yao  wrote:
>
>> +1
>>
>> Dongjoon Hyun  于2024年5月13日周一 08:39写道:
>> >
>> > +1
>> >
>> > On Sun, May 12, 2024 at 3:50 PM huaxin gao 
>> wrote:
>> >>
>> >> +1
>> >>
>> >> On Sat, May 11, 2024 at 4:35 PM L. C. Hsieh  wrote:
>> >>>
>> >>> +1
>> >>>
>> >>> On Sat, May 11, 2024 at 3:11 PM Chao Sun  wrote:
>> >>> >
>> >>> > +1
>> >>> >
>> >>> > On Sat, May 11, 2024 at 2:10 PM L. C. Hsieh 
>> wrote:
>> >>> >>
>> >>> >> Hi all,
>> >>> >>
>> >>> >> I’d like to start a vote for SPIP: Stored Procedures API for
>> Catalogs.
>> >>> >>
>> >>> >> Please also refer to:
>> >>> >>
>> >>> >>- Discussion thread:
>> >>> >> https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo
>> >>> >>- JIRA ticket:
>> https://issues.apache.org/jira/browse/SPARK-44167
>> >>> >>- SPIP doc:
>> https://docs.google.com/document/d/1rDcggNl9YNcBECsfgPcoOecHXYZOu29QYFrloo2lPBg/
>> >>> >>
>> >>> >>
>> >>> >> Please vote on the SPIP for the next 72 hours:
>> >>> >>
>> >>> >> [ ] +1: Accept the proposal as an official SPIP
>> >>> >> [ ] +0
>> >>> >> [ ] -1: I don’t think this is a good idea because …
>> >>> >>
>> >>> >>
>> >>> >> Thank you!
>> >>> >>
>> >>> >> Liang-Chi Hsieh
>> >>> >>
>> >>> >>
>> -
>> >>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> >>> >>
>> >>>
>> >>> -
>> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>> >>>
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>


Re: [DISCUSS] Spark - How to improve our release processes

2024-05-13 Thread Wenchen Fan
Hi Nicholas,

Thanks for your help! I'm definitely interested in participating in this
unification work. Let me know how I can help.

Wenchen

On Mon, May 13, 2024 at 1:41 PM Nicholas Chammas 
wrote:

> Re: unification
>
> We also have a long-standing problem with how we manage Python
> dependencies, something I’ve tried (unsuccessfully
> ) to fix in the past.
>
> Consider, for example, how many separate places this numpy dependency is
> installed:
>
> 1.
> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/.github/workflows/build_and_test.yml#L277
> 2.
> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/.github/workflows/build_and_test.yml#L733
> 3.
> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/.github/workflows/build_and_test.yml#L853
> 4.
> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/.github/workflows/build_and_test.yml#L871
> 5.
> https://github.com/apache/spark/blob/8094535973f19e9f0543535a97254e8ebffc1b23/.github/workflows/build_python_connect35.yml#L70
> 6.
> https://github.com/apache/spark/blob/553e1b85c42a60c082d33f7b9df53b0495893286/.github/workflows/maven_test.yml#L181
> 7.
> https://github.com/apache/spark/blob/6e5d1db9058de62a45f35d3f41e028a72f688b70/dev/requirements.txt#L5
> 8.
> https://github.com/apache/spark/blob/678aeb7ef7086bd962df7ac6d1c5f39151a0515b/dev/run-pip-tests#L90
> 9.
> https://github.com/apache/spark/blob/678aeb7ef7086bd962df7ac6d1c5f39151a0515b/dev/run-pip-tests#L99
> 10.
> https://github.com/apache/spark/blob/9a2818820f11f9bdcc042f4ab80850918911c68c/dev/create-release/spark-rm/Dockerfile#L40
> 11.
> https://github.com/apache/spark/blob/9a42610d5ad8ae0ded92fb68c7617861cfe975e1/dev/infra/Dockerfile#L89
> 12.
> https://github.com/apache/spark/blob/9a42610d5ad8ae0ded92fb68c7617861cfe975e1/dev/infra/Dockerfile#L92
>
> None of those installations reference a unified version requirement, so
> naturally they are inconsistent across all these different lines. Some say
> `>=1.21`, others say `>=1.20.0`, and still others say `==1.20.3`. In
> several cases there is no version requirement specified at all.
>
> I’m interested in trying again to fix this problem, but it needs to be in
> collaboration with a committer since I cannot fully test the release
> scripts. (This testing gap is what doomed my last attempt at fixing this
> problem.)
>
> Nick
>
>
> On May 13, 2024, at 12:18 AM, Wenchen Fan  wrote:
>
> After finishing the 4.0.0-preview1 RC1, I have more experience with this
> topic now.
>
> In fact, the main job of the release process: building packages and
> documents, is tested in Github Action jobs. However, the way we test them
> is different from what we do in the release scripts.
>
> 1. the execution environment is different:
> The release scripts define the execution environment with this Dockerfile:
> https://github.com/apache/spark/blob/master/dev/create-release/spark-rm/Dockerfile
> However, Github Action jobs use a different Dockerfile:
> https://github.com/apache/spark/blob/master/dev/infra/Dockerfile
> We should figure out a way to unify it. The docker image for the release
> process needs to set up more things so it may not be viable to use a single
> Dockerfile for both.
>
> 2. the execution code is different. Use building documents as an example:
> The release scripts:
> https://github.com/apache/spark/blob/master/dev/create-release/release-build.sh#L404-L411
> The Github Action job:
> https://github.com/apache/spark/blob/master/.github/workflows/build_and_test.yml#L883-L895
> I don't know which one is more correct, but we should definitely unify
> them.
>
> It's better if we can run the release scripts as Github Action jobs, but I
> think it's more important to do the unification now.
>
> Thanks,
> Wenchen
>
>
> On Fri, May 10, 2024 at 12:34 AM Hussein Awala  wrote:
>
>> Hello,
>>
>> I can answer some of your common questions with other Apache projects.
>>
>> > Who currently has permissions for Github actions? Is there a specific
>> owner for that today or a different volunteer each time?
>>
>> The Apache organization owns Github Actions, and committers (contributors
>> with write permissions) can retrigger/cancel a Github Actions workflow, but
>> Github Actions runners are managed by the Apache infra team.
>>
>> > What are the current limits of GitHub Actions, who set them - and what
>> is the process to change those (if possible at all, but I presume not all
>> Apache projects have the same limits)?
>>
>> For limits, I don't think there is any significant limit, especially
>> since the Apache organization has 900 donated runners used by its projects,
>> and there is an initiative from the Infra team to add self-hosted runners
>> running on Kubernetes (document
>> 
>> ).
>>
>> > Where should the artifacts be stored

Re: [VOTE] SPIP: Stored Procedures API for Catalogs

2024-05-13 Thread Ryan Blue
+1

On Mon, May 13, 2024 at 12:31 AM Mich Talebzadeh 
wrote:

> +0
>
> For reasons I outlined in the discussion thread
>
> https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo
>
> Mich Talebzadeh,
> Technologist | Architect | Data Engineer  | Generative AI | FinCrime
> London
> United Kingdom
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner  Von
> Braun )".
>
>
> On Mon, 13 May 2024 at 08:24, Wenchen Fan  wrote:
>
>> +1
>>
>> On Mon, May 13, 2024 at 10:30 AM Kent Yao  wrote:
>>
>>> +1
>>>
>>> Dongjoon Hyun  于2024年5月13日周一 08:39写道:
>>> >
>>> > +1
>>> >
>>> > On Sun, May 12, 2024 at 3:50 PM huaxin gao 
>>> wrote:
>>> >>
>>> >> +1
>>> >>
>>> >> On Sat, May 11, 2024 at 4:35 PM L. C. Hsieh  wrote:
>>> >>>
>>> >>> +1
>>> >>>
>>> >>> On Sat, May 11, 2024 at 3:11 PM Chao Sun  wrote:
>>> >>> >
>>> >>> > +1
>>> >>> >
>>> >>> > On Sat, May 11, 2024 at 2:10 PM L. C. Hsieh 
>>> wrote:
>>> >>> >>
>>> >>> >> Hi all,
>>> >>> >>
>>> >>> >> I’d like to start a vote for SPIP: Stored Procedures API for
>>> Catalogs.
>>> >>> >>
>>> >>> >> Please also refer to:
>>> >>> >>
>>> >>> >>- Discussion thread:
>>> >>> >> https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo
>>> >>> >>- JIRA ticket:
>>> https://issues.apache.org/jira/browse/SPARK-44167
>>> >>> >>- SPIP doc:
>>> https://docs.google.com/document/d/1rDcggNl9YNcBECsfgPcoOecHXYZOu29QYFrloo2lPBg/
>>> >>> >>
>>> >>> >>
>>> >>> >> Please vote on the SPIP for the next 72 hours:
>>> >>> >>
>>> >>> >> [ ] +1: Accept the proposal as an official SPIP
>>> >>> >> [ ] +0
>>> >>> >> [ ] -1: I don’t think this is a good idea because …
>>> >>> >>
>>> >>> >>
>>> >>> >> Thank you!
>>> >>> >>
>>> >>> >> Liang-Chi Hsieh
>>> >>> >>
>>> >>> >>
>>> -
>>> >>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>> >>> >>
>>> >>>
>>> >>> -
>>> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>> >>>
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>

-- 
Ryan Blue
Tabular


Re: [VOTE] SPIP: Stored Procedures API for Catalogs

2024-05-13 Thread Anton Okolnychyi
+1

On 2024/05/13 15:33:33 Ryan Blue wrote:
> +1
> 
> On Mon, May 13, 2024 at 12:31 AM Mich Talebzadeh 
> wrote:
> 
> > +0
> >
> > For reasons I outlined in the discussion thread
> >
> > https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo
> >
> > Mich Talebzadeh,
> > Technologist | Architect | Data Engineer  | Generative AI | FinCrime
> > London
> > United Kingdom
> >
> >
> >view my Linkedin profile
> > 
> >
> >
> >  https://en.everybodywiki.com/Mich_Talebzadeh
> >
> >
> >
> > *Disclaimer:* The information provided is correct to the best of my
> > knowledge but of course cannot be guaranteed . It is essential to note
> > that, as with any advice, quote "one test result is worth one-thousand
> > expert opinions (Werner  
> > Von
> > Braun )".
> >
> >
> > On Mon, 13 May 2024 at 08:24, Wenchen Fan  wrote:
> >
> >> +1
> >>
> >> On Mon, May 13, 2024 at 10:30 AM Kent Yao  wrote:
> >>
> >>> +1
> >>>
> >>> Dongjoon Hyun  于2024年5月13日周一 08:39写道:
> >>> >
> >>> > +1
> >>> >
> >>> > On Sun, May 12, 2024 at 3:50 PM huaxin gao 
> >>> wrote:
> >>> >>
> >>> >> +1
> >>> >>
> >>> >> On Sat, May 11, 2024 at 4:35 PM L. C. Hsieh  wrote:
> >>> >>>
> >>> >>> +1
> >>> >>>
> >>> >>> On Sat, May 11, 2024 at 3:11 PM Chao Sun  wrote:
> >>> >>> >
> >>> >>> > +1
> >>> >>> >
> >>> >>> > On Sat, May 11, 2024 at 2:10 PM L. C. Hsieh 
> >>> wrote:
> >>> >>> >>
> >>> >>> >> Hi all,
> >>> >>> >>
> >>> >>> >> I’d like to start a vote for SPIP: Stored Procedures API for
> >>> Catalogs.
> >>> >>> >>
> >>> >>> >> Please also refer to:
> >>> >>> >>
> >>> >>> >>- Discussion thread:
> >>> >>> >> https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo
> >>> >>> >>- JIRA ticket:
> >>> https://issues.apache.org/jira/browse/SPARK-44167
> >>> >>> >>- SPIP doc:
> >>> https://docs.google.com/document/d/1rDcggNl9YNcBECsfgPcoOecHXYZOu29QYFrloo2lPBg/
> >>> >>> >>
> >>> >>> >>
> >>> >>> >> Please vote on the SPIP for the next 72 hours:
> >>> >>> >>
> >>> >>> >> [ ] +1: Accept the proposal as an official SPIP
> >>> >>> >> [ ] +0
> >>> >>> >> [ ] -1: I don’t think this is a good idea because …
> >>> >>> >>
> >>> >>> >>
> >>> >>> >> Thank you!
> >>> >>> >>
> >>> >>> >> Liang-Chi Hsieh
> >>> >>> >>
> >>> >>> >>
> >>> -
> >>> >>> >> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>> >>> >>
> >>> >>>
> >>> >>> -
> >>> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>> >>>
> >>>
> >>> -
> >>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
> >>>
> >>>
> 
> -- 
> Ryan Blue
> Tabular
> 

-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org



Re: [VOTE] SPIP: Stored Procedures API for Catalogs

2024-05-13 Thread Zhou Jiang
+1 (non-binding)

On Sat, May 11, 2024 at 2:10 PM L. C. Hsieh  wrote:

> Hi all,
>
> I’d like to start a vote for SPIP: Stored Procedures API for Catalogs.
>
> Please also refer to:
>
>- Discussion thread:
> https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo
>- JIRA ticket: https://issues.apache.org/jira/browse/SPARK-44167
>- SPIP doc:
> https://docs.google.com/document/d/1rDcggNl9YNcBECsfgPcoOecHXYZOu29QYFrloo2lPBg/
>
>
> Please vote on the SPIP for the next 72 hours:
>
> [ ] +1: Accept the proposal as an official SPIP
> [ ] +0
> [ ] -1: I don’t think this is a good idea because …
>
>
> Thank you!
>
> Liang-Chi Hsieh
>
> -
> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>
>

-- 
*Zhou JIANG*


Re: [VOTE] SPIP: Stored Procedures API for Catalogs

2024-05-13 Thread Gengliang Wang
+1

On Mon, May 13, 2024 at 12:30 PM Zhou Jiang  wrote:

> +1 (non-binding)
>
> On Sat, May 11, 2024 at 2:10 PM L. C. Hsieh  wrote:
>
>> Hi all,
>>
>> I’d like to start a vote for SPIP: Stored Procedures API for Catalogs.
>>
>> Please also refer to:
>>
>>- Discussion thread:
>> https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo
>>- JIRA ticket: https://issues.apache.org/jira/browse/SPARK-44167
>>- SPIP doc:
>> https://docs.google.com/document/d/1rDcggNl9YNcBECsfgPcoOecHXYZOu29QYFrloo2lPBg/
>>
>>
>> Please vote on the SPIP for the next 72 hours:
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>>
>> Thank you!
>>
>> Liang-Chi Hsieh
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>
> --
> *Zhou JIANG*
>
>


Re: [VOTE] SPIP: Stored Procedures API for Catalogs

2024-05-13 Thread Wenchen Fan
+1

On Tue, May 14, 2024 at 8:19 AM Zhou Jiang  wrote:

> +1 (non-binding)
>
> On Sat, May 11, 2024 at 2:10 PM L. C. Hsieh  wrote:
>
>> Hi all,
>>
>> I’d like to start a vote for SPIP: Stored Procedures API for Catalogs.
>>
>> Please also refer to:
>>
>>- Discussion thread:
>> https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo
>>- JIRA ticket: https://issues.apache.org/jira/browse/SPARK-44167
>>- SPIP doc:
>> https://docs.google.com/document/d/1rDcggNl9YNcBECsfgPcoOecHXYZOu29QYFrloo2lPBg/
>>
>>
>> Please vote on the SPIP for the next 72 hours:
>>
>> [ ] +1: Accept the proposal as an official SPIP
>> [ ] +0
>> [ ] -1: I don’t think this is a good idea because …
>>
>>
>> Thank you!
>>
>> Liang-Chi Hsieh
>>
>> -
>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>
>>
>
> --
> *Zhou JIANG*
>
>


Re: [VOTE] SPIP: Stored Procedures API for Catalogs

2024-05-13 Thread Xiao Li
+1

Gengliang Wang  于2024年5月13日周一 16:24写道:

> +1
>
> On Mon, May 13, 2024 at 12:30 PM Zhou Jiang 
> wrote:
>
>> +1 (non-binding)
>>
>> On Sat, May 11, 2024 at 2:10 PM L. C. Hsieh  wrote:
>>
>>> Hi all,
>>>
>>> I’d like to start a vote for SPIP: Stored Procedures API for Catalogs.
>>>
>>> Please also refer to:
>>>
>>>- Discussion thread:
>>> https://lists.apache.org/thread/7r04pz544c9qs3gc8q2nyj3fpzfnv8oo
>>>- JIRA ticket: https://issues.apache.org/jira/browse/SPARK-44167
>>>- SPIP doc:
>>> https://docs.google.com/document/d/1rDcggNl9YNcBECsfgPcoOecHXYZOu29QYFrloo2lPBg/
>>>
>>>
>>> Please vote on the SPIP for the next 72 hours:
>>>
>>> [ ] +1: Accept the proposal as an official SPIP
>>> [ ] +0
>>> [ ] -1: I don’t think this is a good idea because …
>>>
>>>
>>> Thank you!
>>>
>>> Liang-Chi Hsieh
>>>
>>> -
>>> To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
>>>
>>>
>>
>> --
>> *Zhou JIANG*
>>
>>