Re: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-02-22 Thread Mich Talebzadeh
I can see it was closed. Was it because of inactivity?


Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  Von
Braun )".


On Thu, 22 Feb 2024 at 06:58, Pavan Kotikalapudi
 wrote:

> Hi Spark PMC members,
>
> I think we have few upvotes for this effort here and more people are
> showing interest (see  PR comments
> .)
>
> Is anyone interested in mentoring and reviewing this effort?
>
> Also can the repository admin/owner re-open the PR?  ( I guess people only
> with admin access to the repository can do that).
>
> Thank you,
>
> Pavan
>
> On Tue, Feb 20, 2024 at 2:08 PM Krystal Mitchell
>  wrote:
>
>> +1
>>
>> On 2024/01/17 17:49:32 Pavan Kotikalapudi wrote:
>> > Thanks for proposing and voting for the feature Mich.
>> >
>> > adding some references to the thread.
>> >
>> >- Jira ticket - SPARK-24815
>> >
>> 
>> >- Design Doc
>> ><
>> https://docs.google.com/document/d/1_YmfCsQQb9XhRdKh0ijbc-j8JKGtGBxYsk_30NVSTWo/edit?usp=sharing>
>> 
>> >
>> >- discussion thread
>> >
>> 
>> >- PR with initial implementation -
>> >https://github.com/apache/spark/pull/42352
>> 
>> >
>> > Please vote with:
>> >
>> > [ ] +1: Accept the proposal and start with the development.
>> > [ ] +0
>> > [ ] -1: I don’t think this is a good idea because …
>> >
>> > Thank you,
>> >
>> > Pavan
>> >
>> > On Wed, Jan 17, 2024 at 9:52 PM Mich Talebzadeh 
>> > wrote:
>> >
>> > >
>> > > +1 for me  (non binding)
>> > >
>> > >
>> > >
>> > > *Disclaimer:* Use it at your own risk. Any and all responsibility for
>> any
>> > > loss, damage or destruction of data or any other property which may
>> arise
>> > > from relying on this email's technical content is explicitly
>> disclaimed.
>> > > The author will in no case be liable for any monetary damages arising
>> from
>> > > such loss, damage or destruction.
>> > >
>> > >
>> > >
>> >
>>
>


Re: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-02-22 Thread Pavan Kotikalapudi
Yes. The PR was closed due to inactivity by github actions..

The msg 
also
says

> If you'd like to revive this PR, please reopen it and ask a committer to
remove the Stale tag!

On Thu, Feb 22, 2024 at 1:09 AM Mich Talebzadeh 
wrote:

> I can see it was closed. Was it because of inactivity?
>
>
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
> 
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner
> Von
> Braun
> 
> )".
>
>
> On Thu, 22 Feb 2024 at 06:58, Pavan Kotikalapudi
>  wrote:
>
>> Hi Spark PMC members,
>>
>> I think we have few upvotes for this effort here and more people are
>> showing interest (see  PR comments
>> 
>> .)
>>
>> Is anyone interested in mentoring and reviewing this effort?
>>
>> Also can the repository admin/owner re-open the PR?  ( I guess people
>> only with admin access to the repository can do that).
>>
>> Thank you,
>>
>> Pavan
>>
>> On Tue, Feb 20, 2024 at 2:08 PM Krystal Mitchell
>>  wrote:
>>
>>> +1
>>>
>>> On 2024/01/17 17:49:32 Pavan Kotikalapudi wrote:
>>> > Thanks for proposing and voting for the feature Mich.
>>> >
>>> > adding some references to the thread.
>>> >
>>> >- Jira ticket - SPARK-24815
>>> >
>>> 
>>> >- Design Doc
>>> ><
>>> https://docs.google.com/document/d/1_YmfCsQQb9XhRdKh0ijbc-j8JKGtGBxYsk_30NVSTWo/edit?usp=sharing>
>>> 
>>> >
>>> >- discussion thread
>>> >
>>> 
>>> >- PR with initial implementation -
>>> >https://github.com/apache/spark/pull/42352
>>> 
>>> >
>>> > Please vote with:
>>> >
>>> > [ ] +1: Accept the proposal and start with the development.
>>> > [ ] +0
>>> > [ ] -1: I don’t think this is a good idea because …
>>> >
>>> > Thank you,
>>> >
>>> > Pavan
>>> >
>>> > On Wed, Jan 17, 2024 at 9:52 PM Mich Talebzadeh 
>>> > wrote:
>>> >
>>> > >
>>> > > +1 for me  (non binding)
>>> > >
>>> > >
>>> > >
>>> > > *Disclaimer:* Use it at your own risk. Any and all responsibility
>>> for any
>>> > > loss, damage or destruction of data or any other property which may
>>> arise
>>> > > from relying on this email's technical content is explicitly
>>> disclaimed.
>>> > > The author will in no case be liable for any monetary damages
>>> arising from
>>> > > such loss, damage or destruction.
>>> > >
>>> > >
>>> > >
>>> >
>>>
>>


Re: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-02-22 Thread Mich Talebzadeh
Hi Pavan,

Do you have a list of votes for this feature by any chance? Does it pass
the required condition as approved?

HTH

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  Von
Braun )".


On Thu, 22 Feb 2024 at 10:04, Pavan Kotikalapudi
 wrote:

> Yes. The PR was closed due to inactivity by github actions..
>
> The msg
>  also
> says
>
> > If you'd like to revive this PR, please reopen it and ask a committer to
> remove the Stale tag!
>
> On Thu, Feb 22, 2024 at 1:09 AM Mich Talebzadeh 
> wrote:
>
>> I can see it was closed. Was it because of inactivity?
>>
>>
>> Mich Talebzadeh,
>> Dad | Technologist | Solutions Architect | Engineer
>> London
>> United Kingdom
>>
>>
>>view my Linkedin profile
>> 
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>> 
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> Von
>> Braun
>> 
>> )".
>>
>>
>> On Thu, 22 Feb 2024 at 06:58, Pavan Kotikalapudi
>>  wrote:
>>
>>> Hi Spark PMC members,
>>>
>>> I think we have few upvotes for this effort here and more people are
>>> showing interest (see  PR comments
>>> 
>>> .)
>>>
>>> Is anyone interested in mentoring and reviewing this effort?
>>>
>>> Also can the repository admin/owner re-open the PR?  ( I guess people
>>> only with admin access to the repository can do that).
>>>
>>> Thank you,
>>>
>>> Pavan
>>>
>>> On Tue, Feb 20, 2024 at 2:08 PM Krystal Mitchell
>>>  wrote:
>>>
 +1

 On 2024/01/17 17:49:32 Pavan Kotikalapudi wrote:
 > Thanks for proposing and voting for the feature Mich.
 >
 > adding some references to the thread.
 >
 >- Jira ticket - SPARK-24815
 >
 
 >- Design Doc
 ><
 https://docs.google.com/document/d/1_YmfCsQQb9XhRdKh0ijbc-j8JKGtGBxYsk_30NVSTWo/edit?usp=sharing>
 
 >
 >- discussion thread
 >
 
 >- PR with initial implementation -
 >https://github.com/apache/spark/pull/42352
 
 >
 > Please vote with:
 >
 > [ ] +1: Accept the proposal and start with the development.
 > [ ] +0
 > [ ] -1: I don’t think this is a good idea because …
 >
 > Thank you,
 >
 > Pavan
 >
 > On Wed, Jan 17, 2024 at 9:52 PM Mich Talebzadeh 
 > wrote:
 >
 > >
 > > +1 for me  (non binding)
 > >
 > >
 > >
 > > *Disclai

Re: Generating config docs automatically

2024-02-22 Thread Nicholas Chammas
Thank you, Holden!

Yes, having everything live in the ConfigEntry is attractive.

The main reason I proposed an alternative where the groups are defined in YAML 
is that if the config groups are defined in ConfigEntry, then altering the 
groupings – which is relevant only to the display of config documentation – 
requires rebuilding Spark. This feels a bit off to me in terms of design.

For example, on the SQL performance tuning page there is some narrative 
documentation about caching 
,
 plus a table of relevant configs. If I want an additional config to show up in 
this table, I need to add it to the config group that backs the table.

With the ConfigEntry approach in #44755 
, that means editing the 
appropriate ConfigEntry and rebuilding Spark before I can regenerate the config 
table.

val SOME_CONFIG = buildConf("spark.sql.someCachingRelatedConfig")
  .doc("some documentation")
  .version("2.1.0")
  .withDocumentationGroup("sql-tuning-caching-data")  // assign group to the 
config
With the YAML approach in #44756 , 
that means editing the config group defined in the YAML file and regenerating 
the config table. No Spark rebuild required.

sql-tuning-caching-data:
- spark.sql.inMemoryColumnarStorage.compressed
- spark.sql.inMemoryColumnarStorage.batchSize
- spark.sql.someCachingRelatedConfig  # add config to the group
In both cases the config names, descriptions, defaults, etc. will be pulled 
from the ConfigEntry when building the HTML tables.

I prefer the latter approach but I’m open to whatever committers are more 
comfortable with. If you prefer the former, then I’ll focus on that and ping 
you for reviews accordingly!


> On Feb 21, 2024, at 11:43 AM, Holden Karau  wrote:
> 
> I think this is a good idea. I like having everything in one source of truth 
> rather than two (so option 1 sounds like a good idea); but that’s just my 
> opinion. I'd be happy to help with reviews though.
> 
> On Wed, Feb 21, 2024 at 6:37 AM Nicholas Chammas  > wrote:
>> I know config documentation is not the most exciting thing. If there is 
>> anything I can do to make this as easy as possible for a committer to 
>> shepherd, I’m all ears!
>> 
>> 
>>> On Feb 14, 2024, at 8:53 PM, Nicholas Chammas >> > wrote:
>>> 
>>> I’m interested in automating our config documentation and need input from a 
>>> committer who is interested in shepherding this work.
>>> 
>>> We have around 60 tables of configs across our documentation. Here’s a 
>>> typical example. 
>>> 
>>> 
>>> These tables span several thousand lines of manually maintained HTML, which 
>>> poses a few problems:
>>> The documentation for a given config is sometimes out of sync across the 
>>> HTML table and its source `ConfigEntry`.
>>> Internal configs that are not supposed to be documented publicly sometimes 
>>> are.
>>> Many config names and defaults are extremely long, posing formatting 
>>> problems.
>>> 
>>> Contributors waste time dealing with these issues in a losing battle to 
>>> keep everything up-to-date and consistent.
>>> 
>>> I’d like to solve all these problems by generating HTML tables 
>>> automatically from the `ConfigEntry` instances where the configs are 
>>> defined.
>>> 
>>> I’ve proposed two alternative solutions:
>>> #44755 : Enhance `ConfigEntry` 
>>> so a config can be associated with one or more groups, and use that new 
>>> metadata to generate the tables we need.
>>> #44756 : Add a standalone YAML 
>>> file where we define config groups, and use that to generate the tables we 
>>> need.
>>> 
>>> If you’re a committer and are interested in this problem, please chime in 
>>> on whatever approach appeals to you. If you think this is a bad idea, I’m 
>>> also eager to hear your feedback.
>>> 
>>> Nick
>>> 
> 
> 



Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-02-22 Thread Sona Torosyan
+1


Re: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-02-22 Thread Pavan Kotikalapudi
Hi Mich,

We have

five  +1s till now.

Mich Talebzadeh
Adam Hobbs
Pavan Kotikalapudi
Krystal Mitchell
Sona Torosyan
(few more in github pr)
+0: None

-1: None

Does it pass the required condition as approved?


Not sure of that though, nothing about minimum required is mentioned in the
past emails.

I would request spark PMC members or any others who have done this in the
past to understand the process better.

Thank you,

Pavan

On Thu, Feb 22, 2024 at 3:20 AM Mich Talebzadeh 
wrote:

> Hi Pavan,
>
> Do you have a list of votes for this feature by any chance? Does it pass
> the required condition as approved?
>
> HTH
>
> Mich Talebzadeh,
> Dad | Technologist | Solutions Architect | Engineer
> London
> United Kingdom
>
>
>view my Linkedin profile
> 
>
>
>  https://en.everybodywiki.com/Mich_Talebzadeh
> 
>
>
>
> *Disclaimer:* The information provided is correct to the best of my
> knowledge but of course cannot be guaranteed . It is essential to note
> that, as with any advice, quote "one test result is worth one-thousand
> expert opinions (Werner
> Von
> Braun
> 
> )".
>
>
> On Thu, 22 Feb 2024 at 10:04, Pavan Kotikalapudi
>  wrote:
>
>> Yes. The PR was closed due to inactivity by github actions..
>>
>> The msg
>> 
>>  also
>> says
>>
>> > If you'd like to revive this PR, please reopen it and ask a committer
>> to remove the Stale tag!
>>
>> On Thu, Feb 22, 2024 at 1:09 AM Mich Talebzadeh <
>> mich.talebza...@gmail.com> wrote:
>>
>>> I can see it was closed. Was it because of inactivity?
>>>
>>>
>>> Mich Talebzadeh,
>>> Dad | Technologist | Solutions Architect | Engineer
>>> London
>>> United Kingdom
>>>
>>>
>>>view my Linkedin profile
>>> 
>>>
>>>
>>>  https://en.everybodywiki.com/Mich_Talebzadeh
>>> 
>>>
>>>
>>>
>>> *Disclaimer:* The information provided is correct to the best of my
>>> knowledge but of course cannot be guaranteed . It is essential to note
>>> that, as with any advice, quote "one test result is worth one-thousand
>>> expert opinions (Werner
>>> Von
>>> Braun
>>> 
>>> )".
>>>
>>>
>>> On Thu, 22 Feb 2024 at 06:58, Pavan Kotikalapudi
>>>  wrote:
>>>
 Hi Spark PMC members,

 I think we have few upvotes for this effort here and more people are
 showing interest (see  PR comments
 
 .)

 Is anyone interested in mentoring and reviewing this effort?

 Also can the repository admin/owner re-open the PR?  ( I guess people
 only with admin access to the repository can do that).

 Thank you,

 Pavan

 On Tue, Feb 20, 2024 at 2:08 PM Krystal Mitchell
  wrote:

> +1
>
> On 2024/01/17 17:49:32 Pavan Kotikalapudi wrote:
> > Thanks for proposing and voting for the feature Mich.
> >
> > adding some references to the thread.
> >
> >- Jira ticket - SPARK-24815
> >
> 

Re: Vote on Dynamic resource allocation for structured streaming [SPARK-24815]

2024-02-22 Thread Mich Talebzadeh
Hi,

please check this doc

Spark Project Improvement Proposals (SPIP) | Apache Spark


and specifically the below extract

Discussing an SPIP

All discussion of an SPIP should take place in a public forum, preferably
the discussion attached to the Jira. Any discussions that happen offline
should be made available online for the public via meeting notes
summarizing the discussions.(done)

During this discussion, one or more shepherds should be identified among
PMC members. (outstanding)

Once the discussion settles, the shepherd(s) should call for a vote on the
SPIP moving forward on the dev@ list. The vote should be open for at least
72 hours and follows the typical Apache vote process and passes upon
consensus (at least 3 +1 votes from PMC members and no -1 votes from PMC
members). dev@ should be notified of the vote result.

If there does not exist at least one PMC member that is committed to
shepherding the change within a month, the SPIP is rejected.

If a committer does not think a SPIP aligns with long-term project goals,
or is not practical at the point of proposal, the committer should -1 the
SPIP explicitly and give technical justifications.
OK a shepherd from PMC members is required. Maybe Jungtaek Lee can kindly
help the process

cheers

Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom


   view my Linkedin profile



 https://en.everybodywiki.com/Mich_Talebzadeh



*Disclaimer:* The information provided is correct to the best of my
knowledge but of course cannot be guaranteed . It is essential to note
that, as with any advice, quote "one test result is worth one-thousand
expert opinions (Werner  Von
Braun )".


On Thu, 22 Feb 2024 at 21:52, Pavan Kotikalapudi
 wrote:

> Hi Mich,
>
> We have
>
> five  +1s till now.
>
> Mich Talebzadeh
> Adam Hobbs
> Pavan Kotikalapudi
> Krystal Mitchell
> Sona Torosyan
> (few more in github pr)
> +0: None
>
> -1: None
>
> Does it pass the required condition as approved?
>
>
> Not sure of that though, nothing about minimum required is mentioned in
> the past emails.
>
> I would request spark PMC members or any others who have done this in the
> past to understand the process better.
>
> Thank you,
>
> Pavan
>
> On Thu, Feb 22, 2024 at 3:20 AM Mich Talebzadeh 
> wrote:
>
>> Hi Pavan,
>>
>> Do you have a list of votes for this feature by any chance? Does it pass
>> the required condition as approved?
>>
>> HTH
>>
>> Mich Talebzadeh,
>> Dad | Technologist | Solutions Architect | Engineer
>> London
>> United Kingdom
>>
>>
>>view my Linkedin profile
>> 
>>
>>
>>  https://en.everybodywiki.com/Mich_Talebzadeh
>> 
>>
>>
>>
>> *Disclaimer:* The information provided is correct to the best of my
>> knowledge but of course cannot be guaranteed . It is essential to note
>> that, as with any advice, quote "one test result is worth one-thousand
>> expert opinions (Werner
>> Von
>> Braun
>> 
>> )".
>>
>>
>> On Thu, 22 Feb 2024 at 10:04, Pavan Kotikalapudi
>>  wrote:
>>
>>> Yes. The PR was closed due to inactivity by github actions..
>>>
>>> The msg
>>> 
>>>  also
>>> says
>>>
>>> > If you'd like to revive this PR, please reopen it and ask a committer
>>> to remove the Stale tag!
>>>
>>> On Thu, Feb 22, 2024 at 1:09 AM Mich Talebzadeh <
>>> mich.talebza...@gmail.com> wrote:
>>>
 I can see it was closed. Was it because of inactivity?


 Mich Talebzadeh,
 Dad | Technologist | Solutions Architect | Engineer
 London
 United Kingdom


view my Linkedin profile
 


  https://en.everybodywiki.com/Mich_Talebzadeh