+1, I have no issues with the practicality and value of this feature itself.
I've left some comments concerning ongoing maintenance and
compatibility-related matters, which we can continue to discuss.

Jungtaek Lim <kabhwan.opensou...@gmail.com> 于2023年10月17日周二 05:23写道:

> Thanks Bartosz and Anish for your support!
>
> I'll wait for a couple more days to see whether we can hear more voices on
> this. We could probably look for initiating a VOTE thread if there is no
> objection.
>
> On Tue, Oct 17, 2023 at 5:48 AM Anish Shrigondekar <
> anish.shrigonde...@databricks.com> wrote:
>
>> Hi Jungtaek,
>>
>> Thanks for putting this together. +1 from me and looks good overall.
>> Posted some minor comments/questions to the doc.
>>
>> Thanks,
>> Anish
>>
>> On Mon, Oct 16, 2023 at 11:25 AM Bartosz Konieczny <
>> bartkoniec...@gmail.com> wrote:
>>
>>> Thank you, Jungtaek, for your answers! It's clear now.
>>>
>>> +1 for me. It seems like a prerequisite for further ops-related
>>> improvements for the state store management. I mean especially here the
>>> state rebalancing that could rely on this read+write state store API. I
>>> don't mean here the dynamic state rebalancing that could probably be
>>> implemented with a lower latency directly in the stateful API. Instead I'm
>>> thinking more of an offline job to rebalance the state and later restart
>>> the stateful pipeline with the changed number of shuffle partitions.
>>>
>>> Best,
>>> Bartosz.
>>>
>>> On Mon, Oct 16, 2023 at 6:19 PM Jungtaek Lim <
>>> kabhwan.opensou...@gmail.com> wrote:
>>>
>>>> bump for better reach
>>>>
>>>> On Thu, Oct 12, 2023 at 4:26 PM Jungtaek Lim <
>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>
>>>>> Sorry, please use this link instead for SPIP doc:
>>>>> https://docs.google.com/document/d/1_iVf_CIu2RZd3yWWF6KoRNlBiz5NbSIK0yThqG0EvPY/edit?usp=sharing
>>>>>
>>>>>
>>>>> On Thu, Oct 12, 2023 at 3:58 PM Jungtaek Lim <
>>>>> kabhwan.opensou...@gmail.com> wrote:
>>>>>
>>>>>> Hi dev,
>>>>>>
>>>>>> I'd like to start a discussion on "State Data Source - Reader".
>>>>>>
>>>>>> This proposal aims to introduce a new data source "statestore" which
>>>>>> enables reading the state rows from existing checkpoint via offline 
>>>>>> (batch)
>>>>>> query. This will enable users to 1) create unit tests against stateful
>>>>>> query verifying the state value (especially flatMapGroupsWithState), 2)
>>>>>> gather more context on the status when an incident occurs, especially for
>>>>>> incorrect output.
>>>>>>
>>>>>> *SPIP*:
>>>>>> https://docs.google.com/document/d/1HjEupRv8TRFeULtJuxRq_tEG1Wq-9UNu-ctGgCYRke0/edit?usp=sharing
>>>>>> *JIRA*: https://issues.apache.org/jira/browse/SPARK-45511
>>>>>>
>>>>>> Looking forward to your feedback!
>>>>>>
>>>>>> Thanks,
>>>>>> Jungtaek Lim (HeartSaVioR)
>>>>>>
>>>>>> ps. The scope of the project is narrowed to the reader in this SPIP,
>>>>>> since the writer requires us to consider more cases. We are planning on 
>>>>>> it.
>>>>>>
>>>>>
>>>
>>> --
>>> Bartosz Konieczny
>>> freelance data engineer
>>> https://www.waitingforcode.com
>>> https://github.com/bartosz25/
>>> https://twitter.com/waitingforcode
>>>
>>>

Reply via email to