Hi Gordon,
The modify max parallelism API looks good to me. Thank you and Seth for the
great work on it.
Cheers,
Jark
On Tue, 25 Jun 2019 at 16:01, Tzu-Li (Gordon) Tai
wrote:
> Hi Jark,
>
> Thanks for the reminder. I've updated the FLIP name in confluence to match
> the new name "State Process
Hi Jark,
Thanks for the reminder. I've updated the FLIP name in confluence to match
the new name "State Processor API".
Concerning an API for changing max parallelism:
That is actually in the works and has been considered, and would look
something like -
```
ExistingSavepoint savepoint = Savepoin
Thanks for the awesome FLIP.
I think it will be very useful in state migration scenario. We are also
looking for a state reuse solution for SQL jobs. And I think this feature
will help a lot.
Looking forward to have it in the near future.
Regarding to the naming, I'm +1 to "State Processing API".
On Wed, Jun 5, 2019 at 6:39 AM Xiaowei Jiang wrote:
> Hi Gordon & Seth, this looks like a very useful feature for analyze and
> manage states.
> I agree that using DataSet is probably the most practical choice right
> now. But in the longer adding the TableAPI support for this will be nice.
>
A
Hi Gordon & Seth, this looks like a very useful feature for analyze and manage
states.
I agree that using DataSet is probably the most practical choice right now. But
in the longer adding the TableAPI support for this will be nice.
When analyzing the savepoint, I assume that the state backend r
+1 I think is is a very valuable new additional and we should try and not get
stuck on trying to design the perfect solution for everything
> On 4. Jun 2019, at 13:24, Tzu-Li (Gordon) Tai wrote:
>
> +1 to renaming it as State Processing API and adding it under the
> flink-libraries module.
>
>
+1 to renaming it as State Processing API and adding it under the
flink-libraries module.
I also think we can start with the development of the feature. From the
feedback so far, it seems like we're in a good spot to add in at least the
initial version of this API, hopefully making it ready for 1.
It seems like a recurring piece of feedback was a different name. I’d like to
propose moving the functionality to the libraries module and naming this the
State Processing API.
Seth
> On May 31, 2019, at 3:47 PM, Seth Wiesman wrote:
>
> The SavepointOutputFormat only writes out the savepoint
The SavepointOutputFormat only writes out the savepoint metadata file and
should be mostly ignored.
The actual state is written out by stream operators and tied into the flink
runtime[1, 2, 3].
This is the most important part and the piece that I don’t think can be
reasonably extracted.
Seth
Hi Seth,
yes, that helped! :-)
What I was looking for is essentially
`org.apache.flink.connectors.savepoint.output.SavepointOutputFormat`. It
would be great if this could be written in a way, that would enable
reusing it in different engine (as I mentioned - Apache Spark). There
seem to be s
@Jan Gotcha,
So in reusing components it explicitly is not a writer. This is not a savepoint
output format in the way we have a parquet output format. The reason for the
Transform api is to hide the underlying details, it does not simply append a
output writer to the end of a dataset. This get
> I think it’s best to keep this initial implementation focused and add those
> changes if there is adoption and interest in the community.
I agree. I didn’t mean to hold the implementation/acceptance of this until
someone solve the SQL story :)
Piotrek
> On 31 May 2019, at 13:18, Seth Wiesma
Hi Seth,
that sounds reasonable. What I was asking for was not to reverse
engineer binary format, but to make the savepoint write API a little
more reusable, so that it could be wrapped into some other technology. I
don't know the details enough to propose a solution, but it seems to me,
that
And I can definitely imagine a “savepoint catalog” at some point in the future.
Seth
> On May 31, 2019, at 4:39 AM, Tzu-Li (Gordon) Tai wrote:
>
> @Piotr
> Yes, we're aiming this for the 1.9 release. This was also mentioned in the
> recent 1.9 feature discussion thread [1].
>
> [1]
> http://a
@Konstantin agreed, that was a large impotence for this feature. Also I am
happy to change the name to something that better describes the feature set.
@Lan
Savepoints depend heavily on a number of flink internal components that may
change between versions: state backends internals, type seria
@Piotr
I definitely would like to see this have sql integrations at some point. The
reason for holding off is that to do so would require savepoint format, it is
not currently possible to discover states and schemas without state descriptors
in a robust way.
I think it’s best to keep this in
Hi,
this is awesome, and really useful feature. If I might ask for one thing
to consider - would it be possible to make the Savepoint manipulation
API (at least writing the Savepoint) less dependent on other parts of
Flink internals (e.g. |KeyedStateBootstrapFunction|) and provide
something m
@Piotr
Yes, we're aiming this for the 1.9 release. This was also mentioned in the
recent 1.9 feature discussion thread [1].
[1]
http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Features-for-Apache-Flink-1-9-0-td28701.html
On Fri, May 31, 2019 at 4:34 PM Piotr Nowojski wrote
I was long awaiting this feature! I can not help much with reviewing, but big
+1 from my side :)
One thing that would be great for analyzing the state and possible smaller
modifications, would be to hook this in with Flink SQL :) Especially if it
could be done in a way that would work out of th
Hi Seth,
big +1, happy to see this moving forward :) I have seen plenty of users,
who refrained using managed state for some of their data/use cases due to
the lack of something like this. I am not sure about the name "Savepoint
Connector", but for a different reason. While it is technically a
"co
this is an awesome feature.
> The name "Savepoint Connector" might indeed be not that good, as it
doesn't
point out the fact that with the current design, all kinds of snapshots
(savepoint / full or incremental checkpoints) can be read.
@Gordon can you add the above clarification to the FLIP page
Hi,
@Gordon @Seth
Thanks a lot for your inputs! In general, I agree with you. The metadata
querying feature is a nice-to-have but not a must-have, and it’s reasonable to
make it as a follow up since it requires some extra work.
Best,
Paul Lam
> 在 2019年5月30日,19:22,Seth Wiesman 写道:
>
> @Paul
@Paul
I agree with Gordon that those are useful features. The only thing I’d like to
add is that I don’t believe listing operator ids will be useful to most users,
they want to see UIDs which would also require changes to the Savepoint
metadata file. I think that would be a good follow up but
+1 from my size.
I think it will be a good feature.
Best
--
Louis
Email:xu_soft39211...@163.com
> On 30 May 2019, at 15:57, Tzu-Li (Gordon) Tai wrote:
>
> The name "Savepoint Connector" might indeed be not that good, as it doesn't
> point out the fact that with the current design, all kinds o
The name "Savepoint Connector" might indeed be not that good, as it doesn't
point out the fact that with the current design, all kinds of snapshots
(savepoint / full or incremental checkpoints) can be read.
@Paul
That would be a very valid requirement. Querying the list of existing
operator ids sh
Hi Seth,
Sorry for the confusion. I mean currently we need to know the operator id,
state name and the state type (eg. ListState, MapState) beforehand to get the
states. Is possible that we can perform a scan to get all existing operator ids
or state names in the savepoint? It would be good to
Hi Paul,
I’m not following, could you provide and example of the kind of operation your
describing?
Seth
> On May 29, 2019, at 7:37 PM, Paul Lam wrote:
>
> Hi Seth,
>
> +1 from my side.
>
> I was wondering if we can add a reader method to provide a full view of the
> states instead of t
Hi Seth,
Big +1 from my side. I like this idea. IMO, it’s better to chose another flip
name instead of ‘connector’, which is a little confusing.
> 在 2019年5月30日,上午10:37,Paul Lam 写道:
>
> Hi Seth,
>
> +1 from my side.
>
> I was wondering if we can add a reader method to provide a full view of t
Hi Seth,
+1 from my side.
I was wondering if we can add a reader method to provide a full view of the
states instead of the state of a specific operator? It would be helpful when
there is some unrestored states of a previously removed operator in the
savepoint.
Best,
Paul Lam
> 在 2019年5月30日
Hi Seth,
Glad to see this FLIP, big +1 for this feature!
Best,
Vino
Seth Wiesman 于2019年5月30日周四 上午7:14写道:
> Hey Everyone!
>
> Gordon and I have been discussing adding a savepoint connector to flink
> for reading, writing and modifying savepoints.
>
> This is useful for:
>
> Analyzing
Hey Everyone!
Gordon and I have been discussing adding a savepoint connector to flink for
reading, writing and modifying savepoints.
This is useful for:
Analyzing state for interesting patterns
Troubleshooting or auditing jobs by checking for discrepancies in state
Bootstrapping
Hey Everyone!
Gordon and I have been discussing adding a savepoint connector to flink for
reading, writing, and modifying savepoints.
This is useful for:
* Analyzing state for interesting patterns
* Troubleshooting or auditing jobs by checking for discrepancies in state
* Bootstrapping state fo
32 matches
Mail list logo