AW: partial savepoints/combining savepoints

Claudia Wegmann Mon, 01 Aug 2016 07:41:35 -0700

Hi Till,

thanks for the quick reply. Too bad, I thought I was on the right track with 
savepoints here.


Some follow-up questions:


1.)    Can I do the whole thing of transferring state and the position in the 
Kafka topic manually for one stream? In other words: is this information 
accessible easily?


2.)    In any case I would need to stop the running job, change the topology 
(e.g. the number of streams in the program) and resume processing. Can you name 
the overhead of time coming from stopping and starting a Flink job?


3.)    I’m aware of the upcoming feature for scaling in and out. But I don’t 
quite see, how this will help me with different services.
I thought of each service having its own Flink instance/cluster. I would commit 
this service as one job to the dedicated Flink containing all the necessary 
streams and computations. Is this a bad architecture?
Would it be better to have one big Flink cluster and commit one big Job, which 
contains all the streams? (As I got to know, committing multiple jobs to one 
Flink instance is not recommended).
To be honest, I’m not quite there to totally understand the different 
deployment options of Flink and how to bring them together with a microservice 
architecture where I have a service packed as a JAR-File and wanting to be able 
to just deploy this JAR-File. I thought of this service containing Flink and 
then start the JobManager and some TaskManagers from this service and deploy 
itself as the Flink job with a dedicated entry point. Is this a good idea? Or 
is it even possible?

Thanks in advance,
Claudia

Von: Till Rohrmann [mailto:[email protected]]
Gesendet: Montag, 1. August 2016 16:21
An: [email protected]
Betreff: Re: partial savepoints/combining savepoints

Hi Claudia,

unfortunately neither taking partial savepoints nor combining multiple 
savepoints into one savepoint is currently supported by Flink.

However, we're currently working on dynamic scaling which will allow to adjust 
the parallelism of your Flink job. This helps you to scale in/out depending on 
the workload of your job. However, you would only be able to scale within a 
single Flink job and not across Flink jobs.

Cheers,
Till

On Mon, Aug 1, 2016 at 9:49 PM, Claudia Wegmann 
<[email protected]<mailto:[email protected]>> wrote:
Hey everyone,

I’ve got some questions regarding savepoints in Flink. I have the following 
situation:

There is a microservice that reads data from Kafka topics, creates Flink 
streams from this data and does different computations/pattern matching 
workloads. If the overall workload for this service becomes too big, I want to 
start a new instance of this service and share the work between the running 
services. To accomplish that, I thought about using Flinks savepoint mechanism. 
But there are some open questions:


1.)    Can I combine two or more savepoints in one program?
Think of two services already running. Now I’m starting up a third service. The 
new one would get savepoints from the already running services. It than would 
continue computation of some streams while the other services would discard 
calculation on these streams now calculated by the new service. So, is it 
possible to combine two or more savepoints in one program?

2.)    Another approach I could think of for accomplishing the introduction of 
a new service would be, to just take a savepoint of the streams that change 
service. Can I only take a savepoint of a part of the running job?
Thanks for your comments and best wishes,
Claudia

AW: partial savepoints/combining savepoints

Reply via email to