Hello Robert,
Thanks for the info. That makes sense. I will save and cancel my jobs with
1.10, upgrade to 1.11, and restore the jobs from the savepoints.
Thanks and regards,
Averell
--
Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/
Hey Averell,
to clarify: You should be able to migrate using a savepoint from 1.10 to
1.11. Restoring from the state stored in Zookeeper (for HA) with a newer
Flink version won't work.
On Mon, Oct 26, 2020 at 5:05 PM Robert Metzger wrote:
> Hey Averell,
>
> you should be able to migrate savepoi
Hey Averell,
you should be able to migrate savepoints from Flink 1.10 to 1.11.
Is there a simple way for me to reproduce this issue locally? This seems to
be a rare, but probably valid issue. Are you using any special operators?
(like the new source API?)
Best,
Robert
On Wed, Oct 21, 2020 at 11
Hello Roman,
Thanks for the answer.
I have already had that high-availability.storageDir configured to an S3
location. Our service is not critical enough, so to save the cost, we are
using the single-master EMR setup. I understand that we'll not get YARN HA
in that case, but what I expect here is
Hello Averell,
I don't think ZK data is stored on a master node. And Flink JM data is
stored usually on DFS - according to "high-availability.storageDir" [1]
In either case, for Flink to be HA, Yarn should also be HA. And I think
this is not the case with a single master node. Please consider
mu
Hello Roman,
Thanks for your time.
I'm using EMR 5.30.1 (Flink 1.10.0) with 1 master node.
/yarn.application-attempts/ is not set (does that means unlimited?), while
/yarn.resourcemanager.am.max-attempts/ is 4.
In saying "EMR cluster crashed) I meant the cluster is lost. Some scenarios
which cou
Hi,
Can you explain what "EMR cluster crashed" means in the 2nd scenario?
Can you also share:
- yarn.application-attempts in Flink
- yarn.resourcemanager.am.max-attempts in Yarn
- number of EMR master nodes (1 or 3)
- EMR version?
Regards,
Roman
On Mon, Oct 19, 2020 at 8:22 AM Averell wrote:
Hi,
I'm trying to enable HA for my Flink jobs running on AWS EMR.
Following [1], I created a common Flink YARN session and submitting all my
jobs to that one. These 4 config params were added
/high-availability = zookeeper
high-availability.storageDir =
high-availability.zookepper.pa