java.lang.SecurityException: setContextClassLoader

2024-10-23 Thread Tony Chen
;Data Types & Serialization" for details of the effect on performance and schema evolution. I tried to create a security policy to allow setContextClassLoader, but that didn't work. Any idea on how to fix this will be greatly appreciated. Thanks, -- <http://www.robinhood.com/>

Re: Increasing maximum number of FlinkDeployments that the Operator can handle

2023-11-08 Thread Tony Chen
Currently, 16GB of heap size is allocated to the flink-kubernetes-operator container by setting *jvmArgs.operator*, and this didn't help either. On Wed, Nov 8, 2023 at 5:56 PM Tony Chen wrote: > Hi Flink Community, > > This is a follow-up on a previous email thread (see emai

Increasing maximum number of FlinkDeployments that the Operator can handle

2023-11-08 Thread Tony Chen
ed in this message >> https://lists.apache.org/thread/0odcc9pvlpz1x9y2nop9dlmcnp9v1696 >> I tried changing versions and allocated resources, as well as the number >> of reconcile threads, but nothing helped >> >> -- >> *От:* Tony Chen &g

Re: flink-kubernetes-operator cannot handle SPECCHANGE for 100+ FlinkDeployments concurrently

2023-11-02 Thread Tony Chen
GRADING > | > {"type":"org.apache.flink.kubernetes.operator.exception.ReconciliationException","message":"org.apache.flink.client.deployment.ClusterDeploymentException: > The Flink cluster already > exists.","additionalMetadata":{},"thro

Fwd: flink-kubernetes-operator cannot handle SPECCHANGE for 100+ FlinkDeployments concurrently

2023-11-01 Thread Tony Chen
message":"org.apache.flink.client.deployment.ClusterDeploymentException: The Flink cluster already exists.","additionalMetadata":{},"throwableList":[{"type":"org.apache.flink.client.deployment.ClusterDeploymentException","message":"The Flink clust

Re: Issue with flink-kubernetes-operator not updating execution.savepoint.path after savepoint deletion

2023-10-21 Thread Tony Chen
scope.jm: >>> flink.jobmanager.metric >>> metrics.scope.jm.job: >>> flink.jobmanager.job..metric >>> metrics.scope.operator: >>> flink.taskmanager.job..operator..metric >>> metrics.scope.task: >>> flink.taskmanager.job..task..metr

Re: Flink Operator 1.6 causes JobManagerDeploymentStatus: MISSING

2023-10-18 Thread Tony Chen
I did see another email thread that mentions instructions on getting the image from this link: https://github.com/apache/flink-kubernetes-operator/pkgs/container/flink-kubernetes-operator/127962962?tag=3f0dc2e On Wed, Oct 18, 2023 at 6:25 PM Tony Chen wrote: > We're using the Helm

Re: Flink Operator 1.6 causes JobManagerDeploymentStatus: MISSING

2023-10-18 Thread Tony Chen
8, 2023 at 2:55 PM Gyula Fóra wrote: > Hi! > Not sure if it’s the same but could you try picking up the fix from the > release branch and confirming that it solves the problem? > > If it does we may consider a quick bug fix release. > > Cheers > Gyula > > On Wed, 18 Oct 2

Re: Flink kubernets operator delete HA metadata after resuming from suspend

2023-10-18 Thread Tony Chen
from >>>>> HA metadata >>>>> 2023-09-11 06:02:07,999 o.a.f.k.o.c.FlinkDeploymentController >>>>> [ERROR][rec-job/rec-job] Flink recovery failed >>>>> 2023-09-11 06:02:08,012 o.a.f.k.o.l.AuditUtils [INFO >>>>> ][rec-jo

Flink Operator 1.6 causes JobManagerDeploymentStatus: MISSING

2023-10-18 Thread Tony Chen
hat could cause the JobManagerDeploymentStatus to be MISSING? Thanks, Tony -- <http://www.robinhood.com/> Tony Chen Software Engineer Menlo Park, CA Don't copy, share, or use this email without permission. If you received it by accident, please let us know and then delete it right away.

Re: observedGeneration field in FlinkDeployment

2023-10-09 Thread Tony Chen
I think that a FLIP JIRA should be created to add an `observedGeneration` field to the spec. When I look at other kubernetes APIs, I see the `observedGeneration` field in many of them: https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.26/ On Mon, Sep 11, 2023 at 11:51 AM Tony Chen

Re: Rolling back a bad deployment of FlinkDeployment on kubernetes

2023-10-06 Thread Tony Chen
des also dont) > > I hope this helps to understand the problem. > The solution in these cases is to manually recover the job from the last > checkpoint/savepoint. > > Cheers, > Gyula > > > On Thu, Oct 5, 2023 at 7:56 PM Tony Chen wrote: > >> I tried this out wi

Re: Rolling back a bad deployment of FlinkDeployment on kubernetes

2023-10-05 Thread Tony Chen
HA metadata Besides setting kubernetes.operator.deployment.rollback.enabled: true, is there anything else that I need to configure? On Thu, Oct 5, 2023 at 10:35 AM Tony Chen wrote: > I just saw this experimental feature in the documentation: > https://nightlies.apache.org/flink/flink-kuber

Re: Rolling back a bad deployment of FlinkDeployment on kubernetes

2023-10-05 Thread Tony Chen
2023 at 3:25 PM Tony Chen wrote: > Hi Flink Community, > > I am currently running Apache flink-kubernetes-operator on our kubernetes > clusters, and I have Flink applications that are deployed using the > FlinkDeployment Custom Resources (CR). I am trying to automate the process >

Rolling back a bad deployment of FlinkDeployment on kubernetes

2023-10-04 Thread Tony Chen
d like to avoid manual restores if possible. Is it possible to recover by just changing the FlinkDeployment spec? Thanks, Tony -- <http://www.robinhood.com/> Tony Chen Software Engineer Menlo Park, CA Don't copy, share, or use this email without permission. If you received it by accident, please let us know and then delete it right away.

Re: observedGeneration field in FlinkDeployment

2023-09-11 Thread Tony Chen
munity, >>> >>> I noticed that there is no status.observedGeneration field in the >>> FlinkDeployment spec. The closest field to this is >>> status.reconciliationStatus.lastReconciledSpec. Are there plans to add the >>> observedGeneration field in the spe

observedGeneration field in FlinkDeployment

2023-09-08 Thread Tony Chen
in the FlinkDeployment spec. Thanks, Tony -- <http://www.robinhood.com/> Tony Chen Software Engineer Menlo Park, CA Don't copy, share, or use this email without permission. If you received it by accident, please let us know and then delete it right away.

Re: Enable RocksDB in FlinkDeployment with flink-kubernetes-operator

2023-08-30 Thread Tony Chen
ackend as well. > > You can simply set this in you config like before :) > > Cheers > Gyula > > On Wed, 30 Aug 2023 at 19:22, Tony Chen wrote: > >> Hi Flink Community, >> >> Does the flink-kubernetes-operator support RocksDB as the state backend >> f

Enable RocksDB in FlinkDeployment with flink-kubernetes-operator

2023-08-30 Thread Tony Chen
recommendations on how we can decrease the size of these states? Thanks, Tony -- <http://www.robinhood.com/> Tony Chen Software Engineer Menlo Park, CA Don't copy, share, or use this email without permission. If you received it by accident, please let us know and then delete it right away.

Re: Questions on Restarting a Flink Application from a savepoint or checkpoint

2023-07-21 Thread Tony Chen
e I think it’s worth talking about this > delete/recreate requirement because it sounds a bit strange in the > Kubernetes world . We specifically designed the operator in a way so that > you wouldn’t have to do this if you want the latest state and so far this > is the first I hear this

Re: Questions on Restarting a Flink Application from a savepoint or checkpoint

2023-07-19 Thread Tony Chen
ng the job for the while or > just use a regular upgrade to fit your needs . > > Cheers > Gyula > > On Wed, 19 Jul 2023 at 22:19, Tony Chen wrote: > >> Hi Gyula, >> >> Thank you for responding so quickly. I went through the page you sent me >> a bit more, a

Re: Questions on Restarting a Flink Application from a savepoint or checkpoint

2023-07-19 Thread Tony Chen
oint triggering can help you keep backups for failure > recovery but they should not be executed as part of your upgrade flow > because the operator already does this for you. > > Cheers, > Gyula > > On Wed, Jul 19, 2023 at 8:20 PM Tony Chen > wrote: > >> Hi F

Questions on Restarting a Flink Application from a savepoint or checkpoint

2023-07-19 Thread Tony Chen
Hi Flink Community, My name is Tony Chen, and I am a software engineer at Robinhood. I have some questions on restarting a Flink Application from a savepoint or checkpoint. We currently store our checkpoints and savepoints in S3, and we would like to use the Apache Flink Kubernetes Operator to