[jira] [Resolved] (FLINK-35525) HDFS delegation token fetched by custom DelegationTokenProvider is not passed to Yarn AM
[ https://issues.apache.org/jira/browse/FLINK-35525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-35525. --- Fix Version/s: 1.20.0 Resolution: Fixed 74b100b on master > HDFS delegation token fetched by custom DelegationTokenProvider is not > passed to Yarn AM > - > > Key: FLINK-35525 > URL: https://issues.apache.org/jira/browse/FLINK-35525 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.19.0, 1.18.1 >Reporter: Zhen Wang >Assignee: Zhen Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.20.0 > > > I tried running flink with hadoop proxy user by disabling HadoopModuleFactory > and flink built-in token providers, and implementing a custom token provider. > However, only the hdfs token obtained by hadoopfs provider was added in > YarnClusterDescriptor, which resulted in Yarn AM submission failure. > Discussion: https://github.com/apache/flink/pull/22009#issuecomment-2132676114 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-35525) HDFS delegation token fetched by custom DelegationTokenProvider is not passed to Yarn AM
[ https://issues.apache.org/jira/browse/FLINK-35525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-35525. - > HDFS delegation token fetched by custom DelegationTokenProvider is not > passed to Yarn AM > - > > Key: FLINK-35525 > URL: https://issues.apache.org/jira/browse/FLINK-35525 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.19.0, 1.18.1 >Reporter: Zhen Wang >Assignee: Zhen Wang >Priority: Major > Labels: pull-request-available > Fix For: 1.20.0 > > > I tried running flink with hadoop proxy user by disabling HadoopModuleFactory > and flink built-in token providers, and implementing a custom token provider. > However, only the hdfs token obtained by hadoopfs provider was added in > YarnClusterDescriptor, which resulted in Yarn AM submission failure. > Discussion: https://github.com/apache/flink/pull/22009#issuecomment-2132676114 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-35371) Allow the keystore and truststore type to configured for SSL
[ https://issues.apache.org/jira/browse/FLINK-35371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-35371. --- Fix Version/s: 1.20.0 Resolution: Fixed 0919ff2 on master > Allow the keystore and truststore type to configured for SSL > > > Key: FLINK-35371 > URL: https://issues.apache.org/jira/browse/FLINK-35371 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Ammar Master >Assignee: Ammar Master >Priority: Minor > Labels: SSL, pull-request-available > Fix For: 1.20.0 > > > Flink always creates a keystore and trustore using the [default > type|https://github.com/apache/flink/blob/b87ead743dca161cdae8a1fef761954d206b81fb/flink-runtime/src/main/java/org/apache/flink/runtime/net/SSLUtils.java#L236] > defined in the JDK, which in most cases is JKS. > {code} > KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType()); > {code} > We should add other configuration options to set the type explicitly to > support other custom formats, and match the options provided by other > applications by > [Spark|https://spark.apache.org/docs/latest/security.html#:~:text=the%20key%20store.-,%24%7Bns%7D.keyStoreType,-JKS] > and > [Kafka|https://kafka.apache.org/documentation/#:~:text=per%2Dbroker-,ssl.keystore.type,-The%20file%20format] > already. The default would continue to be specified by the JDK. > > The SSLContext for the REST API can read the configuration option directly, > and we need to add extra logic to the > [CustomSSLEngineProvider|https://github.com/apache/flink/blob/master/flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/pekko/CustomSSLEngineProvider.java] > for Pekko. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-35371) Allow the keystore and truststore type to configured for SSL
[ https://issues.apache.org/jira/browse/FLINK-35371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-35371. - > Allow the keystore and truststore type to configured for SSL > > > Key: FLINK-35371 > URL: https://issues.apache.org/jira/browse/FLINK-35371 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Ammar Master >Assignee: Ammar Master >Priority: Minor > Labels: SSL, pull-request-available > Fix For: 1.20.0 > > > Flink always creates a keystore and trustore using the [default > type|https://github.com/apache/flink/blob/b87ead743dca161cdae8a1fef761954d206b81fb/flink-runtime/src/main/java/org/apache/flink/runtime/net/SSLUtils.java#L236] > defined in the JDK, which in most cases is JKS. > {code} > KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType()); > {code} > We should add other configuration options to set the type explicitly to > support other custom formats, and match the options provided by other > applications by > [Spark|https://spark.apache.org/docs/latest/security.html#:~:text=the%20key%20store.-,%24%7Bns%7D.keyStoreType,-JKS] > and > [Kafka|https://kafka.apache.org/documentation/#:~:text=per%2Dbroker-,ssl.keystore.type,-The%20file%20format] > already. The default would continue to be specified by the JDK. > > The SSLContext for the REST API can read the configuration option directly, > and we need to add extra logic to the > [CustomSSLEngineProvider|https://github.com/apache/flink/blob/master/flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/pekko/CustomSSLEngineProvider.java] > for Pekko. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-35625) FLIP-464: Merge "flink run" and "flink run-application"
[ https://issues.apache.org/jira/browse/FLINK-35625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-35625. --- Fix Version/s: 2.0.0 (was: 1.20.0) Resolution: Fixed e56b54d on master > FLIP-464: Merge "flink run" and "flink run-application" > --- > > Key: FLINK-35625 > URL: https://issues.apache.org/jira/browse/FLINK-35625 > Project: Flink > Issue Type: Improvement > Components: Client / Job Submission, Command Line Client >Reporter: Ferenc Csaky >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Ticket to track > [FLIP-464|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=311626179]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-35625) FLIP-464: Merge "flink run" and "flink run-application"
[ https://issues.apache.org/jira/browse/FLINK-35625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-35625. - > FLIP-464: Merge "flink run" and "flink run-application" > --- > > Key: FLINK-35625 > URL: https://issues.apache.org/jira/browse/FLINK-35625 > Project: Flink > Issue Type: Improvement > Components: Client / Job Submission, Command Line Client >Reporter: Ferenc Csaky >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Ticket to track > [FLIP-464|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=311626179]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-35625) FLIP-464: Merge "flink run" and "flink run-application"
[ https://issues.apache.org/jira/browse/FLINK-35625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-35625: - Assignee: Ferenc Csaky > FLIP-464: Merge "flink run" and "flink run-application" > --- > > Key: FLINK-35625 > URL: https://issues.apache.org/jira/browse/FLINK-35625 > Project: Flink > Issue Type: Improvement > Components: Client / Job Submission, Command Line Client >Reporter: Ferenc Csaky >Assignee: Ferenc Csaky >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Ticket to track > [FLIP-464|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=311626179]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34267) Python connector test fails when running on MacBook with m1 processor
[ https://issues.apache.org/jira/browse/FLINK-34267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-34267. --- Resolution: Fixed [{{e6e1426}}|https://github.com/apache/flink-connector-shared-utils/commit/e6e14268b8316352031b25f4b67ed64dc142b683] on ci_utils > Python connector test fails when running on MacBook with m1 processor > - > > Key: FLINK-34267 > URL: https://issues.apache.org/jira/browse/FLINK-34267 > Project: Flink > Issue Type: Bug > Components: API / Python, Build System / CI, Connectors / Common > Environment: m1 MacBook Pro > MacOS 14.2.1 >Reporter: Aleksandr Pilipenko >Assignee: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > > Attempt to execute lint_python.sh on m1 macbook fails while trying to install > miniconda environment > {code} > =installing environment= > installing wget... > install wget... [SUCCESS] > installing miniconda... > download miniconda... > download miniconda... [SUCCESS] > installing conda... > tail: illegal offset -- +018838: Invalid argument > tail: illegal offset -- +018838: Invalid argument > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/download/miniconda.sh: > line 353: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/preconda.tar.bz2: > No such file or directory > upgrade pip... > ./dev/lint-python.sh: line 215: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/python: > No such file or directory > upgrade pip... [SUCCESS] > install conda ... [SUCCESS] > install miniconda... [SUCCESS] > installing python environment... > installing python3.7... > ./dev/lint-python.sh: line 247: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 retrying 1/3 > ./dev/lint-python.sh: line 254: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 retrying 2/3 > ./dev/lint-python.sh: line 254: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 retrying 3/3 > ./dev/lint-python.sh: line 254: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 failed after retrying 3 times.You can retry to > execute the script again. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34267) Python connector test fails when running on MacBook with m1 processor
[ https://issues.apache.org/jira/browse/FLINK-34267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-34267: - Assignee: Aleksandr Pilipenko > Python connector test fails when running on MacBook with m1 processor > - > > Key: FLINK-34267 > URL: https://issues.apache.org/jira/browse/FLINK-34267 > Project: Flink > Issue Type: Bug > Components: API / Python, Build System / CI, Connectors / Common > Environment: m1 MacBook Pro > MacOS 14.2.1 >Reporter: Aleksandr Pilipenko >Assignee: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > > Attempt to execute lint_python.sh on m1 macbook fails while trying to install > miniconda environment > {code} > =installing environment= > installing wget... > install wget... [SUCCESS] > installing miniconda... > download miniconda... > download miniconda... [SUCCESS] > installing conda... > tail: illegal offset -- +018838: Invalid argument > tail: illegal offset -- +018838: Invalid argument > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/download/miniconda.sh: > line 353: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/preconda.tar.bz2: > No such file or directory > upgrade pip... > ./dev/lint-python.sh: line 215: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/python: > No such file or directory > upgrade pip... [SUCCESS] > install conda ... [SUCCESS] > install miniconda... [SUCCESS] > installing python environment... > installing python3.7... > ./dev/lint-python.sh: line 247: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 retrying 1/3 > ./dev/lint-python.sh: line 254: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 retrying 2/3 > ./dev/lint-python.sh: line 254: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 retrying 3/3 > ./dev/lint-python.sh: line 254: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 failed after retrying 3 times.You can retry to > execute the script again. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-34267) Python connector test fails when running on MacBook with m1 processor
[ https://issues.apache.org/jira/browse/FLINK-34267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-34267. - > Python connector test fails when running on MacBook with m1 processor > - > > Key: FLINK-34267 > URL: https://issues.apache.org/jira/browse/FLINK-34267 > Project: Flink > Issue Type: Bug > Components: API / Python, Build System / CI, Connectors / Common > Environment: m1 MacBook Pro > MacOS 14.2.1 >Reporter: Aleksandr Pilipenko >Assignee: Aleksandr Pilipenko >Priority: Major > Labels: pull-request-available > > Attempt to execute lint_python.sh on m1 macbook fails while trying to install > miniconda environment > {code} > =installing environment= > installing wget... > install wget... [SUCCESS] > installing miniconda... > download miniconda... > download miniconda... [SUCCESS] > installing conda... > tail: illegal offset -- +018838: Invalid argument > tail: illegal offset -- +018838: Invalid argument > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/download/miniconda.sh: > line 353: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/preconda.tar.bz2: > No such file or directory > upgrade pip... > ./dev/lint-python.sh: line 215: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/python: > No such file or directory > upgrade pip... [SUCCESS] > install conda ... [SUCCESS] > install miniconda... [SUCCESS] > installing python environment... > installing python3.7... > ./dev/lint-python.sh: line 247: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 retrying 1/3 > ./dev/lint-python.sh: line 254: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 retrying 2/3 > ./dev/lint-python.sh: line 254: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 retrying 3/3 > ./dev/lint-python.sh: line 254: > /Users/apilipenko/Dev/flink-connector-aws/flink-python/dev/.conda/bin/conda: > No such file or directory > conda install 3.7 failed after retrying 3 times.You can retry to > execute the script again. > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-20090) Expose SlotId / SlotSharingGroup in Rest API
[ https://issues.apache.org/jira/browse/FLINK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-20090: - Assignee: Gabor Somogyi > Expose SlotId / SlotSharingGroup in Rest API > - > > Key: FLINK-20090 > URL: https://issues.apache.org/jira/browse/FLINK-20090 > Project: Flink > Issue Type: New Feature > Components: Runtime / REST >Reporter: Maximilian Michels >Assignee: Gabor Somogyi >Priority: Not a Priority > > There is no information on slot sharing exposed via the Rest API which would > be useful to monitor how tasks are assigned to task slots. > We could include the SlotId in {{SubtaskExecutionAttemptDetailsInfo}} and > provide a list of slots in {{TaskManagersInfo}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-20090) Expose SlotId / SlotSharingGroup in Rest API
[ https://issues.apache.org/jira/browse/FLINK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17820034#comment-17820034 ] Gabor Somogyi commented on FLINK-20090: --- I'm working on this and planning to open a PR beginning of next week. > Expose SlotId / SlotSharingGroup in Rest API > - > > Key: FLINK-20090 > URL: https://issues.apache.org/jira/browse/FLINK-20090 > Project: Flink > Issue Type: New Feature > Components: Runtime / REST >Reporter: Maximilian Michels >Assignee: Gabor Somogyi >Priority: Not a Priority > > There is no information on slot sharing exposed via the Rest API which would > be useful to monitor how tasks are assigned to task slots. > We could include the SlotId in {{SubtaskExecutionAttemptDetailsInfo}} and > provide a list of slots in {{TaskManagersInfo}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-20090) Expose SlotId / SlotSharingGroup in Rest API
[ https://issues.apache.org/jira/browse/FLINK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-20090. --- Resolution: Fixed [{{34a7734}}|https://github.com/apache/flink/commit/34a7734c489b080d34ff2194a29d3c1d25d3ab45] on master > Expose SlotId / SlotSharingGroup in Rest API > - > > Key: FLINK-20090 > URL: https://issues.apache.org/jira/browse/FLINK-20090 > Project: Flink > Issue Type: New Feature > Components: Runtime / REST >Reporter: Maximilian Michels >Assignee: Gabor Somogyi >Priority: Not a Priority > Labels: pull-request-available > > There is no information on slot sharing exposed via the Rest API which would > be useful to monitor how tasks are assigned to task slots. > We could include the SlotId in {{SubtaskExecutionAttemptDetailsInfo}} and > provide a list of slots in {{TaskManagersInfo}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-20090) Expose SlotId / SlotSharingGroup in Rest API
[ https://issues.apache.org/jira/browse/FLINK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-20090: -- Fix Version/s: 1.20.0 > Expose SlotId / SlotSharingGroup in Rest API > - > > Key: FLINK-20090 > URL: https://issues.apache.org/jira/browse/FLINK-20090 > Project: Flink > Issue Type: New Feature > Components: Runtime / REST >Reporter: Maximilian Michels >Assignee: Gabor Somogyi >Priority: Not a Priority > Labels: pull-request-available > Fix For: 1.20.0 > > > There is no information on slot sharing exposed via the Rest API which would > be useful to monitor how tasks are assigned to task slots. > We could include the SlotId in {{SubtaskExecutionAttemptDetailsInfo}} and > provide a list of slots in {{TaskManagersInfo}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-20090) Expose SlotId / SlotSharingGroup in Rest API
[ https://issues.apache.org/jira/browse/FLINK-20090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-20090. - > Expose SlotId / SlotSharingGroup in Rest API > - > > Key: FLINK-20090 > URL: https://issues.apache.org/jira/browse/FLINK-20090 > Project: Flink > Issue Type: New Feature > Components: Runtime / REST >Reporter: Maximilian Michels >Assignee: Gabor Somogyi >Priority: Not a Priority > Labels: pull-request-available > > There is no information on slot sharing exposed via the Rest API which would > be useful to monitor how tasks are assigned to task slots. > We could include the SlotId in {{SubtaskExecutionAttemptDetailsInfo}} and > provide a list of slots in {{TaskManagersInfo}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34574) Add CPU and memory size autoscaler quota
Gabor Somogyi created FLINK-34574: - Summary: Add CPU and memory size autoscaler quota Key: FLINK-34574 URL: https://issues.apache.org/jira/browse/FLINK-34574 Project: Flink Issue Type: New Feature Components: Autoscaler Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34574) Add CPU and memory size autoscaler quota
[ https://issues.apache.org/jira/browse/FLINK-34574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-34574: - Assignee: Gabor Somogyi > Add CPU and memory size autoscaler quota > > > Key: FLINK-34574 > URL: https://issues.apache.org/jira/browse/FLINK-34574 > Project: Flink > Issue Type: New Feature > Components: Autoscaler >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33515) PythonDriver need to stream python process output to log instead of collecting it in memory
[ https://issues.apache.org/jira/browse/FLINK-33515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-33515: - Assignee: Gabor Somogyi > PythonDriver need to stream python process output to log instead of > collecting it in memory > --- > > Key: FLINK-33515 > URL: https://issues.apache.org/jira/browse/FLINK-33515 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > > PythonDriver now collects the python process output in a Stringbuilder > instead of streaming it. It can cause OOM when the python process is > generating huge amount of output. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33515) PythonDriver need to stream python process output to log instead of collecting it in memory
Gabor Somogyi created FLINK-33515: - Summary: PythonDriver need to stream python process output to log instead of collecting it in memory Key: FLINK-33515 URL: https://issues.apache.org/jira/browse/FLINK-33515 Project: Flink Issue Type: Bug Components: API / Python Affects Versions: 1.19.0 Reporter: Gabor Somogyi PythonDriver now collects the python process output in a Stringbuilder instead of streaming it. It can cause OOM when the python process is generating huge amount of output. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33513) Metastore delegation-token can be cached?
[ https://issues.apache.org/jira/browse/FLINK-33513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17784739#comment-17784739 ] Gabor Somogyi commented on FLINK-33513: --- If that hurts the solution is not caching but adding a token provider for metastore like HiveServer2DelegationTokenProvider. > Metastore delegation-token can be cached? > - > > Key: FLINK-33513 > URL: https://issues.apache.org/jira/browse/FLINK-33513 > Project: Flink > Issue Type: Improvement > Components: Connectors / Hive >Reporter: katty he >Priority: Major > > Now, every time, getDelegationToken wil be called when asking for metastore, > how about build a cache, we cache the token for the first time, then we can > just get token from cache? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33515) PythonDriver need to stream python process output to log instead of collecting it in memory
[ https://issues.apache.org/jira/browse/FLINK-33515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-33515. --- Fix Version/s: 1.19.0 Resolution: Fixed caa324a on master > PythonDriver need to stream python process output to log instead of > collecting it in memory > --- > > Key: FLINK-33515 > URL: https://issues.apache.org/jira/browse/FLINK-33515 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > PythonDriver now collects the python process output in a Stringbuilder > instead of streaming it. It can cause OOM when the python process is > generating huge amount of output. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33515) PythonDriver need to stream python process output to log instead of collecting it in memory
[ https://issues.apache.org/jira/browse/FLINK-33515?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-33515. - > PythonDriver need to stream python process output to log instead of > collecting it in memory > --- > > Key: FLINK-33515 > URL: https://issues.apache.org/jira/browse/FLINK-33515 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > PythonDriver now collects the python process output in a Stringbuilder > instead of streaming it. It can cause OOM when the python process is > generating huge amount of output. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33531) Nightly Python fails with NPE at metadataHandlerProvider on AZP
[ https://issues.apache.org/jira/browse/FLINK-33531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17785515#comment-17785515 ] Gabor Somogyi commented on FLINK-33531: --- Since I've added python 3.11 lately I've double checked my part. This has been added on 15th of Oct: {code:java} commit 2da9a9639216b8c48850ee714065f090a80dcd65 Author: Gabor Somogyi Date: Sun Oct 15 09:31:08 2023 +0200 [FLINK-33030][python] Add python 3.11 support Also bump grpcio-tools version ... {code} Seems like the latest green nightly happened 30th of Oct so that's not the cause: [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54166&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602] I've gone back time starting from the mentioned 30th of Oct, double checked the master nightlies and seems like it was stable. No idea what happened but after that it became unstable. > Nightly Python fails with NPE at metadataHandlerProvider on AZP > --- > > Key: FLINK-33531 > URL: https://issues.apache.org/jira/browse/FLINK-33531 > Project: Flink > Issue Type: Bug > Components: API / Python >Affects Versions: 1.19.0 >Reporter: Sergey Nuyanzin >Priority: Blocker > Labels: test-stability > > It seems starting 02.11.2023 every master nightly fails with this (that's why > it is a blocker) > for instance > [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=54512&view=logs&j=9cada3cb-c1d3-5621-16da-0f718fb86602&t=c67e71ed-6451-5d26-8920-5a8cf9651901] > {noformat} > 2023-11-12T02:10:24.5082784Z Nov 12 02:10:24 if is_error(answer)[0]: > 2023-11-12T02:10:24.5083620Z Nov 12 02:10:24 if len(answer) > 1: > 2023-11-12T02:10:24.5084326Z Nov 12 02:10:24 type = answer[1] > 2023-11-12T02:10:24.5085164Z Nov 12 02:10:24 value = > OUTPUT_CONVERTER[type](answer[2:], gateway_client) > 2023-11-12T02:10:24.5086061Z Nov 12 02:10:24 if answer[1] == > REFERENCE_TYPE: > 2023-11-12T02:10:24.5086850Z Nov 12 02:10:24 > raise > Py4JJavaError( > 2023-11-12T02:10:24.5087677Z Nov 12 02:10:24 "An > error occurred while calling {0}{1}{2}.\n". > 2023-11-12T02:10:24.5088538Z Nov 12 02:10:24 > format(target_id, ".", name), value) > 2023-11-12T02:10:24.5089551Z Nov 12 02:10:24 E > py4j.protocol.Py4JJavaError: An error occurred while calling > o3371.executeInsert. > 2023-11-12T02:10:24.5090832Z Nov 12 02:10:24 E : > java.lang.NullPointerException: metadataHandlerProvider > 2023-11-12T02:10:24.5091832Z Nov 12 02:10:24 Eat > java.util.Objects.requireNonNull(Objects.java:228) > 2023-11-12T02:10:24.5093399Z Nov 12 02:10:24 Eat > org.apache.calcite.rel.metadata.RelMetadataQueryBase.getMetadataHandlerProvider(RelMetadataQueryBase.java:122) > 2023-11-12T02:10:24.5094480Z Nov 12 02:10:24 Eat > org.apache.calcite.rel.metadata.RelMetadataQueryBase.revise(RelMetadataQueryBase.java:118) > 2023-11-12T02:10:24.5095365Z Nov 12 02:10:24 Eat > org.apache.calcite.rel.metadata.RelMetadataQuery.getPulledUpPredicates(RelMetadataQuery.java:844) > 2023-11-12T02:10:24.5096306Z Nov 12 02:10:24 Eat > org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:307) > 2023-11-12T02:10:24.5097238Z Nov 12 02:10:24 Eat > org.apache.calcite.plan.AbstractRelOptPlanner.fireRule(AbstractRelOptPlanner.java:337) > 2023-11-12T02:10:24.5098014Z Nov 12 02:10:24 Eat > org.apache.calcite.plan.hep.HepPlanner.applyRule(HepPlanner.java:556) > 2023-11-12T02:10:24.5098753Z Nov 12 02:10:24 Eat > org.apache.calcite.plan.hep.HepPlanner.applyRules(HepPlanner.java:420) > 2023-11-12T02:10:24.5099517Z Nov 12 02:10:24 Eat > org.apache.calcite.plan.hep.HepPlanner.executeRuleInstance(HepPlanner.java:243) > 2023-11-12T02:10:24.5100373Z Nov 12 02:10:24 Eat > org.apache.calcite.plan.hep.HepInstruction$RuleInstance$State.execute(HepInstruction.java:178) > 2023-11-12T02:10:24.5101313Z Nov 12 02:10:24 Eat > org.apache.calcite.plan.hep.HepPlanner.lambda$executeProgram$0(HepPlanner.java:211) > 2023-11-12T02:10:24.5102410Z Nov 12 02:10:24 Eat > org.apache.flink.calcite.shaded.com.google.common.collect.ImmutableList.forEach(ImmutableList.java:422) > 2023-11-12T02:10:24.5103343Z Nov 12 02:10:24 Eat > org.apache.calcite.plan.hep.HepPlanner.executeProgram(HepPlanner.java:210) > 2023-11-12T02:10:24.51041
[jira] [Resolved] (FLINK-33268) Flink REST API response parsing throws exception on new fields
[ https://issues.apache.org/jira/browse/FLINK-33268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-33268. --- Resolution: Fixed [{{19cb9de}}|https://github.com/apache/flink/commit/19cb9de5c54b9535be15ca850f5e1ebd2e21c244] on master > Flink REST API response parsing throws exception on new fields > -- > > Key: FLINK-33268 > URL: https://issues.apache.org/jira/browse/FLINK-33268 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > > At the moment Flink is not ignoring unknown fields when parsing REST > responses. An example for such a class is JobDetailsInfo but this applies to > all others. It would be good to add this support to increase compatibility. > The real life use-case is when the Flink k8s operator wants to handle 2 jobs > with 2 different Flink versions where the newer version has added a new field > to any REST response. Such case the operator gets an exception when for > example it tries to poll the job details with the additional field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33268) Flink REST API response parsing throws exception on new fields
[ https://issues.apache.org/jira/browse/FLINK-33268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-33268. - > Flink REST API response parsing throws exception on new fields > -- > > Key: FLINK-33268 > URL: https://issues.apache.org/jira/browse/FLINK-33268 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > > At the moment Flink is not ignoring unknown fields when parsing REST > responses. An example for such a class is JobDetailsInfo but this applies to > all others. It would be good to add this support to increase compatibility. > The real life use-case is when the Flink k8s operator wants to handle 2 jobs > with 2 different Flink versions where the newer version has added a new field > to any REST response. Such case the operator gets an exception when for > example it tries to poll the job details with the additional field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33268) Flink REST API response parsing throws exception on new fields
[ https://issues.apache.org/jira/browse/FLINK-33268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-33268: -- Fix Version/s: 1.19.0 > Flink REST API response parsing throws exception on new fields > -- > > Key: FLINK-33268 > URL: https://issues.apache.org/jira/browse/FLINK-33268 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > At the moment Flink is not ignoring unknown fields when parsing REST > responses. An example for such a class is JobDetailsInfo but this applies to > all others. It would be good to add this support to increase compatibility. > The real life use-case is when the Flink k8s operator wants to handle 2 jobs > with 2 different Flink versions where the newer version has added a new field > to any REST response. Such case the operator gets an exception when for > example it tries to poll the job details with the additional field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-30310) Re-enable e2e test error check
[ https://issues.apache.org/jira/browse/FLINK-30310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17809448#comment-17809448 ] Gabor Somogyi commented on FLINK-30310: --- There are too many false positives because of negative test cases and its not realistic to be so strict that error typed messages can't appear in the operator log so closing this with won't do and remove this dead code part. > Re-enable e2e test error check > -- > > Key: FLINK-30310 > URL: https://issues.apache.org/jira/browse/FLINK-30310 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > > In FLINK-30307 e2e test error check has been turned off temporarily. We must > re-enable it after release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-30310) Re-enable e2e test error check
[ https://issues.apache.org/jira/browse/FLINK-30310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-30310. --- Resolution: Won't Fix > Re-enable e2e test error check > -- > > Key: FLINK-30310 > URL: https://issues.apache.org/jira/browse/FLINK-30310 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > > In FLINK-30307 e2e test error check has been turned off temporarily. We must > re-enable it after release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-30310) Re-enable e2e test error check
[ https://issues.apache.org/jira/browse/FLINK-30310?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-30310. - > Re-enable e2e test error check > -- > > Key: FLINK-30310 > URL: https://issues.apache.org/jira/browse/FLINK-30310 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > > In FLINK-30307 e2e test error check has been turned off temporarily. We must > re-enable it after release. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-30147) Evaluate operator error log whitelist entry: Failed to submit a listener notification task
[ https://issues.apache.org/jira/browse/FLINK-30147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-30147. --- Resolution: Won't Fix See comment in FLINK-30310. > Evaluate operator error log whitelist entry: Failed to submit a listener > notification task > -- > > Key: FLINK-30147 > URL: https://issues.apache.org/jira/browse/FLINK-30147 > Project: Flink > Issue Type: Sub-task > Components: Kubernetes Operator >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-30148) Evaluate operator error log whitelist entry: Failed to submit job to session cluster
[ https://issues.apache.org/jira/browse/FLINK-30148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-30148. - > Evaluate operator error log whitelist entry: Failed to submit job to session > cluster > > > Key: FLINK-30148 > URL: https://issues.apache.org/jira/browse/FLINK-30148 > Project: Flink > Issue Type: Sub-task > Components: Kubernetes Operator >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-30148) Evaluate operator error log whitelist entry: Failed to submit job to session cluster
[ https://issues.apache.org/jira/browse/FLINK-30148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-30148. --- Resolution: Won't Fix See comment in FLINK-30310. > Evaluate operator error log whitelist entry: Failed to submit job to session > cluster > > > Key: FLINK-30148 > URL: https://issues.apache.org/jira/browse/FLINK-30148 > Project: Flink > Issue Type: Sub-task > Components: Kubernetes Operator >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-30147) Evaluate operator error log whitelist entry: Failed to submit a listener notification task
[ https://issues.apache.org/jira/browse/FLINK-30147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-30147. - > Evaluate operator error log whitelist entry: Failed to submit a listener > notification task > -- > > Key: FLINK-30147 > URL: https://issues.apache.org/jira/browse/FLINK-30147 > Project: Flink > Issue Type: Sub-task > Components: Kubernetes Operator >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-30149) Evaluate operator error log whitelist entry: Error during event processing
[ https://issues.apache.org/jira/browse/FLINK-30149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-30149. - > Evaluate operator error log whitelist entry: Error during event processing > -- > > Key: FLINK-30149 > URL: https://issues.apache.org/jira/browse/FLINK-30149 > Project: Flink > Issue Type: Sub-task > Components: Kubernetes Operator >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-30149) Evaluate operator error log whitelist entry: Error during event processing
[ https://issues.apache.org/jira/browse/FLINK-30149?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-30149. --- Resolution: Won't Fix See comment in FLINK-30310. > Evaluate operator error log whitelist entry: Error during event processing > -- > > Key: FLINK-30149 > URL: https://issues.apache.org/jira/browse/FLINK-30149 > Project: Flink > Issue Type: Sub-task > Components: Kubernetes Operator >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-30117) Evaluate operator error log whitelist entries added in FLINK-29475
[ https://issues.apache.org/jira/browse/FLINK-30117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-30117. - > Evaluate operator error log whitelist entries added in FLINK-29475 > -- > > Key: FLINK-30117 > URL: https://issues.apache.org/jira/browse/FLINK-30117 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: 1.17.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-30311) CI error: Back-off pulling image "flink:1.14"
[ https://issues.apache.org/jira/browse/FLINK-30311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-30311. --- Resolution: Won't Fix See comment in FLINK-30310. > CI error: Back-off pulling image "flink:1.14" > - > > Key: FLINK-30311 > URL: https://issues.apache.org/jira/browse/FLINK-30311 > Project: Flink > Issue Type: Sub-task > Components: Kubernetes Operator >Reporter: Peter Vary >Priority: Major > > CI failed with: {{Flink Deployment failed 2022-12-06T08:45:03.0244383Z > org.apache.flink.kubernetes.operator.exception.DeploymentFailedException: > Back-off pulling image "flink:1.14"}} > We should find the root cause of this issue and try to mitigate it. > [https://github.com/apache/flink-kubernetes-operator/actions/runs/3627824632/jobs/6118131271] > > {code:java} > 2022-12-06T08:45:03.0243558Z [m[33m2022-12-06 08:41:44,716[m > [36mo.a.f.k.o.c.FlinkDeploymentController[m > [1;31m[ERROR][default/flink-example-statemachine] Flink Deployment failed > 2022-12-06T08:45:03.0244383Z > org.apache.flink.kubernetes.operator.exception.DeploymentFailedException: > Back-off pulling image "flink:1.14" > 2022-12-06T08:45:03.0245385Z at > org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.checkContainerBackoff(AbstractFlinkDeploymentObserver.java:194) > 2022-12-06T08:45:03.0246604Z at > org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.observeJmDeployment(AbstractFlinkDeploymentObserver.java:150) > 2022-12-06T08:45:03.0247780Z at > org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.observeInternal(AbstractFlinkDeploymentObserver.java:84) > 2022-12-06T08:45:03.0248934Z at > org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.observeInternal(AbstractFlinkDeploymentObserver.java:55) > 2022-12-06T08:45:03.0249941Z at > org.apache.flink.kubernetes.operator.observer.AbstractFlinkResourceObserver.observe(AbstractFlinkResourceObserver.java:56) > 2022-12-06T08:45:03.0250844Z at > org.apache.flink.kubernetes.operator.observer.AbstractFlinkResourceObserver.observe(AbstractFlinkResourceObserver.java:32) > 2022-12-06T08:45:03.0252038Z at > org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:113) > 2022-12-06T08:45:03.0252936Z at > org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:54) > 2022-12-06T08:45:03.0253850Z at > io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:136) > 2022-12-06T08:45:03.0254412Z at > io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:94) > 2022-12-06T08:45:03.0255322Z at > org.apache.flink.kubernetes.operator.metrics.OperatorJosdkMetrics.timeControllerExecution(OperatorJosdkMetrics.java:80) > 2022-12-06T08:45:03.0256081Z at > io.javaoperatorsdk.operator.processing.Controller.reconcile(Controller.java:93) > 2022-12-06T08:45:03.0256872Z at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:130) > 2022-12-06T08:45:03.0257804Z at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:110) > 2022-12-06T08:45:03.0258720Z at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:81) > 2022-12-06T08:45:03.0259635Z at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:54) > 2022-12-06T08:45:03.0260448Z at > io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:406) > 2022-12-06T08:45:03.0261070Z at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > 2022-12-06T08:45:03.0261595Z at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > 2022-12-06T08:45:03.0262005Z at java.base/java.lang.Thread.run(Unknown > Source) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-30283) Evaluate operator error log entry: Error while patching status
[ https://issues.apache.org/jira/browse/FLINK-30283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-30283. - > Evaluate operator error log entry: Error while patching status > -- > > Key: FLINK-30283 > URL: https://issues.apache.org/jira/browse/FLINK-30283 > Project: Flink > Issue Type: Sub-task >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-30283) Evaluate operator error log entry: Error while patching status
[ https://issues.apache.org/jira/browse/FLINK-30283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-30283. --- Resolution: Won't Fix See comment in FLINK-30310. > Evaluate operator error log entry: Error while patching status > -- > > Key: FLINK-30283 > URL: https://issues.apache.org/jira/browse/FLINK-30283 > Project: Flink > Issue Type: Sub-task >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-30311) CI error: Back-off pulling image "flink:1.14"
[ https://issues.apache.org/jira/browse/FLINK-30311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-30311. - > CI error: Back-off pulling image "flink:1.14" > - > > Key: FLINK-30311 > URL: https://issues.apache.org/jira/browse/FLINK-30311 > Project: Flink > Issue Type: Sub-task > Components: Kubernetes Operator >Reporter: Peter Vary >Priority: Major > > CI failed with: {{Flink Deployment failed 2022-12-06T08:45:03.0244383Z > org.apache.flink.kubernetes.operator.exception.DeploymentFailedException: > Back-off pulling image "flink:1.14"}} > We should find the root cause of this issue and try to mitigate it. > [https://github.com/apache/flink-kubernetes-operator/actions/runs/3627824632/jobs/6118131271] > > {code:java} > 2022-12-06T08:45:03.0243558Z [m[33m2022-12-06 08:41:44,716[m > [36mo.a.f.k.o.c.FlinkDeploymentController[m > [1;31m[ERROR][default/flink-example-statemachine] Flink Deployment failed > 2022-12-06T08:45:03.0244383Z > org.apache.flink.kubernetes.operator.exception.DeploymentFailedException: > Back-off pulling image "flink:1.14" > 2022-12-06T08:45:03.0245385Z at > org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.checkContainerBackoff(AbstractFlinkDeploymentObserver.java:194) > 2022-12-06T08:45:03.0246604Z at > org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.observeJmDeployment(AbstractFlinkDeploymentObserver.java:150) > 2022-12-06T08:45:03.0247780Z at > org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.observeInternal(AbstractFlinkDeploymentObserver.java:84) > 2022-12-06T08:45:03.0248934Z at > org.apache.flink.kubernetes.operator.observer.deployment.AbstractFlinkDeploymentObserver.observeInternal(AbstractFlinkDeploymentObserver.java:55) > 2022-12-06T08:45:03.0249941Z at > org.apache.flink.kubernetes.operator.observer.AbstractFlinkResourceObserver.observe(AbstractFlinkResourceObserver.java:56) > 2022-12-06T08:45:03.0250844Z at > org.apache.flink.kubernetes.operator.observer.AbstractFlinkResourceObserver.observe(AbstractFlinkResourceObserver.java:32) > 2022-12-06T08:45:03.0252038Z at > org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:113) > 2022-12-06T08:45:03.0252936Z at > org.apache.flink.kubernetes.operator.controller.FlinkDeploymentController.reconcile(FlinkDeploymentController.java:54) > 2022-12-06T08:45:03.0253850Z at > io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:136) > 2022-12-06T08:45:03.0254412Z at > io.javaoperatorsdk.operator.processing.Controller$1.execute(Controller.java:94) > 2022-12-06T08:45:03.0255322Z at > org.apache.flink.kubernetes.operator.metrics.OperatorJosdkMetrics.timeControllerExecution(OperatorJosdkMetrics.java:80) > 2022-12-06T08:45:03.0256081Z at > io.javaoperatorsdk.operator.processing.Controller.reconcile(Controller.java:93) > 2022-12-06T08:45:03.0256872Z at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.reconcileExecution(ReconciliationDispatcher.java:130) > 2022-12-06T08:45:03.0257804Z at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleReconcile(ReconciliationDispatcher.java:110) > 2022-12-06T08:45:03.0258720Z at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleDispatch(ReconciliationDispatcher.java:81) > 2022-12-06T08:45:03.0259635Z at > io.javaoperatorsdk.operator.processing.event.ReconciliationDispatcher.handleExecution(ReconciliationDispatcher.java:54) > 2022-12-06T08:45:03.0260448Z at > io.javaoperatorsdk.operator.processing.event.EventProcessor$ReconcilerExecutor.run(EventProcessor.java:406) > 2022-12-06T08:45:03.0261070Z at > java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) > 2022-12-06T08:45:03.0261595Z at > java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) > 2022-12-06T08:45:03.0262005Z at java.base/java.lang.Thread.run(Unknown > Source) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-30117) Evaluate operator error log whitelist entries added in FLINK-29475
[ https://issues.apache.org/jira/browse/FLINK-30117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-30117. --- Resolution: Won't Fix See comment in FLINK-30310. > Evaluate operator error log whitelist entries added in FLINK-29475 > -- > > Key: FLINK-30117 > URL: https://issues.apache.org/jira/browse/FLINK-30117 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: 1.17.0 >Reporter: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34198) Remove e2e test operator log error check
Gabor Somogyi created FLINK-34198: - Summary: Remove e2e test operator log error check Key: FLINK-34198 URL: https://issues.apache.org/jira/browse/FLINK-34198 Project: Flink Issue Type: Improvement Components: Kubernetes Operator Affects Versions: 1.8.4 Reporter: Gabor Somogyi There are too many false positives because of negative test cases and its not realistic to be so strict that error typed messages can't appear in the operator log. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-34198) Remove e2e test operator log error check
[ https://issues.apache.org/jira/browse/FLINK-34198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-34198: - Assignee: Gabor Somogyi > Remove e2e test operator log error check > > > Key: FLINK-34198 > URL: https://issues.apache.org/jira/browse/FLINK-34198 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: 1.8.4 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > > There are too many false positives because of negative test cases and its not > realistic to be so strict that error typed messages can't appear in the > operator log. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-34198) Remove e2e test operator log error check
[ https://issues.apache.org/jira/browse/FLINK-34198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-34198. --- Fix Version/s: 1.8.0 Resolution: Fixed [{{31d01f2}}|https://github.com/apache/flink-kubernetes-operator/commit/31d01f246d8a344b560aab1653b7aba561baea26] on main > Remove e2e test operator log error check > > > Key: FLINK-34198 > URL: https://issues.apache.org/jira/browse/FLINK-34198 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: 1.8.4 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > > There are too many false positives because of negative test cases and its not > realistic to be so strict that error typed messages can't appear in the > operator log. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-34198) Remove e2e test operator log error check
[ https://issues.apache.org/jira/browse/FLINK-34198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-34198: -- Affects Version/s: 1.8.0 (was: 1.8.4) > Remove e2e test operator log error check > > > Key: FLINK-34198 > URL: https://issues.apache.org/jira/browse/FLINK-34198 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: 1.8.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > > There are too many false positives because of negative test cases and its not > realistic to be so strict that error typed messages can't appear in the > operator log. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-34198) Remove e2e test operator log error check
[ https://issues.apache.org/jira/browse/FLINK-34198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-34198. - > Remove e2e test operator log error check > > > Key: FLINK-34198 > URL: https://issues.apache.org/jira/browse/FLINK-34198 > Project: Flink > Issue Type: Improvement > Components: Kubernetes Operator >Affects Versions: 1.8.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 1.8.0 > > > There are too many false positives because of negative test cases and its not > realistic to be so strict that error typed messages can't appear in the > operator log. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33268) Flink REST API response parsing should support backward compatible changes like new fields
[ https://issues.apache.org/jira/browse/FLINK-33268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-33268: -- Description: At the moment Flink is not ignoring unknown fields when parsing REST responses. An example for such a class is JobDetailsInfo but this applies to all others. It would be good to add this support to increase compatibility. The real life use-case is when the operator wants to handle 2 jobs with 2 different Flink versions where the newer version has added a new field to any REST response. Such case the operator gets an exception when tries to poll the job details with the additional field. was:At the moment Flink is not ignoring unknown fields when parsing REST responses. An example for such a class is JobDetailsInfo but this applies to all others. It would be good to add this support to increase compatibility. > Flink REST API response parsing should support backward compatible changes > like new fields > -- > > Key: FLINK-33268 > URL: https://issues.apache.org/jira/browse/FLINK-33268 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Priority: Major > > At the moment Flink is not ignoring unknown fields when parsing REST > responses. An example for such a class is JobDetailsInfo but this applies to > all others. It would be good to add this support to increase compatibility. > The real life use-case is when the operator wants to handle 2 jobs with 2 > different Flink versions where the newer version has added a new field to any > REST response. Such case the operator gets an exception when tries to poll > the job details with the additional field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33268) Flink REST API response parsing should support backward compatible changes like new fields
[ https://issues.apache.org/jira/browse/FLINK-33268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-33268: -- Description: At the moment Flink is not ignoring unknown fields when parsing REST responses. An example for such a class is JobDetailsInfo but this applies to all others. It would be good to add this support to increase compatibility. The real life use-case is when the Flink k8s operator wants to handle 2 jobs with 2 different Flink versions where the newer version has added a new field to any REST response. Such case the operator gets an exception when tries to poll the job details with the additional field. was: At the moment Flink is not ignoring unknown fields when parsing REST responses. An example for such a class is JobDetailsInfo but this applies to all others. It would be good to add this support to increase compatibility. The real life use-case is when the operator wants to handle 2 jobs with 2 different Flink versions where the newer version has added a new field to any REST response. Such case the operator gets an exception when tries to poll the job details with the additional field. > Flink REST API response parsing should support backward compatible changes > like new fields > -- > > Key: FLINK-33268 > URL: https://issues.apache.org/jira/browse/FLINK-33268 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Priority: Major > > At the moment Flink is not ignoring unknown fields when parsing REST > responses. An example for such a class is JobDetailsInfo but this applies to > all others. It would be good to add this support to increase compatibility. > The real life use-case is when the Flink k8s operator wants to handle 2 jobs > with 2 different Flink versions where the newer version has added a new field > to any REST response. Such case the operator gets an exception when tries to > poll the job details with the additional field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33268) Flink REST API response parsing should support backward compatible changes like new fields
[ https://issues.apache.org/jira/browse/FLINK-33268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-33268: -- Description: At the moment Flink is not ignoring unknown fields when parsing REST responses. An example for such a class is JobDetailsInfo but this applies to all others. It would be good to add this support to increase compatibility. The real life use-case is when the Flink k8s operator wants to handle 2 jobs with 2 different Flink versions where the newer version has added a new field to any REST response. Such case the operator gets an exception when for example it tries to poll the job details with the additional field. was: At the moment Flink is not ignoring unknown fields when parsing REST responses. An example for such a class is JobDetailsInfo but this applies to all others. It would be good to add this support to increase compatibility. The real life use-case is when the Flink k8s operator wants to handle 2 jobs with 2 different Flink versions where the newer version has added a new field to any REST response. Such case the operator gets an exception when tries to poll the job details with the additional field. > Flink REST API response parsing should support backward compatible changes > like new fields > -- > > Key: FLINK-33268 > URL: https://issues.apache.org/jira/browse/FLINK-33268 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Priority: Major > > At the moment Flink is not ignoring unknown fields when parsing REST > responses. An example for such a class is JobDetailsInfo but this applies to > all others. It would be good to add this support to increase compatibility. > The real life use-case is when the Flink k8s operator wants to handle 2 jobs > with 2 different Flink versions where the newer version has added a new field > to any REST response. Such case the operator gets an exception when for > example it tries to poll the job details with the additional field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-33268) Flink REST API response parsing throws exception on new fields
[ https://issues.apache.org/jira/browse/FLINK-33268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-33268: -- Summary: Flink REST API response parsing throws exception on new fields (was: Flink REST API response parsing should support backward compatible changes like new fields) > Flink REST API response parsing throws exception on new fields > -- > > Key: FLINK-33268 > URL: https://issues.apache.org/jira/browse/FLINK-33268 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Priority: Major > > At the moment Flink is not ignoring unknown fields when parsing REST > responses. An example for such a class is JobDetailsInfo but this applies to > all others. It would be good to add this support to increase compatibility. > The real life use-case is when the Flink k8s operator wants to handle 2 jobs > with 2 different Flink versions where the newer version has added a new field > to any REST response. Such case the operator gets an exception when for > example it tries to poll the job details with the additional field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33556) Test infrastructure for externalized python code
[ https://issues.apache.org/jira/browse/FLINK-33556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17794087#comment-17794087 ] Gabor Somogyi commented on FLINK-33556: --- e4f3898 on master > Test infrastructure for externalized python code > > > Key: FLINK-33556 > URL: https://issues.apache.org/jira/browse/FLINK-33556 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Connectors / Common >Affects Versions: 1.18.0 >Reporter: Márton Balassi >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > We need to establish the reusable parts of the python infrastructure as part > of the shared connector utils such that it can be easily reused. Ideally we > would create a github workflow similar to > https://github.com/apache/flink-connector-shared-utils/blob/ci_utils/.github/workflows/ci.yml. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33556) Test infrastructure for externalized python code
[ https://issues.apache.org/jira/browse/FLINK-33556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-33556. --- Resolution: Fixed > Test infrastructure for externalized python code > > > Key: FLINK-33556 > URL: https://issues.apache.org/jira/browse/FLINK-33556 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Connectors / Common >Affects Versions: 1.18.0 >Reporter: Márton Balassi >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > We need to establish the reusable parts of the python infrastructure as part > of the shared connector utils such that it can be easily reused. Ideally we > would create a github workflow similar to > https://github.com/apache/flink-connector-shared-utils/blob/ci_utils/.github/workflows/ci.yml. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33556) Test infrastructure for externalized python code
[ https://issues.apache.org/jira/browse/FLINK-33556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17794102#comment-17794102 ] Gabor Somogyi commented on FLINK-33556: --- 7691962 on ci_utils > Test infrastructure for externalized python code > > > Key: FLINK-33556 > URL: https://issues.apache.org/jira/browse/FLINK-33556 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Connectors / Common >Affects Versions: 1.18.0 >Reporter: Márton Balassi >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > We need to establish the reusable parts of the python infrastructure as part > of the shared connector utils such that it can be easily reused. Ideally we > would create a github workflow similar to > https://github.com/apache/flink-connector-shared-utils/blob/ci_utils/.github/workflows/ci.yml. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33556) Test infrastructure for externalized python code
[ https://issues.apache.org/jira/browse/FLINK-33556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-33556. - > Test infrastructure for externalized python code > > > Key: FLINK-33556 > URL: https://issues.apache.org/jira/browse/FLINK-33556 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Connectors / Common >Affects Versions: 1.18.0 >Reporter: Márton Balassi >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > We need to establish the reusable parts of the python infrastructure as part > of the shared connector utils such that it can be easily reused. Ideally we > would create a github workflow similar to > https://github.com/apache/flink-connector-shared-utils/blob/ci_utils/.github/workflows/ci.yml. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33559) Externalize Kafka Python connector code
[ https://issues.apache.org/jira/browse/FLINK-33559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17795239#comment-17795239 ] Gabor Somogyi commented on FLINK-33559: --- c38a040 on main > Externalize Kafka Python connector code > --- > > Key: FLINK-33559 > URL: https://issues.apache.org/jira/browse/FLINK-33559 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Connectors / Kafka >Affects Versions: 1.18.0 >Reporter: Márton Balassi >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > See description of parent ticket for context. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-33559) Externalize Kafka Python connector code
[ https://issues.apache.org/jira/browse/FLINK-33559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-33559. - > Externalize Kafka Python connector code > --- > > Key: FLINK-33559 > URL: https://issues.apache.org/jira/browse/FLINK-33559 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Connectors / Kafka >Affects Versions: 1.18.0 >Reporter: Márton Balassi >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > See description of parent ticket for context. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-33559) Externalize Kafka Python connector code
[ https://issues.apache.org/jira/browse/FLINK-33559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-33559. --- Resolution: Fixed > Externalize Kafka Python connector code > --- > > Key: FLINK-33559 > URL: https://issues.apache.org/jira/browse/FLINK-33559 > Project: Flink > Issue Type: Sub-task > Components: API / Python, Connectors / Kafka >Affects Versions: 1.18.0 >Reporter: Márton Balassi >Assignee: Peter Vary >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > See description of parent ticket for context. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-33268) Flink REST API response parsing throws exception on new fields
[ https://issues.apache.org/jira/browse/FLINK-33268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-33268: - Assignee: Gabor Somogyi > Flink REST API response parsing throws exception on new fields > -- > > Key: FLINK-33268 > URL: https://issues.apache.org/jira/browse/FLINK-33268 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > > At the moment Flink is not ignoring unknown fields when parsing REST > responses. An example for such a class is JobDetailsInfo but this applies to > all others. It would be good to add this support to increase compatibility. > The real life use-case is when the Flink k8s operator wants to handle 2 jobs > with 2 different Flink versions where the newer version has added a new field > to any REST response. Such case the operator gets an exception when for > example it tries to poll the job details with the additional field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-22059) add a new option is rocksdb statebackend to enable job threads setting
[ https://issues.apache.org/jira/browse/FLINK-22059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-22059: - Assignee: zhongyou.lee (was: xiaogang zhou) > add a new option is rocksdb statebackend to enable job threads setting > -- > > Key: FLINK-22059 > URL: https://issues.apache.org/jira/browse/FLINK-22059 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 1.12.2 >Reporter: xiaogang zhou >Assignee: zhongyou.lee >Priority: Minor > Labels: auto-deprioritized-major, auto-unassigned, stale-assigned > Fix For: 2.0.0 > > > As discussed in FLINK-21688 , now we are using the setIncreaseParallelism > function to set the number of rocksdb's working threads. > > can we enable another setting key to set the rocksdb's max backgroud jobs > which will set a large flush thread number. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35969) Remove deprecated dataset based API from State Processor API
Gabor Somogyi created FLINK-35969: - Summary: Remove deprecated dataset based API from State Processor API Key: FLINK-35969 URL: https://issues.apache.org/jira/browse/FLINK-35969 Project: Flink Issue Type: Improvement Components: API / State Processor Affects Versions: 2.0.0 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-35969) Remove deprecated dataset based API from State Processor API
[ https://issues.apache.org/jira/browse/FLINK-35969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-35969: - Assignee: Gabor Somogyi > Remove deprecated dataset based API from State Processor API > > > Key: FLINK-35969 > URL: https://issues.apache.org/jira/browse/FLINK-35969 > Project: Flink > Issue Type: Improvement > Components: API / State Processor >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36001) Store operator name and UID in state metadata
Gabor Somogyi created FLINK-36001: - Summary: Store operator name and UID in state metadata Key: FLINK-36001 URL: https://issues.apache.org/jira/browse/FLINK-36001 Project: Flink Issue Type: Improvement Components: Runtime / State Backends Affects Versions: 2.0.0 Reporter: Gabor Somogyi -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-36001) Store operator name and UID in state metadata
[ https://issues.apache.org/jira/browse/FLINK-36001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-36001: - Assignee: Gabor Somogyi > Store operator name and UID in state metadata > - > > Key: FLINK-36001 > URL: https://issues.apache.org/jira/browse/FLINK-36001 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-35969) Remove deprecated dataset based API from State Processor API
[ https://issues.apache.org/jira/browse/FLINK-35969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-35969. --- Fix Version/s: 2.0.0 Resolution: Fixed > Remove deprecated dataset based API from State Processor API > > > Key: FLINK-35969 > URL: https://issues.apache.org/jira/browse/FLINK-35969 > Project: Flink > Issue Type: Improvement > Components: API / State Processor >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35969) Remove deprecated dataset based API from State Processor API
[ https://issues.apache.org/jira/browse/FLINK-35969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17871663#comment-17871663 ] Gabor Somogyi commented on FLINK-35969: --- 2c48175 on master > Remove deprecated dataset based API from State Processor API > > > Key: FLINK-35969 > URL: https://issues.apache.org/jira/browse/FLINK-35969 > Project: Flink > Issue Type: Improvement > Components: API / State Processor >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-35969) Remove deprecated dataset based API from State Processor API
[ https://issues.apache.org/jira/browse/FLINK-35969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-35969. - > Remove deprecated dataset based API from State Processor API > > > Key: FLINK-35969 > URL: https://issues.apache.org/jira/browse/FLINK-35969 > Project: Flink > Issue Type: Improvement > Components: API / State Processor >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-36058) OperatorTestHarness is always using checkpointId=0 for state recovery
[ https://issues.apache.org/jira/browse/FLINK-36058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17873727#comment-17873727 ] Gabor Somogyi commented on FLINK-36058: --- If I understand correctly this has no Flink code impact just test, right? > OperatorTestHarness is always using checkpointId=0 for state recovery > - > > Key: FLINK-36058 > URL: https://issues.apache.org/jira/browse/FLINK-36058 > Project: Flink > Issue Type: Improvement > Components: Test Infrastructure, Tests >Reporter: Rodrigo Meneses >Priority: Major > > OperatorTestHarness last completed checkpoint for recovery is always reset to > 0. > This looks like a known limitation: > [https://github.com/apache/flink/blob/master/flink-streaming-java/src/test/java/org/apache/flink/streaming/runtime/operators/sink/CommitterOperatorTestBase.java#L223] > > By fixing this issue, users will be able to use a checkpointId different from > zero in their state recovery unit tests. > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36124) S3RecoverableFsDataOutputStream.sync closes the stream and further write operation throw exception
Gabor Somogyi created FLINK-36124: - Summary: S3RecoverableFsDataOutputStream.sync closes the stream and further write operation throw exception Key: FLINK-36124 URL: https://issues.apache.org/jira/browse/FLINK-36124 Project: Flink Issue Type: Bug Components: Connectors / AWS Affects Versions: 2.0.0 Reporter: Gabor Somogyi This behaviour is introduced in FLINK-28513. Rationale why I think it's a bug: * `sync` method is defined in `FSDataOutputStream` with the following definition: {code:java} /** * Flushes the data all the way to the persistent non-volatile storage (for example disks). The * method behaves similar to the fsync function, forcing all data to be persistent on the * devices. * * @throws IOException Thrown if an I/O error occurs */ {code} * In case `sync` method call user of the writer instance is expected to call further `write` methods * What is actually happening it's blowing up the next write with the following exception: {code:java} java.io.IOException: Stream closed. at org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:72) at org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52) at org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104) at org.apache.flink.core.fs.RefCountedBufferingFileStream.write(RefCountedBufferingFileStream.java:87) at org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.write(S3RecoverableFsDataOutputStream.java:112) at java.base/java.io.OutputStream.write(OutputStream.java:122) {code} * This can be super easily tested with `S3RecoverableFsDataOutputStreamTest.testSync`. Please remove the `expected = Exception.class` from the beginning of the test. * The following line in the test is testing nothing because never ever called: https://github.com/apache/flink/blob/56c81995d3b34ed9066b6771755407b93438f5ab/flink-filesystems/flink-s3-fs-base/src/test/java/org/apache/flink/fs/s3/common/writer/S3RecoverableFsDataOutputStreamTest.java#L264 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-36124) S3RecoverableFsDataOutputStream.sync closes the stream and further write operation throw exception
[ https://issues.apache.org/jira/browse/FLINK-36124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-36124: - Assignee: Gabor Somogyi > S3RecoverableFsDataOutputStream.sync closes the stream and further write > operation throw exception > -- > > Key: FLINK-36124 > URL: https://issues.apache.org/jira/browse/FLINK-36124 > Project: Flink > Issue Type: Bug > Components: Connectors / AWS >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Blocker > > This behaviour is introduced in FLINK-28513. > Rationale why I think it's a bug: > * `sync` method is defined in `FSDataOutputStream` with the following > definition: > {code:java} > /** > * Flushes the data all the way to the persistent non-volatile storage > (for example disks). The > * method behaves similar to the fsync function, forcing all data > to be persistent on the > * devices. > * > * @throws IOException Thrown if an I/O error occurs > */ > {code} > * In case `sync` method call user of the writer instance is expected to call > further `write` methods > * What is actually happening it's blowing up the next write with the > following exception: > {code:java} > java.io.IOException: Stream closed. > at > org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:72) > at > org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52) > at > org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104) > at > org.apache.flink.core.fs.RefCountedBufferingFileStream.write(RefCountedBufferingFileStream.java:87) > at > org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.write(S3RecoverableFsDataOutputStream.java:112) > at java.base/java.io.OutputStream.write(OutputStream.java:122) > {code} > * This can be super easily tested with > `S3RecoverableFsDataOutputStreamTest.testSync`. Please remove the `expected = > Exception.class` from the beginning of the test. > * The following line in the test is testing nothing because never ever > called: > https://github.com/apache/flink/blob/56c81995d3b34ed9066b6771755407b93438f5ab/flink-filesystems/flink-s3-fs-base/src/test/java/org/apache/flink/fs/s3/common/writer/S3RecoverableFsDataOutputStreamTest.java#L264 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-36124) S3RecoverableFsDataOutputStream.sync closes the stream and further write operation throw exception
[ https://issues.apache.org/jira/browse/FLINK-36124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-36124: - Assignee: (was: Gabor Somogyi) > S3RecoverableFsDataOutputStream.sync closes the stream and further write > operation throw exception > -- > > Key: FLINK-36124 > URL: https://issues.apache.org/jira/browse/FLINK-36124 > Project: Flink > Issue Type: Bug > Components: Connectors / AWS >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Priority: Blocker > > This behaviour is introduced in FLINK-28513. > Rationale why I think it's a bug: > * `sync` method is defined in `FSDataOutputStream` with the following > definition: > {code:java} > /** > * Flushes the data all the way to the persistent non-volatile storage > (for example disks). The > * method behaves similar to the fsync function, forcing all data > to be persistent on the > * devices. > * > * @throws IOException Thrown if an I/O error occurs > */ > {code} > * In case `sync` method call user of the writer instance is expected to call > further `write` methods > * What is actually happening it's blowing up the next write with the > following exception: > {code:java} > java.io.IOException: Stream closed. > at > org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:72) > at > org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52) > at > org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104) > at > org.apache.flink.core.fs.RefCountedBufferingFileStream.write(RefCountedBufferingFileStream.java:87) > at > org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.write(S3RecoverableFsDataOutputStream.java:112) > at java.base/java.io.OutputStream.write(OutputStream.java:122) > {code} > * This can be super easily tested with > `S3RecoverableFsDataOutputStreamTest.testSync`. Please remove the `expected = > Exception.class` from the beginning of the test. > * The following line in the test is testing nothing because never ever > called: > https://github.com/apache/flink/blob/56c81995d3b34ed9066b6771755407b93438f5ab/flink-filesystems/flink-s3-fs-base/src/test/java/org/apache/flink/fs/s3/common/writer/S3RecoverableFsDataOutputStreamTest.java#L264 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-36124) S3RecoverableFsDataOutputStream.sync closes the stream and further write operation throw exception
[ https://issues.apache.org/jira/browse/FLINK-36124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-36124: - Assignee: Gabor Somogyi > S3RecoverableFsDataOutputStream.sync closes the stream and further write > operation throw exception > -- > > Key: FLINK-36124 > URL: https://issues.apache.org/jira/browse/FLINK-36124 > Project: Flink > Issue Type: Bug > Components: Connectors / AWS >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Blocker > > This behaviour is introduced in FLINK-28513. > Rationale why I think it's a bug: > * `sync` method is defined in `FSDataOutputStream` with the following > definition: > {code:java} > /** > * Flushes the data all the way to the persistent non-volatile storage > (for example disks). The > * method behaves similar to the fsync function, forcing all data > to be persistent on the > * devices. > * > * @throws IOException Thrown if an I/O error occurs > */ > {code} > * In case `sync` method call user of the writer instance is expected to call > further `write` methods > * What is actually happening it's blowing up the next write with the > following exception: > {code:java} > java.io.IOException: Stream closed. > at > org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:72) > at > org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52) > at > org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104) > at > org.apache.flink.core.fs.RefCountedBufferingFileStream.write(RefCountedBufferingFileStream.java:87) > at > org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.write(S3RecoverableFsDataOutputStream.java:112) > at java.base/java.io.OutputStream.write(OutputStream.java:122) > {code} > * This can be super easily tested with > `S3RecoverableFsDataOutputStreamTest.testSync`. Please remove the `expected = > Exception.class` from the beginning of the test. > * The following line in the test is testing nothing because never ever > called: > https://github.com/apache/flink/blob/56c81995d3b34ed9066b6771755407b93438f5ab/flink-filesystems/flink-s3-fs-base/src/test/java/org/apache/flink/fs/s3/common/writer/S3RecoverableFsDataOutputStreamTest.java#L264 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-36124) S3RecoverableFsDataOutputStream.sync closes the stream and further write operations throw exception
[ https://issues.apache.org/jira/browse/FLINK-36124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-36124: -- Summary: S3RecoverableFsDataOutputStream.sync closes the stream and further write operations throw exception (was: S3RecoverableFsDataOutputStream.sync closes the stream and further write operation throw exception) > S3RecoverableFsDataOutputStream.sync closes the stream and further write > operations throw exception > --- > > Key: FLINK-36124 > URL: https://issues.apache.org/jira/browse/FLINK-36124 > Project: Flink > Issue Type: Bug > Components: Connectors / AWS >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Blocker > > This behaviour is introduced in FLINK-28513. > Rationale why I think it's a bug: > * `sync` method is defined in `FSDataOutputStream` with the following > definition: > {code:java} > /** > * Flushes the data all the way to the persistent non-volatile storage > (for example disks). The > * method behaves similar to the fsync function, forcing all data > to be persistent on the > * devices. > * > * @throws IOException Thrown if an I/O error occurs > */ > {code} > * In case `sync` method call user of the writer instance is expected to call > further `write` methods > * What is actually happening it's blowing up the next write with the > following exception: > {code:java} > java.io.IOException: Stream closed. > at > org.apache.flink.core.fs.RefCountedFileWithStream.requireOpened(RefCountedFileWithStream.java:72) > at > org.apache.flink.core.fs.RefCountedFileWithStream.write(RefCountedFileWithStream.java:52) > at > org.apache.flink.core.fs.RefCountedBufferingFileStream.flush(RefCountedBufferingFileStream.java:104) > at > org.apache.flink.core.fs.RefCountedBufferingFileStream.write(RefCountedBufferingFileStream.java:87) > at > org.apache.flink.fs.s3.common.writer.S3RecoverableFsDataOutputStream.write(S3RecoverableFsDataOutputStream.java:112) > at java.base/java.io.OutputStream.write(OutputStream.java:122) > {code} > * This can be super easily tested with > `S3RecoverableFsDataOutputStreamTest.testSync`. Please remove the `expected = > Exception.class` from the beginning of the test. > * The following line in the test is testing nothing because never ever > called: > https://github.com/apache/flink/blob/56c81995d3b34ed9066b6771755407b93438f5ab/flink-filesystems/flink-s3-fs-base/src/test/java/org/apache/flink/fs/s3/common/writer/S3RecoverableFsDataOutputStreamTest.java#L264 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-36140) Log a warning when pods are terminated by kubernetes
[ https://issues.apache.org/jira/browse/FLINK-36140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-36140: - Assignee: Clara Xiong > Log a warning when pods are terminated by kubernetes > > > Key: FLINK-36140 > URL: https://issues.apache.org/jira/browse/FLINK-36140 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Affects Versions: 1.19.1 >Reporter: Clara Xiong >Assignee: Clara Xiong >Priority: Major > Labels: pull-request-available > > Scheduled maintenance or buggy nodes on Kubernetes can result random pod > termination and eventually a series of job restarts due to rolling restart of > the Kubernetes cluster nodes. The larger the job is the higher the chance it > is affected. The jobs should be able to auto-recover from these issues, but > can cause unwanted turbulence in large scale pipeline. > In this case, it is very difficult to identify what is causing the restarts > without knowing the issue at Kubernetes layer and the keyword to search with > because it is logged at INFO level. > We need to log this at higher level. If changing it from INFO to ERROR breaks > monitoring we should at least log as warning. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-36140) Log a warning when pods are terminated by kubernetes
[ https://issues.apache.org/jira/browse/FLINK-36140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-36140. --- Fix Version/s: 2.0.0 Resolution: Fixed 9bcd8f4 on master > Log a warning when pods are terminated by kubernetes > > > Key: FLINK-36140 > URL: https://issues.apache.org/jira/browse/FLINK-36140 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Affects Versions: 1.19.1 >Reporter: Clara Xiong >Assignee: Clara Xiong >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Scheduled maintenance or buggy nodes on Kubernetes can result random pod > termination and eventually a series of job restarts due to rolling restart of > the Kubernetes cluster nodes. The larger the job is the higher the chance it > is affected. The jobs should be able to auto-recover from these issues, but > can cause unwanted turbulence in large scale pipeline. > In this case, it is very difficult to identify what is causing the restarts > without knowing the issue at Kubernetes layer and the keyword to search with > because it is logged at INFO level. > We need to log this at higher level. If changing it from INFO to ERROR breaks > monitoring we should at least log as warning. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-36140) Log a warning when pods are terminated by kubernetes
[ https://issues.apache.org/jira/browse/FLINK-36140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-36140. - > Log a warning when pods are terminated by kubernetes > > > Key: FLINK-36140 > URL: https://issues.apache.org/jira/browse/FLINK-36140 > Project: Flink > Issue Type: Improvement > Components: Deployment / Kubernetes >Affects Versions: 1.19.1 >Reporter: Clara Xiong >Assignee: Clara Xiong >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > > Scheduled maintenance or buggy nodes on Kubernetes can result random pod > termination and eventually a series of job restarts due to rolling restart of > the Kubernetes cluster nodes. The larger the job is the higher the chance it > is affected. The jobs should be able to auto-recover from these issues, but > can cause unwanted turbulence in large scale pipeline. > In this case, it is very difficult to identify what is causing the restarts > without knowing the issue at Kubernetes layer and the keyword to search with > because it is logged at INFO level. > We need to log this at higher level. If changing it from INFO to ERROR breaks > monitoring we should at least log as warning. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-36173) Invalid link in checkpoint documentation
Gabor Somogyi created FLINK-36173: - Summary: Invalid link in checkpoint documentation Key: FLINK-36173 URL: https://issues.apache.org/jira/browse/FLINK-36173 Project: Flink Issue Type: Bug Reporter: Gabor Somogyi Some of the places we still have "checkpointing-with-parts-of-the-graph-finished-beta" link instead of "checkpointing-with-parts-of-the-graph-finished". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-36173) Invalid link in checkpoint documentation
[ https://issues.apache.org/jira/browse/FLINK-36173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-36173: - Assignee: Gabor Somogyi > Invalid link in checkpoint documentation > > > Key: FLINK-36173 > URL: https://issues.apache.org/jira/browse/FLINK-36173 > Project: Flink > Issue Type: Bug >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Minor > > Some of the places we still have > "checkpointing-with-parts-of-the-graph-finished-beta" link instead of > "checkpointing-with-parts-of-the-graph-finished". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-36173) Invalid link in checkpoint documentation
[ https://issues.apache.org/jira/browse/FLINK-36173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-36173. --- Resolution: Fixed [{{26ce997}}|https://github.com/apache/flink/commit/26ce997052b5b3a4f3e06e4e489d285d3ae2b618] on master [{{55e4fca}}|https://github.com/apache/flink/commit/55e4fca0b4de14c51aa05c862e1961ea00d6b536] on release-1.20 > Invalid link in checkpoint documentation > > > Key: FLINK-36173 > URL: https://issues.apache.org/jira/browse/FLINK-36173 > Project: Flink > Issue Type: Bug >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Minor > Labels: pull-request-available > > Some of the places we still have > "checkpointing-with-parts-of-the-graph-finished-beta" link instead of > "checkpointing-with-parts-of-the-graph-finished". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-36173) Invalid link in checkpoint documentation
[ https://issues.apache.org/jira/browse/FLINK-36173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-36173: -- Fix Version/s: 2.0.0 1.20.1 > Invalid link in checkpoint documentation > > > Key: FLINK-36173 > URL: https://issues.apache.org/jira/browse/FLINK-36173 > Project: Flink > Issue Type: Bug >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0, 1.20.1 > > > Some of the places we still have > "checkpointing-with-parts-of-the-graph-finished-beta" link instead of > "checkpointing-with-parts-of-the-graph-finished". -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-36001) Store operator name and UID in state metadata
[ https://issues.apache.org/jira/browse/FLINK-36001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-36001: -- Fix Version/s: 2.0.0 > Store operator name and UID in state metadata > - > > Key: FLINK-36001 > URL: https://issues.apache.org/jira/browse/FLINK-36001 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-36001) Store operator name and UID in state metadata
[ https://issues.apache.org/jira/browse/FLINK-36001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-36001. --- Release Note: 02110ce on master Resolution: Fixed > Store operator name and UID in state metadata > - > > Key: FLINK-36001 > URL: https://issues.apache.org/jira/browse/FLINK-36001 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-36001) Store operator name and UID in state metadata
[ https://issues.apache.org/jira/browse/FLINK-36001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-36001. - > Store operator name and UID in state metadata > - > > Key: FLINK-36001 > URL: https://issues.apache.org/jira/browse/FLINK-36001 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Reopened] (FLINK-36001) Store operator name and UID in state metadata
[ https://issues.apache.org/jira/browse/FLINK-36001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reopened FLINK-36001: --- > Store operator name and UID in state metadata > - > > Key: FLINK-36001 > URL: https://issues.apache.org/jira/browse/FLINK-36001 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-36001) Store operator name and UID in state metadata
[ https://issues.apache.org/jira/browse/FLINK-36001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-36001: -- Release Note: (was: 02110ce on master) > Store operator name and UID in state metadata > - > > Key: FLINK-36001 > URL: https://issues.apache.org/jira/browse/FLINK-36001 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-36001) Store operator name and UID in state metadata
[ https://issues.apache.org/jira/browse/FLINK-36001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-36001. - Resolution: Fixed > Store operator name and UID in state metadata > - > > Key: FLINK-36001 > URL: https://issues.apache.org/jira/browse/FLINK-36001 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-36001) Store operator name and UID in state metadata
[ https://issues.apache.org/jira/browse/FLINK-36001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881700#comment-17881700 ] Gabor Somogyi commented on FLINK-36001: --- 02110ce on master > Store operator name and UID in state metadata > - > > Key: FLINK-36001 > URL: https://issues.apache.org/jira/browse/FLINK-36001 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-36001) Store operator name and UID in state metadata
[ https://issues.apache.org/jira/browse/FLINK-36001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17881701#comment-17881701 ] Gabor Somogyi commented on FLINK-36001: --- 60beed9 on master > Store operator name and UID in state metadata > - > > Key: FLINK-36001 > URL: https://issues.apache.org/jira/browse/FLINK-36001 > Project: Flink > Issue Type: Improvement > Components: Runtime / State Backends >Affects Versions: 2.0.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-33268) Flink REST API response parsing throws exception on new fields
[ https://issues.apache.org/jira/browse/FLINK-33268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17843657#comment-17843657 ] Gabor Somogyi commented on FLINK-33268: --- > I assume that there can still be a problem, when a newer version client sends > requests to an old version server with additional fields in RequestBody that > server does not recognize. That's correct. This change is not solving the complete set of combinations just making the client side more flexible. The main use-case what we wanted to fix is the client usage in the operator code. In short the operator uses a client and when received a new feature like slot sharing group information which was added lately then it was blowing up. To overcome this we needed to copy some things from Flink code which is ugly and in mid-long term must be removed, for example: https://github.com/apache/flink-kubernetes-operator/blob/e73363f3486ed9e1df5cc05c9d0baec7c8c3a37f/flink-autoscaler/src/main/java/org/apache/flink/runtime/rest/messages/job/JobDetailsInfo.java#L295 > Flink REST API response parsing throws exception on new fields > -- > > Key: FLINK-33268 > URL: https://issues.apache.org/jira/browse/FLINK-33268 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Gabor Somogyi >Assignee: Gabor Somogyi >Priority: Major > Labels: pull-request-available > Fix For: 1.19.0 > > > At the moment Flink is not ignoring unknown fields when parsing REST > responses. An example for such a class is JobDetailsInfo but this applies to > all others. It would be good to add this support to increase compatibility. > The real life use-case is when the Flink k8s operator wants to handle 2 jobs > with 2 different Flink versions where the newer version has added a new field > to any REST response. Such case the operator gets an exception when for > example it tries to poll the job details with the additional field. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-35302) Flink REST server throws exception on unknown fields in RequestBody
[ https://issues.apache.org/jira/browse/FLINK-35302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-35302. - > Flink REST server throws exception on unknown fields in RequestBody > --- > > Key: FLINK-35302 > URL: https://issues.apache.org/jira/browse/FLINK-35302 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Juntao Hu >Assignee: Juntao Hu >Priority: Major > Labels: pull-request-available > Fix For: 1.19.1 > > > As > [FLIP-401|https://cwiki.apache.org/confluence/display/FLINK/FLIP-401%3A+REST+API+JSON+response+deserialization+unknown+field+tolerance] > and FLINK-33268 mentioned, when an old version REST client receives response > from a new version REST server, with strict JSON mapper, the client will > throw exceptions on newly added fields, which is not convenient for > situations where a centralized client deals with REST servers of different > versions (e.g. k8s operator). > But this incompatibility can also happens at server side, when a new version > REST client sends requests to an old version REST server with additional > fields. Making server flexible with unknown fields can save clients from > backward compatibility code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-35302) Flink REST server throws exception on unknown fields in RequestBody
[ https://issues.apache.org/jira/browse/FLINK-35302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-35302. --- Resolution: Fixed 36b1d2a on master > Flink REST server throws exception on unknown fields in RequestBody > --- > > Key: FLINK-35302 > URL: https://issues.apache.org/jira/browse/FLINK-35302 > Project: Flink > Issue Type: Improvement > Components: Runtime / REST >Affects Versions: 1.19.0 >Reporter: Juntao Hu >Assignee: Juntao Hu >Priority: Major > Labels: pull-request-available > Fix For: 1.19.1 > > > As > [FLIP-401|https://cwiki.apache.org/confluence/display/FLINK/FLIP-401%3A+REST+API+JSON+response+deserialization+unknown+field+tolerance] > and FLINK-33268 mentioned, when an old version REST client receives response > from a new version REST server, with strict JSON mapper, the client will > throw exceptions on newly added fields, which is not convenient for > situations where a centralized client deals with REST servers of different > versions (e.g. k8s operator). > But this incompatibility can also happens at server side, when a new version > REST client sends requests to an old version REST server with additional > fields. Making server flexible with unknown fields can save clients from > backward compatibility code. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35371) Allow the keystore and truststore type to configured for SSL
[ https://issues.apache.org/jira/browse/FLINK-35371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17846935#comment-17846935 ] Gabor Somogyi commented on FLINK-35371: --- That makes sense. Started to have a look... > Allow the keystore and truststore type to configured for SSL > > > Key: FLINK-35371 > URL: https://issues.apache.org/jira/browse/FLINK-35371 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Ammar Master >Priority: Minor > Labels: SSL > > Flink always creates a keystore and trustore using the [default > type|https://github.com/apache/flink/blob/b87ead743dca161cdae8a1fef761954d206b81fb/flink-runtime/src/main/java/org/apache/flink/runtime/net/SSLUtils.java#L236] > defined in the JDK, which in most cases is JKS. > {code} > KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType()); > {code} > We should add other configuration options to set the type explicitly to > support other custom formats, and match the options provided by other > applications by > [Spark|https://spark.apache.org/docs/latest/security.html#:~:text=the%20key%20store.-,%24%7Bns%7D.keyStoreType,-JKS] > and > [Kafka|https://kafka.apache.org/documentation/#:~:text=per%2Dbroker-,ssl.keystore.type,-The%20file%20format] > already. The default would continue to be specified by the JDK. > > The SSLContext for the REST API can read the configuration option directly, > and we need to add extra logic to the > [CustomSSLEngineProvider|https://github.com/apache/flink/blob/master/flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/pekko/CustomSSLEngineProvider.java] > for Pekko. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-35371) Allow the keystore and truststore type to configured for SSL
[ https://issues.apache.org/jira/browse/FLINK-35371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-35371: - Assignee: Gabor Somogyi > Allow the keystore and truststore type to configured for SSL > > > Key: FLINK-35371 > URL: https://issues.apache.org/jira/browse/FLINK-35371 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Reporter: Ammar Master >Assignee: Gabor Somogyi >Priority: Minor > Labels: SSL > > Flink always creates a keystore and trustore using the [default > type|https://github.com/apache/flink/blob/b87ead743dca161cdae8a1fef761954d206b81fb/flink-runtime/src/main/java/org/apache/flink/runtime/net/SSLUtils.java#L236] > defined in the JDK, which in most cases is JKS. > {code} > KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType()); > {code} > We should add other configuration options to set the type explicitly to > support other custom formats, and match the options provided by other > applications by > [Spark|https://spark.apache.org/docs/latest/security.html#:~:text=the%20key%20store.-,%24%7Bns%7D.keyStoreType,-JKS] > and > [Kafka|https://kafka.apache.org/documentation/#:~:text=per%2Dbroker-,ssl.keystore.type,-The%20file%20format] > already. The default would continue to be specified by the JDK. > > The SSLContext for the REST API can read the configuration option directly, > and we need to add extra logic to the > [CustomSSLEngineProvider|https://github.com/apache/flink/blob/master/flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/pekko/CustomSSLEngineProvider.java] > for Pekko. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (FLINK-35371) Allow the keystore and truststore type to configured for SSL
[ https://issues.apache.org/jira/browse/FLINK-35371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi updated FLINK-35371: -- Affects Version/s: 1.19.0 > Allow the keystore and truststore type to configured for SSL > > > Key: FLINK-35371 > URL: https://issues.apache.org/jira/browse/FLINK-35371 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Ammar Master >Assignee: Gabor Somogyi >Priority: Minor > Labels: SSL > > Flink always creates a keystore and trustore using the [default > type|https://github.com/apache/flink/blob/b87ead743dca161cdae8a1fef761954d206b81fb/flink-runtime/src/main/java/org/apache/flink/runtime/net/SSLUtils.java#L236] > defined in the JDK, which in most cases is JKS. > {code} > KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType()); > {code} > We should add other configuration options to set the type explicitly to > support other custom formats, and match the options provided by other > applications by > [Spark|https://spark.apache.org/docs/latest/security.html#:~:text=the%20key%20store.-,%24%7Bns%7D.keyStoreType,-JKS] > and > [Kafka|https://kafka.apache.org/documentation/#:~:text=per%2Dbroker-,ssl.keystore.type,-The%20file%20format] > already. The default would continue to be specified by the JDK. > > The SSLContext for the REST API can read the configuration option directly, > and we need to add extra logic to the > [CustomSSLEngineProvider|https://github.com/apache/flink/blob/master/flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/pekko/CustomSSLEngineProvider.java] > for Pekko. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-35371) Allow the keystore and truststore type to configured for SSL
[ https://issues.apache.org/jira/browse/FLINK-35371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-35371: - Assignee: Ammar Master (was: Gabor Somogyi) > Allow the keystore and truststore type to configured for SSL > > > Key: FLINK-35371 > URL: https://issues.apache.org/jira/browse/FLINK-35371 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Ammar Master >Assignee: Ammar Master >Priority: Minor > Labels: SSL > > Flink always creates a keystore and trustore using the [default > type|https://github.com/apache/flink/blob/b87ead743dca161cdae8a1fef761954d206b81fb/flink-runtime/src/main/java/org/apache/flink/runtime/net/SSLUtils.java#L236] > defined in the JDK, which in most cases is JKS. > {code} > KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType()); > {code} > We should add other configuration options to set the type explicitly to > support other custom formats, and match the options provided by other > applications by > [Spark|https://spark.apache.org/docs/latest/security.html#:~:text=the%20key%20store.-,%24%7Bns%7D.keyStoreType,-JKS] > and > [Kafka|https://kafka.apache.org/documentation/#:~:text=per%2Dbroker-,ssl.keystore.type,-The%20file%20format] > already. The default would continue to be specified by the JDK. > > The SSLContext for the REST API can read the configuration option directly, > and we need to add extra logic to the > [CustomSSLEngineProvider|https://github.com/apache/flink/blob/master/flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/pekko/CustomSSLEngineProvider.java] > for Pekko. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35192) Kubernetes operator oom
[ https://issues.apache.org/jira/browse/FLINK-35192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848060#comment-17848060 ] Gabor Somogyi commented on FLINK-35192: --- 8b789ee on main > Kubernetes operator oom > --- > > Key: FLINK-35192 > URL: https://issues.apache.org/jira/browse/FLINK-35192 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.6.1 > Environment: jdk: openjdk11 > operator version: 1.6.1 >Reporter: chenyuzhi >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-1.9.0 > > Attachments: image-2024-04-22-15-47-49-455.png, > image-2024-04-22-15-52-51-600.png, image-2024-04-22-15-58-23-269.png, > image-2024-04-22-15-58-42-850.png, image-2024-04-30-16-47-07-289.png, > image-2024-04-30-17-11-24-974.png, image-2024-04-30-20-38-25-195.png, > image-2024-04-30-20-39-05-109.png, image-2024-04-30-20-39-34-396.png, > image-2024-04-30-20-41-51-660.png, image-2024-04-30-20-43-20-125.png, > screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png > > > The kubernetest operator docker process was killed by kernel cause out of > memory(the time is 2024.04.03: 18:16) > !image-2024-04-22-15-47-49-455.png! > Metrics: > the pod memory (RSS) is increasing slowly in the past 7 days: > !screenshot-1.png! > However the jvm memory metrics of operator not shown obvious anomaly: > !image-2024-04-22-15-58-23-269.png! > !image-2024-04-22-15-58-42-850.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35192) Kubernetes operator oom
[ https://issues.apache.org/jira/browse/FLINK-35192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17848066#comment-17848066 ] Gabor Somogyi commented on FLINK-35192: --- [~stupid_pig] I've read through the whole conversation here. Do I understand correctly that this jira can be resolved? > Kubernetes operator oom > --- > > Key: FLINK-35192 > URL: https://issues.apache.org/jira/browse/FLINK-35192 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.6.1 > Environment: jdk: openjdk11 > operator version: 1.6.1 >Reporter: chenyuzhi >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-1.9.0 > > Attachments: image-2024-04-22-15-47-49-455.png, > image-2024-04-22-15-52-51-600.png, image-2024-04-22-15-58-23-269.png, > image-2024-04-22-15-58-42-850.png, image-2024-04-30-16-47-07-289.png, > image-2024-04-30-17-11-24-974.png, image-2024-04-30-20-38-25-195.png, > image-2024-04-30-20-39-05-109.png, image-2024-04-30-20-39-34-396.png, > image-2024-04-30-20-41-51-660.png, image-2024-04-30-20-43-20-125.png, > screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png > > > The kubernetest operator docker process was killed by kernel cause out of > memory(the time is 2024.04.03: 18:16) > !image-2024-04-22-15-47-49-455.png! > Metrics: > the pod memory (RSS) is increasing slowly in the past 7 days: > !screenshot-1.png! > However the jvm memory metrics of operator not shown obvious anomaly: > !image-2024-04-22-15-58-23-269.png! > !image-2024-04-22-15-58-42-850.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (FLINK-35192) Kubernetes operator oom
[ https://issues.apache.org/jira/browse/FLINK-35192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi closed FLINK-35192. - > Kubernetes operator oom > --- > > Key: FLINK-35192 > URL: https://issues.apache.org/jira/browse/FLINK-35192 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.6.1 > Environment: jdk: openjdk11 > operator version: 1.6.1 >Reporter: chenyuzhi >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-1.9.0 > > Attachments: image-2024-04-22-15-47-49-455.png, > image-2024-04-22-15-52-51-600.png, image-2024-04-22-15-58-23-269.png, > image-2024-04-22-15-58-42-850.png, image-2024-04-30-16-47-07-289.png, > image-2024-04-30-17-11-24-974.png, image-2024-04-30-20-38-25-195.png, > image-2024-04-30-20-39-05-109.png, image-2024-04-30-20-39-34-396.png, > image-2024-04-30-20-41-51-660.png, image-2024-04-30-20-43-20-125.png, > screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png > > > The kubernetest operator docker process was killed by kernel cause out of > memory(the time is 2024.04.03: 18:16) > !image-2024-04-22-15-47-49-455.png! > Metrics: > the pod memory (RSS) is increasing slowly in the past 7 days: > !screenshot-1.png! > However the jvm memory metrics of operator not shown obvious anomaly: > !image-2024-04-22-15-58-23-269.png! > !image-2024-04-22-15-58-42-850.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Resolved] (FLINK-35192) Kubernetes operator oom
[ https://issues.apache.org/jira/browse/FLINK-35192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi resolved FLINK-35192. --- Resolution: Fixed > Kubernetes operator oom > --- > > Key: FLINK-35192 > URL: https://issues.apache.org/jira/browse/FLINK-35192 > Project: Flink > Issue Type: Bug > Components: Kubernetes Operator >Affects Versions: kubernetes-operator-1.6.1 > Environment: jdk: openjdk11 > operator version: 1.6.1 >Reporter: chenyuzhi >Priority: Major > Labels: pull-request-available > Fix For: kubernetes-operator-1.9.0 > > Attachments: image-2024-04-22-15-47-49-455.png, > image-2024-04-22-15-52-51-600.png, image-2024-04-22-15-58-23-269.png, > image-2024-04-22-15-58-42-850.png, image-2024-04-30-16-47-07-289.png, > image-2024-04-30-17-11-24-974.png, image-2024-04-30-20-38-25-195.png, > image-2024-04-30-20-39-05-109.png, image-2024-04-30-20-39-34-396.png, > image-2024-04-30-20-41-51-660.png, image-2024-04-30-20-43-20-125.png, > screenshot-1.png, screenshot-2.png, screenshot-3.png, screenshot-4.png > > > The kubernetest operator docker process was killed by kernel cause out of > memory(the time is 2024.04.03: 18:16) > !image-2024-04-22-15-47-49-455.png! > Metrics: > the pod memory (RSS) is increasing slowly in the past 7 days: > !screenshot-1.png! > However the jvm memory metrics of operator not shown obvious anomaly: > !image-2024-04-22-15-58-23-269.png! > !image-2024-04-22-15-58-42-850.png! -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (FLINK-35525) HDFS delegation token fetched by custom DelegationTokenProvider is not passed to Yarn AM
[ https://issues.apache.org/jira/browse/FLINK-35525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabor Somogyi reassigned FLINK-35525: - Assignee: Zhen Wang > HDFS delegation token fetched by custom DelegationTokenProvider is not > passed to Yarn AM > - > > Key: FLINK-35525 > URL: https://issues.apache.org/jira/browse/FLINK-35525 > Project: Flink > Issue Type: Bug > Components: Deployment / YARN >Affects Versions: 1.19.0, 1.18.1 >Reporter: Zhen Wang >Assignee: Zhen Wang >Priority: Major > Labels: pull-request-available > > I tried running flink with hadoop proxy user by disabling HadoopModuleFactory > and flink built-in token providers, and implementing a custom token provider. > However, only the hdfs token obtained by hadoopfs provider was added in > YarnClusterDescriptor, which resulted in Yarn AM submission failure. > Discussion: https://github.com/apache/flink/pull/22009#issuecomment-2132676114 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (FLINK-35371) Allow the keystore and truststore type to configured for SSL
[ https://issues.apache.org/jira/browse/FLINK-35371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17852689#comment-17852689 ] Gabor Somogyi commented on FLINK-35371: --- [~ammarm] any news on this? > Allow the keystore and truststore type to configured for SSL > > > Key: FLINK-35371 > URL: https://issues.apache.org/jira/browse/FLINK-35371 > Project: Flink > Issue Type: Improvement > Components: Runtime / Network >Affects Versions: 1.19.0 >Reporter: Ammar Master >Assignee: Ammar Master >Priority: Minor > Labels: SSL > > Flink always creates a keystore and trustore using the [default > type|https://github.com/apache/flink/blob/b87ead743dca161cdae8a1fef761954d206b81fb/flink-runtime/src/main/java/org/apache/flink/runtime/net/SSLUtils.java#L236] > defined in the JDK, which in most cases is JKS. > {code} > KeyStore trustStore = KeyStore.getInstance(KeyStore.getDefaultType()); > {code} > We should add other configuration options to set the type explicitly to > support other custom formats, and match the options provided by other > applications by > [Spark|https://spark.apache.org/docs/latest/security.html#:~:text=the%20key%20store.-,%24%7Bns%7D.keyStoreType,-JKS] > and > [Kafka|https://kafka.apache.org/documentation/#:~:text=per%2Dbroker-,ssl.keystore.type,-The%20file%20format] > already. The default would continue to be specified by the JDK. > > The SSLContext for the REST API can read the configuration option directly, > and we need to add extra logic to the > [CustomSSLEngineProvider|https://github.com/apache/flink/blob/master/flink-rpc/flink-rpc-akka/src/main/java/org/apache/flink/runtime/rpc/pekko/CustomSSLEngineProvider.java] > for Pekko. -- This message was sent by Atlassian Jira (v8.20.10#820010)