[jira] [Commented] (FLINK-15571) Create a Redis Streams Connector for Flink

2024-10-07 Thread Vineeth Naroju (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-15571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887320#comment-17887320
 ] 

Vineeth Naroju commented on FLINK-15571:


Is this closed ? 

> Create a Redis Streams Connector for Flink
> --
>
> Key: FLINK-15571
> URL: https://issues.apache.org/jira/browse/FLINK-15571
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Redis Streams
>Reporter: Tugdual Grall
>Assignee: ZhuoYu Chen
>Priority: Minor
>  Labels: pull-request-available, stale-assigned
>
> Redis has a "log data structure" called Redis Streams, it would be nice to 
> integrate Redis Streams and Apache Flink as:
>  * Source
>  * Sink
> See Redis Streams introduction: [https://redis.io/topics/streams-intro]
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-36404) PrometheusSinkWriteException thrown by the response callback may not cause job to fail

2024-10-07 Thread Hong Liang Teoh (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh resolved FLINK-36404.
-
Resolution: Fixed

 merged commit 
[{{67da7df}}|https://github.com/apache/flink-connector-prometheus/commit/67da7dfd715fd6c7cf9fda38d5a8c942e1c9fa38]
 into   apache:main

> PrometheusSinkWriteException thrown by the response callback may not cause 
> job to fail
> --
>
> Key: FLINK-36404
> URL: https://issues.apache.org/jira/browse/FLINK-36404
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Affects Versions: prometheus-connector-1.0.0
>Reporter: Lorenzo Nicora
>Priority: Critical
> Fix For: prometheus-connector-1.0.0
>
>
> *Issue*
> {{PrometheusSinkWriteException}} thrown by {{HttpResponseCallback}} do not 
> cause the httpclient IOReactor to fail, being actually swallowed, and 
> preventing the job from failing.
> Also, related: exceptions from the IOReactor eventually causes the response 
> callback {{failed}} to be called. Allowing the user to set 
> DISCARD_AND_CONTINUE on generic exceptions thrown by the client may hide 
> rethrown exceptions. Also, there is really no use of not failing on a generic 
> unhandled exceptions from the client.
> *Solution*
> 1. Intercept {{PrometheusSinkWriteException}} up the httpclient stack, adding 
> to the client a {{IOSessionListener}} to that can rethow those exceptions, 
> causing the reactor to actually fail, and consequently also the operator to 
> fail.
> 2. Remove the ability to configure of error handling behaviour on generic 
> exceptions thrown by the httpclient. The job should always fail.
> 3. When the httpclient IOReactor fail, a long chain of exceptions is logged. 
> To keep the actual root cause evident, the response callback should log to 
> ERROR when the exception happens



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-36404) PrometheusSinkWriteException thrown by the response callback may not cause job to fail

2024-10-07 Thread Hong Liang Teoh (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh reassigned FLINK-36404:
---

Assignee: Lorenzo Nicora

> PrometheusSinkWriteException thrown by the response callback may not cause 
> job to fail
> --
>
> Key: FLINK-36404
> URL: https://issues.apache.org/jira/browse/FLINK-36404
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Affects Versions: prometheus-connector-1.0.0
>Reporter: Lorenzo Nicora
>Assignee: Lorenzo Nicora
>Priority: Critical
> Fix For: prometheus-connector-1.0.0
>
>
> *Issue*
> {{PrometheusSinkWriteException}} thrown by {{HttpResponseCallback}} do not 
> cause the httpclient IOReactor to fail, being actually swallowed, and 
> preventing the job from failing.
> Also, related: exceptions from the IOReactor eventually causes the response 
> callback {{failed}} to be called. Allowing the user to set 
> DISCARD_AND_CONTINUE on generic exceptions thrown by the client may hide 
> rethrown exceptions. Also, there is really no use of not failing on a generic 
> unhandled exceptions from the client.
> *Solution*
> 1. Intercept {{PrometheusSinkWriteException}} up the httpclient stack, adding 
> to the client a {{IOSessionListener}} to that can rethow those exceptions, 
> causing the reactor to actually fail, and consequently also the operator to 
> fail.
> 2. Remove the ability to configure of error handling behaviour on generic 
> exceptions thrown by the httpclient. The job should always fail.
> 3. When the httpclient IOReactor fail, a long chain of exceptions is logged. 
> To keep the actual root cause evident, the response callback should log to 
> ERROR when the exception happens



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-36427) Additional Integration Test cases for KDS source

2024-10-07 Thread Hong Liang Teoh (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh resolved FLINK-36427.
-
Resolution: Fixed

 merged commit 
[{{91381cd}}|https://github.com/apache/flink-connector-aws/commit/91381cd304c8a98d6dbbdea26ab37e2218c58ca0]
 into   apache:main

> Additional Integration Test cases for KDS source
> 
>
> Key: FLINK-36427
> URL: https://issues.apache.org/jira/browse/FLINK-36427
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kinesis
>Reporter: Elphas Toringepi
>Assignee: Elphas Toringepi
>Priority: Major
>
> Add Integration Test cases for multiple shard and resharding
>  * Reading from multiple shard KDS stream
>  * Reading from resharded KDS stream
>  * Mirror StopWithSavepoint test cases from old KDS source



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-36421) Missing fsync in FsCheckpointStreamFactory

2024-10-07 Thread Marc Aurel Fritz (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-36421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887306#comment-17887306
 ] 

Marc Aurel Fritz commented on FLINK-36421:
--

[~gaborgsomogyi] Are you using the hashmap or the rocksdb state backend? This 
should only affect the hashmap state backend.

Currently we don't have an example to easily reproduce it locally. It's 
happening on our edge devices in a manufacturing shopfloor where some devices 
can often experience power loss, e.g. due to a machine operator cutting power 
to the machine. The bug seems to be quite hard to catch as it's a timing thing: 
the power loss has to happen after the checkpoint and before the OS has flushed 
the data to disk asynchronously. With a multitude of devices that do a (rather 
small) checkpoint every minute and have unstable power the bug occurs 
regularly, though.
It hasn't happened again since introducing the "sync()" call as the checkpoint 
is now only completed when its files are actually persisted on disk, as 
confirmed by "strace".

Asserting the behavior works great with "strace". Sadly I don't know of a good 
way to easily reproduce this without cutting power to the system. Simplest way 
I can think of would be to start an example flink job with checkpointing in a 
VM and kill the whole VM directly after a large enough checkpoint happened. 
This should trigger the bug, but isn't too practical.
Do you know of a better way?

> Missing fsync in FsCheckpointStreamFactory
> --
>
> Key: FLINK-36421
> URL: https://issues.apache.org/jira/browse/FLINK-36421
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems, Runtime / Checkpointing
>Affects Versions: 1.17.0, 1.18.0, 1.19.0, 1.20.0
>Reporter: Marc Aurel Fritz
>Priority: Critical
> Attachments: fsync-fs-stream-factory.diff
>
>
> With Flink 1.20 we observed another checkpoint corruption bug. This is 
> similar to FLINK-35217, but affects only files written by the taskmanager 
> (the ones with random names as described 
> [here|https://github.com/Planet-X/flink/blob/0d6e25a9738d9d4ee94de94e1437f92611b50758/flink-runtime/src/main/java/org/apache/flink/runtime/state/filesystem/FsCheckpointStreamFactory.java#L47]).
> After system crash the files written by the taskmanager may be corrupted 
> (file size of 0 bytes) if the changes in the file-system cache haven't been 
> written to disk. The "_metadata" file written by the jobmanager is always 
> fine because it's properly fsynced.
> Investigation revealed that "fsync" is missing, this time in 
> "FsCheckpointStreamFactory". In this case the "OutputStream" is closed 
> without calling "fsync", thus the file is not durably persisted on disk 
> before the checkpoint is completed. (As previously established in 
> FLINK-35217, calling "fsync" is necessary as simply closing the stream does 
> not have any guarantees on persistence.)
> "strace" on the taskmanager's process confirms this behavior:
>  # The checkpoint chk-1217's directory is created at "mkdir"
>  # The checkpoint chk-1217's non-inline state is written by the taskmanager 
> at "openat", filename is "0507881e-8877-40b0-82d6-3d7dead64ccc". Note that 
> there's no "fsync" before "close".
>  # The checkpoint chk-1217 is finished, its "_metadata" is written and synced 
> properly
>  # The old checkpoint chk-1216 is deleted at "unlink"
> The new checkpoint chk-1217 now references a not-synced file that can get 
> corrupted on e.g. power loss. This means there is no working checkpoint left 
> as the old checkpoint was deleted.
> For durable persistence an "fsync" call is missing before "close" in step 2.
> Full "strace" log:
> {code:java}
> [pid 947250] 08:22:58 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> 0x7f68414c5b50) = -1 ENOENT (No such file or directory)
> [pid 947250] 08:22:58 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> 0x7f68414c5b50) = -1 ENOENT (No such file or directory)
> [pid 947250] 08:22:58 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502", 
> {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> [pid 947250] 08:22:58 
> mkdir("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> 0777) = 0
> [pid 1303248] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/0507881e-8877-40b0-82d6-3d7dead64ccc",
>  0x7f56f08d5610) = -1 ENOENT (No such file or directory)
> [pid 1303248] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> [pid 1303248] 08:22:59 openat(AT_FDCWD, 
> "/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/0507881e-8877-40b0-82d6-3d7dead64ccc",
>  O_WRONLY|O_CREAT|O_TRUNC, 0666) = 199
> [pid 1303248] 08:22:59

[jira] [Updated] (FLINK-36404) PrometheusSinkWriteException thrown by the response callback may not cause job to fail

2024-10-07 Thread Hong Liang Teoh (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh updated FLINK-36404:

Fix Version/s: prometheus-connector-1.0.0

> PrometheusSinkWriteException thrown by the response callback may not cause 
> job to fail
> --
>
> Key: FLINK-36404
> URL: https://issues.apache.org/jira/browse/FLINK-36404
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Affects Versions: prometheus-connector-1.0.0
>Reporter: Lorenzo Nicora
>Priority: Critical
> Fix For: prometheus-connector-1.0.0
>
>
> *Issue*
> {{PrometheusSinkWriteException}} thrown by {{HttpResponseCallback}} do not 
> cause the httpclient IOReactor to fail, being actually swallowed, and 
> preventing the job from failing.
> Also, related: exceptions from the IOReactor eventually causes the response 
> callback {{failed}} to be called. Allowing the user to set 
> DISCARD_AND_CONTINUE on generic exceptions thrown by the client may hide 
> rethrown exceptions. Also, there is really no use of not failing on a generic 
> unhandled exceptions from the client.
> *Solution*
> 1. Intercept {{PrometheusSinkWriteException}} up the httpclient stack, adding 
> to the client a {{IOSessionListener}} to that can rethow those exceptions, 
> causing the reactor to actually fail, and consequently also the operator to 
> fail.
> 2. Remove the ability to configure of error handling behaviour on generic 
> exceptions thrown by the httpclient. The job should always fail.
> 3. When the httpclient IOReactor fail, a long chain of exceptions is logged. 
> To keep the actual root cause evident, the response callback should log to 
> ERROR when the exception happens



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-36404) PrometheusSinkWriteException thrown by the response callback may not cause job to fail

2024-10-07 Thread Hong Liang Teoh (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh updated FLINK-36404:

Affects Version/s: prometheus-connector-1.0.0

> PrometheusSinkWriteException thrown by the response callback may not cause 
> job to fail
> --
>
> Key: FLINK-36404
> URL: https://issues.apache.org/jira/browse/FLINK-36404
> Project: Flink
>  Issue Type: Sub-task
>  Components: Connectors / Prometheus
>Affects Versions: prometheus-connector-1.0.0
>Reporter: Lorenzo Nicora
>Priority: Critical
>
> *Issue*
> {{PrometheusSinkWriteException}} thrown by {{HttpResponseCallback}} do not 
> cause the httpclient IOReactor to fail, being actually swallowed, and 
> preventing the job from failing.
> Also, related: exceptions from the IOReactor eventually causes the response 
> callback {{failed}} to be called. Allowing the user to set 
> DISCARD_AND_CONTINUE on generic exceptions thrown by the client may hide 
> rethrown exceptions. Also, there is really no use of not failing on a generic 
> unhandled exceptions from the client.
> *Solution*
> 1. Intercept {{PrometheusSinkWriteException}} up the httpclient stack, adding 
> to the client a {{IOSessionListener}} to that can rethow those exceptions, 
> causing the reactor to actually fail, and consequently also the operator to 
> fail.
> 2. Remove the ability to configure of error handling behaviour on generic 
> exceptions thrown by the httpclient. The job should always fail.
> 3. When the httpclient IOReactor fail, a long chain of exceptions is logged. 
> To keep the actual root cause evident, the response callback should log to 
> ERROR when the exception happens



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35777) Observe session jobs during cleanup

2024-10-07 Thread Gyula Fora (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887344#comment-17887344
 ] 

Gyula Fora commented on FLINK-35777:


yes [~mateczagany] , thanks!

> Observe session jobs during cleanup
> ---
>
> Key: FLINK-35777
> URL: https://issues.apache.org/jira/browse/FLINK-35777
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Major
> Fix For: kubernetes-operator-1.10.0
>
>
> In the same way we do for FlinkDeployments, session jobs should be also 
> observed in the cleanup logic to let the reconciler work correctly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-35777) Observe session jobs during cleanup

2024-10-07 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-35777.
--
Resolution: Fixed

fixed in https://issues.apache.org/jira/browse/FLINK-35414

> Observe session jobs during cleanup
> ---
>
> Key: FLINK-35777
> URL: https://issues.apache.org/jira/browse/FLINK-35777
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Major
> Fix For: kubernetes-operator-1.10.0
>
>
> In the same way we do for FlinkDeployments, session jobs should be also 
> observed in the cleanup logic to let the reconciler work correctly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-35777) Observe session jobs during cleanup

2024-10-07 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora reassigned FLINK-35777:
--

Assignee: Gyula Fora

> Observe session jobs during cleanup
> ---
>
> Key: FLINK-35777
> URL: https://issues.apache.org/jira/browse/FLINK-35777
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Assignee: Gyula Fora
>Priority: Major
> Fix For: kubernetes-operator-1.10.0
>
>
> In the same way we do for FlinkDeployments, session jobs should be also 
> observed in the cleanup logic to let the reconciler work correctly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-36421) Missing fsync in FsCheckpointStreamFactory

2024-10-07 Thread Gabor Somogyi (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-36421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887315#comment-17887315
 ] 

Gabor Somogyi commented on FLINK-36421:
---

As we're dealing with super huge states which doesn't fit into memory we're not 
effected. No better idea to test it in a relatively reliable way. I think the 
best we can do is to rely on your unstable power supply environment and test it 
there. Please run the jobs for several week and double check whether it's 
stable enough with the mentioned change. We can evaluate the situation when we 
see stable operation.

> Missing fsync in FsCheckpointStreamFactory
> --
>
> Key: FLINK-36421
> URL: https://issues.apache.org/jira/browse/FLINK-36421
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems, Runtime / Checkpointing
>Affects Versions: 1.17.0, 1.18.0, 1.19.0, 1.20.0
>Reporter: Marc Aurel Fritz
>Priority: Critical
> Attachments: fsync-fs-stream-factory.diff
>
>
> With Flink 1.20 we observed another checkpoint corruption bug. This is 
> similar to FLINK-35217, but affects only files written by the taskmanager 
> (the ones with random names as described 
> [here|https://github.com/Planet-X/flink/blob/0d6e25a9738d9d4ee94de94e1437f92611b50758/flink-runtime/src/main/java/org/apache/flink/runtime/state/filesystem/FsCheckpointStreamFactory.java#L47]).
> After system crash the files written by the taskmanager may be corrupted 
> (file size of 0 bytes) if the changes in the file-system cache haven't been 
> written to disk. The "_metadata" file written by the jobmanager is always 
> fine because it's properly fsynced.
> Investigation revealed that "fsync" is missing, this time in 
> "FsCheckpointStreamFactory". In this case the "OutputStream" is closed 
> without calling "fsync", thus the file is not durably persisted on disk 
> before the checkpoint is completed. (As previously established in 
> FLINK-35217, calling "fsync" is necessary as simply closing the stream does 
> not have any guarantees on persistence.)
> "strace" on the taskmanager's process confirms this behavior:
>  # The checkpoint chk-1217's directory is created at "mkdir"
>  # The checkpoint chk-1217's non-inline state is written by the taskmanager 
> at "openat", filename is "0507881e-8877-40b0-82d6-3d7dead64ccc". Note that 
> there's no "fsync" before "close".
>  # The checkpoint chk-1217 is finished, its "_metadata" is written and synced 
> properly
>  # The old checkpoint chk-1216 is deleted at "unlink"
> The new checkpoint chk-1217 now references a not-synced file that can get 
> corrupted on e.g. power loss. This means there is no working checkpoint left 
> as the old checkpoint was deleted.
> For durable persistence an "fsync" call is missing before "close" in step 2.
> Full "strace" log:
> {code:java}
> [pid 947250] 08:22:58 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> 0x7f68414c5b50) = -1 ENOENT (No such file or directory)
> [pid 947250] 08:22:58 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> 0x7f68414c5b50) = -1 ENOENT (No such file or directory)
> [pid 947250] 08:22:58 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502", 
> {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> [pid 947250] 08:22:58 
> mkdir("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> 0777) = 0
> [pid 1303248] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/0507881e-8877-40b0-82d6-3d7dead64ccc",
>  0x7f56f08d5610) = -1 ENOENT (No such file or directory)
> [pid 1303248] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> [pid 1303248] 08:22:59 openat(AT_FDCWD, 
> "/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/0507881e-8877-40b0-82d6-3d7dead64ccc",
>  O_WRONLY|O_CREAT|O_TRUNC, 0666) = 199
> [pid 1303248] 08:22:59 fstat(199, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> [pid 1303248] 08:22:59 close(199)   = 0
> [pid 947310] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/_metadata",
>  0x7f683fb378b0) = -1 ENOENT (No such file or directory)
> [pid 947310] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/_metadata",
>  0x7f683fb37730) = -1 ENOENT (No such file or directory)
> [pid 947310] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/._metadata.inprogress.e87ac62b-5f75-472b-ad21-9579a872b0d0",
>  0x7f683fb37730) = -1 ENOENT (No such file or directory)
> [pid 947310] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> [pid 947310] 08:22:59 
> stat("/opt/flink/state

[jira] [Closed] (FLINK-36392) Varying the Java version in the Operator CI has no effect

2024-10-07 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-36392.
--
Fix Version/s: kubernetes-operator-1.10.0
 Assignee: Sam Barker
   Resolution: Fixed

merged to main 70eb2b6d4acf053411e6330caacf9e62d387bedc

> Varying the Java version in the Operator CI has no effect
> -
>
> Key: FLINK-36392
> URL: https://issues.apache.org/jira/browse/FLINK-36392
> Project: Flink
>  Issue Type: Bug
>  Components: Kubernetes Operator
>Reporter: Sam Barker
>Assignee: Sam Barker
>Priority: Major
> Fix For: kubernetes-operator-1.10.0
>
>
> [FLINK-33471|https://issues.apache.org/jira/browse/FLINK-33471] & 
> [FLINK-33359|https://issues.apache.org/jira/browse/FLINK-33359] both modified 
> the Kubernetes Operator edge to edge test suite to support to support a 
> variety of java versions in GitHub CI. However as far as I can tell this well 
> intentioned move hasn't actually had the desired effect. As the JVM which 
> actually runs the maven build of the operator is executed within the context 
> of a Docker container based on 
> [temurin-11|https://github.com/apache/flink-kubernetes-operator/blob/d946f3f9f3a7f12098cd82db2545de7c89e220ff/Dockerfile#L19].
>  
> Therefore all the tests are actually executed by a java operator built and 
> running under JDK/JRE 11. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35777) Observe session jobs during cleanup

2024-10-07 Thread Mate Czagany (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887324#comment-17887324
 ] 

Mate Czagany commented on FLINK-35777:
--

I think this has been implemented in 
https://issues.apache.org/jira/browse/FLINK-35414

> Observe session jobs during cleanup
> ---
>
> Key: FLINK-35777
> URL: https://issues.apache.org/jira/browse/FLINK-35777
> Project: Flink
>  Issue Type: Improvement
>  Components: Kubernetes Operator
>Reporter: Gyula Fora
>Priority: Major
> Fix For: kubernetes-operator-1.10.0
>
>
> In the same way we do for FlinkDeployments, session jobs should be also 
> observed in the cleanup logic to let the reconciler work correctly



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-36427) Additional Integration Test cases for KDS source

2024-10-07 Thread Hong Liang Teoh (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Liang Teoh reassigned FLINK-36427:
---

Assignee: Elphas Toringepi

> Additional Integration Test cases for KDS source
> 
>
> Key: FLINK-36427
> URL: https://issues.apache.org/jira/browse/FLINK-36427
> Project: Flink
>  Issue Type: Improvement
>  Components: Connectors / Kinesis
>Reporter: Elphas Toringepi
>Assignee: Elphas Toringepi
>Priority: Major
>
> Add Integration Test cases for multiple shard and resharding
>  * Reading from multiple shard KDS stream
>  * Reading from resharded KDS stream
>  * Mirror StopWithSavepoint test cases from old KDS source



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36432) FLIP-438: Amazon SQS Source Connector

2024-10-07 Thread Saurabh (Jira)
Saurabh created FLINK-36432:
---

 Summary: FLIP-438: Amazon SQS Source Connector
 Key: FLINK-36432
 URL: https://issues.apache.org/jira/browse/FLINK-36432
 Project: Flink
  Issue Type: New Feature
  Components: Connectors / AWS
Reporter: Saurabh


This is an umbrella task for FLIP-477. 
FLIP-477: 
[https://cwiki.apache.org/confluence/display/FLINK/FLIP-477+Amazon+SQS+Source+Connector|https://cwiki.apache.org/confluence/display/FLINK/FLIP-438%3A+Amazon+SQS+Sink+Connector]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] (FLINK-35751) Migrate SplitPythonConditionFromCorrelateRule

2024-10-07 Thread Jacky Lau (Jira)


[ https://issues.apache.org/jira/browse/FLINK-35751 ]


Jacky Lau deleted comment on FLINK-35751:
---

was (Author: jackylau):
pr is here [https://github.com/apache/flink/pull/25014,] will you help review 
this [~snuyanzin] ?

> Migrate SplitPythonConditionFromCorrelateRule
> -
>
> Key: FLINK-35751
> URL: https://issues.apache.org/jira/browse/FLINK-35751
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 2.0.0
>Reporter: Jacky Lau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-35751) Migrate SplitPythonConditionFromCorrelateRule

2024-10-07 Thread Jacky Lau (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-35751?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887464#comment-17887464
 ] 

Jacky Lau commented on FLINK-35751:
---

hi [~snuyanzin]  will you help review this?

> Migrate SplitPythonConditionFromCorrelateRule
> -
>
> Key: FLINK-35751
> URL: https://issues.apache.org/jira/browse/FLINK-35751
> Project: Flink
>  Issue Type: Sub-task
>  Components: Table SQL / Planner
>Affects Versions: 2.0.0
>Reporter: Jacky Lau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-32766) Support kafka catalog

2024-10-07 Thread Leonard Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32766?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu reassigned FLINK-32766:
--

Assignee: Hongshun Wang

> Support kafka catalog
> -
>
> Key: FLINK-32766
> URL: https://issues.apache.org/jira/browse/FLINK-32766
> Project: Flink
>  Issue Type: New Feature
>Reporter: melin
>Assignee: Hongshun Wang
>Priority: Major
>
> Support kafka catalog
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-35393) Support flink kafka catalog

2024-10-07 Thread Leonard Xu (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu reassigned FLINK-35393:
--

Assignee: Hrushikesh Tilak

> Support flink kafka catalog
> ---
>
> Key: FLINK-35393
> URL: https://issues.apache.org/jira/browse/FLINK-35393
> Project: Flink
>  Issue Type: New Feature
>  Components: Connectors / Kafka
>Reporter: melin
>Assignee: Hrushikesh Tilak
>Priority: Major
>
> Support for flink kafka catalog



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-36332) Allow the Operator http client to be customised

2024-10-07 Thread Gyula Fora (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36332?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gyula Fora closed FLINK-36332.
--
Fix Version/s: kubernetes-operator-1.10.0
 Assignee: Sam Barker
   Resolution: Fixed

merged to main 4f87bc2210387061c3c0391e5b7c8bbecb9c7fcd

> Allow the Operator http client to be customised
> ---
>
> Key: FLINK-36332
> URL: https://issues.apache.org/jira/browse/FLINK-36332
> Project: Flink
>  Issue Type: Improvement
>Reporter: Sam Barker
>Assignee: Sam Barker
>Priority: Minor
>  Labels: pull-request-available
> Fix For: kubernetes-operator-1.10.0
>
>
> We are looking to produce a build of the Flink Kubernetes operator however 
> for internal policy reasons we need to exclude the Kotlin dependencies. 
> Kotlin is a transitive dependency of OkHttp and now that  
> [FLINK-36031|https://issues.apache.org/jira/browse/FLINK-36031] has been 
> merged OkHttp is entirely optional (but a sensible default). The Fabric8 
> project explicitly support supplying alternative http clients (see 
> [what-artifacts-should-my-project-depend-on|https://github.com/fabric8io/kubernetes-client/blob/main/doc/FAQ.md#what-artifacts-should-my-project-depend-on])
>  and the common pattern as demonstrated by the 
> [java-operator-sdk|https://github.com/operator-framework/java-operator-sdk/blob/24494cb6342a5c75dff9a6962156ff488ad0c818/pom.xml#L44]
>  is to define a property with the name of the client implementation. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-36421) Missing fsync in FsCheckpointStreamFactory

2024-10-07 Thread Zakelly Lan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-36421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887479#comment-17887479
 ] 

Zakelly Lan commented on FLINK-36421:
-

[~gaborgsomogyi] Thanks for the heads up~

[~planet9] I think this also affects the rocksdb since it also uses the 
{{FsCheckpointStreamFactory}}. A {{sync()}} before {{close()}} would be fine. 
Would you please fix this?

> Missing fsync in FsCheckpointStreamFactory
> --
>
> Key: FLINK-36421
> URL: https://issues.apache.org/jira/browse/FLINK-36421
> Project: Flink
>  Issue Type: Bug
>  Components: FileSystems, Runtime / Checkpointing
>Affects Versions: 1.17.0, 1.18.0, 1.19.0, 1.20.0
>Reporter: Marc Aurel Fritz
>Priority: Critical
> Attachments: fsync-fs-stream-factory.diff
>
>
> With Flink 1.20 we observed another checkpoint corruption bug. This is 
> similar to FLINK-35217, but affects only files written by the taskmanager 
> (the ones with random names as described 
> [here|https://github.com/Planet-X/flink/blob/0d6e25a9738d9d4ee94de94e1437f92611b50758/flink-runtime/src/main/java/org/apache/flink/runtime/state/filesystem/FsCheckpointStreamFactory.java#L47]).
> After system crash the files written by the taskmanager may be corrupted 
> (file size of 0 bytes) if the changes in the file-system cache haven't been 
> written to disk. The "_metadata" file written by the jobmanager is always 
> fine because it's properly fsynced.
> Investigation revealed that "fsync" is missing, this time in 
> "FsCheckpointStreamFactory". In this case the "OutputStream" is closed 
> without calling "fsync", thus the file is not durably persisted on disk 
> before the checkpoint is completed. (As previously established in 
> FLINK-35217, calling "fsync" is necessary as simply closing the stream does 
> not have any guarantees on persistence.)
> "strace" on the taskmanager's process confirms this behavior:
>  # The checkpoint chk-1217's directory is created at "mkdir"
>  # The checkpoint chk-1217's non-inline state is written by the taskmanager 
> at "openat", filename is "0507881e-8877-40b0-82d6-3d7dead64ccc". Note that 
> there's no "fsync" before "close".
>  # The checkpoint chk-1217 is finished, its "_metadata" is written and synced 
> properly
>  # The old checkpoint chk-1216 is deleted at "unlink"
> The new checkpoint chk-1217 now references a not-synced file that can get 
> corrupted on e.g. power loss. This means there is no working checkpoint left 
> as the old checkpoint was deleted.
> For durable persistence an "fsync" call is missing before "close" in step 2.
> Full "strace" log:
> {code:java}
> [pid 947250] 08:22:58 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> 0x7f68414c5b50) = -1 ENOENT (No such file or directory)
> [pid 947250] 08:22:58 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> 0x7f68414c5b50) = -1 ENOENT (No such file or directory)
> [pid 947250] 08:22:58 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502", 
> {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> [pid 947250] 08:22:58 
> mkdir("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> 0777) = 0
> [pid 1303248] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/0507881e-8877-40b0-82d6-3d7dead64ccc",
>  0x7f56f08d5610) = -1 ENOENT (No such file or directory)
> [pid 1303248] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> [pid 1303248] 08:22:59 openat(AT_FDCWD, 
> "/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/0507881e-8877-40b0-82d6-3d7dead64ccc",
>  O_WRONLY|O_CREAT|O_TRUNC, 0666) = 199
> [pid 1303248] 08:22:59 fstat(199, {st_mode=S_IFREG|0644, st_size=0, ...}) = 0
> [pid 1303248] 08:22:59 close(199)   = 0
> [pid 947310] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/_metadata",
>  0x7f683fb378b0) = -1 ENOENT (No such file or directory)
> [pid 947310] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/_metadata",
>  0x7f683fb37730) = -1 ENOENT (No such file or directory)
> [pid 947310] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217/._metadata.inprogress.e87ac62b-5f75-472b-ad21-9579a872b0d0",
>  0x7f683fb37730) = -1 ENOENT (No such file or directory)
> [pid 947310] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> [pid 947310] 08:22:59 
> stat("/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/chk-1217", 
> {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
> [pid 947310] 08:22:59 openat(AT_FDCWD, 
> "/opt/flink/statestore/d0346e6c26416765a6038fa482b29502/

[jira] [Updated] (FLINK-13926) `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` should be generic

2024-10-07 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-13926:
---
Fix Version/s: 3.0.0
   (was: 2.0.0)

> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` should be 
> generic 
> ---
>
> Key: FLINK-13926
> URL: https://issues.apache.org/jira/browse/FLINK-13926
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Affects Versions: 1.9.0
>Reporter: zhihao zhang
>Priority: Major
>  Labels: pull-request-available, windows
> Fix For: 3.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` should be 
> generic just like `DynamicEventTimeSessionWindows` and 
> `DynamicProcessingTimeSessionWindows`.
> now:
>  
> {code:java}
> public class ProcessingTimeSessionWindows extends 
> MergingWindowAssigner {}
> {code}
> proposal:
>  
> {code:java}
> public class ProcessingTimeSessionWindows extends MergingWindowAssigner TimeWindow> {}
> {code}
> If this ticket is ok to go, I would like to take it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36433) non-keyed fullWindowPartition#reduce may throw NPE

2024-10-07 Thread xuhuang (Jira)
xuhuang created FLINK-36433:
---

 Summary: non-keyed fullWindowPartition#reduce may throw NPE
 Key: FLINK-36433
 URL: https://issues.apache.org/jira/browse/FLINK-36433
 Project: Flink
  Issue Type: Bug
  Components: API / DataStream
Reporter: xuhuang






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-36261) Revisit all breaking changes before Flink 2.0 Preview

2024-10-07 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-36261:


Assignee: Xuannan Su

> Revisit all breaking changes before Flink 2.0 Preview
> -
>
> Key: FLINK-36261
> URL: https://issues.apache.org/jira/browse/FLINK-36261
> Project: Flink
>  Issue Type: Technical Debt
>Reporter: Xintong Song
>Assignee: Xuannan Su
>Priority: Blocker
> Fix For: 2.0-preview
>
>
> Japicmp for @Deprecated APIs is disabled in FLINK-36207.
> We need to check whether there's any unexpected breaking changes right before 
> the Flink 2.0 Preview release, by re-enable the checking locally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-36360) Prepare release process and Scripts for the preview release

2024-10-07 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo resolved FLINK-36360.

Resolution: Done

> Prepare release process and Scripts for the preview release
> ---
>
> Key: FLINK-36360
> URL: https://issues.apache.org/jira/browse/FLINK-36360
> Project: Flink
>  Issue Type: New Feature
>  Components: Release System
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.0-preview
>
>
> Flink Repo
> || ||Branch||Version||Tag (if any)||
> |Regular|master|2.0-SNAPSHOT| |
> | |release-1.20|1.20-SNAPSHOT| |
> | |release-1.20-rc1|1.20.0|release-1.20.0|
> |Preview|master|2.0-SNAPSHOT| |
> | |2.0-preview1-rc1|2.0-preview1|release-2.0-preview1|
>  
> Docs
> || ||Doc Version||Pointing Branch||Notes||
> |Regular|1.20.X|release-1.20| |
> |Preview|2.0-previewX|2.0-preview1-rc1 (branch of the most recent preview & 
> rc)|Should be removed once 2.0.0 is out|
>  
> Docker
> ||Heading 1||Version||Branch||Notes||
> |Regular|1.20.X|dev-1.20| |
> |Preview|2.0-previewX|dev-2.0|2.0.x should use the same branch|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-36360) Prepare release process and Scripts for the preview release

2024-10-07 Thread Weijie Guo (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-36360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887490#comment-17887490
 ] 

Weijie Guo commented on FLINK-36360:


apache/flink(master 2.0) via 2f6882796cd7971eb2f69736f385d4ce1535f96f.
flink-docker(dev-master) via 1f35ca3b357fd0b5f332e8b07dad5b853e55a3f7.

> Prepare release process and Scripts for the preview release
> ---
>
> Key: FLINK-36360
> URL: https://issues.apache.org/jira/browse/FLINK-36360
> Project: Flink
>  Issue Type: New Feature
>  Components: Release System
>Reporter: Xintong Song
>Assignee: Xintong Song
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: 2.0-preview
>
>
> Flink Repo
> || ||Branch||Version||Tag (if any)||
> |Regular|master|2.0-SNAPSHOT| |
> | |release-1.20|1.20-SNAPSHOT| |
> | |release-1.20-rc1|1.20.0|release-1.20.0|
> |Preview|master|2.0-SNAPSHOT| |
> | |2.0-preview1-rc1|2.0-preview1|release-2.0-preview1|
>  
> Docs
> || ||Doc Version||Pointing Branch||Notes||
> |Regular|1.20.X|release-1.20| |
> |Preview|2.0-previewX|2.0-preview1-rc1 (branch of the most recent preview & 
> rc)|Should be removed once 2.0.0 is out|
>  
> Docker
> ||Heading 1||Version||Branch||Notes||
> |Regular|1.20.X|dev-1.20| |
> |Preview|2.0-previewX|dev-2.0|2.0.x should use the same branch|



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-36261) Revisit all breaking changes before Flink 2.0 Preview

2024-10-07 Thread Xuannan Su (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-36261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xuannan Su reassigned FLINK-36261:
--

Assignee: Xuannan Su  (was: Xuannan Su)

> Revisit all breaking changes before Flink 2.0 Preview
> -
>
> Key: FLINK-36261
> URL: https://issues.apache.org/jira/browse/FLINK-36261
> Project: Flink
>  Issue Type: Technical Debt
>Reporter: Xintong Song
>Assignee: Xuannan Su
>Priority: Blocker
> Fix For: 2.0-preview
>
>
> Japicmp for @Deprecated APIs is disabled in FLINK-36207.
> We need to check whether there's any unexpected breaking changes right before 
> the Flink 2.0 Preview release, by re-enable the checking locally.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-5875) Use TypeComparator.hash() instead of Object.hashCode() for keying in DataStream API

2024-10-07 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-5875:
--
Parent: (was: FLINK-3957)
Issue Type: Technical Debt  (was: Sub-task)

> Use TypeComparator.hash() instead of Object.hashCode() for keying in 
> DataStream API
> ---
>
> Key: FLINK-5875
> URL: https://issues.apache.org/jira/browse/FLINK-5875
> Project: Flink
>  Issue Type: Technical Debt
>  Components: API / DataStream
>Reporter: Robert Metzger
>Priority: Major
> Fix For: 3.0.0
>
>
> See FLINK-5874 for details.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-4675) Remove Parameter from WindowAssigner.getDefaultTrigger()

2024-10-07 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4675?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo resolved FLINK-4675.
---
Resolution: Done

already done via FLINK-36355.

> Remove Parameter from WindowAssigner.getDefaultTrigger()
> 
>
> Key: FLINK-4675
> URL: https://issues.apache.org/jira/browse/FLINK-4675
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Reporter: Aljoscha Krettek
>Assignee: xuhuang
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0-preview
>
>
> For legacy reasons the method has {{StreamExecutionEnvironment}} as a 
> parameter. This is not needed anymore.
> [~StephanEwen] do you think we should break this now? {{WindowAssigner}} is 
> {{PublicEvolving}} but I wanted to play it conservative for now.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (FLINK-36434) Revise threading model of (KafkaPartition)SplitReader

2024-10-07 Thread Arvid Heise (Jira)
Arvid Heise created FLINK-36434:
---

 Summary: Revise threading model of (KafkaPartition)SplitReader
 Key: FLINK-36434
 URL: https://issues.apache.org/jira/browse/FLINK-36434
 Project: Flink
  Issue Type: Bug
  Components: Connectors / Kafka
Affects Versions: kafka-3.2.0
Reporter: Arvid Heise


The KafkaPartitionSplitReader is created in the source thread, where it 
initializes the consumer. However, it later access the consumer almost 
exclusively through the fetcher thread. Since the consumer is not thread-safe, 
this thread model looks broken.

However, I'd challenge that the overall SplitReader implementation is already 
suboptimal as the same issue is probably happening in other connectors. I'd 
probably first create the fetch task and within the fetch task create the split 
reader.

If left as-is, we can't upgrade Kafka client anymore because we receive sporadic


{code:java}
 Caused by: org.apache.kafka.common.requests.CorrelationIdMismatchException: 
Correlation id for response (1179651) does not match request (0), request 
header: RequestHeader(apiKey=API_VERSIONS, apiVersion=3, 
clientId=kafka-source-external-context-6092797646400842179-3, correlationId=0, 
headerVersion=2)
at 
org.apache.kafka.common.requests.AbstractResponse.parseResponse(AbstractResponse.java:106)
at 
org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:740)
at 
org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:913)
at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:580)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:280)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:251)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1728)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.position(KafkaConsumer.java:1686)
at 
org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.lambda$removeEmptySplits$5(KafkaPartitionSplitReader.java:375)
at 
org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.retryOnWakeup(KafkaPartitionSplitReader.java:481)
at 
org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.removeEmptySplits(KafkaPartitionSplitReader.java:374)
at 
org.apache.flink.connector.kafka.source.reader.KafkaPartitionSplitReader.handleSplitsChanges(KafkaPartitionSplitReader.java:224)
at 
org.apache.flink.connector.base.source.reader.fetcher.AddSplitsTask.run(AddSplitsTask.java:51)
at 
org.apache.flink.connector.base.source.reader.fetcher.SplitFetcher.runOnce(SplitFetcher.java:165)
... 6 more
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-6375) Fix LongValue hashCode

2024-10-07 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-6375:
--
Parent: (was: FLINK-3957)
Issue Type: Technical Debt  (was: Sub-task)

> Fix LongValue hashCode
> --
>
> Key: FLINK-6375
> URL: https://issues.apache.org/jira/browse/FLINK-6375
> Project: Flink
>  Issue Type: Technical Debt
>  Components: API / Type Serialization System
>Affects Versions: 2.0.0
>Reporter: Greg Hogan
>Priority: Minor
>  Labels: auto-unassigned
> Fix For: 3.0.0
>
>
> Match {{LongValue.hashCode}} to {{Long.hashCode}} (and the other numeric 
> types) by simply adding the high and low words rather than shifting the hash 
> by adding 43.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-13926) `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` should be generic

2024-10-07 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-13926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-13926:
---
Parent: (was: FLINK-3957)
Issue Type: Technical Debt  (was: Sub-task)

> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` should be 
> generic 
> ---
>
> Key: FLINK-13926
> URL: https://issues.apache.org/jira/browse/FLINK-13926
> Project: Flink
>  Issue Type: Technical Debt
>  Components: API / DataStream
>Affects Versions: 1.9.0
>Reporter: zhihao zhang
>Priority: Major
>  Labels: pull-request-available, windows
> Fix For: 3.0.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> `ProcessingTimeSessionWindows` and `EventTimeSessionWindows` should be 
> generic just like `DynamicEventTimeSessionWindows` and 
> `DynamicProcessingTimeSessionWindows`.
> now:
>  
> {code:java}
> public class ProcessingTimeSessionWindows extends 
> MergingWindowAssigner {}
> {code}
> proposal:
>  
> {code:java}
> public class ProcessingTimeSessionWindows extends MergingWindowAssigner TimeWindow> {}
> {code}
> If this ticket is ok to go, I would like to take it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-32383) 2.0 Breaking configuration changes

2024-10-07 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-32383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song closed FLINK-32383.

Resolution: Fixed

> 2.0 Breaking configuration changes
> --
>
> Key: FLINK-32383
> URL: https://issues.apache.org/jira/browse/FLINK-32383
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Runtime / Configuration
>Affects Versions: 2.0-preview
>Reporter: Zhu Zhu
>Assignee: Junrui Li
>Priority: Blocker
>  Labels: 2.0-related
> Fix For: 2.0-preview
>
>
> Umbrella issue for all breaking changes to Flink configuration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (FLINK-6375) Fix LongValue hashCode

2024-10-07 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-6375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo updated FLINK-6375:
--
Fix Version/s: 3.0.0
   (was: 2.0.0)

> Fix LongValue hashCode
> --
>
> Key: FLINK-6375
> URL: https://issues.apache.org/jira/browse/FLINK-6375
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / Type Serialization System
>Affects Versions: 2.0.0
>Reporter: Greg Hogan
>Priority: Minor
>  Labels: auto-unassigned
> Fix For: 3.0.0
>
>
> Match {{LongValue.hashCode}} to {{Long.hashCode}} (and the other numeric 
> types) by simply adding the high and low words rather than shifting the hash 
> by adding 43.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-4503) Remove with method from CoGroupedStream and JoinedStream, and change apply method return type

2024-10-07 Thread Weijie Guo (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-4503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Weijie Guo resolved FLINK-4503.
---
Resolution: Fixed

Already done in FLINK-36355.

> Remove with method from CoGroupedStream and JoinedStream, and change apply 
> method return type
> -
>
> Key: FLINK-4503
> URL: https://issues.apache.org/jira/browse/FLINK-4503
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream
>Reporter: Jark Wu
>Assignee: xuhuang
>Priority: Major
> Fix For: 2.0-preview
>
>
> We introduced (and immediately deprecated) the with(...) method in 
> FLINK-4271. It is a temporary workaround for setting parallelism after 
> co-group and join operator and not breaking binary compatibility. The 
> with(...) method only differs in the return type and calls apply(...), 
> casting the returned value. 
> So we need to remove the {{with(...)}} method in Flink 2.0. And change the 
> apply method return type.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-14068) Use Java's Duration instead of Flink's Time

2024-10-07 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-14068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song closed FLINK-14068.

Resolution: Done

> Use Java's Duration instead of Flink's Time
> ---
>
> Key: FLINK-14068
> URL: https://issues.apache.org/jira/browse/FLINK-14068
> Project: Flink
>  Issue Type: Sub-task
>  Components: API / DataStream, Runtime / Configuration, Runtime / 
> Coordination
>Reporter: Zili Chen
>Assignee: Matthias Pohl
>Priority: Major
>  Labels: 2.0-related, pull-request-available
> Fix For: 2.0-preview
>
>
> As discussion in mailing list 
> [here|https://lists.apache.org/x/thread.html/90ad2f1d7856cfe5bdc8f7dd678c626be96eeaeeb736e98f31660039@%3Cdev.flink.apache.org%3E]
>  the community reaches a consensus that we will use Java's Duration for 
> representing "time interval" instead of use Flink's Time for it.
> Specifically, Flink has two {{Time}} classes, which are
> {{org.apache.flink.api.common.time.Time}}
> {{org.apache.flink.streaming.api.windowing.time.Time}}
> the latter has been already deprecated and superseded by the former. Now we 
> want to also deprecated the former and drop it in 2.0.0(we don't drop it just 
> now because it is part of {{@Public}} interfaces).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33747) Remove Sink V1 API in 2.0

2024-10-07 Thread LvYanquan (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33747?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17887485#comment-17887485
 ] 

LvYanquan commented on FLINK-33747:
---

Removed together in the associated PR of this Jira 
https://issues.apache.org/jira/browse/FLINK-36245.

> Remove Sink V1 API in 2.0
> -
>
> Key: FLINK-33747
> URL: https://issues.apache.org/jira/browse/FLINK-33747
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Common
>Reporter: Weijie Guo
>Priority: Major
>  Labels: 2.0-related
> Fix For: 2.0-preview
>
>
> We have already mark sink v1 related API to `@deprecated` in 1.15. They can 
> be removed in Flink 2.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (FLINK-35411) Optimize buffer triggering of async state requests

2024-10-07 Thread Zakelly Lan (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-35411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zakelly Lan resolved FLINK-35411.
-
Fix Version/s: 2.0-preview
   Resolution: Fixed

> Optimize buffer triggering of async state requests
> --
>
> Key: FLINK-35411
> URL: https://issues.apache.org/jira/browse/FLINK-35411
> Project: Flink
>  Issue Type: Sub-task
>  Components: Runtime / State Backends, Runtime / Task
>Reporter: Zakelly Lan
>Assignee: Zakelly Lan
>Priority: Major
>  Labels: pull-request-available
> Fix For: 2.0-preview
>
>
> Currently during draining of async state requests, the task thread performs 
> {{Thread.sleep}} to avoid cpu overhead when polling mails. This can be 
> optimized by wait & notify.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (FLINK-33747) Remove Sink V1 API in 2.0

2024-10-07 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song reassigned FLINK-33747:


Assignee: LvYanquan

> Remove Sink V1 API in 2.0
> -
>
> Key: FLINK-33747
> URL: https://issues.apache.org/jira/browse/FLINK-33747
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Common
>Reporter: Weijie Guo
>Assignee: LvYanquan
>Priority: Major
>  Labels: 2.0-related
> Fix For: 2.0-preview
>
>
> We have already mark sink v1 related API to `@deprecated` in 1.15. They can 
> be removed in Flink 2.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Closed] (FLINK-33747) Remove Sink V1 API in 2.0

2024-10-07 Thread Xintong Song (Jira)


 [ 
https://issues.apache.org/jira/browse/FLINK-33747?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Xintong Song closed FLINK-33747.

Resolution: Done

> Remove Sink V1 API in 2.0
> -
>
> Key: FLINK-33747
> URL: https://issues.apache.org/jira/browse/FLINK-33747
> Project: Flink
>  Issue Type: Technical Debt
>  Components: Connectors / Common
>Reporter: Weijie Guo
>Assignee: LvYanquan
>Priority: Major
>  Labels: 2.0-related
> Fix For: 2.0-preview
>
>
> We have already mark sink v1 related API to `@deprecated` in 1.15. They can 
> be removed in Flink 2.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)