[jira] [Created] (FLINK-32197) FLIP 246: Multi Cluster Kafka Source
Mason Chen created FLINK-32197: -- Summary: FLIP 246: Multi Cluster Kafka Source Key: FLINK-32197 URL: https://issues.apache.org/jira/browse/FLINK-32197 Project: Flink Issue Type: New Feature Components: Connectors / Kafka Reporter: Mason Chen This is for introducing a new connector that extends off the current KafkaSource to read multiple Kafka clusters, which can change dynamically. For more details, please refer to [FLIP 246|https://cwiki.apache.org/confluence/display/FLINK/FLIP-246%3A+Multi+Cluster+Kafka+Source]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32416) Initial DynamicKafkaSource Implementation
Mason Chen created FLINK-32416: -- Summary: Initial DynamicKafkaSource Implementation Key: FLINK-32416 URL: https://issues.apache.org/jira/browse/FLINK-32416 Project: Flink Issue Type: Sub-task Reporter: Mason Chen -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32417) DynamicKafkaSource User Documentation
Mason Chen created FLINK-32417: -- Summary: DynamicKafkaSource User Documentation Key: FLINK-32417 URL: https://issues.apache.org/jira/browse/FLINK-32417 Project: Flink Issue Type: Sub-task Reporter: Mason Chen -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32449) Refactor state machine examples to remove Kafka dependency
Mason Chen created FLINK-32449: -- Summary: Refactor state machine examples to remove Kafka dependency Key: FLINK-32449 URL: https://issues.apache.org/jira/browse/FLINK-32449 Project: Flink Issue Type: Technical Debt Components: Documentation Affects Versions: 1.18.0 Reporter: Mason Chen Since the Kafka connector has been externalized, we should remove dependencies on the external Kafka connector. In this case, we should replace the KafkaSource with a example specific generator source, also deleting the KafkaEventsGeneratorJob -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32451) Refactor Confluent Schema Registry E2E Tests to remove Kafka connector dependency
Mason Chen created FLINK-32451: -- Summary: Refactor Confluent Schema Registry E2E Tests to remove Kafka connector dependency Key: FLINK-32451 URL: https://issues.apache.org/jira/browse/FLINK-32451 Project: Flink Issue Type: Technical Debt Components: Formats (JSON, Avro, Parquet, ORC, SequenceFile), Tests Affects Versions: 1.18.0 Reporter: Mason Chen Since the Kafka connector has been externalized, we should remove dependencies on the external Kafka connector. We can use a different connector to test the confluent schema registry format since the format is connector agnostic. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32452) Refactor SQL Client E2E Test to Remove Kafka SQL Connector Dependency
Mason Chen created FLINK-32452: -- Summary: Refactor SQL Client E2E Test to Remove Kafka SQL Connector Dependency Key: FLINK-32452 URL: https://issues.apache.org/jira/browse/FLINK-32452 Project: Flink Issue Type: Technical Debt Components: Table SQL / Client, Tests Affects Versions: 1.18.0 Reporter: Mason Chen Since the Kafka connector has been externalized, we should remove dependencies on the external Kafka connector. The E2E sql client test can use a different connector to exercise this test. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32507) Document KafkaSink SinkWriterMetricGroup metrics
Mason Chen created FLINK-32507: -- Summary: Document KafkaSink SinkWriterMetricGroup metrics Key: FLINK-32507 URL: https://issues.apache.org/jira/browse/FLINK-32507 Project: Flink Issue Type: Technical Debt Components: Connectors / Kafka Reporter: Mason Chen SinkWriterMetricGroup metrics that KafkaSink implements are not documented -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32892) Upgrade kafka-clients dependency to 3.4.x
Mason Chen created FLINK-32892: -- Summary: Upgrade kafka-clients dependency to 3.4.x Key: FLINK-32892 URL: https://issues.apache.org/jira/browse/FLINK-32892 Project: Flink Issue Type: Improvement Components: Connectors / Kafka Affects Versions: 1.17.1 Reporter: Mason Chen 3.4.x includes an implementation for KIP-830. [https://cwiki.apache.org/confluence/display/KAFKA/KIP-830%3A+Allow+disabling+JMX+Reporter] Many people on the mailing list have complained about confusing warning logs about mbean conflicts. We should be safe to disable this JMX reporter since Flink has its own metric system and the internal kafka-clients JMX reporter should not be used. This affects Kafka connector release 3.1 and below (for some reason I cannot enter 3.1 in the affects version/s box). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-32893) Make client.id configurable from KafkaSource
Mason Chen created FLINK-32893: -- Summary: Make client.id configurable from KafkaSource Key: FLINK-32893 URL: https://issues.apache.org/jira/browse/FLINK-32893 Project: Flink Issue Type: Improvement Affects Versions: 1.17.1 Reporter: Mason Chen The client id is not strictly configurable from the KafkaSource because it appends a configurable prefix and subtask information to avoid the mbean conflict exception messages that are internal to the Kafka client. However, various users reported that they need use this client.id for Kafka quotas and they need to have control over the client.id to enforce quotas properly. Affects Kafka connector 3.1 and below. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-33575) FLIP-394: Add Metrics for Connector Agnostic Autoscaling
Mason Chen created FLINK-33575: -- Summary: FLIP-394: Add Metrics for Connector Agnostic Autoscaling Key: FLINK-33575 URL: https://issues.apache.org/jira/browse/FLINK-33575 Project: Flink Issue Type: Improvement Components: Connectors / Common, Kubernetes Operator, Runtime / Metrics Reporter: Mason Chen https://cwiki.apache.org/confluence/display/FLINK/FLIP-394%3A+Add+Metrics+for+Connector+Agnostic+Autoscaling -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34064) Expose JobManagerOperatorMetrics via REST API
Mason Chen created FLINK-34064: -- Summary: Expose JobManagerOperatorMetrics via REST API Key: FLINK-34064 URL: https://issues.apache.org/jira/browse/FLINK-34064 Project: Flink Issue Type: Improvement Reporter: Mason Chen Add a REST API to fetch coordinator metrics. [https://cwiki.apache.org/confluence/display/FLINK/FLIP-417%3A+Expose+JobManagerOperatorMetrics+via+REST+API] -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30409) Support reopening closed metric groups
Mason Chen created FLINK-30409: -- Summary: Support reopening closed metric groups Key: FLINK-30409 URL: https://issues.apache.org/jira/browse/FLINK-30409 Project: Flink Issue Type: Improvement Components: Runtime / Metrics Affects Versions: 1.17.0 Reporter: Mason Chen Currently, metricGroup.close() will unregister metrics and the underlying metric groups. If the metricGroup is created again via addGroup(), it will silently fail to create metrics since the metric group is in a closed state. We need to close metric groups and reopen them because some of the metrics may reference old objects that are no longer relevant/stale and we need to re-create the metric/metric group to point to the new references. For example, we may close `KafkaSourceReader` to remove a topic partition from assignment and then recreate `KafkaSourceReader` with a different set of topic partitions. The metrics should also reflect that. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30833) MetricListener supports closing metric groups
Mason Chen created FLINK-30833: -- Summary: MetricListener supports closing metric groups Key: FLINK-30833 URL: https://issues.apache.org/jira/browse/FLINK-30833 Project: Flink Issue Type: Improvement Components: Runtime / Metrics, Tests Affects Versions: 1.16.1 Reporter: Mason Chen The internal TestingMetricRegistry does not register an action when a metric group is closed. This would allow the use of the MetricListener to test logic that closes metric groups. [https://github.com/apache/flink/blob/44dbb8ec84792540095b826616d0f21b745aa995/flink-test-utils-parent/flink-test-utils/src/main/java/org/apache/flink/metrics/testutils/MetricListener.java#L53] This change doesn't require exposing any public API and is transparent. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30859) Remove flink-connector-kafka rom master branch
Mason Chen created FLINK-30859: -- Summary: Remove flink-connector-kafka rom master branch Key: FLINK-30859 URL: https://issues.apache.org/jira/browse/FLINK-30859 Project: Flink Issue Type: Technical Debt Components: Connectors / Kafka Affects Versions: 1.17.0 Reporter: Mason Chen Remove flink-connector-kafka rom master branch since the repo has now been externalized. There are some 1.17 dependent commits (e.g. split level watermark alignment) that need to be transferred over -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30932) Enabling producer metrics for KafkaSink is not documented
Mason Chen created FLINK-30932: -- Summary: Enabling producer metrics for KafkaSink is not documented Key: FLINK-30932 URL: https://issues.apache.org/jira/browse/FLINK-30932 Project: Flink Issue Type: Improvement Components: Connectors / Kafka Affects Versions: 1.17.0 Reporter: Mason Chen Fix For: 1.17.0 Users can enable producer metrics by setting `register.producer.metrics` to True. We should expose this as a ConfigOption to automate it with Flink's documentation process. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31305) KafkaWriter doesn't wait for errors for in-flight records before completing flush
Mason Chen created FLINK-31305: -- Summary: KafkaWriter doesn't wait for errors for in-flight records before completing flush Key: FLINK-31305 URL: https://issues.apache.org/jira/browse/FLINK-31305 Project: Flink Issue Type: Improvement Components: Connectors / Kafka Affects Versions: 1.16.1, 1.17.0 Reporter: Mason Chen Fix For: 1.17.0 The KafkaWriter flushing needs to wait for all in-flight records to send successfully. This can be achieved by tracking requests and returning a response from the registered callback from the producer#send() logic. There is potential for data loss since the checkpoint does not accurately reflect that all records have been sent successfully, to preserve at least once semantics. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-31747) Externalize debezium from flink-json
Mason Chen created FLINK-31747: -- Summary: Externalize debezium from flink-json Key: FLINK-31747 URL: https://issues.apache.org/jira/browse/FLINK-31747 Project: Flink Issue Type: Improvement Components: Connectors / Kafka, Formats (JSON, Avro, Parquet, ORC, SequenceFile) Affects Versions: 1.18.0 Reporter: Mason Chen debezium code from Flink-json should move to the external Kafka repo. however, we need to ensure backward compatibility with dependencies referencing `Flink-json`. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34127) Kafka connector repo runs a duplicate of `IntegrationTests` framework tests
Mason Chen created FLINK-34127: -- Summary: Kafka connector repo runs a duplicate of `IntegrationTests` framework tests Key: FLINK-34127 URL: https://issues.apache.org/jira/browse/FLINK-34127 Project: Flink Issue Type: Improvement Components: Build System / CI, Connectors / Kafka Affects Versions: kafka-3.0.2 Reporter: Mason Chen I found out this behavior when troubleshooting CI flakiness. These integration tests make heavy use of the CI since they require Kafka, Zookeeper, and Docker containers. We can further stablize CI by not redundantly running these set of tests. `grep -E ".*testIdleReader\[TestEnvironment.*" 14_Compile\ and\ test.txt` returns: ``` 2024-01-17T00:51:05.2943150Z Test org.apache.flink.connector.kafka.dynamic.source.DynamicKafkaSourceITTest.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [org.apache.flink.connector.kafka.testutils.DynamicKafkaSourceExternalContext@43e9a8a2], Semantic: [EXACTLY_ONCE]] is running. 2024-01-17T00:51:07.6922535Z Test org.apache.flink.connector.kafka.dynamic.source.DynamicKafkaSourceITTest.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [org.apache.flink.connector.kafka.testutils.DynamicKafkaSourceExternalContext@43e9a8a2], Semantic: [EXACTLY_ONCE]] successfully run. 2024-01-17T00:56:27.1326332Z Test org.apache.flink.connector.kafka.dynamic.source.DynamicKafkaSourceITTest.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [org.apache.flink.connector.kafka.testutils.DynamicKafkaSourceExternalContext@2db4a84a], Semantic: [EXACTLY_ONCE]] is running. 2024-01-17T00:56:28.4000830Z Test org.apache.flink.connector.kafka.dynamic.source.DynamicKafkaSourceITTest.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [org.apache.flink.connector.kafka.testutils.DynamicKafkaSourceExternalContext@2db4a84a], Semantic: [EXACTLY_ONCE]] successfully run. 2024-01-17T00:56:58.7830792Z Test org.apache.flink.connector.kafka.source.KafkaSourceITCase.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [KafkaSource-PARTITION], Semantic: [EXACTLY_ONCE]] is running. 2024-01-17T00:56:59.0544092Z Test org.apache.flink.connector.kafka.source.KafkaSourceITCase.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [KafkaSource-PARTITION], Semantic: [EXACTLY_ONCE]] successfully run. 2024-01-17T00:56:59.3910987Z Test org.apache.flink.connector.kafka.source.KafkaSourceITCase.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [KafkaSource-TOPIC], Semantic: [EXACTLY_ONCE]] is running. 2024-01-17T00:56:59.6025298Z Test org.apache.flink.connector.kafka.source.KafkaSourceITCase.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [KafkaSource-TOPIC], Semantic: [EXACTLY_ONCE]] successfully run. 2024-01-17T00:57:37.8378640Z Test org.apache.flink.connector.kafka.source.KafkaSourceITCase.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [KafkaSource-PARTITION], Semantic: [EXACTLY_ONCE]] is running. 2024-01-17T00:57:38.0144732Z Test org.apache.flink.connector.kafka.source.KafkaSourceITCase.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [KafkaSource-PARTITION], Semantic: [EXACTLY_ONCE]] successfully run. 2024-01-17T00:57:38.2004796Z Test org.apache.flink.connector.kafka.source.KafkaSourceITCase.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [KafkaSource-TOPIC], Semantic: [EXACTLY_ONCE]] is running. 2024-01-17T00:57:38.4072815Z Test org.apache.flink.connector.kafka.source.KafkaSourceITCase.IntegrationTests.testIdleReader[TestEnvironment: [MiniCluster], ExternalContext: [KafkaSource-TOPIC], Semantic: [EXACTLY_ONCE]] successfully run. 2024-01-17T01:06:11.2933375Z Test org.apache.flink.tests.util.kafka.KafkaSourceE2ECase.testIdleReader[TestEnvironment: [FlinkContainers], ExternalContext: [KafkaSource-PARTITION], Semantic: [EXACTLY_ONCE]] is running. 2024-01-17T01:06:12.1790031Z Test org.apache.flink.tests.util.kafka.KafkaSourceE2ECase.testIdleReader[TestEnvironment: [FlinkContainers], ExternalContext: [KafkaSource-PARTITION], Semantic: [EXACTLY_ONCE]] successfully run. 2024-01-17T01:06:12.5703927Z Test org.apache.flink.tests.util.kafka.KafkaSourceE2ECase.testIdleReader[TestEnvironment: [FlinkContainers], ExternalContext: [KafkaSource-TOPIC], Semantic: [EXACTLY_ONCE]] is running. 2024-01-17T01:06:13.3369574Z Test org.apache.flink.tests.util.kafka.KafkaSourceE2ECase.testIdleReader[TestEnvironment: [FlinkContainers], ExternalContext: [KafkaSource-TOPIC], Semantic: [EXACTLY_ONCE]] successfully run. ``` -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34327) Use maven wrapper in operator build
Mason Chen created FLINK-34327: -- Summary: Use maven wrapper in operator build Key: FLINK-34327 URL: https://issues.apache.org/jira/browse/FLINK-34327 Project: Flink Issue Type: Improvement Components: Kubernetes Operator Reporter: Mason Chen Contributors need to switch between maven versions at times and mvnw can help make this easy. For reference, the build was failing with maven 3.2 but passed when I switched manually to maven 3.9 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34444) Add new endpoint handler to flink
Mason Chen created FLINK-3: -- Summary: Add new endpoint handler to flink Key: FLINK-3 URL: https://issues.apache.org/jira/browse/FLINK-3 Project: Flink Issue Type: Sub-task Reporter: Mason Chen -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-34445) Integrate new endpoint with Flink UI metrics section
Mason Chen created FLINK-34445: -- Summary: Integrate new endpoint with Flink UI metrics section Key: FLINK-34445 URL: https://issues.apache.org/jira/browse/FLINK-34445 Project: Flink Issue Type: Sub-task Reporter: Mason Chen -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35122) Implement Watermark Alignment for DynamicKafkaSource
Mason Chen created FLINK-35122: -- Summary: Implement Watermark Alignment for DynamicKafkaSource Key: FLINK-35122 URL: https://issues.apache.org/jira/browse/FLINK-35122 Project: Flink Issue Type: Improvement Affects Versions: kafka-3.1.0 Reporter: Mason Chen Assignee: Mason Chen Implement Watermark Alignment for DynamicKafkaSource -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-35247) Upgrade spotless apply to `2.41.1` in flink-connector-parent to work with Java 21
Mason Chen created FLINK-35247: -- Summary: Upgrade spotless apply to `2.41.1` in flink-connector-parent to work with Java 21 Key: FLINK-35247 URL: https://issues.apache.org/jira/browse/FLINK-35247 Project: Flink Issue Type: Improvement Components: Build System / CI, Connectors / Common Affects Versions: connector-parent-1.1.0 Reporter: Mason Chen Spotless apply version from flink-connector-parent does not work with Java 21 Tested here: [https://github.com/apache/flink-connector-kafka/pull/98] This is already fixed by spotless apply: https://github.com/diffplug/spotless/pull/1920 -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-27195) KafkaSourceReader offsetsToCommit should be mutable
Mason Chen created FLINK-27195: -- Summary: KafkaSourceReader offsetsToCommit should be mutable Key: FLINK-27195 URL: https://issues.apache.org/jira/browse/FLINK-27195 Project: Flink Issue Type: Bug Components: Connectors / Kafka Affects Versions: 1.14.4, 1.13.6 Reporter: Mason Chen In the KafkaSourceReader, offsetsToCommit should be mutable. Currently, the condition that splits is empty initializes the offsets to commit as an immutable empty map. However, it is possible to have splits assigned after a checkpoint is take (e.g. discovered topic partitions). Therefore, offsets to commit should be mutable. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-27479) HybridSource refreshes availability future
Mason Chen created FLINK-27479: -- Summary: HybridSource refreshes availability future Key: FLINK-27479 URL: https://issues.apache.org/jira/browse/FLINK-27479 Project: Flink Issue Type: Improvement Components: Connectors / Common Affects Versions: 1.14.4 Reporter: Mason Chen Fix For: 1.15.1 HybridSourceReader needs to refresh the availability future according to the underlying reader. It currently maintains its own future and completes it after the sub-reader's availability future is complete. However, the implementation does not refresh the future again until the reader receives a switch event. This can cause a tight loop with the Flink runtime repeatedly invoking pollNext() and high CPU utilization. To solve this, we can reuse the MultipleFuturesAvailabilityHelper to manage the lifecycle of the availability future. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Created] (FLINK-25697) Support HTTPS for Prometheus Push Gateway
Mason Chen created FLINK-25697: -- Summary: Support HTTPS for Prometheus Push Gateway Key: FLINK-25697 URL: https://issues.apache.org/jira/browse/FLINK-25697 Project: Flink Issue Type: Improvement Components: Runtime / Metrics Affects Versions: 1.14.3, 1.13.5 Reporter: Mason Chen Currently prometheus push gateway only supports http endpoints. I am proposing a configuration similar to the influxdb reporter (https://issues.apache.org/jira/browse/FLINK-12336) that would allow users to configure a https endpoint. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (FLINK-23519) Aggregate State Backend Latency by State Level
Mason Chen created FLINK-23519: -- Summary: Aggregate State Backend Latency by State Level Key: FLINK-23519 URL: https://issues.apache.org/jira/browse/FLINK-23519 Project: Flink Issue Type: Improvement Components: Runtime / Metrics, Runtime / State Backends Affects Versions: 1.13.0 Reporter: Mason Chen To make metrics aggregation easier, there should be a config to expose something like `state.backend.rocksdb.metrics.column-family-as-variable` that rocksdb provides to do aggregation across column families ([https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/deployment/config/#state-backend-rocksdb-metrics-column-family-as-variable]). In this case of state backend latency, the variable exposed would be state level instead column family. This makes it easier to aggregate by the various state levels that are reported. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24622) Unified sources do not close scheduled threads from SplitEnumeratorContext#callAsync()
Mason Chen created FLINK-24622: -- Summary: Unified sources do not close scheduled threads from SplitEnumeratorContext#callAsync() Key: FLINK-24622 URL: https://issues.apache.org/jira/browse/FLINK-24622 Project: Flink Issue Type: Bug Components: Connectors / Common Affects Versions: 1.13.2 Reporter: Mason Chen >From user mailing list: I was wondering how to cancel a task that is enqueued by the callAsync() method, the one that takes in a time interval. For example, the KafkaSource uses this for topic partition discovery. It would be straightforward if the API returned the underlying future so that a process can cancel it. For Kafka, the enumerator shutdown seems to be unclean since it only closes the admin client and kafka consumer but not the topic partition discovery task. Furthermore, exceptions from that task will cause job failure and can potentially happen if the task is still running with the admin client closed. How can we address this? This seems to be a bug with the current KafkaSource and also the unified Sources in general. Can you open a bug ticket in jira? I think the enumerator should take of first joining all the async threads before closing the enumerator. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24660) Allow setting KafkaSubscriber in KafkaSourceBuilder
Mason Chen created FLINK-24660: -- Summary: Allow setting KafkaSubscriber in KafkaSourceBuilder Key: FLINK-24660 URL: https://issues.apache.org/jira/browse/FLINK-24660 Project: Flink Issue Type: Improvement Components: Connectors / Kafka Affects Versions: 1.13.3, 1.14.0 Reporter: Mason Chen Some users may have a different mechanism for subscribing the set of topics/partitions. The builder can allow user custom implementations of KafkaSubscriber -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-24805) Metrics documentation missing numBytes metrics reported by tasks
Mason Chen created FLINK-24805: -- Summary: Metrics documentation missing numBytes metrics reported by tasks Key: FLINK-24805 URL: https://issues.apache.org/jira/browse/FLINK-24805 Project: Flink Issue Type: Improvement Components: Documentation, Runtime / Metrics Affects Versions: 1.13.3, 1.14.0 Reporter: Mason Chen The metrics documentation is missing descriptions for: 1. numBytesIn/Out 2. numBytes[In/Out]PerSecond -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (FLINK-28722) Hybrid Source should use .equals() for Integer comparison
Mason Chen created FLINK-28722: -- Summary: Hybrid Source should use .equals() for Integer comparison Key: FLINK-28722 URL: https://issues.apache.org/jira/browse/FLINK-28722 Project: Flink Issue Type: Improvement Components: Connectors / Common Affects Versions: 1.15.1 Reporter: Mason Chen Fix For: 1.16.0, 1.15.2 HybridSource should use .equals() for Integer comparison in filtering out the underlying sources. This causes the HybridSource to stop working when it hits the 128th source (would not work for anything past 127 sources). https://github.com/apache/flink/blob/release-1.14.3-rc1/flink-connectors/flink-connector-base/src/main/java/org/apache/flink/connector/base/source/hybrid/HybridSourceSplitEnumerator.java#L358 A user reported this issue here: https://lists.apache.org/thread/7h2rblsdt7rjf85q9mhfht77bghtbswh -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29515) Document KafkaSource behavior with deleted topics
Mason Chen created FLINK-29515: -- Summary: Document KafkaSource behavior with deleted topics Key: FLINK-29515 URL: https://issues.apache.org/jira/browse/FLINK-29515 Project: Flink Issue Type: Improvement Components: Connectors / Kafka, Documentation Affects Versions: 1.17.0 Reporter: Mason Chen -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-29856) Triggering savepoint does not trigger source operator checkpoint
Mason Chen created FLINK-29856: -- Summary: Triggering savepoint does not trigger source operator checkpoint Key: FLINK-29856 URL: https://issues.apache.org/jira/browse/FLINK-29856 Project: Flink Issue Type: Improvement Components: Runtime / Checkpointing Affects Versions: 1.16.0 Reporter: Mason Chen When I trigger a savepoint with the Flink K8s operator, I verified for two sources (KafkaSource and MultiClusterKafkaSource) do not invoke snapshotState or notifyCheckpointComplete. This is easily reproducible in a simple pipeline (e.g. KafkaSource -> print). However, when the checkpoint occurs via the interval, I do see the sources checkpointing properly and expected logs in the output. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (FLINK-30200) TimestampOffsetsInitializer allows configurable OffsetResetStrategy
Mason Chen created FLINK-30200: -- Summary: TimestampOffsetsInitializer allows configurable OffsetResetStrategy Key: FLINK-30200 URL: https://issues.apache.org/jira/browse/FLINK-30200 Project: Flink Issue Type: Improvement Components: Connectors / Kafka Affects Versions: 1.16.0 Reporter: Mason Chen Currently, the TimestampOffsetsInitializer defaults to `LATEST` and it would be beneficial to allow the user to customize this to `EARLIEST` or `NONE`. This was brought up in PR review: https://github.com/apache/flink/pull/20370#discussion_r991928769 -- This message was sent by Atlassian Jira (v8.20.10#820010)