date:20241112

[jira] [Created] (KAFKA-17992) Remove `getUnderlying` and `isKRaftTest` from ClusterInstance

2024-11-12 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-17992:
--

 Summary: Remove `getUnderlying` and `isKRaftTest` from 
ClusterInstance
 Key: KAFKA-17992
 URL: https://issues.apache.org/jira/browse/KAFKA-17992
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


they are unused



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17991) Timed calls to future.get in DefaultStatePersister and test improvements

2024-11-12 Thread Sushant Mahajan (Jira)

Sushant Mahajan created KAFKA-17991:
---

 Summary: Timed calls to future.get in DefaultStatePersister and 
test improvements
 Key: KAFKA-17991
 URL: https://issues.apache.org/jira/browse/KAFKA-17991
 Project: Kafka
  Issue Type: Sub-task
Reporter: Sushant Mahajan






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17314) Fix the typo: `maxlifeTimeMs`

2024-11-12 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17314.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Fix the typo: `maxlifeTimeMs`
> -
>
> Key: KAFKA-17314
> URL: https://issues.apache.org/jira/browse/KAFKA-17314
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: 黃竣陽
>Priority: Minor
>  Labels: kip
> Fix For: 4.0.0
>
>
> The typo is in our public APIs [0], so this change requires a KIP ...
> [0] 
> https://github.com/apache/kafka/blob/49fc14f6116a697550339a8804177bd9290d15db/clients/src/main/java/org/apache/kafka/clients/admin/CreateDelegationTokenOptions.java#L56
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1104: Allow Foreign Key Extraction from Both Key and Value in KTable Joins

2024-11-12 Thread Bill Bejeck

Hi Peter,

It's important that we don't break compatibility.
We faced a similar situation in KIP-149

when we
provided access to the key in mapping and joining. I think it's worth
exploring Lucas's suggestion of using a `BiFunction`.
But if for some reason it won't work (I don't see why it wouldn't) we'll
need to maintain compatibly and go with the renaming the new method.

Thanks,
Bill

On Tue, Nov 12, 2024 at 8:06 AM Lucas Brutschy
 wrote:

> Hi,
>
> 1. I don't think we can/should break backwards compatibility.
> 2. Have you considered using `BiFunction foreignKeyExtractor`
> ? This should work without renaming the method.
> 3. I don't see the benefit of deprecating it. I agree, we wouldn't add
> both overloads normally, but the value-only overload is useful /
> correct as it is, I wouldn't break users code just to "clean" things
> very slighly.
>
> Cheers,
> Lucas
>
> On Mon, Nov 11, 2024 at 6:43 PM Chu Cheng Li  wrote:
> >
> > Hi all,
> >
> > I'm working on enhancing the KTable foreign key join API to allow
> > extracting foreign keys from both key and value. However, I've
> encountered
> > a Java type erasure issue that affects our API design choices.
> >
> > Current Situation:
> > - We want to allow foreign key extraction from both key and value
> > - We'd like to maintain backward compatibility with existing value-only
> > extractors
> > - Initial attempt was to use method overloading
> >
> > The Issue:
> > Due to Java type erasure, we can't have both:
> > - Function foreignKeyExtractor
> > - Function, KO> foreignKeyExtractor
> > in method overloads as they become identical at runtime.
> >
> > Options:
> > 1. Breaking Change Approach
> >- Replace value-only extractor with key-value extractor
> >- Clean API but breaks backward compatibility
> >
> > 2. Compatibility Approach
> >- Keep both capabilities through different method names (e.g.,
> > join/joinWithKey)
> >- Maintains compatibility but less elegant API
> >
> > Questions for Discussion:
> > 1. Should we prioritize API elegance or backward compatibility?
> > 2. If we choose compatibility, which naming convention should we use for
> > the new methods?
> > 3. Should we consider deprecating the value-only extractor in a future
> > release?
> >
> > Looking forward to your thoughts.
> >
> > Best regards,
> > Peter
> >
> > On Thu, Oct 31, 2024 at 1:08 PM Chu Cheng Li 
> wrote:
> >
> > > Hi Everyone,
> > > I would like to start a discussion on KIP-1104:
> > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1103%3A+Additional+metrics+for+cooperative+consumption
> > > 
> > >
> > > This KIP allow foreign key extraction from both key and value in KTable
> > > Joins, before this KIP user can only extract foreign from record's
> value,
> > > this KIP provides more flexibility on it.
> > >
> > > Regards,
> > > Peter Lee
> > >
> > >
> > >
>

Re: [DISCUSS] KIP-1050: Consistent error handling for Transactions

2024-11-12 Thread Kaushik Raina

Thanks Lianet for review

LM1 & LM2:
We will extend parent classes only to maintain the hierarchy

For TopicAuthorizationException and GroupAuthorizationException, we will
extend parent class AuthorizationException. So new hierarchy will be
"AuthorizationException < InvalidConfigurationException < ApiException"

For UnknownTopicOrPartitionException and NotLeaderOrFollowerException, we
will extend parent class InvalidMetadataException. So new hierarchy will be
"InvalidMetadataException < RefreshRetriableException < RetriableException
< ApiException"


LM3:
CommitFailedException will not be extended because its parent class
KafkaException is expected to be treated as application recoverable as
mentioned in KIP.


I have added a `comment` section in the KIP-1050 table to include LM1 & LM2
& LM3 details.


LM4: StaleMemberEpochException is thrown as IllegalGenerationException in
Txn
https://github.com/apache/kafka/blob/trunk/group-coordinator/src/main/java/org/apache/kafka/coordinator/group/OffsetMetadataManager.java#L367-L379
. So it need not be part of KIP

On Thu, Nov 7, 2024 at 1:05 AM Lianet M.  wrote:

> Hello, sorry for the late reply, took another pass after the latest
> updates. Most of the proposed groupings and handling seem good and very
> helpful indeed from the producer perspective. I just have some more
> comments related to the changes in the exception hierarchy, that are not
> scoped to the producer:
>
>
> LM1 - Regarding TopicAuthorizationException and
> GroupAuthorizationException. We’re proposing to change their parent class
> to InvalidConfigurationException, which extends ApiException (vs their
> current parent AuthorizationException). This would be a breaking change for
> apps expecting/handling instanceOf AuthorizationException (and an
> unexpected one I would say, given that TopicAuthorizationException and
> GroupAuthorizationException are indeed “authorization errors”).
>
>
> LM2 - Similarly, regarding UnknownTopicOrPartitionException and
> NotLeaderOrFollowerException: we’re proposing to change parent class to
> RefreshRetriableException (which extends Retriable), but that would mean
> that these exceptions wouldn’t be instanceOf InvalidMetadataException
> anymore (as they are today). This would be a breaking change for client
> apps dealing with instanceOf InvalidMetadata right?
>
>
> LM3 - Regarding CommitFailedException, I’m simply concerned that the move
> may not be conceptually right? We’re proposing to change its parent class
> to ApplicationRecoverableException, which would make CommitFailedException
> land under the ApiException umbrella. ApiException is intended for errors
> that are part of the protocol, and this CommitFailedException is not (it’s
> just an exception generated on the client side based on different
> protocol-level errors). Not that it will have an impact on existing apps
> given that it would still be a KafkaException as it is today, but wanted to
> point this out.
>
>
> LM4 - Under ApplicationRecoverableException, I wonder if we should also
> consider StaleMemberEpochException (just for consistency, alongside
> FencedInstaceIdException, UnknownMemberIdException,...)
>
>
> Thanks!
>
>
> Lianet
>
> On Thu, Oct 24, 2024 at 6:25 AM Kaushik Raina  >
> wrote:
>
> > Thanks Artem for valuable comments
> >
> > I have incorporated them into the updated KIP.
> >
> >
> > On Sat, Oct 19, 2024 at 3:31 AM Artem Livshits
> >  wrote:
> >
> > > Hi Kaushik,
> > >
> > > Thank you for the KIP!  I think it'll make writing transactional
> > > application easier and less error prone.  I have a couple comments:
> > >
> > > AL1.  The keep proposes changing the semantics
> > > of UnknownProducerIdException.  Currently, this error is never returned
> > by
> > > the broker so we cannot validate whether changing semantics is going to
> > be
> > > compatible with a future use (if we ever return it again) and the
> future
> > > broker wouldn't know which sematnics the client supports.  I think for
> > now
> > > we could just add a comment in the code that this error is never
> returned
> > > by the broker and if we ever add it to the broker we'd need to make
> sure
> > > it's categorized properly and bump the API version so that the broker
> > knows
> > > if the client supports required semantics.
> > >
> > > AL2. The exception classification looks good to me from the producer
> > > perspective: we have retriable, abortable, etc. categories.  The
> > retriable
> > > errors can be retried within producer safely without losing
> idempotence,
> > > the messages would be retried with the same sequence numbers and we
> won't
> > > have duplicates.  However, if a retriable error (e.g. TimeoutException)
> > is
> > > thrown to the application, it's not safe to retry the produce, because
> > the
> > > messages will get new sequence numbers and if the original message in
> > fact
> > > succeeded, the new messages would become duplicates, losing the exactly
> > > once semantics.  Thus, if a retria

[jira] [Resolved] (KAFKA-17962) test_pause_and_resume_sink fails with "Failed to consume messages after resuming sink connector" with CONSUMER group protocol

2024-11-12 Thread Kirk True (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True resolved KAFKA-17962.
---
Resolution: Cannot Reproduce

> test_pause_and_resume_sink fails with "Failed to consume messages after 
> resuming sink connector" with CONSUMER group protocol
> -
>
> Key: KAFKA-17962
> URL: https://issues.apache.org/jira/browse/KAFKA-17962
> Project: Kafka
>  Issue Type: Test
>  Components: clients, consumer, system tests
>Affects Versions: 3.9.0
>Reporter: Kirk True
>Assignee: Kirk True
>Priority: Major
>  Labels: kip-848-client-support
> Fix For: 4.0.0
>
>
> The following failure is consistently seen when running 
> {{{}test_pause_and_resume_sink.connect_protocol{}}}:
> {noformat}
> TimeoutError('Failed to consume messages after resuming sink connector')
> Traceback (most recent call last):
>   File 
> "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/tests/runner_client.py",
>  line 351, in _do_run
> data = self.run_test()
>   File 
> "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/tests/runner_client.py",
>  line 411, in run_test
> return self.test_context.function(self.test)
>   File 
> "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/mark/_mark.py",
>  line 438, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/home/semaphore/kafka-overlay/kafka/tests/kafkatest/tests/connect/connect_distributed_test.py",
>  line 415, in test_pause_and_resume_sink
> wait_until(lambda: len(self.sink.received_messages()) > num_messages, 
> timeout_sec=30,
>   File 
> "/home/semaphore/kafka-overlay/kafka/venv/lib/python3.8/site-packages/ducktape/utils/util.py",
>  line 58, in wait_until
> raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from 
> last_exception
> ducktape.errors.TimeoutError: Failed to consume messages after resuming sink 
> connector
> {noformat}
> Parameters on fail:
>  * connect_protocol=compatible
>  * metadata_quorum=ISOLATED_KRAFT
>  * use_new_coordinator=True
>  * group_protocol=consumer



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17681) Fix unstable consumer_test.py#test_fencing_static_consumer

2024-11-12 Thread Kirk True (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True resolved KAFKA-17681.
---
Resolution: Cannot Reproduce

> Fix unstable consumer_test.py#test_fencing_static_consumer
> --
>
> Key: KAFKA-17681
> URL: https://issues.apache.org/jira/browse/KAFKA-17681
> Project: Kafka
>  Issue Type: Bug
>  Components: clients, consumer, system tests
>Reporter: Chia-Ping Tsai
>Assignee: Chia-Ping Tsai
>Priority: Major
>  Labels: flaky-test, kip-848-client-support
> Fix For: 4.0.0
>
>
> {code:java}
> AssertionError('Static consumers attempt to join with instance id in use 
> should not cause a rebalance.')
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 186, in _do_run
> data = self.run_test()
>   File 
> "/usr/local/lib/python3.9/dist-packages/ducktape/tests/runner_client.py", 
> line 246, in run_test
> return self.test_context.function(self.test)
>   File "/usr/local/lib/python3.9/dist-packages/ducktape/mark/_mark.py", line 
> 433, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File "/opt/kafka-dev/tests/kafkatest/tests/client/consumer_test.py", line 
> 359, in test_fencing_static_consumer
> assert num_rebalances == consumer.num_rebalances(), "Static consumers 
> attempt to join with instance id in use should not cause a rebalance. before: 
> " + str(num_rebalances) + " after: " + str(consumer.num_rebalances())
> AssertionError: Static consumers attempt to join with instance id in use 
> should not cause a rebalance.
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18003) add test to make sure `Admin#deleteRecords` can handle the corrupted records

2024-11-12 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-18003:
--

 Summary: add test to make sure `Admin#deleteRecords` can handle 
the corrupted records
 Key: KAFKA-18003
 URL: https://issues.apache.org/jira/browse/KAFKA-18003
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


1. produces some records to create one inactive segment
2. manually load the segment file to corrupt the data
3. failed to consume the all records due to corrupted record
4. call `Admin.deleteRecords` to skip the corrupted record
5. succeed to consume the all records due to corrupted record



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17999) Fix flaky DynamicConnectionQuotaTest testDynamicConnectionQuota

2024-11-12 Thread David Arthur (Jira)

David Arthur created KAFKA-17999:


 Summary: Fix flaky DynamicConnectionQuotaTest 
testDynamicConnectionQuota
 Key: KAFKA-17999
 URL: https://issues.apache.org/jira/browse/KAFKA-17999
 Project: Kafka
  Issue Type: Test
Reporter: David Arthur


https://ge.apache.org/scans/tests?search.names=CI%20workflow%2CGit%20repository&search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.tags=github%2Ctrunk&search.tasks=test&search.timeZoneId=America%2FNew_York&search.values=CI%2Chttps:%2F%2Fgithub.com%2Fapache%2Fkafka&tests.container=kafka.network.DynamicConnectionQuotaTest&tests.test=testDynamicConnectionQuota(String)%5B1%5D



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18000) Fix flaky ReplicaManager#testSuccessfulBuildRemoteLogAuxStateMetrics

2024-11-12 Thread David Arthur (Jira)

David Arthur created KAFKA-18000:


 Summary: Fix flaky 
ReplicaManager#testSuccessfulBuildRemoteLogAuxStateMetrics
 Key: KAFKA-18000
 URL: https://issues.apache.org/jira/browse/KAFKA-18000
 Project: Kafka
  Issue Type: Test
Reporter: David Arthur


https://ge.apache.org/scans/tests?search.names=CI%20workflow%2CGit%20repository&search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.tags=github%2Ctrunk&search.tasks=test&search.timeZoneId=America%2FNew_York&search.values=CI%2Chttps:%2F%2Fgithub.com%2Fapache%2Fkafka&tests.container=kafka.server.ReplicaManagerTest&tests.test=testSuccessfulBuildRemoteLogAuxStateMetrics()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1106: Add duration based offset reset option for consumer clients

2024-11-12 Thread Matthias J. Sax


Thanks for updating the KIP.



I am happy to see that we seem to align to use a single config only :)

Obviously, I need to bikeshed on the format: `BACK:` does not 
read well IMHO, and I think `auto.offset.reset="BACK_BY:"` 
would read much better. Andrew's suggestion of `BY_DURATION:` 
might even be better (but I would be ok with either one).


(I would not use `DURATION:` personally -- similar to 
`BACK:` is does not read well from my POV).




About KS related changes. Overall LGTM. What is the reason for having

  AutoOffsetReset.LATEST
  AutoOffsetReset.EARLIEST

as `public` variables? Seems we don't need them, but that they are 
rather an internal implementation detail?


In general, I would recommend to omit everything `private` on the KIP 
(including method implementations), as it's not user facing, but only 
have method signatures in the KIP.


What I believe we need to add is a `protected` constructor (for the 
internal sub-class):


   protected AutoOffsetReset(AutoOffsetReset autoOffsetReset);




-Matthias

On 11/12/24 6:09 AM, Apoorv Mittal wrote:

Thanks Manikumar for explaining, sounds good to me.

Regards,
Apoorv Mittal


On Tue, Nov 12, 2024 at 1:47 PM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:


Hi,
Looks good now. Just one suggestion.

AS8: Instead of "back:30D", I wonder whether the word 'duration' ought to
be
used to be consistent with kafka-consumer-groups.sh. So,
"by-duration:P3D" or "duration:P3D" might be better.

The overall idea of merging the configs into one config is fine in the
current
text of the KIP.

Thanks,
Andrew


From: Manikumar 
Sent: 12 November 2024 13:30
To: dev@kafka.apache.org 
Subject: Re: [DISCUSS] KIP-1106: Add duration based offset reset option
for consumer clients

Hi Apoorv,

AM7: AutoOffsetReset.java is for Kafka Streams API. I am not proposing any
public Interface/class for Kafka Consumer.
As mentioned in the KIP, even though OffsetResetStrategy is a public class,
it's not used in any public APIs. I think new internal classes should be
sufficient.

AM8: Fixed

AM9: This class is for kafka streams API

AM10: I was using the EARLIEST_TIMESTAMP, LATEST_TIMESTAMP constant values
from ListOffsetsRequest
<
https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/requests/ListOffsetsRequest.java#L41




But yes, we can use Optional in AutoOffsetReset class. Updated the KIP.


Thanks.



On Tue, Nov 12, 2024 at 6:15 PM Apoorv Mittal 
wrote:


Hi,
I read the changes for single configuration and deprecated
OffsetResetStrategy.java.

AM7: Question: The KIP says that previous supported values were
earliest/latest/none and new back: config would be added. We

have

no definition of "none" in the newly introduced AutoOffsetReset.java

class

hence I am assuming that if "none" is specified as a config option then
that config will be ignored, correct? Or are we deprecating the usage of
"none" altogether?

AM8: Minor: The new class AutoOffsetReset under interfaces mentions the
name as OffsetResetStrategy.java. This requires correction.

AM9: Is the package name correct for AutoOffsetReset as
org.apache.kafka.streams, shouldn't it be under clients package?

AM10: What does -1L and -2L mean as long for latest and earliest in
AutoOffsetReset.java? Is it just a long placeholder which will never be
used elsewhere for latest and earliest? If yes then does it make sense to
keep long as Optional, and use Optional.empty() for latest and earliest?

Regards,
Apoorv Mittal


On Tue, Nov 12, 2024 at 12:06 PM Manikumar 
wrote:


Thanks Ismael and Lianet for the reviews.

Based on suggestions, I have updated the KIP to in favour of having a
single config (auto.offset.reset).
I have also adopted the Lianet's suggestion on naming.

auto.offset.reset=back:P3D -> reset back 3 days


Let me know if there are any concerns.


Thanks,


On Sat, Nov 9, 2024 at 10:06 PM Lianet M.  wrote:


Hi all. Thanks Manikumar for the nice improvement, useful indeed.

I also lean towards having a single config given that it's all about

the

reset policy, seems all the same "what" (auto reset policy) and we

are

just extending the with a new behaviour. Regarding the naming (agree

on

duration being confusing), what about something to show that it's

simply

about how far back to reset:

auto.offset.reset=EARLIEST
auto.offset.reset=BACK:P3D -> reset back 3 days (combined with

ISO8601

seems really easy to read/understand from the config definition

itself)


Something like that would definitely break consistency with the

command

line tool argument "by_duration", but if it seems clearer we should
consider the tradeoff and not penalize the consumer API/configs.

Thanks!

Lianet


On Sat, Nov 9, 2024 at 2:47 AM Ismael Juma 

wrote:



Thanks for the KIP, this is useful. A comment below.

On Thu, Nov 7, 2024 at 4:51 PM Matthias J. Sax 

wrote:



I am personally not convinced that adding a new config
`auto.offset.res

[jira] [Reopened] (KAFKA-16949) System test test_dynamic_logging in connect_distributed_test is failing

2024-11-12 Thread Kirk True (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-16949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True reopened KAFKA-16949:
---

> System test test_dynamic_logging in connect_distributed_test is failing
> ---
>
> Key: KAFKA-16949
> URL: https://issues.apache.org/jira/browse/KAFKA-16949
> Project: Kafka
>  Issue Type: Bug
>  Components: connect
>Reporter: Sagar Rao
>Assignee: Sagar Rao
>Priority: Major
> Fix For: 3.9.0
>
>
> Noticed that the system test `test_dynamic_logging` in 
> `connect_distributed_test` is failing with the following error:
>  
> {code:java}
> [INFO  - 2024-05-08 21:11:06,638 - runner_client - log - lineno:310]: 
> RunnerClient: 
> kafkatest.tests.connect.connect_distributed_test.ConnectDistributedTest.test_dynamic_logging:
>  FAIL: AssertionError()
> Traceback (most recent call last):
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py",
>  line 184, in _do_run
> data = self.run_test()
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/venv/lib/python3.7/site-packages/ducktape/tests/runner_client.py",
>  line 262, in run_test
> return self.test_context.function(self.test)
>   File 
> "/home/jenkins/workspace/system-test-kafka-branch-builder/kafka/tests/kafkatest/tests/connect/connect_distributed_test.py",
>  line 500, in test_dynamic_logging
> assert self._loggers_are_set(new_level, request_time, namespace, 
> workers=[worker])
> AssertionError {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1104: Allow Foreign Key Extraction from Both Key and Value in KTable Joins

2024-11-12 Thread Matthias J. Sax

I can just second was Lucas and Bill said already.

1. We cannot break compatibility
2. BiFunction sounds like a good alternative
3. I would personally deprecate the existing method, but don't feel
strong about it.

-Matthias

On 11/12/24 8:33 AM, Bill Bejeck wrote:

Hi Peter,

It's important that we don't break compatibility.
We faced a similar situation in KIP-149

when we
provided access to the key in mapping and joining. I think it's worth
exploring Lucas's suggestion of using a `BiFunction`.
But if for some reason it won't work (I don't see why it wouldn't) we'll
need to maintain compatibly and go with the renaming the new method.

Thanks,
Bill

On Tue, Nov 12, 2024 at 8:06 AM Lucas Brutschy
wrote:

Hi,

1. I don't think we can/should break backwards compatibility.
2. Have you considered using `BiFunction foreignKeyExtractor`
? This should work without renaming the method.
3. I don't see the benefit of deprecating it. I agree, we wouldn't add
both overloads normally, but the value-only overload is useful /
correct as it is, I wouldn't break users code just to "clean" things
very slighly.

Cheers,
Lucas

On Mon, Nov 11, 2024 at 6:43 PM Chu Cheng Li wrote:

Hi all,

I'm working on enhancing the KTable foreign key join API to allow
extracting foreign keys from both key and value. However, I've

encountered

a Java type erasure issue that affects our API design choices.

Current Situation:
- We want to allow foreign key extraction from both key and value
- We'd like to maintain backward compatibility with existing value-only
extractors
- Initial attempt was to use method overloading

The Issue:
Due to Java type erasure, we can't have both:
- Function foreignKeyExtractor
- Function, KO> foreignKeyExtractor
in method overloads as they become identical at runtime.

Options:
1. Breaking Change Approach
- Replace value-only extractor with key-value extractor
- Clean API but breaks backward compatibility

2. Compatibility Approach
- Keep both capabilities through different method names (e.g.,
join/joinWithKey)
- Maintains compatibility but less elegant API

Questions for Discussion:
1. Should we prioritize API elegance or backward compatibility?
2. If we choose compatibility, which naming convention should we use for
the new methods?
3. Should we consider deprecating the value-only extractor in a future
release?

Looking forward to your thoughts.

Best regards,
Peter

On Thu, Oct 31, 2024 at 1:08 PM Chu Cheng Li

wrote:

Hi Everyone,
I would like to start a discussion on KIP-1104:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-1103%3A+Additional+metrics+for+cooperative+consumption

This KIP allow foreign key extraction from both key and value in KTable
Joins, before this KIP user can only extract foreign from record's

value,

this KIP provides more flexibility on it.

Regards,
Peter Lee

[jira] [Resolved] (KAFKA-17978) StreamsUpgradeTest#test_rolling_upgrade_with_2_bounces system tests fail

2024-11-12 Thread PoAn Yang (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

PoAn Yang resolved KAFKA-17978.
---
Resolution: Fixed

> StreamsUpgradeTest#test_rolling_upgrade_with_2_bounces system tests fail
> 
>
> Key: KAFKA-17978
> URL: https://issues.apache.org/jira/browse/KAFKA-17978
> Project: Kafka
>  Issue Type: Test
>Reporter: PoAn Yang
>Assignee: Nicholas Telford
>Priority: Major
>
> Run `TC_PATHS="tests/kafkatest/tests/streams/streams_upgrade_test.py" 
> /bin/bash tests/docker/run_tests.sh` on trunk branch. The versions which can 
> support fk_joins can't pass `test_rolling_upgrade_with_2_bounces`.
>  
> {noformat}
> [INFO:2024-11-09 22:24:00,601]: Triggering test 10 of 19...
> [INFO:2024-11-09 22:24:00,611]: RunnerClient: Loading test {'directory': 
> '/opt/kafka-dev/tests/kafkatest/tests/streams', 'file_name': 
> 'streams_upgrade_test.py', 'cls_name': 'StreamsUpgradeTest', 'method_name': 
> 'test_rolling_upgrade_with_2_bounces', 'injected_args': {'from_version': 
> '3.4.1', 'metadata_quorum': 'COMBINED_KRAFT'}}
> [INFO:2024-11-09 22:24:00,619]: RunnerClient: 
> kafkatest.tests.streams.streams_upgrade_test.StreamsUpgradeTest.test_rolling_upgrade_with_2_bounces.from_version=3.4.1.metadata_quorum=COMBINED_KRAFT:
>  on run 1/1
> [INFO:2024-11-09 22:24:00,621]: RunnerClient: 
> kafkatest.tests.streams.streams_upgrade_test.StreamsUpgradeTest.test_rolling_upgrade_with_2_bounces.from_version=3.4.1.metadata_quorum=COMBINED_KRAFT:
>  Setting up...
> [INFO:2024-11-09 22:24:00,623]: RunnerClient: 
> kafkatest.tests.streams.streams_upgrade_test.StreamsUpgradeTest.test_rolling_upgrade_with_2_bounces.from_version=3.4.1.metadata_quorum=COMBINED_KRAFT:
>  Running...
> [INFO:2024-11-09 22:26:26,343]: RunnerClient: 
> kafkatest.tests.streams.streams_upgrade_test.StreamsUpgradeTest.test_rolling_upgrade_with_2_bounces.from_version=3.4.1.metadata_quorum=COMBINED_KRAFT:
>  Tearing down...
> [INFO:2024-11-09 22:27:47,017]: RunnerClient: 
> kafkatest.tests.streams.streams_upgrade_test.StreamsUpgradeTest.test_rolling_upgrade_with_2_bounces.from_version=3.4.1.metadata_quorum=COMBINED_KRAFT:
>  FAIL: TimeoutError("Never saw output 'processed [0-9]* records from 
> topic=data' on ducker@ducker07")
> Traceback (most recent call last):
>   File 
> "/usr/local/lib/python3.7/dist-packages/ducktape/tests/runner_client.py", 
> line 351, in _do_run
> data = self.run_test()
>   File 
> "/usr/local/lib/python3.7/dist-packages/ducktape/tests/runner_client.py", 
> line 411, in run_test
> return self.test_context.function(self.test)
>   File "/usr/local/lib/python3.7/dist-packages/ducktape/mark/_mark.py", line 
> 438, in wrapper
> return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
>   File 
> "/opt/kafka-dev/tests/kafkatest/tests/streams/streams_upgrade_test.py", line 
> 137, in test_rolling_upgrade_with_2_bounces
> self.do_stop_start_bounce(p, from_version[:-2], to_version, counter, 
> extra_properties)
>   File 
> "/opt/kafka-dev/tests/kafkatest/tests/streams/streams_upgrade_test.py", line 
> 402, in do_stop_start_bounce
> err_msg="Never saw output '%s' on " % self.processed_data_msg + 
> str(node.account))
>   File 
> "/usr/local/lib/python3.7/dist-packages/ducktape/cluster/remoteaccount.py", 
> line 754, in wait_until
> allow_fail=True) == 0, **kwargs)
>   File "/usr/local/lib/python3.7/dist-packages/ducktape/utils/util.py", line 
> 58, in wait_until
> raise TimeoutError(err_msg() if callable(err_msg) else err_msg) from 
> last_exception
> ducktape.errors.TimeoutError: Never saw output 'processed [0-9]* records from 
> topic=data' on ducker@ducker07
> [WARNING:2024-11-09 22:27:47,017]: RunnerClient: 
> kafkatest.tests.streams.streams_upgrade_test.StreamsUpgradeTest.test_rolling_upgrade_with_2_bounces.from_version=3.4.1.metadata_quorum=COMBINED_KRAFT:
>  Test requested 6 nodes, used only 5
> [INFO:2024-11-09 22:27:47,017]: RunnerClient: 
> kafkatest.tests.streams.streams_upgrade_test.StreamsUpgradeTest.test_rolling_upgrade_with_2_bounces.from_version=3.4.1.metadata_quorum=COMBINED_KRAFT:
>  Data: None
> [INFO:2024-11-09 22:27:47,124]: 
> ~
> [INFO:2024-11-09 22:27:47,125]: Triggering test 11 of 19...
> [INFO:2024-11-09 22:27:47,134]: RunnerClient: Loading test {'directory': 
> '/opt/kafka-dev/tests/kafkatest/tests/streams', 'file_name': 
> 'streams_upgrade_test.py', 'cls_name': 'StreamsUpgradeTest', 'method_name': 
> 'test_rolling_upgrade_with_2_bounces', 'injected_args': {'from_version': 
> '3.5.2', 'metadata_quorum': 'COMBINED_KRAFT'}}
> [INFO:2024-11-09 22:27:47,142]: RunnerClient: 
> kafkatest.tests.streams.streams_upgrade_test.StreamsUpgradeTest.test_rolling_upgrade_with_2_bounces.from_version=

[jira] [Created] (KAFKA-18004) Use 3.8 to run zk service

2024-11-12 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-18004:
--

 Summary: Use 3.8 to run zk service
 Key: KAFKA-18004
 URL: https://issues.apache.org/jira/browse/KAFKA-18004
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


We plan to remove all ZooKeeper-related code in version 4.0. However, some old 
brokers in the end-to-end tests still require ZooKeeper service, so we need to 
run the ZooKeeper service using the 3.x release instead of the dev branch.

Since version 3.9 is not available in the 
https://s3-us-west-2.amazonaws.com/kafka-packages repo, we can use version 3.8 
for now.

https://github.com/apache/kafka/blob/trunk/tests/kafkatest/services/zookeeper.py#L47



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18002) Upgrade connect_distributed_test.py's test_exactly_once_source to support different group.protocol values

2024-11-12 Thread Kirk True (Jira)

Kirk True created KAFKA-18002:
-

 Summary: Upgrade connect_distributed_test.py's 
test_exactly_once_source to support different group.protocol values
 Key: KAFKA-18002
 URL: https://issues.apache.org/jira/browse/KAFKA-18002
 Project: Kafka
  Issue Type: Bug
  Components: clients, connect, consumer, system tests
Affects Versions: 3.9.0
Reporter: Kirk True
 Fix For: 4.0.0


The Connect system test for {{test_exactly_once_source}} needs to be updated to 
allow testing against different {{group.protocol}} values. The Connector that's 
started is not supplied with the group.protocol value from the test parameter, 
so it defaults to {{{}ConsumerConfig.DEFAULT_GROUP_PROTOCOL{}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-18001) KafkaNetworkChannel missing UpdateRaftVoterRequestData logic

2024-11-12 Thread Alyssa Huang (Jira)

Alyssa Huang created KAFKA-18001:


 Summary: KafkaNetworkChannel missing UpdateRaftVoterRequestData 
logic
 Key: KAFKA-18001
 URL: https://issues.apache.org/jira/browse/KAFKA-18001
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.9.0, 3.9.1
Reporter: Alyssa Huang


buildRequest needs an if case for UpdateRaftVoterRequestData



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-16589) Consider removing `ClusterInstance#createAdminClient` since callers are not sure whether they need to call close

2024-11-12 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-16589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-16589.

Resolution: Won't Fix

KAFKA-17922 refactor all helpers so we don't need to address this now.

> Consider removing `ClusterInstance#createAdminClient` since callers are not 
> sure whether they need to call close
> 
>
> Key: KAFKA-16589
> URL: https://issues.apache.org/jira/browse/KAFKA-16589
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: PoAn Yang
>Priority: Minor
>
> Sometimes we close the admin created by `createAdminClient`, and sometimes we 
> don't. That is not a true problem since the `ClusterInstance` will call 
> `close` when stopping.
> However, that cause a lot of inconsistent code, and in fact it does not save 
> much time since creating a Admin is not a hard work. We can get 
> `bootstrapServers` and `bootstrapControllers` from `ClusterInstance` easily.
>  
> {code:java}
> // before
> try (Admin admin = cluster.createAdminClient()) { }
> // after v0
> try (Admin admin = Admin.create(Collections.singletonMap(
> CommonClientConfigs.BOOTSTRAP_SERVERS_CONFIG, 
> cluster.bootstrapServers( {}
> {code}
> Personally, the `after` version is not verbose, but we can have alternatives: 
> `Map clientConfigs`.
>  
> {code:java}
> // after v1
> try (Admin admin = Admin.create(cluster.clientConfigs())) {}{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17922) add helper to ClusterInstance to create client component

2024-11-12 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17922?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17922.

Fix Version/s: 4.0.0
   Resolution: Fixed

> add helper to ClusterInstance to create client component 
> -
>
> Key: KAFKA-17922
> URL: https://issues.apache.org/jira/browse/KAFKA-17922
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: 黃竣陽
>Priority: Major
> Fix For: 4.0.0
>
>
> `ClusterInstance` can offer many meaningful default setting to build client 
> component easily. Additionally, it can simplify the configs when it supports 
> to run under security.
> The following helpers should be included
> 1. ClusterInstance#producer(Map<> overrides, serializer, serializer)
> 2. ClusterInstance#producer(serializer, serializer)
> 3. ClusterInstance#consumer(Map<> overrides, deserializer, deserializer)
> 4. ClusterInstance#consumer(deserializer, deserializer)
> 5. ClusterInstance#admin(Map)
> 4. ClusterInstance#admin()



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1106: Add duration based offset reset option for consumer clients

2024-11-12 Thread Manikumar

Hi Matthias.

Thanks for the review

1. looks like the majority of us are leaning towards BY_DURATION naming. I
have updated the same in the KIP.

2. Thanks. Updated the KIP to remove private/internal implementation.


Thanks,

On Wed, Nov 13, 2024 at 6:28 AM Matthias J. Sax  wrote:

> Thanks for updating the KIP.
>
>
>
> I am happy to see that we seem to align to use a single config only :)
>
> Obviously, I need to bikeshed on the format: `BACK:` does not
> read well IMHO, and I think `auto.offset.reset="BACK_BY:"`
> would read much better. Andrew's suggestion of `BY_DURATION:`
> might even be better (but I would be ok with either one).
>
> (I would not use `DURATION:` personally -- similar to
> `BACK:` is does not read well from my POV).
>
>
>
> About KS related changes. Overall LGTM. What is the reason for having
>
>AutoOffsetReset.LATEST
>AutoOffsetReset.EARLIEST
>
> as `public` variables? Seems we don't need them, but that they are
> rather an internal implementation detail?
>
> In general, I would recommend to omit everything `private` on the KIP
> (including method implementations), as it's not user facing, but only
> have method signatures in the KIP.
>
> What I believe we need to add is a `protected` constructor (for the
> internal sub-class):
>
> protected AutoOffsetReset(AutoOffsetReset autoOffsetReset);
>
>
>
>
> -Matthias
>
> On 11/12/24 6:09 AM, Apoorv Mittal wrote:
> > Thanks Manikumar for explaining, sounds good to me.
> >
> > Regards,
> > Apoorv Mittal
> >
> >
> > On Tue, Nov 12, 2024 at 1:47 PM Andrew Schofield <
> > andrew_schofield_j...@outlook.com> wrote:
> >
> >> Hi,
> >> Looks good now. Just one suggestion.
> >>
> >> AS8: Instead of "back:30D", I wonder whether the word 'duration' ought
> to
> >> be
> >> used to be consistent with kafka-consumer-groups.sh. So,
> >> "by-duration:P3D" or "duration:P3D" might be better.
> >>
> >> The overall idea of merging the configs into one config is fine in the
> >> current
> >> text of the KIP.
> >>
> >> Thanks,
> >> Andrew
> >>
> >> 
> >> From: Manikumar 
> >> Sent: 12 November 2024 13:30
> >> To: dev@kafka.apache.org 
> >> Subject: Re: [DISCUSS] KIP-1106: Add duration based offset reset option
> >> for consumer clients
> >>
> >> Hi Apoorv,
> >>
> >> AM7: AutoOffsetReset.java is for Kafka Streams API. I am not proposing
> any
> >> public Interface/class for Kafka Consumer.
> >> As mentioned in the KIP, even though OffsetResetStrategy is a public
> class,
> >> it's not used in any public APIs. I think new internal classes should be
> >> sufficient.
> >>
> >> AM8: Fixed
> >>
> >> AM9: This class is for kafka streams API
> >>
> >> AM10: I was using the EARLIEST_TIMESTAMP, LATEST_TIMESTAMP constant
> values
> >> from ListOffsetsRequest
> >> <
> >>
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/requests/ListOffsetsRequest.java#L41
> >>>
> >>
> >> But yes, we can use Optional in AutoOffsetReset class. Updated the KIP.
> >>
> >>
> >> Thanks.
> >>
> >>
> >>
> >> On Tue, Nov 12, 2024 at 6:15 PM Apoorv Mittal  >
> >> wrote:
> >>
> >>> Hi,
> >>> I read the changes for single configuration and deprecated
> >>> OffsetResetStrategy.java.
> >>>
> >>> AM7: Question: The KIP says that previous supported values were
> >>> earliest/latest/none and new back: config would be added. We
> >> have
> >>> no definition of "none" in the newly introduced AutoOffsetReset.java
> >> class
> >>> hence I am assuming that if "none" is specified as a config option then
> >>> that config will be ignored, correct? Or are we deprecating the usage
> of
> >>> "none" altogether?
> >>>
> >>> AM8: Minor: The new class AutoOffsetReset under interfaces mentions the
> >>> name as OffsetResetStrategy.java. This requires correction.
> >>>
> >>> AM9: Is the package name correct for AutoOffsetReset as
> >>> org.apache.kafka.streams, shouldn't it be under clients package?
> >>>
> >>> AM10: What does -1L and -2L mean as long for latest and earliest in
> >>> AutoOffsetReset.java? Is it just a long placeholder which will never be
> >>> used elsewhere for latest and earliest? If yes then does it make sense
> to
> >>> keep long as Optional, and use Optional.empty() for latest and
> earliest?
> >>>
> >>> Regards,
> >>> Apoorv Mittal
> >>>
> >>>
> >>> On Tue, Nov 12, 2024 at 12:06 PM Manikumar 
> >>> wrote:
> >>>
>  Thanks Ismael and Lianet for the reviews.
> 
>  Based on suggestions, I have updated the KIP to in favour of having a
>  single config (auto.offset.reset).
>  I have also adopted the Lianet's suggestion on naming.
> 
>  auto.offset.reset=back:P3D -> reset back 3 days
> 
> 
>  Let me know if there are any concerns.
> 
> 
>  Thanks,
> 
> 
>  On Sat, Nov 9, 2024 at 10:06 PM Lianet M.  wrote:
> >
> > Hi all. Thanks Manikumar for the nice improvement, useful indeed.
> >
> > I also lean towards having a single config given t

Re: [VOTE] KIP-1091: Improved Kafka Streams operator metrics

2024-11-12 Thread Bill Bejeck

All,

Quick update on KIP-1091.  In an offline discussion, it was brought up that
the existing JMX metric for the client state is named "state", so we'll
update the name for the JMX thread state metric to "state" as well.

Thanks,
Bill

On Mon, Nov 11, 2024 at 12:59 PM Bill Bejeck  wrote:

> Hi All,
>
> The vote is now closed.
> KIP-1091 has been accepted with 3 binding votes (Lucas, Matthias, and
> Sophie) and 1 non-binding vote (Apoorv).
>
> Thanks to everyone for participating.
>
> -Bill
>
> On Wed, Nov 6, 2024 at 11:40 PM Sophie Blee-Goldman 
> wrote:
>
>> +1 (binding)
>>
>> thanks Bill
>>
>> On Wed, Nov 6, 2024 at 4:44 PM Matthias J. Sax  wrote:
>>
>> > +1 (binding)
>> >
>> > On 11/6/24 3:39 AM, Apoorv Mittal wrote:
>> > > Hi Bill,
>> > > Thanks for the KIP.
>> > >
>> > > +1 (non-binding)
>> > >
>> > > Regards,
>> > > Apoorv Mittal
>> > >
>> > >
>> > > On Wed, Nov 6, 2024 at 10:55 AM Lucas Brutschy
>> > >  wrote:
>> > >
>> > >> Hey Bill,
>> > >>
>> > >> thanks for the KIP!
>> > >>
>> > >> +1 (binding)
>> > >>
>> > >> Cheers,
>> > >> Lucas
>> > >>
>> > >> On Wed, Nov 6, 2024 at 2:00 AM Bill Bejeck 
>> wrote:
>> > >>>
>> > >>> Hi All,
>> > >>>
>> > >>> I'd like to call for a vote on KIP-1091
>> > >>> <
>> > >>
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1091%3A+Improved+Kafka+Streams+operator+metrics
>> > >>>
>> > >>>   (discussion thread
>> > >>> )
>> > >>>
>> > >>> Thanks,
>> > >>>
>> > >>> Bill
>> > >>
>> > >
>> >
>>
>

[jira] [Created] (KAFKA-17997) Remove deprecated config log.message.timestamp.difference.max.ms

2024-11-12 Thread Divij Vaidya (Jira)

Divij Vaidya created KAFKA-17997:


 Summary: Remove deprecated config 
log.message.timestamp.difference.max.ms
 Key: KAFKA-17997
 URL: https://issues.apache.org/jira/browse/KAFKA-17997
 Project: Kafka
  Issue Type: Improvement
Reporter: Divij Vaidya
 Fix For: 4.0.0


_[log.message.timestamp.difference.max.ms|https://kafka.apache.org/documentation.html#brokerconfigs_log.message.timestamp.difference.max.ms]_
 was deprecated as part of KIP 937. We need to remove it from 4.0.

The exit criteria for this Jira should be:
1. Remove the configuration from the code and associated tests.
2. Update documentation at 
[https://github.com/apache/kafka/blob/trunk/docs/upgrade.html#L25] to add the 
removed configuration

[1] 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-937%3A+Improve+Message+Timestamp+Validation]
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17998) Fix flaky OffloadAndTxnConsumeFromLeaderTest

2024-11-12 Thread David Arthur (Jira)

David Arthur created KAFKA-17998:


 Summary: Fix flaky OffloadAndTxnConsumeFromLeaderTest
 Key: KAFKA-17998
 URL: https://issues.apache.org/jira/browse/KAFKA-17998
 Project: Kafka
  Issue Type: Test
  Components: unit tests
Reporter: David Arthur


https://ge.apache.org/scans/tests?search.names=CI%20workflow%2CGit%20repository&search.relativeStartTime=P28D&search.rootProjectNames=kafka&search.tags=github%2Ctrunk&search.tasks=test&search.timeZoneId=America%2FNew_York&search.values=CI%2Chttps:%2F%2Fgithub.com%2Fapache%2Fkafka&tests.container=org.apache.kafka.tiered.storage.integration.OffloadAndTxnConsumeFromLeaderTest



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] Apache Kafka 3.9.0

2024-11-12 Thread Chia-Ping Tsai

hi Colin

3.9.1 is nonexistent in the https://s3-us-west-2.amazonaws.com/kafka-packages 

Could you please check this? I'd like to add version 3.9.1 to the E2E tests.

Best,
Chia-Ping

On 2024/11/07 23:11:51 Colin McCabe wrote:
> The Apache Kafka community is pleased to announce the release for Apache 
> Kafka 3.9.0
> 
> - This is a major release, the final one in the 3.x line. (There may of 
> course be other minor releases in this line, such as 3.9.1.)
> - Tiered storage will be considered production-ready in this release.
> - This will be the final major release to feature the deprecated ZooKeeper 
> mode.
> 
> This release includes the following KIPs:
> - KIP-853: Support dynamically changing KRaft controller membership
> - KIP-1057: Add remote log metadata flag to the dump log tool
> - KIP-1049: Add config log.summary.interval.ms to Kafka Streams
> - KIP-1040: Improve handling of nullable values in InsertField, ExtractField, 
> and other transformations
> - KIP-1031: Control offset translation in MirrorSourceConnector
> - KIP-1033: Add Kafka Streams exception handler for exceptions occurring 
> during processing
> - KIP-1017: Health check endpoint for Kafka Connect
> - KIP-1025: Optionally URL-encode clientID and clientSecret in authorization 
> header
> - KIP-1005: Expose EarliestLocalOffset and TieredOffset
> - KIP-950: Tiered Storage Disablement
> - KIP-956: Tiered Storage Quotas
> 
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/3.9.0/RELEASE_NOTES.html
>
>   
>   
>  
> An overview of the release can be found in our announcement blog post:
> https://kafka.apache.org/blog#apache_kafka_390_release_announcement
> 
> You can download the source and binary release from:
> https://kafka.apache.org/downloads#3.9.0
> 
> ---
> 
> 
> Apache Kafka is a distributed streaming platform with four core APIs:
> 
> 
> ** The Producer API allows an application to publish a stream of records to
> one or more Kafka topics.
> 
> ** The Consumer API allows an application to subscribe to one or more
> topics and process the stream of records produced to them.
> 
> ** The Streams API allows an application to act as a stream processor,
> consuming an input stream from one or more topics and producing an
> output stream to one or more output topics, effectively transforming the
> input streams to output streams.
> 
> ** The Connector API allows building and running reusable producers or
> consumers that connect Kafka topics to existing applications or data
> systems. For example, a connector to a relational database might
> capture every change to a table.
> 
> 
> With these APIs, Kafka can be used for two broad classes of application:
> 
> ** Building real-time streaming data pipelines that reliably get data
> between systems or applications.
> 
> ** Building real-time streaming applications that transform or react
> to the streams of data.
> 
> 
> Apache Kafka is in use at large and small companies worldwide, including
> Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> Target, The New York Times, Uber, Yelp, and Zalando, among others.
> 
> A big thank you for the following 133 contributors to this release!
> (Please report an unintended omission)
> 
> Abhijeet Kumar, abhi-ksolves, Abhinav Dixit, Adrian Preston, Alieh Saeedi,
> Alyssa Huang, Anatoly Popov, Andras Katona, Andrew Schofield, Andy Wilkinson,
> Anna Sophie Blee-Goldman, Antoine Pourchet, Apoorv Mittal, Arnav Dadarya,
> Arnout Engelen, Arpit Goyal, Arun Mathew, A. Sophie Blee-Goldman, Ayoub Omari,
> bachmanity1, Bill Bejeck, brenden20, Bruno Cadonna, Chia Chuan Yu, Chia-Ping
> Tsai, ChickenchickenLove, Chirag Wadhwa, Chris Egerton, Christo Lolov,
> Ming-Yen Chung, Colin P. McCabe, Cy, David Arthur, David Jacot,
> demo...@csie.io, dengziming, Dimitar Dimitrov, Dmitry Werner, Dongnuo Lyu,
> dujian0068, Edoardo Comar, Farbod Ahmadian, Federico Valeri, Fiore Mario
> Vitale, Florin Ak ermann, Francois Visconte, GANESH SADANALA, Gantigmaa
> Selenge, Gaurav Narula, gongxuanzhang, Greg Harris, Do Gyeongwon, Harry
> Fallows, Hongten, Ian McDonald, Igor Soarez, Ismael Juma, Ivan Yurchenko,
> Jakub Scholz, Jason Gustafson, Jeff Kim, Jim Galasyn, Jin yong Choi, Johnny
> Hsu, José Armando García Sancio, Josep Prat, Jun Rao, Justine Olshan, Kamal
> Chandraprakash, Ken Huang, Kevin Wu, Kirk True, Kondrat Bertalan, Krishna
> Agarwal, KrishVora01, Kuan-Po (Cooper) Tseng, Kuan-Po Tseng, Lee Dongjin,
> Lianet Magrans , Logan Zhu, Loïc GREFFIER, Lucas Brutschy, Luke Chen, Maciej
> Moscicki, Manikumar Reddy, Mason Chen, Matthias J. Sax, Max Riedel, Mickael
> Maison, Murali Basani

Re: [ANNOUNCE] Apache Kafka 3.9.0

2024-11-12 Thread Josep Prat

Hi Chia-Ping,

I guess you mean 3.9.0 right?

On Tue, Nov 12, 2024 at 12:16 PM Chia-Ping Tsai  wrote:

> hi Colin
>
> 3.9.1 is nonexistent in the
> https://s3-us-west-2.amazonaws.com/kafka-packages
>
> Could you please check this? I'd like to add version 3.9.1 to the E2E
> tests.
>
> Best,
> Chia-Ping
>
> On 2024/11/07 23:11:51 Colin McCabe wrote:
> > The Apache Kafka community is pleased to announce the release for Apache
> Kafka 3.9.0
> >
> > - This is a major release, the final one in the 3.x line. (There may of
> course be other minor releases in this line, such as 3.9.1.)
> > - Tiered storage will be considered production-ready in this release.
> > - This will be the final major release to feature the deprecated
> ZooKeeper mode.
> >
> > This release includes the following KIPs:
> > - KIP-853: Support dynamically changing KRaft controller membership
> > - KIP-1057: Add remote log metadata flag to the dump log tool
> > - KIP-1049: Add config log.summary.interval.ms to Kafka Streams
> > - KIP-1040: Improve handling of nullable values in InsertField,
> ExtractField, and other transformations
> > - KIP-1031: Control offset translation in MirrorSourceConnector
> > - KIP-1033: Add Kafka Streams exception handler for exceptions occurring
> during processing
> > - KIP-1017: Health check endpoint for Kafka Connect
> > - KIP-1025: Optionally URL-encode clientID and clientSecret in
> authorization header
> > - KIP-1005: Expose EarliestLocalOffset and TieredOffset
> > - KIP-950: Tiered Storage Disablement
> > - KIP-956: Tiered Storage Quotas
> >
> > All of the changes in this release can be found in the release notes:
> > https://www.apache.org/dist/kafka/3.9.0/RELEASE_NOTES.html
>
> >
>
>
> > An overview of the release can be found in our announcement blog post:
> > https://kafka.apache.org/blog#apache_kafka_390_release_announcement
> >
> > You can download the source and binary release from:
> > https://kafka.apache.org/downloads#3.9.0
> >
> >
> ---
> >
> >
> > Apache Kafka is a distributed streaming platform with four core APIs:
> >
> >
> > ** The Producer API allows an application to publish a stream of records
> to
> > one or more Kafka topics.
> >
> > ** The Consumer API allows an application to subscribe to one or more
> > topics and process the stream of records produced to them.
> >
> > ** The Streams API allows an application to act as a stream processor,
> > consuming an input stream from one or more topics and producing an
> > output stream to one or more output topics, effectively transforming the
> > input streams to output streams.
> >
> > ** The Connector API allows building and running reusable producers or
> > consumers that connect Kafka topics to existing applications or data
> > systems. For example, a connector to a relational database might
> > capture every change to a table.
> >
> >
> > With these APIs, Kafka can be used for two broad classes of application:
> >
> > ** Building real-time streaming data pipelines that reliably get data
> > between systems or applications.
> >
> > ** Building real-time streaming applications that transform or react
> > to the streams of data.
> >
> >
> > Apache Kafka is in use at large and small companies worldwide, including
> > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank,
> > Target, The New York Times, Uber, Yelp, and Zalando, among others.
> >
> > A big thank you for the following 133 contributors to this release!
> > (Please report an unintended omission)
> >
> > Abhijeet Kumar, abhi-ksolves, Abhinav Dixit, Adrian Preston, Alieh
> Saeedi,
> > Alyssa Huang, Anatoly Popov, Andras Katona, Andrew Schofield, Andy
> Wilkinson,
> > Anna Sophie Blee-Goldman, Antoine Pourchet, Apoorv Mittal, Arnav Dadarya,
> > Arnout Engelen, Arpit Goyal, Arun Mathew, A. Sophie Blee-Goldman, Ayoub
> Omari,
> > bachmanity1, Bill Bejeck, brenden20, Bruno Cadonna, Chia Chuan Yu,
> Chia-Ping
> > Tsai, ChickenchickenLove, Chirag Wadhwa, Chris Egerton, Christo Lolov,
> > Ming-Yen Chung, Colin P. McCabe, Cy, David Arthur, David Jacot,
> > demo...@csie.io, dengziming, Dimitar Dimitrov, Dmitry Werner, Dongnuo
> Lyu,
> > dujian0068, Edoardo Comar, Farbod Ahmadian, Federico Valeri, Fiore Mario
> > Vitale, Florin Ak ermann, Francois Visconte, GANESH SADANALA, Gantigmaa
> > Selenge, Gaurav Narula, gongxuanzhang, Greg Harris, Do Gyeongwon, Harry
> > Fallows, Hongten, Ian McDonald, Igor Soarez, Ismael Juma, Ivan Yurchenko,
> > Jakub Scholz, Jason Gustafson, Jeff Kim, Jim Galasyn, Jin yong Choi,
> Johnny
> > Hsu, José Armando García Sancio, Josep Prat, Jun Rao, Justine Olshan,
> Kamal
> > Chandraprakash, Ken Huang, Kevin Wu, Kirk True, Kondrat Bertalan, Krishna
> > Agarwal, KrishVora01, Kuan-Po (Cooper) Tseng, Kuan-Po Tseng, Lee Dongjin,
> > Lianet Magrans , Logan Zhu, Loïc GREFFIER, Lucas Brutschy, Luke Chen,
> Maciej
> > Moscicki, Manikumar Reddy, Mason Chen, Matthias J

Re: [ANNOUNCE] Apache Kafka 3.9.0

2024-11-12 Thread Chia-Ping Tsai

hi Josep

> I guess you mean 3.9.0 right?

Yes, sorry for my fat-fingering :(

Josep Prat  於 2024年11月12日 週二 下午7:17寫道：

> Hi Chia-Ping,
>
> I guess you mean 3.9.0 right?
>
> On Tue, Nov 12, 2024 at 12:16 PM Chia-Ping Tsai 
> wrote:
>
> > hi Colin
> >
> > 3.9.1 is nonexistent in the
> > https://s3-us-west-2.amazonaws.com/kafka-packages
> >
> > Could you please check this? I'd like to add version 3.9.1 to the E2E
> > tests.
> >
> > Best,
> > Chia-Ping
> >
> > On 2024/11/07 23:11:51 Colin McCabe wrote:
> > > The Apache Kafka community is pleased to announce the release for
> Apache
> > Kafka 3.9.0
> > >
> > > - This is a major release, the final one in the 3.x line. (There may of
> > course be other minor releases in this line, such as 3.9.1.)
> > > - Tiered storage will be considered production-ready in this release.
> > > - This will be the final major release to feature the deprecated
> > ZooKeeper mode.
> > >
> > > This release includes the following KIPs:
> > > - KIP-853: Support dynamically changing KRaft controller membership
> > > - KIP-1057: Add remote log metadata flag to the dump log tool
> > > - KIP-1049: Add config log.summary.interval.ms to Kafka Streams
> > > - KIP-1040: Improve handling of nullable values in InsertField,
> > ExtractField, and other transformations
> > > - KIP-1031: Control offset translation in MirrorSourceConnector
> > > - KIP-1033: Add Kafka Streams exception handler for exceptions
> occurring
> > during processing
> > > - KIP-1017: Health check endpoint for Kafka Connect
> > > - KIP-1025: Optionally URL-encode clientID and clientSecret in
> > authorization header
> > > - KIP-1005: Expose EarliestLocalOffset and TieredOffset
> > > - KIP-950: Tiered Storage Disablement
> > > - KIP-956: Tiered Storage Quotas
> > >
> > > All of the changes in this release can be found in the release notes:
> > > https://www.apache.org/dist/kafka/3.9.0/RELEASE_NOTES.html
> >
> > >
> >
> >
> > > An overview of the release can be found in our announcement blog post:
> > > https://kafka.apache.org/blog#apache_kafka_390_release_announcement
> > >
> > > You can download the source and binary release from:
> > > https://kafka.apache.org/downloads#3.9.0
> > >
> > >
> >
> ---
> > >
> > >
> > > Apache Kafka is a distributed streaming platform with four core APIs:
> > >
> > >
> > > ** The Producer API allows an application to publish a stream of
> records
> > to
> > > one or more Kafka topics.
> > >
> > > ** The Consumer API allows an application to subscribe to one or more
> > > topics and process the stream of records produced to them.
> > >
> > > ** The Streams API allows an application to act as a stream processor,
> > > consuming an input stream from one or more topics and producing an
> > > output stream to one or more output topics, effectively transforming
> the
> > > input streams to output streams.
> > >
> > > ** The Connector API allows building and running reusable producers or
> > > consumers that connect Kafka topics to existing applications or data
> > > systems. For example, a connector to a relational database might
> > > capture every change to a table.
> > >
> > >
> > > With these APIs, Kafka can be used for two broad classes of
> application:
> > >
> > > ** Building real-time streaming data pipelines that reliably get data
> > > between systems or applications.
> > >
> > > ** Building real-time streaming applications that transform or react
> > > to the streams of data.
> > >
> > >
> > > Apache Kafka is in use at large and small companies worldwide,
> including
> > > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest,
> Rabobank,
> > > Target, The New York Times, Uber, Yelp, and Zalando, among others.
> > >
> > > A big thank you for the following 133 contributors to this release!
> > > (Please report an unintended omission)
> > >
> > > Abhijeet Kumar, abhi-ksolves, Abhinav Dixit, Adrian Preston, Alieh
> > Saeedi,
> > > Alyssa Huang, Anatoly Popov, Andras Katona, Andrew Schofield, Andy
> > Wilkinson,
> > > Anna Sophie Blee-Goldman, Antoine Pourchet, Apoorv Mittal, Arnav
> Dadarya,
> > > Arnout Engelen, Arpit Goyal, Arun Mathew, A. Sophie Blee-Goldman, Ayoub
> > Omari,
> > > bachmanity1, Bill Bejeck, brenden20, Bruno Cadonna, Chia Chuan Yu,
> > Chia-Ping
> > > Tsai, ChickenchickenLove, Chirag Wadhwa, Chris Egerton, Christo Lolov,
> > > Ming-Yen Chung, Colin P. McCabe, Cy, David Arthur, David Jacot,
> > > demo...@csie.io, dengziming, Dimitar Dimitrov, Dmitry Werner, Dongnuo
> > Lyu,
> > > dujian0068, Edoardo Comar, Farbod Ahmadian, Federico Valeri, Fiore
> Mario
> > > Vitale, Florin Ak ermann, Francois Visconte, GANESH SADANALA, Gantigmaa
> > > Selenge, Gaurav Narula, gongxuanzhang, Greg Harris, Do Gyeongwon, Harry
> > > Fallows, Hongten, Ian McDonald, Igor Soarez, Ismael Juma, Ivan
> Yurchenko,
> > > Jakub Scholz, Jason Gustafson, Jeff Kim, Jim Galasyn, Jin yong Choi,
> > Johnny
> >

Re: [DISCUSS] Require KIPs to include "How to teach this section"

2024-11-12 Thread Anton Agestam

Hi Matthias,

Thanks for your input and for pointing that out.

It at least is missing from this wiki page, and so is a link to the KIP
template:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=50859233#KafkaImprovementProposals-WhatshouldbeincludedinaKIP
?

Perhaps there should be a bullet point added there and/or the link to the
template?

BR,
Anton

Den lör 2 nov. 2024 03:11Matthias J. Sax  skrev:

> The KIP already has a section "Documentation Plan"
>
>
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=50859709#KIPTemplate-DocumentationPlan
>
> If it's not used properly, committers reviewing (and voting) KIPs must
> be reminded to pay more attention and emphasis on good documentation,
> and also make sure the PRs contains corresponding doc updates...
>
> Overall, I agree and support to pay attention to good docs. I just don't
> see that we need to change anything in the "official process" -- we just
> need to change our behavior and stick to the exiting process?
>
>
> -Matthias
>
> On 11/1/24 2:50 PM, Colin McCabe wrote:
> > On Fri, Nov 1, 2024, at 07:08, Claude Warren, Jr wrote:
> >> I like this idea.  I'm not sure what the section should be called but It
> >> should spell out what changes from a customer (I don't like the term
> user,
> >> drug dealers have users -- we should have customers) point of view and
> from
> >> a developer point of view.
> >
> > Hi Claude,
> >
> > Sorry to nitpick, but I think "user" really is the correct term. Apache
> Kafka is an open source project. We do not have "customers" because we're
> not a business. Indeed, users should feel free to contribute to Kafka and
> not view it as a vendor / customer relationship.
> >
> > Of course, if you want a vendor / customer relationship, there are lots
> of companies in the ecosystem that can provide that. But the open source
> project should be a collaboration. I think most of us wear both "user" and
> "developer" hats, and that's a good thing.
> >
> >>
> >> I can see cases where the change is not visible to the customer but
> impacts
> >> how developers interact with the system.  I can also see cases where the
> >> change is almost 100% customer focused with little change to the
> developers
> >> perception.
> >>
> >> Whatever changes are noted in the section should be accounted for in
> the PR
> >> before it is accepted.  Keeping in mind that as the change evolves
> through
> >> testing to documentation requirements may change too.
> >>
> >> I think we need more focus on documenting how to use and configure
> kafka in
> >> various environments, but I also perceive that we do not have the
> people to
> >> do that, so let's at least collect the information in some reasonable
> form.
> >>
> >
> > Many changes involve multiple PRs. I think documentation is no different
> than any other aspect of a feature or change. It can be done incrementally
> if necessary. Hopefully having the section in the KIP would help remind us
> to do it, though! And motivate discussion about what kind of documentation
> would be best.
> >
> > best,
> > Colin
> >
> >> Claude
> >>
> >> On Thu, Oct 31, 2024 at 12:21 PM Anton Agestam
> >>  wrote:
> >>
> >>> Thanks for your response here Colin,
> >>>
> >>>   > Perhaps there should be a "documentation" section in the KIP
> template?
> >>>
> >>> I think that would do the trick. The nice idea behind formulating the
> >>> section as "How to teach this?", is that it leaves it to the KIP
> author how
> >>> to answer it. In most cases I would expect the section to be filled in
> like
> >>> "We will update documentation section X and Y", but there might be
> cases
> >>> where the answer is different (. I'm not going to die on this hill, and
> >>> would be very happy with an added "Documentation" section if that
> option
> >>> has more traction 👍
> >>>
>  the KIPs themselves are part of the documentation
> >>>
> >>> I understand this is how things currently work for many parts of Kafka
> >>> documentation, but it's an idea I want to question and I am proposing
> to
> >>> work to phase this out over time. KIPs are by definition documenting a
> >>> proposed change. It is necessary to make assumptions about the current
> >>> state of the Kafka code base at the time of writing, and those
> assumptions
> >>> do not necessarily hold true at any arbitrary time later, when the KIP
> is
> >>> being read. And I think this is what you are saying too, that for
> instance
> >>> an implemented KIP that touches the protocol should also result in
> changes
> >>> to the documentation.
> >>>
> >>> tl;dr; It's useful to also be able to read KIPs in hind-sight, but it
> >>> shouldn't be required to do in-brain materialization of a series of
> KIPs to
> >>> understand what the expected state of some feature currently is.
> >>>
> >>> What I am hoping with the proposed change to the KIP template is that
> there
> >>> will be less chance for documentation to be an after-thought for new
> >>> changes, and

Re: [DISCUSS] Require KIPs to include "How to teach this section"

2024-11-12 Thread Anton Agestam

Oh sorry for the noise all, I missed a couple subsequent responses that
already addressed this.

Den tis 12 nov. 2024 kl 16:05 skrev Anton Agestam :

> Hi Matthias,
>
> Thanks for your input and for pointing that out.
>
> It at least is missing from this wiki page, and so is a link to the KIP
> template:
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=50859233#KafkaImprovementProposals-WhatshouldbeincludedinaKIP
> ?
>
> Perhaps there should be a bullet point added there and/or the link to the
> template?
>
> BR,
> Anton
>
> Den lör 2 nov. 2024 03:11Matthias J. Sax  skrev:
>
>> The KIP already has a section "Documentation Plan"
>>
>>
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=50859709#KIPTemplate-DocumentationPlan
>>
>> If it's not used properly, committers reviewing (and voting) KIPs must
>> be reminded to pay more attention and emphasis on good documentation,
>> and also make sure the PRs contains corresponding doc updates...
>>
>> Overall, I agree and support to pay attention to good docs. I just don't
>> see that we need to change anything in the "official process" -- we just
>> need to change our behavior and stick to the exiting process?
>>
>>
>> -Matthias
>>
>> On 11/1/24 2:50 PM, Colin McCabe wrote:
>> > On Fri, Nov 1, 2024, at 07:08, Claude Warren, Jr wrote:
>> >> I like this idea.  I'm not sure what the section should be called but
>> It
>> >> should spell out what changes from a customer (I don't like the term
>> user,
>> >> drug dealers have users -- we should have customers) point of view and
>> from
>> >> a developer point of view.
>> >
>> > Hi Claude,
>> >
>> > Sorry to nitpick, but I think "user" really is the correct term. Apache
>> Kafka is an open source project. We do not have "customers" because we're
>> not a business. Indeed, users should feel free to contribute to Kafka and
>> not view it as a vendor / customer relationship.
>> >
>> > Of course, if you want a vendor / customer relationship, there are lots
>> of companies in the ecosystem that can provide that. But the open source
>> project should be a collaboration. I think most of us wear both "user" and
>> "developer" hats, and that's a good thing.
>> >
>> >>
>> >> I can see cases where the change is not visible to the customer but
>> impacts
>> >> how developers interact with the system.  I can also see cases where
>> the
>> >> change is almost 100% customer focused with little change to the
>> developers
>> >> perception.
>> >>
>> >> Whatever changes are noted in the section should be accounted for in
>> the PR
>> >> before it is accepted.  Keeping in mind that as the change evolves
>> through
>> >> testing to documentation requirements may change too.
>> >>
>> >> I think we need more focus on documenting how to use and configure
>> kafka in
>> >> various environments, but I also perceive that we do not have the
>> people to
>> >> do that, so let's at least collect the information in some reasonable
>> form.
>> >>
>> >
>> > Many changes involve multiple PRs. I think documentation is no
>> different than any other aspect of a feature or change. It can be done
>> incrementally if necessary. Hopefully having the section in the KIP would
>> help remind us to do it, though! And motivate discussion about what kind of
>> documentation would be best.
>> >
>> > best,
>> > Colin
>> >
>> >> Claude
>> >>
>> >> On Thu, Oct 31, 2024 at 12:21 PM Anton Agestam
>> >>  wrote:
>> >>
>> >>> Thanks for your response here Colin,
>> >>>
>> >>>   > Perhaps there should be a "documentation" section in the KIP
>> template?
>> >>>
>> >>> I think that would do the trick. The nice idea behind formulating the
>> >>> section as "How to teach this?", is that it leaves it to the KIP
>> author how
>> >>> to answer it. In most cases I would expect the section to be filled
>> in like
>> >>> "We will update documentation section X and Y", but there might be
>> cases
>> >>> where the answer is different (. I'm not going to die on this hill,
>> and
>> >>> would be very happy with an added "Documentation" section if that
>> option
>> >>> has more traction 👍
>> >>>
>>  the KIPs themselves are part of the documentation
>> >>>
>> >>> I understand this is how things currently work for many parts of Kafka
>> >>> documentation, but it's an idea I want to question and I am proposing
>> to
>> >>> work to phase this out over time. KIPs are by definition documenting a
>> >>> proposed change. It is necessary to make assumptions about the current
>> >>> state of the Kafka code base at the time of writing, and those
>> assumptions
>> >>> do not necessarily hold true at any arbitrary time later, when the
>> KIP is
>> >>> being read. And I think this is what you are saying too, that for
>> instance
>> >>> an implemented KIP that touches the protocol should also result in
>> changes
>> >>> to the documentation.
>> >>>
>> >>> tl;dr; It's useful to also be able to read KIPs in hind-sight, but it
>> >>> shouldn't be required to do in-brain

Re: [DISCUSS] KIP-1104: Allow Foreign Key Extraction from Both Key and Value in KTable Joins

2024-11-12 Thread Lucas Brutschy

Hi,

1. I don't think we can/should break backwards compatibility.
2. Have you considered using `BiFunction foreignKeyExtractor`
? This should work without renaming the method.
3. I don't see the benefit of deprecating it. I agree, we wouldn't add
both overloads normally, but the value-only overload is useful /
correct as it is, I wouldn't break users code just to "clean" things
very slighly.

Cheers,
Lucas

On Mon, Nov 11, 2024 at 6:43 PM Chu Cheng Li  wrote:
>
> Hi all,
>
> I'm working on enhancing the KTable foreign key join API to allow
> extracting foreign keys from both key and value. However, I've encountered
> a Java type erasure issue that affects our API design choices.
>
> Current Situation:
> - We want to allow foreign key extraction from both key and value
> - We'd like to maintain backward compatibility with existing value-only
> extractors
> - Initial attempt was to use method overloading
>
> The Issue:
> Due to Java type erasure, we can't have both:
> - Function foreignKeyExtractor
> - Function, KO> foreignKeyExtractor
> in method overloads as they become identical at runtime.
>
> Options:
> 1. Breaking Change Approach
>- Replace value-only extractor with key-value extractor
>- Clean API but breaks backward compatibility
>
> 2. Compatibility Approach
>- Keep both capabilities through different method names (e.g.,
> join/joinWithKey)
>- Maintains compatibility but less elegant API
>
> Questions for Discussion:
> 1. Should we prioritize API elegance or backward compatibility?
> 2. If we choose compatibility, which naming convention should we use for
> the new methods?
> 3. Should we consider deprecating the value-only extractor in a future
> release?
>
> Looking forward to your thoughts.
>
> Best regards,
> Peter
>
> On Thu, Oct 31, 2024 at 1:08 PM Chu Cheng Li  wrote:
>
> > Hi Everyone,
> > I would like to start a discussion on KIP-1104:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1103%3A+Additional+metrics+for+cooperative+consumption
> > 
> >
> > This KIP allow foreign key extraction from both key and value in KTable
> > Joins, before this KIP user can only extract foreign from record's value,
> > this KIP provides more flexibility on it.
> >
> > Regards,
> > Peter Lee
> >
> >
> >

[jira] [Created] (KAFKA-17996) kafka-metadata-quorum.sh add-controller cause the new added controller to crash with java.lang.IllegalArgumentException

2024-11-12 Thread Omnia Ibrahim (Jira)

Omnia Ibrahim created KAFKA-17996:
-

 Summary: kafka-metadata-quorum.sh add-controller cause the new 
added controller to crash with java.lang.IllegalArgumentException
 Key: KAFKA-17996
 URL: https://issues.apache.org/jira/browse/KAFKA-17996
 Project: Kafka
  Issue Type: Bug
  Components: controller
Affects Versions: 3.9.0
Reporter: Omnia Ibrahim
Assignee: Omnia Ibrahim
 Fix For: 3.9.1


`kafka-metadata-quorum.sh --bootstrap-server 127.0.0.1:9092 --command-config 
controller.properties add-controller` update the metadata successfully and if I 
ran `kafka-metadata-quorum.sh --bootstrap-server 127.0.0.1:9092 describe 
--status` I can see the new controller get upgraded from observer to voter.
However the new added controller crashes immediately once we ran 
`add-controller` with `java.lang.IllegalArgumentException`

```
2024-11-12 14:38:21 java.lang.IllegalArgumentException: Unexpected type for 
requestData: UpdateRaftVoterRequestData(clusterId='3zaK3YKRQtm3deDECgnj3w', 
currentLeaderEpoch=1, voterId=204, voterDirectoryId=VnSWz2WLHLz6xiY9lA2Z9g, 
listeners=[Listener(name='CONTROLLER', host='controller-4', port=9093)], 
kRaftVersionFeature=KRaftVersionFeature(minSupportedVersion=0, 
maxSupportedVersion=1))
2024-11-12 14:38:21     at 
org.apache.kafka.raft.KafkaNetworkChannel.buildRequest(KafkaNetworkChannel.java:194)
2024-11-12 14:38:21     at 
org.apache.kafka.raft.KafkaNetworkChannel.send(KafkaNetworkChannel.java:119)
2024-11-12 14:38:21     at 
org.apache.kafka.raft.KafkaRaftClient.maybeSendRequest(KafkaRaftClient.java:2664)
2024-11-12 14:38:21     at 
org.apache.kafka.raft.KafkaRaftClient.maybeSendUpdateVoterRequest(KafkaRaftClient.java:3116)
2024-11-12 14:38:21     at 
org.apache.kafka.raft.KafkaRaftClient.pollFollowerAsVoter(KafkaRaftClient.java:3042)
2024-11-12 14:38:21     at 
org.apache.kafka.raft.KafkaRaftClient.pollFollower(KafkaRaftClient.java:3022)
2024-11-12 14:38:21     at 
org.apache.kafka.raft.KafkaRaftClient.pollCurrentState(KafkaRaftClient.java:3157)
2024-11-12 14:38:21     at 
org.apache.kafka.raft.KafkaRaftClient.poll(KafkaRaftClient.java:3299)
2024-11-12 14:38:21     at 
org.apache.kafka.raft.KafkaRaftClientDriver.doWork(KafkaRaftClientDriver.java:64)
2024-11-12 14:38:21     at 
org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:136)
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1106: Add duration based offset reset option for consumer clients

2024-11-12 Thread Manikumar

Thanks Ismael and Lianet for the reviews.

Based on suggestions, I have updated the KIP to in favour of having a
single config (auto.offset.reset).
I have also adopted the Lianet's suggestion on naming.

auto.offset.reset=back:P3D -> reset back 3 days


Let me know if there are any concerns.


Thanks,


On Sat, Nov 9, 2024 at 10:06 PM Lianet M.  wrote:
>
> Hi all. Thanks Manikumar for the nice improvement, useful indeed.
>
> I also lean towards having a single config given that it's all about the
> reset policy, seems all the same "what" (auto reset policy) and we are
> just extending the with a new behaviour. Regarding the naming (agree on
> duration being confusing), what about something to show that it's simply
> about how far back to reset:
>
> auto.offset.reset=EARLIEST
> auto.offset.reset=BACK:P3D -> reset back 3 days (combined with ISO8601
> seems really easy to read/understand from the config definition itself)
>
> Something like that would definitely break consistency with the command
> line tool argument "by_duration", but if it seems clearer we should
> consider the tradeoff and not penalize the consumer API/configs.
>
> Thanks!
>
> Lianet
>
>
> On Sat, Nov 9, 2024 at 2:47 AM Ismael Juma  wrote:
>
> > Thanks for the KIP, this is useful. A comment below.
> >
> > On Thu, Nov 7, 2024 at 4:51 PM Matthias J. Sax  wrote:
> >
> > > I am personally not convinced that adding a new config
> > > `auto.offset.reset.by.duration` is the best way. Kafka in general has
> > > way too many configs and trying to avoid adding more configs seems to be
> > > desirable?  -- It seems this might be a point of contention, and if the
> > > majority of people wants this new config so be it. I just wanted to
> > > stress my concerns about it.
> >
> >
> > I agree that we don't need a new config. We can simply use a value for the
> > existing config. I think a prefix followed by the relevant ISO8601 string
> > would be clear enough. For example, "by-duration:P23DT23H" or something
> > along those lines. I do find the "by-duration" description a bit confusing
> > for what we're doing here (i.e. current time - duration) although there is
> > precedent in the reset offsets tool.
> >
> > Ismael
> >

[jira] [Created] (KAFKA-17993) reassign partition tool stuck with uncaught exception: 'value' field is too long to be serialized

2024-11-12 Thread Edoardo Comar (Jira)

Edoardo Comar created KAFKA-17993:
-

 Summary: reassign partition tool stuck with uncaught exception: 
'value' field is too long to be serialized
 Key: KAFKA-17993
 URL: https://issues.apache.org/jira/browse/KAFKA-17993
 Project: Kafka
  Issue Type: Bug
  Components: clients
Reporter: Edoardo Comar


Running the reassignment script for about 5800 partitions, with both throttle 
options being set, the tool remained stuck with this exception

{{ERROR Uncaught exception in thread 'kafka-admin-client-thread | 
reassign-partitions-tool': (org.apache.kafka.common.utils.KafkaThread)}}
{{java.lang.RuntimeException: 'value' field is too long to be serialized}}
{{    at 
org.apache.kafka.common.message.IncrementalAlterConfigsRequestData$AlterableConfig.addSize(IncrementalAlterConfigsRequestData.java:776)}}
{{    at 
org.apache.kafka.common.message.IncrementalAlterConfigsRequestData$AlterConfigsResource.addSize(IncrementalAlterConfigsRequestData.java:463)}}
{{    at 
org.apache.kafka.common.message.IncrementalAlterConfigsRequestData.addSize(IncrementalAlterConfigsRequestData.java:187)}}
{{    at 
org.apache.kafka.common.protocol.SendBuilder.buildSend(SendBuilder.java:218)}}
{{    at 
org.apache.kafka.common.protocol.SendBuilder.buildRequestSend(SendBuilder.java:187)}}
{{    at 
org.apache.kafka.common.requests.AbstractRequest.toSend(AbstractRequest.java:108)}}
{{    at org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:535)}}
{{    at org.apache.kafka.clients.NetworkClient.doSend(NetworkClient.java:511)}}
{{    at org.apache.kafka.clients.NetworkClient.send(NetworkClient.java:471)}}
{{    at 
org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.sendEligibleCalls(KafkaAdminClient.java:1156)}}
{{    at 
org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.processRequests(KafkaAdminClient.java:1369)}}
{{    at 
org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1312)}}
{{    at java.base/java.lang.Thread.run(Unknown Source)}}

 

The same json file previously passed the --verify step



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1106: Add duration based offset reset option for consumer clients

2024-11-12 Thread Apoorv Mittal

Thanks Manikumar for explaining, sounds good to me.

Regards,
Apoorv Mittal


On Tue, Nov 12, 2024 at 1:47 PM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi,
> Looks good now. Just one suggestion.
>
> AS8: Instead of "back:30D", I wonder whether the word 'duration' ought to
> be
> used to be consistent with kafka-consumer-groups.sh. So,
> "by-duration:P3D" or "duration:P3D" might be better.
>
> The overall idea of merging the configs into one config is fine in the
> current
> text of the KIP.
>
> Thanks,
> Andrew
>
> 
> From: Manikumar 
> Sent: 12 November 2024 13:30
> To: dev@kafka.apache.org 
> Subject: Re: [DISCUSS] KIP-1106: Add duration based offset reset option
> for consumer clients
>
> Hi Apoorv,
>
> AM7: AutoOffsetReset.java is for Kafka Streams API. I am not proposing any
> public Interface/class for Kafka Consumer.
> As mentioned in the KIP, even though OffsetResetStrategy is a public class,
> it's not used in any public APIs. I think new internal classes should be
> sufficient.
>
> AM8: Fixed
>
> AM9: This class is for kafka streams API
>
> AM10: I was using the EARLIEST_TIMESTAMP, LATEST_TIMESTAMP constant values
> from ListOffsetsRequest
> <
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/requests/ListOffsetsRequest.java#L41
> >
>
> But yes, we can use Optional in AutoOffsetReset class. Updated the KIP.
>
>
> Thanks.
>
>
>
> On Tue, Nov 12, 2024 at 6:15 PM Apoorv Mittal 
> wrote:
>
> > Hi,
> > I read the changes for single configuration and deprecated
> > OffsetResetStrategy.java.
> >
> > AM7: Question: The KIP says that previous supported values were
> > earliest/latest/none and new back: config would be added. We
> have
> > no definition of "none" in the newly introduced AutoOffsetReset.java
> class
> > hence I am assuming that if "none" is specified as a config option then
> > that config will be ignored, correct? Or are we deprecating the usage of
> > "none" altogether?
> >
> > AM8: Minor: The new class AutoOffsetReset under interfaces mentions the
> > name as OffsetResetStrategy.java. This requires correction.
> >
> > AM9: Is the package name correct for AutoOffsetReset as
> > org.apache.kafka.streams, shouldn't it be under clients package?
> >
> > AM10: What does -1L and -2L mean as long for latest and earliest in
> > AutoOffsetReset.java? Is it just a long placeholder which will never be
> > used elsewhere for latest and earliest? If yes then does it make sense to
> > keep long as Optional, and use Optional.empty() for latest and earliest?
> >
> > Regards,
> > Apoorv Mittal
> >
> >
> > On Tue, Nov 12, 2024 at 12:06 PM Manikumar 
> > wrote:
> >
> > > Thanks Ismael and Lianet for the reviews.
> > >
> > > Based on suggestions, I have updated the KIP to in favour of having a
> > > single config (auto.offset.reset).
> > > I have also adopted the Lianet's suggestion on naming.
> > >
> > > auto.offset.reset=back:P3D -> reset back 3 days
> > >
> > >
> > > Let me know if there are any concerns.
> > >
> > >
> > > Thanks,
> > >
> > >
> > > On Sat, Nov 9, 2024 at 10:06 PM Lianet M.  wrote:
> > > >
> > > > Hi all. Thanks Manikumar for the nice improvement, useful indeed.
> > > >
> > > > I also lean towards having a single config given that it's all about
> > the
> > > > reset policy, seems all the same "what" (auto reset policy) and we
> are
> > > > just extending the with a new behaviour. Regarding the naming (agree
> on
> > > > duration being confusing), what about something to show that it's
> > simply
> > > > about how far back to reset:
> > > >
> > > > auto.offset.reset=EARLIEST
> > > > auto.offset.reset=BACK:P3D -> reset back 3 days (combined with
> ISO8601
> > > > seems really easy to read/understand from the config definition
> itself)
> > > >
> > > > Something like that would definitely break consistency with the
> command
> > > > line tool argument "by_duration", but if it seems clearer we should
> > > > consider the tradeoff and not penalize the consumer API/configs.
> > > >
> > > > Thanks!
> > > >
> > > > Lianet
> > > >
> > > >
> > > > On Sat, Nov 9, 2024 at 2:47 AM Ismael Juma 
> wrote:
> > > >
> > > > > Thanks for the KIP, this is useful. A comment below.
> > > > >
> > > > > On Thu, Nov 7, 2024 at 4:51 PM Matthias J. Sax 
> > > wrote:
> > > > >
> > > > > > I am personally not convinced that adding a new config
> > > > > > `auto.offset.reset.by.duration` is the best way. Kafka in
> general
> > > has
> > > > > > way too many configs and trying to avoid adding more configs
> seems
> > > to be
> > > > > > desirable?  -- It seems this might be a point of contention, and
> if
> > > the
> > > > > > majority of people wants this new config so be it. I just wanted
> to
> > > > > > stress my concerns about it.
> > > > >
> > > > >
> > > > > I agree that we don't need a new config. We can simply use a value
> > for
> > > the
> > > > > existing config. I think a prefix followed by the relevant IS

[jira] [Created] (KAFKA-17995) Large value for `retention.ms` could prevent remote data cleanup in Tiered Storage

2024-11-12 Thread Divij Vaidya (Jira)

Divij Vaidya created KAFKA-17995:


 Summary: Large value for `retention.ms` could prevent remote data 
cleanup in Tiered Storage
 Key: KAFKA-17995
 URL: https://issues.apache.org/jira/browse/KAFKA-17995
 Project: Kafka
  Issue Type: Bug
  Components: Tiered-Storage
Affects Versions: 3.8.1
Reporter: Divij Vaidya


If a user has configured value of "retention.ms" to a value > current epoch, 
then at this line of code, cleanupUntilMs becomes negative. This is because 
cleanupUntilMs is calculated as current epoch - retention.ms [1]. 

This leads to cleaner failures at 
[https://github.com/apache/kafka/blob/5a5239770ff3565233e5cbecf11446e76339f8fe/core/src/main/java/kafka/log/remote/RemoteLogManager.java#L2218]
 and hence, all cleaning stops.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [ANNOUNCE] Apache Kafka 3.9.0

2024-11-12 Thread Josep Prat

For a second I thought I missed a complete version :D

On Tue, Nov 12, 2024 at 12:20 PM Chia-Ping Tsai  wrote:

> hi Josep
>
> > I guess you mean 3.9.0 right?
>
> Yes, sorry for my fat-fingering :(
>
> Josep Prat  於 2024年11月12日 週二 下午7:17寫道：
>
> > Hi Chia-Ping,
> >
> > I guess you mean 3.9.0 right?
> >
> > On Tue, Nov 12, 2024 at 12:16 PM Chia-Ping Tsai 
> > wrote:
> >
> > > hi Colin
> > >
> > > 3.9.1 is nonexistent in the
> > > https://s3-us-west-2.amazonaws.com/kafka-packages
> > >
> > > Could you please check this? I'd like to add version 3.9.1 to the E2E
> > > tests.
> > >
> > > Best,
> > > Chia-Ping
> > >
> > > On 2024/11/07 23:11:51 Colin McCabe wrote:
> > > > The Apache Kafka community is pleased to announce the release for
> > Apache
> > > Kafka 3.9.0
> > > >
> > > > - This is a major release, the final one in the 3.x line. (There may
> of
> > > course be other minor releases in this line, such as 3.9.1.)
> > > > - Tiered storage will be considered production-ready in this release.
> > > > - This will be the final major release to feature the deprecated
> > > ZooKeeper mode.
> > > >
> > > > This release includes the following KIPs:
> > > > - KIP-853: Support dynamically changing KRaft controller membership
> > > > - KIP-1057: Add remote log metadata flag to the dump log tool
> > > > - KIP-1049: Add config log.summary.interval.ms to Kafka Streams
> > > > - KIP-1040: Improve handling of nullable values in InsertField,
> > > ExtractField, and other transformations
> > > > - KIP-1031: Control offset translation in MirrorSourceConnector
> > > > - KIP-1033: Add Kafka Streams exception handler for exceptions
> > occurring
> > > during processing
> > > > - KIP-1017: Health check endpoint for Kafka Connect
> > > > - KIP-1025: Optionally URL-encode clientID and clientSecret in
> > > authorization header
> > > > - KIP-1005: Expose EarliestLocalOffset and TieredOffset
> > > > - KIP-950: Tiered Storage Disablement
> > > > - KIP-956: Tiered Storage Quotas
> > > >
> > > > All of the changes in this release can be found in the release notes:
> > > > https://www.apache.org/dist/kafka/3.9.0/RELEASE_NOTES.html
> > >
> > > >
> > >
> > >
> > > > An overview of the release can be found in our announcement blog
> post:
> > > > https://kafka.apache.org/blog#apache_kafka_390_release_announcement
> > > >
> > > > You can download the source and binary release from:
> > > > https://kafka.apache.org/downloads#3.9.0
> > > >
> > > >
> > >
> >
> ---
> > > >
> > > >
> > > > Apache Kafka is a distributed streaming platform with four core APIs:
> > > >
> > > >
> > > > ** The Producer API allows an application to publish a stream of
> > records
> > > to
> > > > one or more Kafka topics.
> > > >
> > > > ** The Consumer API allows an application to subscribe to one or more
> > > > topics and process the stream of records produced to them.
> > > >
> > > > ** The Streams API allows an application to act as a stream
> processor,
> > > > consuming an input stream from one or more topics and producing an
> > > > output stream to one or more output topics, effectively transforming
> > the
> > > > input streams to output streams.
> > > >
> > > > ** The Connector API allows building and running reusable producers
> or
> > > > consumers that connect Kafka topics to existing applications or data
> > > > systems. For example, a connector to a relational database might
> > > > capture every change to a table.
> > > >
> > > >
> > > > With these APIs, Kafka can be used for two broad classes of
> > application:
> > > >
> > > > ** Building real-time streaming data pipelines that reliably get data
> > > > between systems or applications.
> > > >
> > > > ** Building real-time streaming applications that transform or react
> > > > to the streams of data.
> > > >
> > > >
> > > > Apache Kafka is in use at large and small companies worldwide,
> > including
> > > > Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest,
> > Rabobank,
> > > > Target, The New York Times, Uber, Yelp, and Zalando, among others.
> > > >
> > > > A big thank you for the following 133 contributors to this release!
> > > > (Please report an unintended omission)
> > > >
> > > > Abhijeet Kumar, abhi-ksolves, Abhinav Dixit, Adrian Preston, Alieh
> > > Saeedi,
> > > > Alyssa Huang, Anatoly Popov, Andras Katona, Andrew Schofield, Andy
> > > Wilkinson,
> > > > Anna Sophie Blee-Goldman, Antoine Pourchet, Apoorv Mittal, Arnav
> > Dadarya,
> > > > Arnout Engelen, Arpit Goyal, Arun Mathew, A. Sophie Blee-Goldman,
> Ayoub
> > > Omari,
> > > > bachmanity1, Bill Bejeck, brenden20, Bruno Cadonna, Chia Chuan Yu,
> > > Chia-Ping
> > > > Tsai, ChickenchickenLove, Chirag Wadhwa, Chris Egerton, Christo
> Lolov,
> > > > Ming-Yen Chung, Colin P. McCabe, Cy, David Arthur, David Jacot,
> > > > demo...@csie.io, dengziming, Dimitar Dimitrov, Dmitry Werner,
> Dongnuo
> > > Lyu,
> > > > dujian0068, Edoardo Com

Re: [DISCUSS] KIP-1106: Add duration based offset reset option for consumer clients

2024-11-12 Thread Apoorv Mittal

Hi,
I read the changes for single configuration and deprecated
OffsetResetStrategy.java.

AM7: Question: The KIP says that previous supported values were
earliest/latest/none and new back: config would be added. We have
no definition of "none" in the newly introduced AutoOffsetReset.java class
hence I am assuming that if "none" is specified as a config option then
that config will be ignored, correct? Or are we deprecating the usage of
"none" altogether?

AM8: Minor: The new class AutoOffsetReset under interfaces mentions the
name as OffsetResetStrategy.java. This requires correction.

AM9: Is the package name correct for AutoOffsetReset as
org.apache.kafka.streams, shouldn't it be under clients package?

AM10: What does -1L and -2L mean as long for latest and earliest in
AutoOffsetReset.java? Is it just a long placeholder which will never be
used elsewhere for latest and earliest? If yes then does it make sense to
keep long as Optional, and use Optional.empty() for latest and earliest?

Regards,
Apoorv Mittal

On Tue, Nov 12, 2024 at 12:06 PM Manikumar 
wrote:

> Thanks Ismael and Lianet for the reviews.
>
> Based on suggestions, I have updated the KIP to in favour of having a
> single config (auto.offset.reset).
> I have also adopted the Lianet's suggestion on naming.
>
> auto.offset.reset=back:P3D -> reset back 3 days
>
>
> Let me know if there are any concerns.
>
>
> Thanks,
>
>
> On Sat, Nov 9, 2024 at 10:06 PM Lianet M.  wrote:
> >
> > Hi all. Thanks Manikumar for the nice improvement, useful indeed.
> >
> > I also lean towards having a single config given that it's all about the
> > reset policy, seems all the same "what" (auto reset policy) and we are
> > just extending the with a new behaviour. Regarding the naming (agree on
> > duration being confusing), what about something to show that it's simply
> > about how far back to reset:
> >
> > auto.offset.reset=EARLIEST
> > auto.offset.reset=BACK:P3D -> reset back 3 days (combined with ISO8601
> > seems really easy to read/understand from the config definition itself)
> >
> > Something like that would definitely break consistency with the command
> > line tool argument "by_duration", but if it seems clearer we should
> > consider the tradeoff and not penalize the consumer API/configs.
> >
> > Thanks!
> >
> > Lianet
> >
> >
> > On Sat, Nov 9, 2024 at 2:47 AM Ismael Juma  wrote:
> >
> > > Thanks for the KIP, this is useful. A comment below.
> > >
> > > On Thu, Nov 7, 2024 at 4:51 PM Matthias J. Sax 
> wrote:
> > >
> > > > I am personally not convinced that adding a new config
> > > > `auto.offset.reset.by.duration` is the best way. Kafka in general
> has
> > > > way too many configs and trying to avoid adding more configs seems
> to be
> > > > desirable?  -- It seems this might be a point of contention, and if
> the
> > > > majority of people wants this new config so be it. I just wanted to
> > > > stress my concerns about it.
> > >
> > >
> > > I agree that we don't need a new config. We can simply use a value for
> the
> > > existing config. I think a prefix followed by the relevant ISO8601
> string
> > > would be clear enough. For example, "by-duration:P23DT23H" or something
> > > along those lines. I do find the "by-duration" description a bit
> confusing
> > > for what we're doing here (i.e. current time - duration) although
> there is
> > > precedent in the reset offsets tool.
> > >
> > > Ismael
> > >
>

[DISCUSS] KIP-1109: Unifying Kafka Consumer Topic Metrics

2024-11-12 Thread Apoorv Mittal

Hi All,
I would like to start a discussion on KIP-1109:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1109%3A+Unifying+Kafka+Consumer+Topic+Metrics

This KIP streamlines topic and topic-partition metrics for Kafka Consumer,
emitting the user defined topic name (as like kafka-producer).

Regards,
Apoorv Mittal

[jira] [Created] (KAFKA-17994) Runtime exceptions are not handled when deserializing kafka stream record

2024-11-12 Thread Ilya (Jira)

Ilya created KAFKA-17994:


 Summary: Runtime exceptions are not handled when deserializing 
kafka stream record
 Key: KAFKA-17994
 URL: https://issues.apache.org/jira/browse/KAFKA-17994
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 3.9.0
Reporter: Ilya


When we got a PR to upgrade kafka clients 3.8.1 -> 3.9.0, we saw some failing 
tests. They were relating to using a DeserializationExceptionHandler with 'log 
and continue' strategy, however on newest version the stream was just crashing 
when Jackson was trying to deserialize a faulty json and this handler was not 
invoked.

In this 
[KIP-1033|https://cwiki.apache.org/confluence/display/KAFKA/KIP-1033%3A+Add+Kafka+Streams+exception+handler+for+exceptions+occurring+during+processing],
 specifically in this PR 
[https://github.com/apache/kafka/pull/16745/files#diff-77791b213bb41d1df63a23860f1faf4394dfbd7d6c4ed9cd021950d82c31c24f]
 a change was introduced to catch only RuntimeException type and handle them in 
the handler. However, all the Jackson exceptions inherit Exception type, not 
RuntimeException.

So with this change all the Jackson exceptions (or any checked exceptions) 
would not be passed to the DeserializationExceptionHandler like it was before.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: [DISCUSS] KIP-1106: Add duration based offset reset option for consumer clients

2024-11-12 Thread Manikumar

Hi Apoorv,

AM7: AutoOffsetReset.java is for Kafka Streams API. I am not proposing any
public Interface/class for Kafka Consumer.
As mentioned in the KIP, even though OffsetResetStrategy is a public class,
it's not used in any public APIs. I think new internal classes should be
sufficient.

AM8: Fixed

AM9: This class is for kafka streams API

AM10: I was using the EARLIEST_TIMESTAMP, LATEST_TIMESTAMP constant values
from ListOffsetsRequest


But yes, we can use Optional in AutoOffsetReset class. Updated the KIP.


Thanks.



On Tue, Nov 12, 2024 at 6:15 PM Apoorv Mittal 
wrote:

> Hi,
> I read the changes for single configuration and deprecated
> OffsetResetStrategy.java.
>
> AM7: Question: The KIP says that previous supported values were
> earliest/latest/none and new back: config would be added. We have
> no definition of "none" in the newly introduced AutoOffsetReset.java class
> hence I am assuming that if "none" is specified as a config option then
> that config will be ignored, correct? Or are we deprecating the usage of
> "none" altogether?
>
> AM8: Minor: The new class AutoOffsetReset under interfaces mentions the
> name as OffsetResetStrategy.java. This requires correction.
>
> AM9: Is the package name correct for AutoOffsetReset as
> org.apache.kafka.streams, shouldn't it be under clients package?
>
> AM10: What does -1L and -2L mean as long for latest and earliest in
> AutoOffsetReset.java? Is it just a long placeholder which will never be
> used elsewhere for latest and earliest? If yes then does it make sense to
> keep long as Optional, and use Optional.empty() for latest and earliest?
>
> Regards,
> Apoorv Mittal
>
>
> On Tue, Nov 12, 2024 at 12:06 PM Manikumar 
> wrote:
>
> > Thanks Ismael and Lianet for the reviews.
> >
> > Based on suggestions, I have updated the KIP to in favour of having a
> > single config (auto.offset.reset).
> > I have also adopted the Lianet's suggestion on naming.
> >
> > auto.offset.reset=back:P3D -> reset back 3 days
> >
> >
> > Let me know if there are any concerns.
> >
> >
> > Thanks,
> >
> >
> > On Sat, Nov 9, 2024 at 10:06 PM Lianet M.  wrote:
> > >
> > > Hi all. Thanks Manikumar for the nice improvement, useful indeed.
> > >
> > > I also lean towards having a single config given that it's all about
> the
> > > reset policy, seems all the same "what" (auto reset policy) and we are
> > > just extending the with a new behaviour. Regarding the naming (agree on
> > > duration being confusing), what about something to show that it's
> simply
> > > about how far back to reset:
> > >
> > > auto.offset.reset=EARLIEST
> > > auto.offset.reset=BACK:P3D -> reset back 3 days (combined with ISO8601
> > > seems really easy to read/understand from the config definition itself)
> > >
> > > Something like that would definitely break consistency with the command
> > > line tool argument "by_duration", but if it seems clearer we should
> > > consider the tradeoff and not penalize the consumer API/configs.
> > >
> > > Thanks!
> > >
> > > Lianet
> > >
> > >
> > > On Sat, Nov 9, 2024 at 2:47 AM Ismael Juma  wrote:
> > >
> > > > Thanks for the KIP, this is useful. A comment below.
> > > >
> > > > On Thu, Nov 7, 2024 at 4:51 PM Matthias J. Sax 
> > wrote:
> > > >
> > > > > I am personally not convinced that adding a new config
> > > > > `auto.offset.reset.by.duration` is the best way. Kafka in general
> > has
> > > > > way too many configs and trying to avoid adding more configs seems
> > to be
> > > > > desirable?  -- It seems this might be a point of contention, and if
> > the
> > > > > majority of people wants this new config so be it. I just wanted to
> > > > > stress my concerns about it.
> > > >
> > > >
> > > > I agree that we don't need a new config. We can simply use a value
> for
> > the
> > > > existing config. I think a prefix followed by the relevant ISO8601
> > string
> > > > would be clear enough. For example, "by-duration:P23DT23H" or
> something
> > > > along those lines. I do find the "by-duration" description a bit
> > confusing
> > > > for what we're doing here (i.e. current time - duration) although
> > there is
> > > > precedent in the reset offsets tool.
> > > >
> > > > Ismael
> > > >
> >
>

Re: [DISCUSS] KIP-1106: Add duration based offset reset option for consumer clients

2024-11-12 Thread Andrew Schofield

Hi,
Looks good now. Just one suggestion.

AS8: Instead of "back:30D", I wonder whether the word 'duration' ought to be
used to be consistent with kafka-consumer-groups.sh. So,
"by-duration:P3D" or "duration:P3D" might be better.

The overall idea of merging the configs into one config is fine in the current
text of the KIP.

Thanks,
Andrew


From: Manikumar 
Sent: 12 November 2024 13:30
To: dev@kafka.apache.org 
Subject: Re: [DISCUSS] KIP-1106: Add duration based offset reset option for 
consumer clients

Hi Apoorv,

AM7: AutoOffsetReset.java is for Kafka Streams API. I am not proposing any
public Interface/class for Kafka Consumer.
As mentioned in the KIP, even though OffsetResetStrategy is a public class,
it's not used in any public APIs. I think new internal classes should be
sufficient.

AM8: Fixed

AM9: This class is for kafka streams API

AM10: I was using the EARLIEST_TIMESTAMP, LATEST_TIMESTAMP constant values
from ListOffsetsRequest


But yes, we can use Optional in AutoOffsetReset class. Updated the KIP.


Thanks.



On Tue, Nov 12, 2024 at 6:15 PM Apoorv Mittal 
wrote:

> Hi,
> I read the changes for single configuration and deprecated
> OffsetResetStrategy.java.
>
> AM7: Question: The KIP says that previous supported values were
> earliest/latest/none and new back: config would be added. We have
> no definition of "none" in the newly introduced AutoOffsetReset.java class
> hence I am assuming that if "none" is specified as a config option then
> that config will be ignored, correct? Or are we deprecating the usage of
> "none" altogether?
>
> AM8: Minor: The new class AutoOffsetReset under interfaces mentions the
> name as OffsetResetStrategy.java. This requires correction.
>
> AM9: Is the package name correct for AutoOffsetReset as
> org.apache.kafka.streams, shouldn't it be under clients package?
>
> AM10: What does -1L and -2L mean as long for latest and earliest in
> AutoOffsetReset.java? Is it just a long placeholder which will never be
> used elsewhere for latest and earliest? If yes then does it make sense to
> keep long as Optional, and use Optional.empty() for latest and earliest?
>
> Regards,
> Apoorv Mittal
>
>
> On Tue, Nov 12, 2024 at 12:06 PM Manikumar 
> wrote:
>
> > Thanks Ismael and Lianet for the reviews.
> >
> > Based on suggestions, I have updated the KIP to in favour of having a
> > single config (auto.offset.reset).
> > I have also adopted the Lianet's suggestion on naming.
> >
> > auto.offset.reset=back:P3D -> reset back 3 days
> >
> >
> > Let me know if there are any concerns.
> >
> >
> > Thanks,
> >
> >
> > On Sat, Nov 9, 2024 at 10:06 PM Lianet M.  wrote:
> > >
> > > Hi all. Thanks Manikumar for the nice improvement, useful indeed.
> > >
> > > I also lean towards having a single config given that it's all about
> the
> > > reset policy, seems all the same "what" (auto reset policy) and we are
> > > just extending the with a new behaviour. Regarding the naming (agree on
> > > duration being confusing), what about something to show that it's
> simply
> > > about how far back to reset:
> > >
> > > auto.offset.reset=EARLIEST
> > > auto.offset.reset=BACK:P3D -> reset back 3 days (combined with ISO8601
> > > seems really easy to read/understand from the config definition itself)
> > >
> > > Something like that would definitely break consistency with the command
> > > line tool argument "by_duration", but if it seems clearer we should
> > > consider the tradeoff and not penalize the consumer API/configs.
> > >
> > > Thanks!
> > >
> > > Lianet
> > >
> > >
> > > On Sat, Nov 9, 2024 at 2:47 AM Ismael Juma  wrote:
> > >
> > > > Thanks for the KIP, this is useful. A comment below.
> > > >
> > > > On Thu, Nov 7, 2024 at 4:51 PM Matthias J. Sax 
> > wrote:
> > > >
> > > > > I am personally not convinced that adding a new config
> > > > > `auto.offset.reset.by.duration` is the best way. Kafka in general
> > has
> > > > > way too many configs and trying to avoid adding more configs seems
> > to be
> > > > > desirable?  -- It seems this might be a point of contention, and if
> > the
> > > > > majority of people wants this new config so be it. I just wanted to
> > > > > stress my concerns about it.
> > > >
> > > >
> > > > I agree that we don't need a new config. We can simply use a value
> for
> > the
> > > > existing config. I think a prefix followed by the relevant ISO8601
> > string
> > > > would be clear enough. For example, "by-duration:P23DT23H" or
> something
> > > > along those lines. I do find the "by-duration" description a bit
> > confusing
> > > > for what we're doing here (i.e. current time - duration) although
> > there is
> > > > precedent in the reset offsets tool.
> > > >
> > > > Ismael
> > > >
> >
>

[jira] [Resolved] (KAFKA-17744) Improve the State Updater logs when restoring state

2024-11-12 Thread Bruno Cadonna (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bruno Cadonna resolved KAFKA-17744.
---
Resolution: Fixed

> Improve the State Updater logs when restoring state
> ---
>
> Key: KAFKA-17744
> URL: https://issues.apache.org/jira/browse/KAFKA-17744
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Sebastien Viale
>Assignee: Sebastien Viale
>Priority: Minor
>
> The logs for Kafka Streams local state restoration incorrectly refer to the 
> {{StreamThread}} instead of the {{StateUpdater}} thread, which is responsible 
> for decoupling the restoration process. The restore consumer group also 
> references {{StreamThread}} instead of {{{}StateUpdater{}}}, which should be 
> corrected for clarity.
> *Current logs:* 
> ...
> stream-thread [***-StreamThread-1] Restoration in progress for 2 partitions. 
> \{stateful_app-count-store-name-changelog-1: position=2698468, end=5716195, 
> totalRestored=2698468} \{stateful_app-count-store-name-changelog-0: 
> position=2655839, end=5743384, totalRestored=2655839}
> stream-thread [***-StreamThread-1] Restoration in progress for 2 partitions. 
> \{stateful_app-count-store-name-changelog-1: position=5415824, end=5716195, 
> totalRestored=5415824} \{stateful_app-count-store-name-changelog-0: 
> position=5412953, end=5743384, totalRestored=5412953}
> stream-thread [***-StreamThread-1] Finished restoring changelog 
> stateful_app-count-store-name-changelog-1 to store count-store-name with a 
> total number of 5716195 records
> stream-thread [***-StreamThread-1] Restoration in progress for 1 partitions. 
> \{stateful_app-count-store-name-changelog-1: position=1000, end=5716195, 
> totalRestored=1000}
> stream-thread [***-StreamThread-1] Finished restoring changelog 
> stateful_app-count-store-name-changelog-0 to store count-store-name with a 
> total number of 5743384 records
> ...
> [Consumer clientId=***-StreamThread-1-restore-consumer, groupId=null] 
> Assigned to partition(s): stateful_app-count-store-name-changelog-1, 
> stateful_app-count-store-name-changelog-0
> ...
> [Consumer clientId=***-StreamThread-1-restore-consumer, groupId=null] 
> Resetting offset for partition stateful_app-count-store-name-changelog-0 to 
> position FetchPosition{offset=0, offsetEpoch=Optional.empty, 
> currentLeader=LeaderAndEpoch{leader=Optional[localhost:19092 (id: 1 rack: 
> null)], epoch=2}}.
>  
> *Expected logs:*
> ...
> ...
> state-updater [***-StateUpdater-1] Restoration in progress for 2 partitions. 
> \{stateful_app-count-store-name-changelog-1: position=2698468, end=5716195, 
> totalRestored=2698468} \{stateful_app-count-store-name-changelog-0: 
> position=2655839, end=5743384, totalRestored=2655839}
> state-updater [***-StateUpdater-1] Restoration in progress for 2 partitions. 
> \{stateful_app-count-store-name-changelog-1: position=5415824, end=5716195, 
> totalRestored=5415824} \{stateful_app-count-store-name-changelog-0: 
> position=5412953, end=5743384, totalRestored=5412953}
> state-updater [***-StateUpdater-1] Finished restoring changelog 
> stateful_app-count-store-name-changelog-1 to store count-store-name with a 
> total number of 5716195 records
> state-updater [***-StateUpdater-1] Restoration in progress for 1 partitions. 
> \{stateful_app-count-store-name-changelog-1: position=1000, end=5716195, 
> totalRestored=1000}
> state-updater [***-StateUpdater-1] Finished restoring changelog 
> stateful_app-count-store-name-changelog-0 to store count-store-name with a 
> total number of 5743384 records
> ...
> [Consumer clientId=***-StateUpdater-1-restore-consumer, groupId=null] 
> Assigned to partition(s): stateful_app-count-store-name-changelog-1, 
> stateful_app-count-store-name-changelog-0
> ...
> [Consumer clientId=***-StateUpdater-1-restore-consumer, groupId=null] 
> Resetting offset for partition stateful_app-count-store-name-changelog-0 to 
> position FetchPosition{offset=0, offsetEpoch=Optional.empty, 
> currentLeader=LeaderAndEpoch{leader=Optional[localhost:19092 (id: 1 rack: 
> null)], epoch=2}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

41 matches

Mail list logo