[jira] [Resolved] (KAFKA-14331) Upgrade to Scala 2.13.10

2022-10-25 Thread Viktor Somogyi-Vass (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14331?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Viktor Somogyi-Vass resolved KAFKA-14331.
-
Resolution: Duplicate

> Upgrade to Scala 2.13.10
> 
>
> Key: KAFKA-14331
> URL: https://issues.apache.org/jira/browse/KAFKA-14331
> Project: Kafka
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 3.4.0
>Reporter: Viktor Somogyi-Vass
>Priority: Major
>
> There are some CVEs in Scala 2.13.8, so we should upgrade to the latest.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1318

2022-10-25 Thread Apache Jenkins Server
See 




Re: [DISCUSS] (continued) KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-10-25 Thread David Jacot
Hi Magnus,

Thanks for your interesting perspective. Your points are totally valid
so I will revert this change to the previous version. I have also
added the reasoning in the rejected alternative.

Best,
David

On Mon, Oct 24, 2022 at 4:49 PM Magnus Edenhill  wrote:
>
> Hi, one minor comment on the latest update:
>
>
> Den mån 24 okt. 2022 kl 16:26 skrev David Jacot  >:
>
> > * Jason pointed out that the member id handling is a tad weird. The
> > group coordinator generates the member id and then trusts the member
> > when it rejoins the group. This also implies that the client could
> > directly generate its member id and the group coordinator will accept
> > it. It seems better to directly let the client generate id instead of
> > relying on the group coordinator. I have updated the KIP in this
> > direction. Note that the new APIs still use a string for the member id
> > in order to remain consistent with the existing APIs.
> >
>
> We had a similar discussion for id generation in KIP-714 and I'd advise
> against client-side id generation for a couple of reasons:
>  - it is much more likely for the client side prng to be poorly seeded, or
> incorrectly implemented, than the server side.
>This risks two different consumer instances generating the same id.
>  - it adds an extra dependency on the client, a uuid library/module, which
> brings with it the usual plethora
>of linking conflicts, package availability issues, etc.
>  - as for trusting the authenticity of the id; with server-side generation
> we at least have a (future) possibility for verifying the id, would it ever
> become an issue.
>
>
> Regards,
> Magnus


Re: [DISCUSS] (continued) KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-10-25 Thread David Jacot
Hi Jun,

90.1 You're right. That's a miss on my side.
90.2 That makes sense. Relying on
ConsumerGroupTargetAssignmentMetadataValue.AssignmentEpoch should be
enough here.

91. When a client side assignor is used, the assignment is computed
asynchronously. While it is computed for the group at epoch X, the
group may have already advanced to epoch X+1 due to another event
(e.g. new member joined). In this case, we have chosen to install the
assignment computed for epoch X and to trigger a new assignment
computation right away. So if the group topology changes rapidly, the
assignment may lag a bit behind but it will eventually converge. For
context, the alternative would have been to cancel the assignment and
to trigger a new one immediately but this approach is prone to a live
lock. For instance, if a member joins and leaves all the time, there
is a chance that a new assignment is never computed. I have extended
that phrase to be more explicit.

Thanks for the comments.

Best,
David

On Mon, Oct 24, 2022 at 7:28 PM Jun Rao  wrote:
>
> Hi, David,
>
> Thanks for the updated KIP. A few more comments.
>
> 90. ConsumerGroupTargetAssignmentMemberValue:
> 90.1 Do we need to include MemberId here given that it's in the key already?
> 90.2 Since there is no new record if the new member assignment is the same,
> it seems that AssignmentEpoch doesn't always reflect the correct assignment
> epoch? If so, do we still need this field? Could we just depend
> on ConsumerGroupTargetAssignmentMetadataValue.AssignmentEpoch?
>
> 91. "The AssignmentEpoch corresponds to the group epoch used to compute the
> assignment. It is not necessarily the last one." Could you explain what "It
> is not necessarily the last one." means?
>
> Thanks,
>
> Jun
>
>
> On Mon, Oct 24, 2022 at 7:49 AM Magnus Edenhill  wrote:
>
> > Hi, one minor comment on the latest update:
> >
> >
> > Den mån 24 okt. 2022 kl 16:26 skrev David Jacot
> >  > >:
> >
> > > * Jason pointed out that the member id handling is a tad weird. The
> > > group coordinator generates the member id and then trusts the member
> > > when it rejoins the group. This also implies that the client could
> > > directly generate its member id and the group coordinator will accept
> > > it. It seems better to directly let the client generate id instead of
> > > relying on the group coordinator. I have updated the KIP in this
> > > direction. Note that the new APIs still use a string for the member id
> > > in order to remain consistent with the existing APIs.
> > >
> >
> > We had a similar discussion for id generation in KIP-714 and I'd advise
> > against client-side id generation for a couple of reasons:
> >  - it is much more likely for the client side prng to be poorly seeded, or
> > incorrectly implemented, than the server side.
> >This risks two different consumer instances generating the same id.
> >  - it adds an extra dependency on the client, a uuid library/module, which
> > brings with it the usual plethora
> >of linking conflicts, package availability issues, etc.
> >  - as for trusting the authenticity of the id; with server-side generation
> > we at least have a (future) possibility for verifying the id, would it ever
> > become an issue.
> >
> >
> > Regards,
> > Magnus
> >


Re: [VOTE] KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-10-25 Thread David Jacot
Hi all,

The vote has been open for a while. I plan to close it on Friday if
there are no further comments in the discussion thread.

Best,
David

On Wed, Oct 19, 2022 at 6:10 PM Jun Rao  wrote:
>
> Hi, David,
>
> Thanks for the KIP. +1
>
> Jun
>
> On Wed, Oct 19, 2022 at 2:21 AM Magnus Edenhill  wrote:
>
> > Great work on the KIP, David.
> >
> > +1 (nonbinding)
> >
> > Den fre 14 okt. 2022 kl 11:50 skrev Luke Chen :
> >
> > > Hi David,
> > >
> > > I made a final pass and LGTM now.
> > > +1 from me.
> > >
> > > Luke
> > >
> > > On Wed, Oct 5, 2022 at 12:32 AM Guozhang Wang 
> > wrote:
> > >
> > > > Hello David,
> > > >
> > > > I've made my final pass on the doc and I think it looks good now. +1.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > > On Wed, Sep 14, 2022 at 1:37 PM Guozhang Wang 
> > > wrote:
> > > >
> > > > > Thanks David,
> > > > >
> > > > > There are a few minor comments pending in the discussion thread, and
> > > one
> > > > > is about whether we should merge PreparePartitionAssignment with HB.
> > > But
> > > > I
> > > > > think the KIP itself is in pretty good shape now. Thanks!
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Fri, Sep 9, 2022 at 1:32 AM David Jacot
> >  > > >
> > > > > wrote:
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> Thank you all for the very positive discussion about KIP-848. It
> > looks
> > > > >> like folks are very positive about it overall.
> > > > >>
> > > > >> I would like to start a vote on KIP-848, which introduces a brand
> > new
> > > > >> consumer rebalance protocol.
> > > > >>
> > > > >> The KIP is here: https://cwiki.apache.org/confluence/x/HhD1D.
> > > > >>
> > > > >> Best,
> > > > >> David
> > > > >>
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >


Request permissions to contribute to kafka

2022-10-25 Thread Wang Jie
Hi team,

I would like to request contributor access to the project.

Jira-ID: jackwangcs
Wiki-ID: jackwangcs

Best,
Jie


Re: Request permissions to contribute to kafka

2022-10-25 Thread Chris Egerton
Hi Jie,

You should be good to go now.

Cheers,

Chris

On Tue, Oct 25, 2022 at 11:04 AM Wang Jie  wrote:

> Hi team,
>
> I would like to request contributor access to the project.
>
> Jira-ID: jackwangcs
> Wiki-ID: jackwangcs
>
> Best,
> Jie
>


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1319

2022-10-25 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 508556 lines...]
[2022-10-25T15:45:20.556Z] 
[2022-10-25T15:45:20.556Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuter[caching
 enabled = false] STARTED
[2022-10-25T15:45:28.093Z] 
[2022-10-25T15:45:28.093Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuter[caching
 enabled = false] PASSED
[2022-10-25T15:45:28.093Z] 
[2022-10-25T15:45:28.093Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testLeft[caching
 enabled = false] STARTED
[2022-10-25T15:45:34.219Z] 
[2022-10-25T15:45:34.219Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testLeft[caching
 enabled = false] PASSED
[2022-10-25T15:45:34.219Z] 
[2022-10-25T15:45:34.219Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerInner[caching
 enabled = false] STARTED
[2022-10-25T15:45:39.932Z] 
[2022-10-25T15:45:39.932Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerInner[caching
 enabled = false] PASSED
[2022-10-25T15:45:39.932Z] 
[2022-10-25T15:45:39.932Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerOuter[caching
 enabled = false] STARTED
[2022-10-25T15:45:45.926Z] 
[2022-10-25T15:45:45.926Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerOuter[caching
 enabled = false] PASSED
[2022-10-25T15:45:45.926Z] 
[2022-10-25T15:45:45.926Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerLeft[caching
 enabled = false] STARTED
[2022-10-25T15:45:51.375Z] 
[2022-10-25T15:45:51.375Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testInnerLeft[caching
 enabled = false] PASSED
[2022-10-25T15:45:51.375Z] 
[2022-10-25T15:45:51.375Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuterInner[caching
 enabled = false] STARTED
[2022-10-25T15:45:57.689Z] 
[2022-10-25T15:45:57.689Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuterInner[caching
 enabled = false] PASSED
[2022-10-25T15:45:57.689Z] 
[2022-10-25T15:45:57.689Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuterOuter[caching
 enabled = false] STARTED
[2022-10-25T15:46:05.105Z] 
[2022-10-25T15:46:05.105Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TableTableJoinIntegrationTest > [caching enabled = false] > 
org.apache.kafka.streams.integration.TableTableJoinIntegrationTest.testOuterOuter[caching
 enabled = false] PASSED
[2022-10-25T15:46:05.105Z] 
[2022-10-25T15:46:05.105Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TaskAssignorIntegrationTest > 
shouldProperlyConfigureTheAssignor STARTED
[2022-10-25T15:46:05.105Z] 
[2022-10-25T15:46:05.105Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TaskAssignorIntegrationTest > 
shouldProperlyConfigureTheAssignor PASSED
[2022-10-25T15:46:05.105Z] 
[2022-10-25T15:46:05.105Z] Gradle Test Run :streams:integrationTest > Gradle 
Test Executor 165 > TaskMetadataIntegrationTest > 
shouldReportCorrectEndOffsetInformation STARTED
[2022-10-2

Re: [DISCUSS] KIP-878: Autoscaling for Statically Partitioned Streams

2022-10-25 Thread Bill Bejeck
Hi Sophie,

Thanks for the KIP! I think this is a worthwhile feature to add.  I have
two main questions about how this new feature will work.


   1. You mention that for stateless applications auto-scaling is a sticker
   situation.  But I was thinking that the auto-scaling would actually benefit
   stateless applications the most, let me explain my thinking.  Let's say you
   have a stateless Kafka Streams application with one input topic and 2
   partitions, meaning you're limited to at most 2 stream threads.  In order
   to increase the throughput, you increase the number of partitions of the
   source topic to 4, so you can 4 stream threads.  In this case would the
   auto-scaling feature automatically increase the number of tasks from 2 to
   4?  Since the application is stateless, say using a filter then a map for
   example, the partition for the record doesn't matter, so it seems that
   stateless applications would stand to gain a great deal.
   2. For stateful applications I can see the immediate benefit from
   autoscaling and static partitioning.   But again going with a partition
   expansion for increased throughput example, what would be the mitigation
   strategy for a stateful application that eventually wants to take advantage
   of the increased number of partitions? Otherwise keeping all keys on their
   original partition means you could end up with "key skew" due to not
   allowing keys to distribute out to the new partitions.

One last comment, the KIP states "only the key, rather than the key and
value, are passed in to the partitioner", but the interface has it taking a
key and a value as parameters.  Based on your comments earlier in this
thread I was thinking that the text needs to be updated.

Thanks,
Bill

On Fri, Oct 21, 2022 at 12:21 PM Lucas Brutschy
 wrote:

> Hi all,
>
> thanks, Sophie, this makes sense. I suppose then the way to help the user
> not apply this in the wrong setting is having good documentation and a one
> or two examples of good use cases.
>
> I think Colt's time-based partitioning is a good example of how to use
> this. It actually doesn't have to be time, the same will work with any
> monotonically increasing identifier. I.e. the new partitions will only get
> records for users with a "large" user ID greater than some user ID
> threshold hardcoded in the static partitioner. At least in this restricted
> use-case, lookups by user ID would still be possible.
>
> Cheers,
> Lucas
>
> On Fri, Oct 21, 2022 at 5:37 PM Colt McNealy  wrote:
>
> > Sophie,
> >
> > Regarding item "3" (my last paragraph from the previous email), perhaps I
> > should give a more general example now that I've had more time to clarify
> > my thoughts:
> >
> > In some stateful applications, certain keys have to be findable without
> any
> > information about when the relevant data was created. For example, if I'm
> > running a word-count app and I want to use Interactive Queries to find
> the
> > count for "foo", I would need to know whether "foo" first arrived before
> or
> > after time T before I could find the correct partition to look up the
> data.
> > In this case, I don't think static partitioning is possible. Is this
> > use-case a non-goal of the KIP, or am I missing something?
> >
> > Colt McNealy
> > *Founder, LittleHorse.io*
> >
> >
> > On Thu, Oct 20, 2022 at 6:37 PM Sophie Blee-Goldman
> >  wrote:
> >
> > > Thanks for the responses guys! I'll get the easy stuff out of the way
> > > first:
> > >
> > > 1) Fixed the KIP so that StaticStreamPartitioner extends
> > StreamPartitioner
> > > 2) I totally agree with you Colt, the record value might have valuable
> > (no
> > > pun) information
> > > in it that is needed to compute the partition without breaking the
> static
> > > constraint. As in my
> > > own example earlier, maybe the userId is a field in the value and not
> the
> > > key itself. Actually
> > > it was that exact thought that made me do a U-turn on this but I forgot
> > to
> > > update the thread
> > > 3) Colt, I'm not  sure I follow what you're trying to say in that last
> > > paragraph, can you expand?
> > > 4) Lucas, it's a good question as to what kind of guard-rails we could
> > put
> > > up to enforce or even
> > > detect a violation of static partitioning. Most likely Streams would
> need
> > > to track every key to
> > > partition mapping in an internal state store, but we have no guarantee
> > the
> > > key space is bounded
> > > and the store wouldn't grow out of control. Mostly however I imagine
> > users
> > > would be frustrated
> > > to find out there's a secret, extra state store taking up space when
> you
> > > enable autoscaling, and
> > > it's not even to provide functionality but just to make sure users
> aren't
> > > doing something wrong.
> > >
> > > I wish I had a better idea, but sadly I think the only practical
> solution
> > > here is to try and make this
> > > condition as clear and obvious and easy to understand as possible,
> > perhaps
> 

Re: [DISCUSS] (continued) KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-10-25 Thread Jun Rao
Hi, David,

Thanks for the reply.

The KIP mentioned downgrade support in a future KIP. So, with this KIP,
once the new records have been generated on the coordinator, there is no
path to downgrade the broker, is that correct?

Thanks,

Jun

On Tue, Oct 25, 2022 at 7:10 AM David Jacot 
wrote:

> Hi Jun,
>
> 90.1 You're right. That's a miss on my side.
> 90.2 That makes sense. Relying on
> ConsumerGroupTargetAssignmentMetadataValue.AssignmentEpoch should be
> enough here.
>
> 91. When a client side assignor is used, the assignment is computed
> asynchronously. While it is computed for the group at epoch X, the
> group may have already advanced to epoch X+1 due to another event
> (e.g. new member joined). In this case, we have chosen to install the
> assignment computed for epoch X and to trigger a new assignment
> computation right away. So if the group topology changes rapidly, the
> assignment may lag a bit behind but it will eventually converge. For
> context, the alternative would have been to cancel the assignment and
> to trigger a new one immediately but this approach is prone to a live
> lock. For instance, if a member joins and leaves all the time, there
> is a chance that a new assignment is never computed. I have extended
> that phrase to be more explicit.
>
> Thanks for the comments.
>
> Best,
> David
>
> On Mon, Oct 24, 2022 at 7:28 PM Jun Rao  wrote:
> >
> > Hi, David,
> >
> > Thanks for the updated KIP. A few more comments.
> >
> > 90. ConsumerGroupTargetAssignmentMemberValue:
> > 90.1 Do we need to include MemberId here given that it's in the key
> already?
> > 90.2 Since there is no new record if the new member assignment is the
> same,
> > it seems that AssignmentEpoch doesn't always reflect the correct
> assignment
> > epoch? If so, do we still need this field? Could we just depend
> > on ConsumerGroupTargetAssignmentMetadataValue.AssignmentEpoch?
> >
> > 91. "The AssignmentEpoch corresponds to the group epoch used to compute
> the
> > assignment. It is not necessarily the last one." Could you explain what
> "It
> > is not necessarily the last one." means?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Mon, Oct 24, 2022 at 7:49 AM Magnus Edenhill 
> wrote:
> >
> > > Hi, one minor comment on the latest update:
> > >
> > >
> > > Den mån 24 okt. 2022 kl 16:26 skrev David Jacot
> > >  > > >:
> > >
> > > > * Jason pointed out that the member id handling is a tad weird. The
> > > > group coordinator generates the member id and then trusts the member
> > > > when it rejoins the group. This also implies that the client could
> > > > directly generate its member id and the group coordinator will accept
> > > > it. It seems better to directly let the client generate id instead of
> > > > relying on the group coordinator. I have updated the KIP in this
> > > > direction. Note that the new APIs still use a string for the member
> id
> > > > in order to remain consistent with the existing APIs.
> > > >
> > >
> > > We had a similar discussion for id generation in KIP-714 and I'd advise
> > > against client-side id generation for a couple of reasons:
> > >  - it is much more likely for the client side prng to be poorly
> seeded, or
> > > incorrectly implemented, than the server side.
> > >This risks two different consumer instances generating the same id.
> > >  - it adds an extra dependency on the client, a uuid library/module,
> which
> > > brings with it the usual plethora
> > >of linking conflicts, package availability issues, etc.
> > >  - as for trusting the authenticity of the id; with server-side
> generation
> > > we at least have a (future) possibility for verifying the id, would it
> ever
> > > become an issue.
> > >
> > >
> > > Regards,
> > > Magnus
> > >
>


Re: [DISCUSS] KIP-858: Handle JBOD broker disk failure in KRaft

2022-10-25 Thread Igor Soarez
Hello,

There’s now a proposal to address ZK to KRaft migration — KIP-866 — but 
currently it doesn't address JBOD so I've decided to update this proposal to 
address that migration scenario.

So given that:

- When migrating from a ZK cluster running JBOD to KRaft, brokers registering 
in KRaft mode will need to be able to register all configured log directories.
- As part of the migration, the mapping of partition to log directory will have 
to be learnt by the active controller and persisted into the cluster metadata.
- It isn’t safe to allow for leadership from replicas without this mapping, as 
if the hosting log directory fails there will be no failover mechanism.

I have updated the proposal to reflect that:

- Multiple log directories may be indicated in the first broker registration 
referencing log directory UUIDs. We no longer require a single log directory to 
start with.
- The controller must never assign leadership to a replica in a broker 
registered with multiple log directories, unless the partition to log directory 
assignment is already in the cluster metadata.
- The broker should not be unfenced until all of its partition to log directory 
mapping is persisted into cluster metadata

I've also added details as to how the ZK to KRaft migration can work in a 
cluster that is already operating with JBOD. 

Please have a look and share your thoughts.

--
Igor




Re: [DISCUSS] (continued) KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-10-25 Thread David Jacot
Hi Jun,

That’s right. I propose to tackle this in a separate KIP because the
downgrade need some thinking.

I would like to point out that the current group coordinator does not
support downgrades either so we don’t make it worst. I am personally in
favor of doing this KIP before we release the new protocol. Supporting
downgrades is important.

We could of course extend the current KIP for this as well but I think that
the current KIP is large and complicated enough.

Does this sound reasonable?

Best,
David

Le mar. 25 oct. 2022 à 19:41, Jun Rao  a écrit :

> Hi, David,
>
> Thanks for the reply.
>
> The KIP mentioned downgrade support in a future KIP. So, with this KIP,
> once the new records have been generated on the coordinator, there is no
> path to downgrade the broker, is that correct?
>
> Thanks,
>
> Jun
>
> On Tue, Oct 25, 2022 at 7:10 AM David Jacot 
> wrote:
>
> > Hi Jun,
> >
> > 90.1 You're right. That's a miss on my side.
> > 90.2 That makes sense. Relying on
> > ConsumerGroupTargetAssignmentMetadataValue.AssignmentEpoch should be
> > enough here.
> >
> > 91. When a client side assignor is used, the assignment is computed
> > asynchronously. While it is computed for the group at epoch X, the
> > group may have already advanced to epoch X+1 due to another event
> > (e.g. new member joined). In this case, we have chosen to install the
> > assignment computed for epoch X and to trigger a new assignment
> > computation right away. So if the group topology changes rapidly, the
> > assignment may lag a bit behind but it will eventually converge. For
> > context, the alternative would have been to cancel the assignment and
> > to trigger a new one immediately but this approach is prone to a live
> > lock. For instance, if a member joins and leaves all the time, there
> > is a chance that a new assignment is never computed. I have extended
> > that phrase to be more explicit.
> >
> > Thanks for the comments.
> >
> > Best,
> > David
> >
> > On Mon, Oct 24, 2022 at 7:28 PM Jun Rao 
> wrote:
> > >
> > > Hi, David,
> > >
> > > Thanks for the updated KIP. A few more comments.
> > >
> > > 90. ConsumerGroupTargetAssignmentMemberValue:
> > > 90.1 Do we need to include MemberId here given that it's in the key
> > already?
> > > 90.2 Since there is no new record if the new member assignment is the
> > same,
> > > it seems that AssignmentEpoch doesn't always reflect the correct
> > assignment
> > > epoch? If so, do we still need this field? Could we just depend
> > > on ConsumerGroupTargetAssignmentMetadataValue.AssignmentEpoch?
> > >
> > > 91. "The AssignmentEpoch corresponds to the group epoch used to compute
> > the
> > > assignment. It is not necessarily the last one." Could you explain what
> > "It
> > > is not necessarily the last one." means?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Mon, Oct 24, 2022 at 7:49 AM Magnus Edenhill 
> > wrote:
> > >
> > > > Hi, one minor comment on the latest update:
> > > >
> > > >
> > > > Den mån 24 okt. 2022 kl 16:26 skrev David Jacot
> > > >  > > > >:
> > > >
> > > > > * Jason pointed out that the member id handling is a tad weird. The
> > > > > group coordinator generates the member id and then trusts the
> member
> > > > > when it rejoins the group. This also implies that the client could
> > > > > directly generate its member id and the group coordinator will
> accept
> > > > > it. It seems better to directly let the client generate id instead
> of
> > > > > relying on the group coordinator. I have updated the KIP in this
> > > > > direction. Note that the new APIs still use a string for the member
> > id
> > > > > in order to remain consistent with the existing APIs.
> > > > >
> > > >
> > > > We had a similar discussion for id generation in KIP-714 and I'd
> advise
> > > > against client-side id generation for a couple of reasons:
> > > >  - it is much more likely for the client side prng to be poorly
> > seeded, or
> > > > incorrectly implemented, than the server side.
> > > >This risks two different consumer instances generating the same
> id.
> > > >  - it adds an extra dependency on the client, a uuid library/module,
> > which
> > > > brings with it the usual plethora
> > > >of linking conflicts, package availability issues, etc.
> > > >  - as for trusting the authenticity of the id; with server-side
> > generation
> > > > we at least have a (future) possibility for verifying the id, would
> it
> > ever
> > > > become an issue.
> > > >
> > > >
> > > > Regards,
> > > > Magnus
> > > >
> >
>


Re: [VOTE] KIP-869: Improve Streams State Restoration Visibility

2022-10-25 Thread Walker Carlson
+1 non binding

Thanks for the kip!

On Thu, Oct 20, 2022 at 10:25 PM John Roesler  wrote:

> Thanks for the KIP, Guozhang!
>
> I'm +1 (binding)
>
> -John
>
> On Wed, Oct 12, 2022, at 16:36, Nick Telford wrote:
> > Can't wait!
> > +1 (non-binding)
> >
> > On Wed, 12 Oct 2022, 18:02 Guozhang Wang, 
> > wrote:
> >
> >> Hello all,
> >>
> >> I'd like to start a vote for the following KIP, aiming to improve Kafka
> >> Stream's restoration visibility via new metrics and callback methods:
> >>
> >>
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-869%3A+Improve+Streams+State+Restoration+Visibility
> >>
> >>
> >> Thanks!
> >> -- Guozhang
> >>
>


Re: [DISCUSS] KIP-878: Autoscaling for Statically Partitioned Streams

2022-10-25 Thread Walker Carlson
Hey Sophie,

Thanks for the KIP. I think this could be useful for a lot of cases. I also
think that this could cause a lot of confusion.

Just to make sure we are doing our best to prevent people from
misusing this feature, I wanted to clarify a couple of things.
1) There will be only an interface and no "default" implementation that a
user can plug in for the static partitioner. I am considering when it comes
to testing we want to make sure that we do not make our testing
implementation avaible to a user.
2)  If a user wanted to use auto scaling for a stateless application it
should be as easy as implementing the StaticStreamsPartitioner. Their
implementation could even just wrap the default partitioner if they wanted,
right?  I can't think of any way we could detect and then warn them about
the output topic not being partitioned by keys if that were to happen, can
you?

Overall this looks good to me!

Walker

On Tue, Oct 25, 2022 at 12:27 PM Bill Bejeck  wrote:

> Hi Sophie,
>
> Thanks for the KIP! I think this is a worthwhile feature to add.  I have
> two main questions about how this new feature will work.
>
>
>1. You mention that for stateless applications auto-scaling is a sticker
>situation.  But I was thinking that the auto-scaling would actually
> benefit
>stateless applications the most, let me explain my thinking.  Let's say
> you
>have a stateless Kafka Streams application with one input topic and 2
>partitions, meaning you're limited to at most 2 stream threads.  In
> order
>to increase the throughput, you increase the number of partitions of the
>source topic to 4, so you can 4 stream threads.  In this case would the
>auto-scaling feature automatically increase the number of tasks from 2
> to
>4?  Since the application is stateless, say using a filter then a map
> for
>example, the partition for the record doesn't matter, so it seems that
>stateless applications would stand to gain a great deal.
>2. For stateful applications I can see the immediate benefit from
>autoscaling and static partitioning.   But again going with a partition
>expansion for increased throughput example, what would be the mitigation
>strategy for a stateful application that eventually wants to take
> advantage
>of the increased number of partitions? Otherwise keeping all keys on
> their
>original partition means you could end up with "key skew" due to not
>allowing keys to distribute out to the new partitions.
>
> One last comment, the KIP states "only the key, rather than the key and
> value, are passed in to the partitioner", but the interface has it taking a
> key and a value as parameters.  Based on your comments earlier in this
> thread I was thinking that the text needs to be updated.
>
> Thanks,
> Bill
>
> On Fri, Oct 21, 2022 at 12:21 PM Lucas Brutschy
>  wrote:
>
> > Hi all,
> >
> > thanks, Sophie, this makes sense. I suppose then the way to help the user
> > not apply this in the wrong setting is having good documentation and a
> one
> > or two examples of good use cases.
> >
> > I think Colt's time-based partitioning is a good example of how to use
> > this. It actually doesn't have to be time, the same will work with any
> > monotonically increasing identifier. I.e. the new partitions will only
> get
> > records for users with a "large" user ID greater than some user ID
> > threshold hardcoded in the static partitioner. At least in this
> restricted
> > use-case, lookups by user ID would still be possible.
> >
> > Cheers,
> > Lucas
> >
> > On Fri, Oct 21, 2022 at 5:37 PM Colt McNealy 
> wrote:
> >
> > > Sophie,
> > >
> > > Regarding item "3" (my last paragraph from the previous email),
> perhaps I
> > > should give a more general example now that I've had more time to
> clarify
> > > my thoughts:
> > >
> > > In some stateful applications, certain keys have to be findable without
> > any
> > > information about when the relevant data was created. For example, if
> I'm
> > > running a word-count app and I want to use Interactive Queries to find
> > the
> > > count for "foo", I would need to know whether "foo" first arrived
> before
> > or
> > > after time T before I could find the correct partition to look up the
> > data.
> > > In this case, I don't think static partitioning is possible. Is this
> > > use-case a non-goal of the KIP, or am I missing something?
> > >
> > > Colt McNealy
> > > *Founder, LittleHorse.io*
> > >
> > >
> > > On Thu, Oct 20, 2022 at 6:37 PM Sophie Blee-Goldman
> > >  wrote:
> > >
> > > > Thanks for the responses guys! I'll get the easy stuff out of the way
> > > > first:
> > > >
> > > > 1) Fixed the KIP so that StaticStreamPartitioner extends
> > > StreamPartitioner
> > > > 2) I totally agree with you Colt, the record value might have
> valuable
> > > (no
> > > > pun) information
> > > > in it that is needed to compute the partition without breaking the
> > static
> > > > constraint. As in my
> > > > own example ea

Re: [DISCUSS] (continued) KIP-848: The Next Generation of the Consumer Rebalance Protocol

2022-10-25 Thread Jun Rao
Hi, David,

Thanks for the explanation. Sounds good to me.

Jun

On Tue, Oct 25, 2022 at 12:19 PM David Jacot  wrote:

> Hi Jun,
>
> That’s right. I propose to tackle this in a separate KIP because the
> downgrade need some thinking.
>
> I would like to point out that the current group coordinator does not
> support downgrades either so we don’t make it worst. I am personally in
> favor of doing this KIP before we release the new protocol. Supporting
> downgrades is important.
>
> We could of course extend the current KIP for this as well but I think that
> the current KIP is large and complicated enough.
>
> Does this sound reasonable?
>
> Best,
> David
>
> Le mar. 25 oct. 2022 à 19:41, Jun Rao  a écrit :
>
> > Hi, David,
> >
> > Thanks for the reply.
> >
> > The KIP mentioned downgrade support in a future KIP. So, with this KIP,
> > once the new records have been generated on the coordinator, there is no
> > path to downgrade the broker, is that correct?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Oct 25, 2022 at 7:10 AM David Jacot  >
> > wrote:
> >
> > > Hi Jun,
> > >
> > > 90.1 You're right. That's a miss on my side.
> > > 90.2 That makes sense. Relying on
> > > ConsumerGroupTargetAssignmentMetadataValue.AssignmentEpoch should be
> > > enough here.
> > >
> > > 91. When a client side assignor is used, the assignment is computed
> > > asynchronously. While it is computed for the group at epoch X, the
> > > group may have already advanced to epoch X+1 due to another event
> > > (e.g. new member joined). In this case, we have chosen to install the
> > > assignment computed for epoch X and to trigger a new assignment
> > > computation right away. So if the group topology changes rapidly, the
> > > assignment may lag a bit behind but it will eventually converge. For
> > > context, the alternative would have been to cancel the assignment and
> > > to trigger a new one immediately but this approach is prone to a live
> > > lock. For instance, if a member joins and leaves all the time, there
> > > is a chance that a new assignment is never computed. I have extended
> > > that phrase to be more explicit.
> > >
> > > Thanks for the comments.
> > >
> > > Best,
> > > David
> > >
> > > On Mon, Oct 24, 2022 at 7:28 PM Jun Rao 
> > wrote:
> > > >
> > > > Hi, David,
> > > >
> > > > Thanks for the updated KIP. A few more comments.
> > > >
> > > > 90. ConsumerGroupTargetAssignmentMemberValue:
> > > > 90.1 Do we need to include MemberId here given that it's in the key
> > > already?
> > > > 90.2 Since there is no new record if the new member assignment is the
> > > same,
> > > > it seems that AssignmentEpoch doesn't always reflect the correct
> > > assignment
> > > > epoch? If so, do we still need this field? Could we just depend
> > > > on ConsumerGroupTargetAssignmentMetadataValue.AssignmentEpoch?
> > > >
> > > > 91. "The AssignmentEpoch corresponds to the group epoch used to
> compute
> > > the
> > > > assignment. It is not necessarily the last one." Could you explain
> what
> > > "It
> > > > is not necessarily the last one." means?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Mon, Oct 24, 2022 at 7:49 AM Magnus Edenhill 
> > > wrote:
> > > >
> > > > > Hi, one minor comment on the latest update:
> > > > >
> > > > >
> > > > > Den mån 24 okt. 2022 kl 16:26 skrev David Jacot
> > > > >  > > > > >:
> > > > >
> > > > > > * Jason pointed out that the member id handling is a tad weird.
> The
> > > > > > group coordinator generates the member id and then trusts the
> > member
> > > > > > when it rejoins the group. This also implies that the client
> could
> > > > > > directly generate its member id and the group coordinator will
> > accept
> > > > > > it. It seems better to directly let the client generate id
> instead
> > of
> > > > > > relying on the group coordinator. I have updated the KIP in this
> > > > > > direction. Note that the new APIs still use a string for the
> member
> > > id
> > > > > > in order to remain consistent with the existing APIs.
> > > > > >
> > > > >
> > > > > We had a similar discussion for id generation in KIP-714 and I'd
> > advise
> > > > > against client-side id generation for a couple of reasons:
> > > > >  - it is much more likely for the client side prng to be poorly
> > > seeded, or
> > > > > incorrectly implemented, than the server side.
> > > > >This risks two different consumer instances generating the same
> > id.
> > > > >  - it adds an extra dependency on the client, a uuid
> library/module,
> > > which
> > > > > brings with it the usual plethora
> > > > >of linking conflicts, package availability issues, etc.
> > > > >  - as for trusting the authenticity of the id; with server-side
> > > generation
> > > > > we at least have a (future) possibility for verifying the id, would
> > it
> > > ever
> > > > > become an issue.
> > > > >
> > > > >
> > > > > Regards,
> > > > > Magnus
> > > > >
> > >
> >
>


Re: [DISCUSS] KIP-874: TopicRoundRobinAssignor

2022-10-25 Thread Sophie Blee-Goldman
Thanks for clarifying Mathieu, I hadn't seen your PR yet and assumed from
your
earlier comments that this was still a WIP.

I do tend to agree with David on this though, I was also struggling to
understand the
use case/motivation and felt it was quite specific. As I think David said
in his PR
comment, we've tried to limit the assignors we include to more common and
simple
use cases that are easy for people to understand and choose between.

Of course since there is now a PR for it, if any others do have a similar
use case and
would find it helpful, they can plug in a custom assignor based on your
patch. If we do
see this happening often enough that people start requesting it be added to
the OOTB
assignors, then it would make sense to revisit this PR. As it is, I suspect
just putting it
out there for others to use via the pluggable interface may be helpful
enough.

Either way, thanks for the contribution!

-Sophie

On Mon, Oct 24, 2022 at 11:19 PM David Jacot  wrote:

> Hi Mathieu,
>
> Thanks for the effort that you have put in creating this KIP. I just read
> it again and I am still confused by the use cases and the motivation. I
> suppose that this works for your data model but it does not seem to be a
> general pattern.
>
> Overall, I stick to the comment that I made in the PR. I still feel like
> that this is something too use case specific to make it into Apache Kafka.
>
> Best,
> David
>
> Le mar. 25 oct. 2022 à 07:25, Mathieu Amblard 
> a
> écrit :
>
> > Hi Sophie,
> >
> > Thank you for your message.
> >
> > The TopicRoundRobinAssignor has been already implemented (because the
> > existing ones do not address our use cases correctly), it implements the
> > interface you mentionned, and I have made a PR to the Kafka repo where
> you
> > can see the source code. It was in a comment of that PR that someone from
> > the core team proposes to create a KIP.
> >
> > I have also fully unit tested this assignor (see the PR). Moreover,
> around
> > 20 microservices are using it in production, for 6 months now, in the
> > company I am working for. So I think it has been proven and that's why I
> > have made this proposal. I would never have dared to suggest something I
> > have never tested and if there is no need for it.
> >
> > Finally, this assignor suits well our Kafka architecture and use cases
> > contrary to the existing ones. For reminder, we are using Kafka as a
> > messaging system (no streaming) and we can have only one type of message
> > per topic. Maybe it is too specific, maybe not, that's also why I have
> made
> > this KIP, to challenge it, to see if someone has the same needs and if
> this
> > assignor can help others. If not, that's not a problem, I will simply
> keep
> > it in our source code.
> >
> > Regards,
> > Mathieu
> >
> >
> > Le mar. 25 oct. 2022 à 01:51, Sophie Blee-Goldman
> >  a écrit :
> >
> > > Hey Mathieu,
> > >
> > > Apologies if you already know this, but the partition assignor
> interface
> > is
> > > fully pluggable.
> > > That means you can plug in a custom PartitionAssignor implementation,
> > > without
> > > having to go through a KIP or commit any code to the AK repo.
> > >
> > > I suspect some of the current unknowns and solutions will become clear
> > when
> > > you start
> > > to actually write this assignor and test it out in your environment.
> Then
> > > you can play around
> > > with what works and what doesn't work, and come back to the KIP if
> > desired
> > > with a stronger
> > > argument for why it's needed. Or you can just take your assignor and
> > > publish it in a public
> > > git repo for anyone who might have a similar use case as you.
> > >
> > > Just my two cents, I'd recommend in this case you start with the
> > > implementation before
> > > worrying about donating your assignor back to the main repo with a KIP.
> > IF
> > > you do want to,
> > > it would then be much easier to convince people when they can see your
> > > assignor logic
> > > for themselves, and you'll be able to answer any questions.
> > >
> > > Best,
> > > Sophie
> > >
> > > On Fri, Oct 21, 2022 at 2:21 AM Mathieu Amblard <
> > mathieu.ambl...@gmail.com
> > > >
> > > wrote:
> > >
> > > > Hello everybody,
> > > >
> > > > Just to let you know that I have added a chapter about having
> multiple
> > > > containers (multiple pods for Kubernetes) running the same
> application
> > :
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-874%3A+TopicRoundRobinAssignor#KIP874:TopicRoundRobinAssignor-Howdoesitworkifwehavemultiplecontainersrunningthesameapplication
> > > > ?
> > > >
> > > > Regards,
> > > > Mathieu
> > > >
> > > > Le mar. 11 oct. 2022 à 20:00, Mathieu Amblard <
> > mathieu.ambl...@gmail.com
> > > >
> > > > a
> > > > écrit :
> > > >
> > > > > Hi Hector,
> > > > >
> > > > >
> > > > >
> > > > > First, thank you for your questions !
> > > > >
> > > > >
> > > > >
> > > > > *If the goal is to do the partition assignments at a topic level,
> > > > wouldn

[jira] [Resolved] (KAFKA-14023) MirrorCheckpointTask.syncGroupOffset does not have to check if translated offset from upstream is smaller than the current consumer offset

2022-10-25 Thread Greg Harris (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris resolved KAFKA-14023.
-
Resolution: Won't Do

> MirrorCheckpointTask.syncGroupOffset  does not have to check if translated 
> offset from upstream is smaller than the current consumer offset
> ---
>
> Key: KAFKA-14023
> URL: https://issues.apache.org/jira/browse/KAFKA-14023
> Project: Kafka
>  Issue Type: Improvement
>  Components: KafkaConnect, mirrormaker
>Affects Versions: 3.2.0
>Reporter: Justinwins
>Assignee: Justinwins
>Priority: Minor
>
> In  MirrorCheckpointTask.syncGroupOffset () , there  is  a dedicated check , 
> as described :
> (line 285)
>  
> {code:java}
> // code placeholder
> // if translated offset from upstream is smaller than the current consumer 
> offset
> // in the target, skip updating the offset for that partition
> long latestDownstreamOffset = 
> targetConsumerOffset.get(topicPartition).offset();
> if (latestDownstreamOffset >= convertedOffset.offset()) {
> log.trace("latestDownstreamOffset {} is larger than or equal to 
> convertedUpstreamOffset {} for "
> + "TopicPartition {}", latestDownstreamOffset, convertedOffset.offset(), 
> topicPartition);
> continue;
> }
> offsetToSync.put(topicPartition, convertedOffset); {code}
>  
> I think there is no need to check 'whether translated offset from upstream is 
> smaller than the current consumer offset' ,as downstream offsets are better 
> to keep up with upstream 
> offsets.Let's say, we reset offset for upstream , it is expected that 
> downstream offsets are synced accordingly ,too
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-14337) bugs kafka 3.3.1 create topic after delete topic

2022-10-25 Thread thanhnd96 (Jira)
thanhnd96 created KAFKA-14337:
-

 Summary: bugs kafka 3.3.1 create topic after delete topic
 Key: KAFKA-14337
 URL: https://issues.apache.org/jira/browse/KAFKA-14337
 Project: Kafka
  Issue Type: Bug
  Components: admin
Affects Versions: 3.3.1
 Environment: Stagging
Reporter: thanhnd96
 Fix For: 3.3.1


Hi admin,

My issue is after i create topic like topic.AAA or Topic.AAA.01 then delete 1 
of the other 2 topics.

Then i can't create 1 of the 2 topics.

But i create topic test123 then delete and recreate fine.

This is log i tried to create topic.AAA

WARN [Controller 1] createTopics: failed with unknown server exception 
NoSuchElementException at epoch 14 in 193 us.  Renouncing leadership and 
reverting to the last committed offset 28. 
(org.apache.kafka.controller.QuorumController)
java.util.NoSuchElementException
        at 
org.apache.kafka.timeline.SnapshottableHashTable$CurrentIterator.next(SnapshottableHashTable.java:167)
        at 
org.apache.kafka.timeline.SnapshottableHashTable$CurrentIterator.next(SnapshottableHashTable.java:139)
        at 
org.apache.kafka.timeline.TimelineHashSet$ValueIterator.next(TimelineHashSet.java:120)
        at 
org.apache.kafka.controller.ReplicationControlManager.validateNewTopicNames(ReplicationControlManager.java:799)
        at 
org.apache.kafka.controller.ReplicationControlManager.createTopics(ReplicationControlManager.java:567)
        at 
org.apache.kafka.controller.QuorumController.lambda$createTopics$7(QuorumController.java:1832)
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.run(QuorumController.java:767)
        at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
        at java.base/java.lang.Thread.run(Thread.java:829)

ERROR [Controller 1] processBrokerHeartbeat: unable to start processing because 
of NotControllerException. (org.apache.kafka.controller.QuorumController)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)