Re: [DISCUSS] KIP-892: Transactional Semantics for StateStores

2023-09-13 Thread Bruno Cadonna

Hi Nick,

Thanks for the updates and sorry for the delay on my side!


1.
Making the default implementation for flush() a no-op sounds good to me.


2.
I think what was bugging me here is that a third-party state store needs 
to implement the state store interface. That means they need to 
implement a wrapper around the actual state store as we do for RocksDB 
with RocksDBStore. So, a third-party state store can always estimate the 
uncommitted bytes, if it wants, because the wrapper can record the added 
bytes.
One case I can think of where returning -1 makes sense is when Streams 
does not need to estimate the size of the write batch and trigger 
extraordinary commits, because the third-party state store takes care of 
memory. But in that case the method could also just return 0. Even that 
case would be better solved with a method that returns whether the state 
store manages itself the memory used for uncommitted bytes or not.
Said that, I am fine with keeping the -1 return value, I was just 
wondering when and if it will be used.


Regarding returning 0 for transactional state stores when the batch is 
empty, I was just wondering because you explicitly stated


"or {@code 0} if this StateStore does not support transactions."

So it seemed to me returning 0 could only happen for non-transactional 
state stores.



3.

a) What do you think if we move the isolation level to IQ (v1 and v2)?
In the end this is the only component that really needs to specify the 
isolation level. It is similar to the Kafka consumer that can choose 
with what isolation level to read the input topic.
For IQv1 the isolation level should go into StoreQueryParameters. For 
IQv2, I would add it to the Query interface.


b) Point a) raises the question what should happen during at-least-once 
processing when the state store does not use transactions? John in the 
past proposed to also use transactions on state stores for 
at-least-once. I like that idea, because it avoids aggregating the same 
records over and over again in the case of a failure. We had a case in 
the past where a Streams applications in at-least-once mode was failing 
continuously for some reasons I do not remember before committing the 
offsets. After each failover, the app aggregated again and again the 
same records. Of course the aggregate increased to very wrong values 
just because of the failover. With transactions on the state stores we 
could have avoided this. The app would have output the same aggregate 
multiple times (i.e., after each failover) but at least the value of the 
aggregate would not depend on the number of failovers. Outputting the 
same aggregate multiple times would be incorrect under exactly-once but 
it is OK for at-least-once.
If it makes sense to add a config to turn on and off transactions on 
state stores under at-least-once or just use transactions in any case is 
a question we should also discuss in this KIP. It depends a bit on the 
performance trade-off. Maybe to be safe, I would add a config.



4.
Your points are all valid. I tend to say to keep the metrics around 
flush() until we remove flush() completely from the interface. Calls to 
flush() might still exist since existing processors might still call 
flush() explicitly as you mentioned in 1). For sure, we need to document 
how the metrics change due to the transactions in the upgrade notes.



5.
I see. Then you should describe how the .position files are handled  in 
a dedicated section of the KIP or incorporate the description in the 
"Atomic Checkpointing" section instead of only mentioning it in the 
"Compatibility, Deprecation, and Migration Plan".



6.
Describing upgrading and downgrading in the KIP is a good idea. 
Regarding downgrading, I think you could also catch the exception and do 
what is needed to downgrade, e.g., drop the column family. See here for 
an example:


https://github.com/apache/kafka/blob/63fee01366e6ce98b9dfafd279a45d40b80e282d/streams/src/main/java/org/apache/kafka/streams/state/internals/RocksDBTimestampedStore.java#L75

It is a bit brittle, but it works.


Best,
Bruno


On 8/24/23 12:18 PM, Nick Telford wrote:

Hi Bruno,

Thanks for taking the time to review the KIP. I'm back from leave now and
intend to move this forwards as quickly as I can.

Addressing your points:

1.
Because flush() is part of the StateStore API, it's exposed to custom
Processors, which might be making calls to flush(). This was actually the
case in a few integration tests.
To maintain as much compatibility as possible, I'd prefer not to make this
an UnsupportedOperationException, as it will cause previously working
Processors to start throwing exceptions at runtime.
I agree that it doesn't make sense for it to proxy commit(), though, as
that would cause it to violate the "StateStores commit only when the Task
commits" rule.
Instead, I think we should make this a no-op. That way, existing user
Processors will continue to work as-before, without violation of store
c

Re: Apache Kafka 3.6.0 release

2023-09-13 Thread Federico Valeri
Hi Satish, this is a small documentation fix about ZK to KRaft
migration, that we would like to backport to 3.5 and 3.6 branches. Are
you ok with that?

https://github.com/apache/kafka/pull/14366

On Wed, Sep 13, 2023 at 3:13 AM Satish Duggana  wrote:
>
> Thanks David for the quick resolution.
>
> ~Satish.
>
> On Tue, 12 Sept 2023 at 22:51, David Arthur
>  wrote:
> >
> > Satish,
> >
> > KAFKA-15450 is merged to 3.6 (as well as trunk, 3.5, and 3.4)
> >
> > Thanks!
> > David
> >
> > On Tue, Sep 12, 2023 at 11:44 AM Ismael Juma  wrote:
> >
> > > Justine,
> > >
> > > Probably best to have the conversation in the JIRA ticket vs the release
> > > thread. Generally, we want to only include low risk bug fixes that are
> > > fully compatible in patch releases.
> > >
> > > Ismael
> > >
> > > On Tue, Sep 12, 2023 at 7:16 AM Justine Olshan
> > > 
> > > wrote:
> > >
> > > > Thanks Satish. I understand.
> > > > Just curious, is this something that could be added to 3.6.1? It would 
> > > > be
> > > > nice to say that hanging transactions are fully covered in a 3.6 
> > > > release.
> > > > I'm not as familiar with the rules around minor releases, but adding it
> > > > there would give more time to ensure stability.
> > > >
> > > > Thanks,
> > > > Justine
> > > >
> > > > On Tue, Sep 12, 2023 at 5:49 AM Satish Duggana  > > >
> > > > wrote:
> > > >
> > > > > Hi Justine,
> > > > > We can skip this change into 3.6 now as it is not a blocker or
> > > > > regression and it involves changes to the API implementation. Let us
> > > > > plan to add the gap in the release notes as you mentioned.
> > > > >
> > > > > Thanks,
> > > > > Satish.
> > > > >
> > > > > On Tue, 12 Sept 2023 at 04:44, Justine Olshan
> > > > >  wrote:
> > > > > >
> > > > > > Hey Satish,
> > > > > >
> > > > > > We just discovered a gap in KIP-890 part 1. We currently don't 
> > > > > > verify
> > > > on
> > > > > > txn offset commits, so it is still possible to have hanging
> > > > transactions
> > > > > on
> > > > > > the consumer offsets partitions.
> > > > > > I've opened a jira to wire the verification in that request.
> > > > > > https://issues.apache.org/jira/browse/KAFKA-15449
> > > > > >
> > > > > > This also isn't a regression, but it would be nice to have part 1
> > > fully
> > > > > > complete. I have opened a PR with the fix:
> > > > > > https://github.com/apache/kafka/pull/14370.
> > > > > >
> > > > > > I understand if there are concerns about last minute changes to this
> > > > API
> > > > > > and we can hold off if that makes the most sense.
> > > > > > If we take that route, I think we should still keep verification for
> > > > the
> > > > > > data partitions since it still provides full protection there and
> > > > > improves
> > > > > > the transactions experience. We will need to call out the gap in the
> > > > > > release notes for consumer offsets partitions
> > > > > >
> > > > > > Let me know what you think.
> > > > > > Justine
> > > > > >
> > > > > >
> > > > > > On Mon, Sep 11, 2023 at 12:29 PM David Arthur
> > > > > >  wrote:
> > > > > >
> > > > > > > Another (small) ZK migration issue was identified. This one isn't 
> > > > > > > a
> > > > > > > regression (it has existed since 3.4), but I think it's reasonable
> > > to
> > > > > > > include. It's a small configuration check that could potentially
> > > save
> > > > > end
> > > > > > > users from some headaches down the line.
> > > > > > >
> > > > > > > https://issues.apache.org/jira/browse/KAFKA-15450
> > > > > > > https://github.com/apache/kafka/pull/14367
> > > > > > >
> > > > > > > I think we can get this one committed to trunk today.
> > > > > > >
> > > > > > > -David
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Sun, Sep 10, 2023 at 7:50 PM Ismael Juma 
> > > > wrote:
> > > > > > >
> > > > > > > > Hi Satish,
> > > > > > > >
> > > > > > > > That sounds great. I think we should aim to only allow blockers
> > > > > > > > (regressions, impactful security issues, etc.) on the 3.6 branch
> > > > > until
> > > > > > > > 3.6.0 is out.
> > > > > > > >
> > > > > > > > Ismael
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sat, Sep 9, 2023, 12:20 AM Satish Duggana <
> > > > > satish.dugg...@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Ismael,
> > > > > > > > > It looks like we will publish RC0 by 14th Sep.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Satish.
> > > > > > > > >
> > > > > > > > > On Fri, 8 Sept 2023 at 19:23, Ismael Juma 
> > > > > > > > > 
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Hi Satish,
> > > > > > > > > >
> > > > > > > > > > Do you have a sense of when we'll publish RC0?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Ismael
> > > > > > > > > >
> > > > > > > > > > On Fri, Sep 8, 2023 at 6:27 AM David Arthur
> > > > > > > > > >  wrote:
> > > > > > > > > >
> > > > > > > > > > > Quick update on my two blockers: KAFKA-15435 is merged to
> > > > > trunk and
> > > > > > > > > > > cher

[jira] [Resolved] (KAFKA-14502) Implement LeaveGroup API

2023-09-13 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14502?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-14502.
-
Fix Version/s: 3.7.0
   Resolution: Fixed

> Implement LeaveGroup API
> 
>
> Key: KAFKA-14502
> URL: https://issues.apache.org/jira/browse/KAFKA-14502
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: David Jacot
>Assignee: Jeff Kim
>Priority: Major
> Fix For: 3.7.0
>
>
> Implement LeaveGroup API in the new Group Coordinator.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15460) Add group type filter to ListGroups API

2023-09-13 Thread David Jacot (Jira)
David Jacot created KAFKA-15460:
---

 Summary: Add group type filter to ListGroups API
 Key: KAFKA-15460
 URL: https://issues.apache.org/jira/browse/KAFKA-15460
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Jacot
Assignee: HaiyuanZhao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15461) Add integration test for the ListGroup API

2023-09-13 Thread David Jacot (Jira)
David Jacot created KAFKA-15461:
---

 Summary: Add integration test for the ListGroup API
 Key: KAFKA-15461
 URL: https://issues.apache.org/jira/browse/KAFKA-15461
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Jacot
Assignee: HaiyuanZhao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15462) Add group type filter to the admin client

2023-09-13 Thread David Jacot (Jira)
David Jacot created KAFKA-15462:
---

 Summary: Add group type filter to the admin client
 Key: KAFKA-15462
 URL: https://issues.apache.org/jira/browse/KAFKA-15462
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Jacot
Assignee: HaiyuanZhao






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-892: Transactional Semantics for StateStores

2023-09-13 Thread Nick Telford
Hi Bruno,

Thanks for getting back to me!

2.
The fact that implementations can always track estimated memory usage in
the wrapper is a good point. I can remove -1 as an option, and I'll clarify
the JavaDoc that 0 is not just for non-transactional stores, which is
currently misleading.

6.
The problem with catching the exception in the downgrade process is that
would require new code in the Kafka version being downgraded to. Since
users could conceivably downgrade to almost *any* older version of Kafka
Streams, I'm not sure how we could add that code?
The only way I can think of doing it would be to provide a dedicated
downgrade tool, that goes through every local store and removes the
offsets column families. But that seems like an unnecessary amount of extra
code to maintain just to handle a somewhat niche situation, when the
alternative (automatically wipe and restore stores) should be acceptable.

1, 4, 5: Agreed. I'll make the changes you've requested.

3a.
I agree that IsolationLevel makes more sense at query-time, and I actually
initially attempted to place the IsolationLevel at query-time, but I ran
into some problems:
- The key issue is that, under ALOS we're not staging writes in
transactions, so can't perform writes at the READ_COMMITTED isolation
level. However, this may be addressed if we decide to *always* use
transactions as discussed under 3b.
- IQv1 and IQv2 have quite different implementations. I remember having
some difficulty understanding the IQv1 internals, which made it difficult
to determine what needed to be changed. However, I *think* this can be
addressed for both implementations by wrapping the RocksDBStore in an
IsolationLevel-dependent wrapper, that overrides read methods (get, etc.)
to either read directly from the database or from the ongoing transaction.
But IQv1 might still be difficult.
- If IsolationLevel becomes a query constraint, then all other StateStores
will need to respect it, including the in-memory stores. This would require
us to adapt in-memory stores to stage their writes so they can be isolated
from READ_COMMITTTED queries. It would also become an important
consideration for third-party stores on upgrade, as without changes, they
would not support READ_COMMITTED queries correctly.

Ultimately, I may need some help making the necessary change to IQv1 to
support this, but I don't think it's fundamentally impossible, if we want
to pursue this route.

3b.
The main reason I chose to keep ALOS un-transactional was to minimize
behavioural change for most users (I believe most Streams users use the
default configuration, which is ALOS). That said, it's clear that if ALOS
also used transactional stores, the only change in behaviour would be that
it would become *more correct*, which could be considered a "bug fix" by
users, rather than a change they need to handle.

I believe that performance using transactions (aka. RocksDB WriteBatches)
should actually be *better* than the un-batched write-path that is
currently used[1]. The only "performance" consideration will be the
increased memory usage that transactions require. Given the mitigations for
this memory that we have in place, I would expect that this is not a
problem for most users.

If we're happy to do so, we can make ALOS also use transactions.

Regards,
Nick

Link 1:
https://github.com/adamretter/rocksjava-write-methods-benchmark#results

On Wed, 13 Sept 2023 at 09:41, Bruno Cadonna  wrote:

> Hi Nick,
>
> Thanks for the updates and sorry for the delay on my side!
>
>
> 1.
> Making the default implementation for flush() a no-op sounds good to me.
>
>
> 2.
> I think what was bugging me here is that a third-party state store needs
> to implement the state store interface. That means they need to
> implement a wrapper around the actual state store as we do for RocksDB
> with RocksDBStore. So, a third-party state store can always estimate the
> uncommitted bytes, if it wants, because the wrapper can record the added
> bytes.
> One case I can think of where returning -1 makes sense is when Streams
> does not need to estimate the size of the write batch and trigger
> extraordinary commits, because the third-party state store takes care of
> memory. But in that case the method could also just return 0. Even that
> case would be better solved with a method that returns whether the state
> store manages itself the memory used for uncommitted bytes or not.
> Said that, I am fine with keeping the -1 return value, I was just
> wondering when and if it will be used.
>
> Regarding returning 0 for transactional state stores when the batch is
> empty, I was just wondering because you explicitly stated
>
> "or {@code 0} if this StateStore does not support transactions."
>
> So it seemed to me returning 0 could only happen for non-transactional
> state stores.
>
>
> 3.
>
> a) What do you think if we move the isolation level to IQ (v1 and v2)?
> In the end this is the only component that really needs to specify the
> isolation 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2199

2023-09-13 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-892: Transactional Semantics for StateStores

2023-09-13 Thread Nick Telford
Bruno,

Thinking about 3a. in addition to adding the IsolationLevel to
QueryStoreParameters and Query, what about also adding a method like
"ReadOnlyKeyValueStore view(IsolationLevel isolationLevel)" to ReadOnlyKeyValueStore?

This would enable us to easily select/switch between IsolationLevels, even
if the StateStore has many layers of wrappers (as is the case at the point
IQv1 deals with the store). Would this be acceptable, or do you have
another approach in mind?

Regards,
Nick

On Wed, 13 Sept 2023 at 10:57, Nick Telford  wrote:

> Hi Bruno,
>
> Thanks for getting back to me!
>
> 2.
> The fact that implementations can always track estimated memory usage in
> the wrapper is a good point. I can remove -1 as an option, and I'll clarify
> the JavaDoc that 0 is not just for non-transactional stores, which is
> currently misleading.
>
> 6.
> The problem with catching the exception in the downgrade process is that
> would require new code in the Kafka version being downgraded to. Since
> users could conceivably downgrade to almost *any* older version of Kafka
> Streams, I'm not sure how we could add that code?
> The only way I can think of doing it would be to provide a dedicated
> downgrade tool, that goes through every local store and removes the
> offsets column families. But that seems like an unnecessary amount of extra
> code to maintain just to handle a somewhat niche situation, when the
> alternative (automatically wipe and restore stores) should be acceptable.
>
> 1, 4, 5: Agreed. I'll make the changes you've requested.
>
> 3a.
> I agree that IsolationLevel makes more sense at query-time, and I actually
> initially attempted to place the IsolationLevel at query-time, but I ran
> into some problems:
> - The key issue is that, under ALOS we're not staging writes in
> transactions, so can't perform writes at the READ_COMMITTED isolation
> level. However, this may be addressed if we decide to *always* use
> transactions as discussed under 3b.
> - IQv1 and IQv2 have quite different implementations. I remember having
> some difficulty understanding the IQv1 internals, which made it difficult
> to determine what needed to be changed. However, I *think* this can be
> addressed for both implementations by wrapping the RocksDBStore in an
> IsolationLevel-dependent wrapper, that overrides read methods (get, etc.)
> to either read directly from the database or from the ongoing transaction.
> But IQv1 might still be difficult.
> - If IsolationLevel becomes a query constraint, then all other StateStores
> will need to respect it, including the in-memory stores. This would require
> us to adapt in-memory stores to stage their writes so they can be isolated
> from READ_COMMITTTED queries. It would also become an important
> consideration for third-party stores on upgrade, as without changes, they
> would not support READ_COMMITTED queries correctly.
>
> Ultimately, I may need some help making the necessary change to IQv1 to
> support this, but I don't think it's fundamentally impossible, if we want
> to pursue this route.
>
> 3b.
> The main reason I chose to keep ALOS un-transactional was to minimize
> behavioural change for most users (I believe most Streams users use the
> default configuration, which is ALOS). That said, it's clear that if ALOS
> also used transactional stores, the only change in behaviour would be that
> it would become *more correct*, which could be considered a "bug fix" by
> users, rather than a change they need to handle.
>
> I believe that performance using transactions (aka. RocksDB WriteBatches)
> should actually be *better* than the un-batched write-path that is
> currently used[1]. The only "performance" consideration will be the
> increased memory usage that transactions require. Given the mitigations for
> this memory that we have in place, I would expect that this is not a
> problem for most users.
>
> If we're happy to do so, we can make ALOS also use transactions.
>
> Regards,
> Nick
>
> Link 1:
> https://github.com/adamretter/rocksjava-write-methods-benchmark#results
>
> On Wed, 13 Sept 2023 at 09:41, Bruno Cadonna  wrote:
>
>> Hi Nick,
>>
>> Thanks for the updates and sorry for the delay on my side!
>>
>>
>> 1.
>> Making the default implementation for flush() a no-op sounds good to me.
>>
>>
>> 2.
>> I think what was bugging me here is that a third-party state store needs
>> to implement the state store interface. That means they need to
>> implement a wrapper around the actual state store as we do for RocksDB
>> with RocksDBStore. So, a third-party state store can always estimate the
>> uncommitted bytes, if it wants, because the wrapper can record the added
>> bytes.
>> One case I can think of where returning -1 makes sense is when Streams
>> does not need to estimate the size of the write batch and trigger
>> extraordinary commits, because the third-party state store takes care of
>> memory. But in that case the method could also just return 0. Even that
>> case would be bett

[jira] [Created] (KAFKA-15463) StreamsException: Accessing from an unknown node

2023-09-13 Thread Yevgeny (Jira)
Yevgeny created KAFKA-15463:
---

 Summary:  StreamsException: Accessing from an unknown node
 Key: KAFKA-15463
 URL: https://issues.apache.org/jira/browse/KAFKA-15463
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 3.2.1
Reporter: Yevgeny


After some time application was working fine, starting to get:
 
This is springboot application runs in kubernetes as stateful pod.
 
 
 
{code:java}
  Exception in thread 
"-ddf9819f-d6c7-46ce-930e-cd923e1b3c2c-StreamThread-1" 
org.apache.kafka.streams.errors.StreamsException: Accessing from an unknown 
node at 
org.apache.kafka.streams.processor.internals.ProcessorContextImpl.getStateStore(ProcessorContextImpl.java:162)
 at myclass1.java:28) at myclass2.java:48) at 
java.base/java.util.stream.MatchOps$1MatchSink.accept(MatchOps.java:90) at 
java.base/java.util.ArrayList$ArrayListSpliterator.tryAdvance(ArrayList.java:1602)
 at 
java.base/java.util.stream.ReferencePipeline.forEachWithCancel(ReferencePipeline.java:129)
 at 
java.base/java.util.stream.AbstractPipeline.copyIntoWithCancel(AbstractPipeline.java:527)
 at 
java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:513) 
at 
java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
 at 
java.base/java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:230)
 at 
java.base/java.util.stream.MatchOps$MatchOp.evaluateSequential(MatchOps.java:196)
 at 
java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234) 
at 
java.base/java.util.stream.ReferencePipeline.allMatch(ReferencePipeline.java:637)
 at myclass3.java:48) at 
org.apache.kafka.streams.kstream.internals.TransformerSupplierAdapter$1.transform(TransformerSupplierAdapter.java:49)
 at 
org.apache.kafka.streams.kstream.internals.TransformerSupplierAdapter$1.transform(TransformerSupplierAdapter.java:38)
 at 
org.apache.kafka.streams.kstream.internals.KStreamFlatTransform$KStreamFlatTransformProcessor.process(KStreamFlatTransform.java:66)
 at 
org.apache.kafka.streams.processor.internals.ProcessorNode.process(ProcessorNode.java:146)
 at 
org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forwardInternal(ProcessorContextImpl.java:275)
 at 
org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:254)
 at 
org.apache.kafka.streams.processor.internals.ProcessorContextImpl.forward(ProcessorContextImpl.java:213)
 at 
org.apache.kafka.streams.processor.internals.SourceNode.process(SourceNode.java:84)
 at 
org.apache.kafka.streams.processor.internals.StreamTask.lambda$doProcess$1(StreamTask.java:780)
 at 
org.apache.kafka.streams.processor.internals.metrics.StreamsMetricsImpl.maybeMeasureLatency(StreamsMetricsImpl.java:809)
 at 
org.apache.kafka.streams.processor.internals.StreamTask.doProcess(StreamTask.java:780)
 at 
org.apache.kafka.streams.processor.internals.StreamTask.process(StreamTask.java:711)
 at 
org.apache.kafka.streams.processor.internals.TaskExecutor.processTask(TaskExecutor.java:100)
 at 
org.apache.kafka.streams.processor.internals.TaskExecutor.process(TaskExecutor.java:81)
 at 
org.apache.kafka.streams.processor.internals.TaskManager.process(TaskManager.java:1177)
 at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:769)
 at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:589)
 at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:551)
   {code}
 
stream-thread 
[-ddf9819f-d6c7-46ce-930e-cd923e1b3c2c-StreamThread-1] State 
transition from PENDING_SHUTDOWN to DEAD
 
 
Transformer is Prototype bean, the supplier supplys new instance of the 
Transformer:
 
 
{code:java}
@Override public Transformer> get() {  
   return ctx.getBean(MyTransformer.class); }{code}
 
 
The only way to recover is to delete all topics used by kafkastreams, even if 
application restarted same exception is thrown.

*If messages in internal topics of 'store-changelog'  are deleted/offset 
manipulated, can it cause the issue?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-892: Transactional Semantics for StateStores

2023-09-13 Thread Bruno Cadonna

Hi Nick,

6.
Of course, you are right! My bad!
Wiping out the state in the downgrading case is fine.


3a.
Focus on the public facing changes for the KIP. We will manage to get 
the internals right. Regarding state stores that do not support 
READ_COMMITTED, they should throw an error stating that they do not 
support READ_COMMITTED. No need to adapt all state stores immediately.


3b.
I am in favor of using transactions also for ALOS.


Best,
Bruno

On 9/13/23 11:57 AM, Nick Telford wrote:

Hi Bruno,

Thanks for getting back to me!

2.
The fact that implementations can always track estimated memory usage in
the wrapper is a good point. I can remove -1 as an option, and I'll clarify
the JavaDoc that 0 is not just for non-transactional stores, which is
currently misleading.

6.
The problem with catching the exception in the downgrade process is that
would require new code in the Kafka version being downgraded to. Since
users could conceivably downgrade to almost *any* older version of Kafka
Streams, I'm not sure how we could add that code?
The only way I can think of doing it would be to provide a dedicated
downgrade tool, that goes through every local store and removes the
offsets column families. But that seems like an unnecessary amount of extra
code to maintain just to handle a somewhat niche situation, when the
alternative (automatically wipe and restore stores) should be acceptable.

1, 4, 5: Agreed. I'll make the changes you've requested.

3a.
I agree that IsolationLevel makes more sense at query-time, and I actually
initially attempted to place the IsolationLevel at query-time, but I ran
into some problems:
- The key issue is that, under ALOS we're not staging writes in
transactions, so can't perform writes at the READ_COMMITTED isolation
level. However, this may be addressed if we decide to *always* use
transactions as discussed under 3b.
- IQv1 and IQv2 have quite different implementations. I remember having
some difficulty understanding the IQv1 internals, which made it difficult
to determine what needed to be changed. However, I *think* this can be
addressed for both implementations by wrapping the RocksDBStore in an
IsolationLevel-dependent wrapper, that overrides read methods (get, etc.)
to either read directly from the database or from the ongoing transaction.
But IQv1 might still be difficult.
- If IsolationLevel becomes a query constraint, then all other StateStores
will need to respect it, including the in-memory stores. This would require
us to adapt in-memory stores to stage their writes so they can be isolated
from READ_COMMITTTED queries. It would also become an important
consideration for third-party stores on upgrade, as without changes, they
would not support READ_COMMITTED queries correctly.

Ultimately, I may need some help making the necessary change to IQv1 to
support this, but I don't think it's fundamentally impossible, if we want
to pursue this route.

3b.
The main reason I chose to keep ALOS un-transactional was to minimize
behavioural change for most users (I believe most Streams users use the
default configuration, which is ALOS). That said, it's clear that if ALOS
also used transactional stores, the only change in behaviour would be that
it would become *more correct*, which could be considered a "bug fix" by
users, rather than a change they need to handle.

I believe that performance using transactions (aka. RocksDB WriteBatches)
should actually be *better* than the un-batched write-path that is
currently used[1]. The only "performance" consideration will be the
increased memory usage that transactions require. Given the mitigations for
this memory that we have in place, I would expect that this is not a
problem for most users.

If we're happy to do so, we can make ALOS also use transactions.

Regards,
Nick

Link 1:
https://github.com/adamretter/rocksjava-write-methods-benchmark#results

On Wed, 13 Sept 2023 at 09:41, Bruno Cadonna  wrote:


Hi Nick,

Thanks for the updates and sorry for the delay on my side!


1.
Making the default implementation for flush() a no-op sounds good to me.


2.
I think what was bugging me here is that a third-party state store needs
to implement the state store interface. That means they need to
implement a wrapper around the actual state store as we do for RocksDB
with RocksDBStore. So, a third-party state store can always estimate the
uncommitted bytes, if it wants, because the wrapper can record the added
bytes.
One case I can think of where returning -1 makes sense is when Streams
does not need to estimate the size of the write batch and trigger
extraordinary commits, because the third-party state store takes care of
memory. But in that case the method could also just return 0. Even that
case would be better solved with a method that returns whether the state
store manages itself the memory used for uncommitted bytes or not.
Said that, I am fine with keeping the -1 return value, I was just
wondering when and if it will be used.

Regardi

Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.5 #73

2023-09-13 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 283096 lines...]

> Task :streams:integrationTest

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
VersionedKeyValueStoreIntegrationTest > 
shouldManualUpgradeFromNonVersionedTimestampedToVersioned PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
HandlingSourceTopicDeletionIntegrationTest > 
shouldThrowErrorAfterSourceTopicDeleted STARTED

> Task :core:integrationTest

Gradle Test Run :core:integrationTest > Gradle Test Executor 185 > 
FetchFromFollowerIntegrationTest > testRackAwareRangeAssignor() PASSED

2025 tests completed, 1 failed, 4 skipped
There were failing tests. See the report at: 
file:///home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.5/core/build/reports/tests/integrationTest/index.html

> Task :streams:integrationTest

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
HandlingSourceTopicDeletionIntegrationTest > 
shouldThrowErrorAfterSourceTopicDeleted PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testHighAvailabilityTaskAssignorLargeNumConsumers 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testHighAvailabilityTaskAssignorLargeNumConsumers 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorLargePartitionCount STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorLargePartitionCount PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorManyThreadsPerClient STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > 
testHighAvailabilityTaskAssignorManyThreadsPerClient PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorManyStandbys STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorManyStandbys PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorManyThreadsPerClient STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorManyThreadsPerClient PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorManyThreadsPerClient 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorManyThreadsPerClient 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorLargePartitionCount 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorLargePartitionCount 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorLargePartitionCount STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorLargePartitionCount PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorManyStandbys STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorManyStandbys PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testHighAvailabilityTaskAssignorManyStandbys 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testHighAvailabilityTaskAssignorManyStandbys PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorLargeNumConsumers 
STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testFallbackPriorTaskAssignorLargeNumConsumers 
PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorLargeNumConsumers STARTED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
StreamsAssignmentScaleTest > testStickyTaskAssignorLargeNumConsumers PASSED

Gradle Test Run :streams:integrationTest > Gradle Test Executor 181 > 
EmitOnChangeIntegrationTest > shouldEmitSameRecordAfterFailover() STARTED

Gradle Test Run :strea

Re: Re: Re: Re: [DISCUSS] KIP-971 Expose replication-offset-lag MirrorMaker2 metric

2023-09-13 Thread Viktor Somogyi-Vass
Elkhan, do you think making yours similar would make sense?

On Wed, Sep 6, 2023 at 4:12 AM hudeqi <16120...@bjtu.edu.cn> wrote:

> Hey, Viktor.
> As far as my implementation is concerned, the default setting is 30s, but
> I added it to `MirrorConnectorConfig`, which can be adjusted freely
> according to the load of the source cluster and the number of tasks.
>
> best,
> hudeqi
>
> "Viktor Somogyi-Vass"  >写道:
> > Hey Elkhan and hudeqi,
> >
> > I'm reading your debate around the implementation. I also think a
> > scheduled task would be better in overall accuracy and performance
> > (compared to calling endOffsets with every poll).
> > Hudeqi, do you have any experience of what works best for you in terms of
> > time intervals? I would think refreshing the metric every 5-10sec would
> be
> > overall good and sufficient for the users (as short intervals can be
> quite
> > noisy anyways).
> >
> > Best,
> > Viktor
> >
> > On Mon, Sep 4, 2023 at 11:41 AM hudeqi <16120...@bjtu.edu.cn> wrote:
> >
> > > My approach is to create another thread to regularly request and update
> > > the end offset of each partition for the `keySet` in the collection
> > > `lastReplicatedSourceOffsets` mentioned by your kip (if there is no
> update
> > > for a long time, it will be removed from
> `lastReplicatedSourceOffsets`).
> > > Obviously, such processing makes the calculation of the partition
> offset
> > > lag less real-time and accurate.
> > > But this also meets our needs, because we need the partition offset
> lag to
> > > analyze the replication performance of the task and which task may have
> > > performance problems; and if you monitor the overall offset lag of the
> > > topic, then using the
> > > "kafka_consumer_consumer_fetch_manager_metrics_records_lag" metric
> will be
> > > more real-time and accurate.
> > > This is my suggestion. I hope to be able to throw bricks and start
> jade,
> > > we can come up with a better solution.
> > >
> > > best,
> > > hudeqi
> > >
> > > "Elxan Eminov" 写道:
> > > > @huqedi replying to your comment on the PR (
> > > > https://github.com/apache/kafka/pull/14077#discussion_r1314592488),
> > > quote:
> > > >
> > > > "I guess we have a disagreement about lag? My understanding of lag
> is:
> > > the
> > > > real LEO of the source cluster partition minus the LEO that has been
> > > > written to the target cluster. It seems that your definition of lag
> is:
> > > the
> > > > lag between the mirror task getting data from consumption and
> writing it
> > > to
> > > > the target cluster?"
> > > >
> > > > Yes, this is the case. I've missed the fact that the consumer itself
> > > might
> > > > be lagging behind the actual data in the partition.
> > > > I believe your definition of the lag is more precise, but:
> > > > Implementing it this way will come at the cost of an extra
> listOffsets
> > > > request, introducing the overhead that you mentioned in your initial
> > > > comment.
> > > >
> > > > If you have enough insights about this, what would you say is the
> chances
> > > > of the task consumer lagging behind the LEO of the partition?
> > > > Are they big enough to justify the extra call to listOffsets?
> > > > @Viktor,  any thoughts?
> > > >
> > > > Thanks,
> > > > Elkhan
> > > >
> > > > On Mon, 4 Sept 2023 at 09:36, Elxan Eminov 
> > > wrote:
> > > >
> > > > > I already have the PR for this so if it will make it easier to
> discuss,
> > > > > feel free to take a look:
> https://github.com/apache/kafka/pull/14077
> > > > >
> > > > > On Mon, 4 Sept 2023 at 09:17, hudeqi <16120...@bjtu.edu.cn> wrote:
> > > > >
> > > > >> But does the offset of the last `ConsumerRecord` obtained in poll
> not
> > > > >> only represent the offset of this record in the source cluster? It
> > > seems
> > > > >> that it cannot represent the LEO of the source cluster for this
> > > partition.
> > > > >> I understand that the offset lag introduced here should be the
> LEO of
> > > the
> > > > >> source cluster minus the offset of the last record to be polled?
> > > > >>
> > > > >> best,
> > > > >> hudeqi
> > > > >>
> > > > >>
> > > > >> > -原始邮件-
> > > > >> > 发件人: "Elxan Eminov" 
> > > > >> > 发送时间: 2023-09-04 14:52:08 (星期一)
> > > > >> > 收件人: dev@kafka.apache.org
> > > > >> > 抄送:
> > > > >> > 主题: Re: [DISCUSS] KIP-971 Expose replication-offset-lag
> > > MirrorMaker2
> > > > >> metric
> > > > >> >
> > > > >> 
> > > > >
> > > > >
> > >
>


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2200

2023-09-13 Thread Apache Jenkins Server
See 




[jira] [Resolved] (KAFKA-15115) Implement resetPositions functionality in OffsetsRequestManager

2023-09-13 Thread Jun Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-15115.
-
Fix Version/s: 3.7.0
   Resolution: Fixed

merged the PR to trunk

> Implement resetPositions functionality in OffsetsRequestManager
> ---
>
> Key: KAFKA-15115
> URL: https://issues.apache.org/jira/browse/KAFKA-15115
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: consumer-threading-refactor
> Fix For: 3.7.0
>
>
> Introduce support for resetting positions in the new OffsetsRequestManager. 
> This task will include a new event for the resetPositions calls performed 
> from the new consumer, and the logic for handling such events in the 
> OffsetRequestManager.
> The reset positions implementation will keep the same behaviour as the one in 
> the old consumer, but adapted to the new threading model. So it is based in a 
> RESET_POSITIONS events that is submitted to the background thread, and then 
> processed by the ApplicationEventProcessor. The processing itself is done by 
> the OffsetRequestManager given that this will require a LIST_OFFSETS request 
> for the partitions awaiting reset.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15163) Implement validatePositions functionality for new KafkaConsumer

2023-09-13 Thread Jun Rao (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-15163.
-
Fix Version/s: 3.7.0
   Resolution: Fixed

This is covered in https://github.com/apache/kafka/pull/14346.

> Implement validatePositions functionality for new KafkaConsumer
> ---
>
> Key: KAFKA-15163
> URL: https://issues.apache.org/jira/browse/KAFKA-15163
> Project: Kafka
>  Issue Type: Task
>  Components: clients, consumer
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>  Labels: consumer-threading-refactor
> Fix For: 3.7.0
>
>
> Introduce support for validating positions in the new OffsetsRequestManager. 
> This task will include a new event for the validatePositions calls performed 
> from the new consumer, and the logic for handling such events in the 
> OffsetRequestManager.
> The validate positions implementation will keep the same behaviour as the one 
> in the old consumer, but adapted to the new threading model. So it is based 
> in a VALIDATE_POSITIONS events that is submitted to the background thread, 
> and the processed by the ApplicationEventProcessor. The processing itself is 
> done by the OffsetRequestManager given that this will require an 
> OFFSET_FOR_LEADER_EPOCH request. This task will introduce support for such 
> requests in the OffsetRequestManager, responsible for offset-related requests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-858: Handle JBOD broker disk failure in KRaft

2023-09-13 Thread Igor Soarez
Hi Ron,

Thanks for drilling down on this. I think the KIP isn't really clear here,
and the metadata caching section you quoted needs clarification.

The "hosting broker's latest registration" refers to the previous,
not the current registration. The registrations are only compared by
the controller, when handling the broker registration request.

Suppose broker b1 hosts two partitions, t-1 and t-2, in two
directories, d1 and d2. The broker is registered, and the
metadata correlates the replicas to their respective directories.
i.e. OnlineLogDirs=[d1,d2] and OfflineLogDirs=false

The broker is then reconfigured to remove t-2 from log.dirs, and at startup,
the registration request shows OnlineLogDirs=[d1] and OfflineLogDirs=false.
The previous registration will only be replaced after a new successful
registration, regardless of how quickly or how often b1 restarts.
The controller compares the previous registration, and notices
that one of the directories has been removed.
So for any replica hosted in the broker that is assigned to that
missing log directory, a logical metadata update takes place
that assigned them to Uuid.OfflineDir, so Assignment.Directory
is updated for t-2. This value is indicates that the replica
is offline — I have updated the section you quoted to address this.

Once the broker catches up with metadata, it will select the only
configured log directory — d1 — for any partitions assigned to
Uuid.OfflineDir, and update the assignment.

Best,

--
Igor





Re: [DISCUSS] KIP-892: Transactional Semantics for StateStores

2023-09-13 Thread Nick Telford
Hi Bruno,

I've updated the KIP based on our conversation. The only things I've not
yet done are:

1. Using transactions under ALOS and EOS.
2. Making IsolationLevel a query-time constraint, rather than linking it to
the processing.guarantee.

There's a wrinkle that makes this a challenge: Interactive Queries that
open an Iterator, when using transactions and READ_UNCOMMITTED.
The problem is that under READ_UNCOMMITTED, queries need to be able to read
records from the currently uncommitted transaction buffer (WriteBatch).
This includes for Iterators, which should iterate both the transaction
buffer and underlying database (using WriteBatch#iteratorWithBase()).

The issue is that when the StreamThread commits, it writes the current
WriteBatch to RocksDB *and then clears the WriteBatch*. Clearing the
WriteBatch while an Interactive Query holds an open Iterator on it will
invalidate the Iterator. Worse, it turns out that Iterators over a
WriteBatch become invalidated not just when the WriteBatch is cleared, but
also when the Iterators' current key receives a new write.

Now that I'm writing this, I remember that this is the major reason that I
switched the original design from having a query-time IsolationLevel to
having the IsolationLevel linked to the transactionality of the stores
themselves.

It *might* be possible to resolve this, by having a "chain" of
WriteBatches, with the StreamThread switching to a new WriteBatch whenever
a new Interactive Query attempts to read from the database, but that could
cause some performance problems/memory pressure when subjected to a high
Interactive Query load. It would also reduce the efficiency of WriteBatches
on-commit, as we'd have to write N WriteBatches, where N is the number of
Interactive Queries since the last commit.

I realise this is getting into the weeds of the implementation, and you'd
rather we focus on the API for now, but I think it's important to consider
how to implement the desired API, in case we come up with an API that
cannot be implemented efficiently, or even at all!

Thoughts?
--
Nick

On Wed, 13 Sept 2023 at 13:03, Bruno Cadonna  wrote:

> Hi Nick,
>
> 6.
> Of course, you are right! My bad!
> Wiping out the state in the downgrading case is fine.
>
>
> 3a.
> Focus on the public facing changes for the KIP. We will manage to get
> the internals right. Regarding state stores that do not support
> READ_COMMITTED, they should throw an error stating that they do not
> support READ_COMMITTED. No need to adapt all state stores immediately.
>
> 3b.
> I am in favor of using transactions also for ALOS.
>
>
> Best,
> Bruno
>
> On 9/13/23 11:57 AM, Nick Telford wrote:
> > Hi Bruno,
> >
> > Thanks for getting back to me!
> >
> > 2.
> > The fact that implementations can always track estimated memory usage in
> > the wrapper is a good point. I can remove -1 as an option, and I'll
> clarify
> > the JavaDoc that 0 is not just for non-transactional stores, which is
> > currently misleading.
> >
> > 6.
> > The problem with catching the exception in the downgrade process is that
> > would require new code in the Kafka version being downgraded to. Since
> > users could conceivably downgrade to almost *any* older version of Kafka
> > Streams, I'm not sure how we could add that code?
> > The only way I can think of doing it would be to provide a dedicated
> > downgrade tool, that goes through every local store and removes the
> > offsets column families. But that seems like an unnecessary amount of
> extra
> > code to maintain just to handle a somewhat niche situation, when the
> > alternative (automatically wipe and restore stores) should be acceptable.
> >
> > 1, 4, 5: Agreed. I'll make the changes you've requested.
> >
> > 3a.
> > I agree that IsolationLevel makes more sense at query-time, and I
> actually
> > initially attempted to place the IsolationLevel at query-time, but I ran
> > into some problems:
> > - The key issue is that, under ALOS we're not staging writes in
> > transactions, so can't perform writes at the READ_COMMITTED isolation
> > level. However, this may be addressed if we decide to *always* use
> > transactions as discussed under 3b.
> > - IQv1 and IQv2 have quite different implementations. I remember having
> > some difficulty understanding the IQv1 internals, which made it difficult
> > to determine what needed to be changed. However, I *think* this can be
> > addressed for both implementations by wrapping the RocksDBStore in an
> > IsolationLevel-dependent wrapper, that overrides read methods (get, etc.)
> > to either read directly from the database or from the ongoing
> transaction.
> > But IQv1 might still be difficult.
> > - If IsolationLevel becomes a query constraint, then all other
> StateStores
> > will need to respect it, including the in-memory stores. This would
> require
> > us to adapt in-memory stores to stage their writes so they can be
> isolated
> > from READ_COMMITTTED queries. It would also become an important
> > considera

Re: Apache Kafka 3.6.0 release

2023-09-13 Thread Justine Olshan
Hey Satish -- yes, you are correct. KAFKA-15459 only affects 3.6.
PR should be finalized soon.

Thanks,
Justine

On Wed, Sep 13, 2023 at 1:41 AM Federico Valeri 
wrote:

> Hi Satish, this is a small documentation fix about ZK to KRaft
> migration, that we would like to backport to 3.5 and 3.6 branches. Are
> you ok with that?
>
> https://github.com/apache/kafka/pull/14366
>
> On Wed, Sep 13, 2023 at 3:13 AM Satish Duggana 
> wrote:
> >
> > Thanks David for the quick resolution.
> >
> > ~Satish.
> >
> > On Tue, 12 Sept 2023 at 22:51, David Arthur
> >  wrote:
> > >
> > > Satish,
> > >
> > > KAFKA-15450 is merged to 3.6 (as well as trunk, 3.5, and 3.4)
> > >
> > > Thanks!
> > > David
> > >
> > > On Tue, Sep 12, 2023 at 11:44 AM Ismael Juma 
> wrote:
> > >
> > > > Justine,
> > > >
> > > > Probably best to have the conversation in the JIRA ticket vs the
> release
> > > > thread. Generally, we want to only include low risk bug fixes that
> are
> > > > fully compatible in patch releases.
> > > >
> > > > Ismael
> > > >
> > > > On Tue, Sep 12, 2023 at 7:16 AM Justine Olshan
> > > > 
> > > > wrote:
> > > >
> > > > > Thanks Satish. I understand.
> > > > > Just curious, is this something that could be added to 3.6.1? It
> would be
> > > > > nice to say that hanging transactions are fully covered in a 3.6
> release.
> > > > > I'm not as familiar with the rules around minor releases, but
> adding it
> > > > > there would give more time to ensure stability.
> > > > >
> > > > > Thanks,
> > > > > Justine
> > > > >
> > > > > On Tue, Sep 12, 2023 at 5:49 AM Satish Duggana <
> satish.dugg...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Justine,
> > > > > > We can skip this change into 3.6 now as it is not a blocker or
> > > > > > regression and it involves changes to the API implementation.
> Let us
> > > > > > plan to add the gap in the release notes as you mentioned.
> > > > > >
> > > > > > Thanks,
> > > > > > Satish.
> > > > > >
> > > > > > On Tue, 12 Sept 2023 at 04:44, Justine Olshan
> > > > > >  wrote:
> > > > > > >
> > > > > > > Hey Satish,
> > > > > > >
> > > > > > > We just discovered a gap in KIP-890 part 1. We currently don't
> verify
> > > > > on
> > > > > > > txn offset commits, so it is still possible to have hanging
> > > > > transactions
> > > > > > on
> > > > > > > the consumer offsets partitions.
> > > > > > > I've opened a jira to wire the verification in that request.
> > > > > > > https://issues.apache.org/jira/browse/KAFKA-15449
> > > > > > >
> > > > > > > This also isn't a regression, but it would be nice to have
> part 1
> > > > fully
> > > > > > > complete. I have opened a PR with the fix:
> > > > > > > https://github.com/apache/kafka/pull/14370.
> > > > > > >
> > > > > > > I understand if there are concerns about last minute changes
> to this
> > > > > API
> > > > > > > and we can hold off if that makes the most sense.
> > > > > > > If we take that route, I think we should still keep
> verification for
> > > > > the
> > > > > > > data partitions since it still provides full protection there
> and
> > > > > > improves
> > > > > > > the transactions experience. We will need to call out the gap
> in the
> > > > > > > release notes for consumer offsets partitions
> > > > > > >
> > > > > > > Let me know what you think.
> > > > > > > Justine
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Sep 11, 2023 at 12:29 PM David Arthur
> > > > > > >  wrote:
> > > > > > >
> > > > > > > > Another (small) ZK migration issue was identified. This one
> isn't a
> > > > > > > > regression (it has existed since 3.4), but I think it's
> reasonable
> > > > to
> > > > > > > > include. It's a small configuration check that could
> potentially
> > > > save
> > > > > > end
> > > > > > > > users from some headaches down the line.
> > > > > > > >
> > > > > > > > https://issues.apache.org/jira/browse/KAFKA-15450
> > > > > > > > https://github.com/apache/kafka/pull/14367
> > > > > > > >
> > > > > > > > I think we can get this one committed to trunk today.
> > > > > > > >
> > > > > > > > -David
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Sun, Sep 10, 2023 at 7:50 PM Ismael Juma <
> m...@ismaeljuma.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Satish,
> > > > > > > > >
> > > > > > > > > That sounds great. I think we should aim to only allow
> blockers
> > > > > > > > > (regressions, impactful security issues, etc.) on the 3.6
> branch
> > > > > > until
> > > > > > > > > 3.6.0 is out.
> > > > > > > > >
> > > > > > > > > Ismael
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sat, Sep 9, 2023, 12:20 AM Satish Duggana <
> > > > > > satish.dugg...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Ismael,
> > > > > > > > > > It looks like we will publish RC0 by 14th Sep.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Satish.
> > > > > > > > > >
> > > > > > > > > > On Fri, 8 Sept 2023 at 19:23, Ismael Juma <
> m...@ismaeljuma.com>
>

Re: [DISCUSS] KIP-892: Transactional Semantics for StateStores

2023-09-13 Thread Nick Telford
Addendum:

I think we would also face the same problem with the approach John outlined
earlier (using the record cache as a transaction buffer and flushing it
straight to SST files). This is because the record cache (the ThreadCache
class) is not thread-safe, so every commit would invalidate open IQ
Iterators in the same way that RocksDB WriteBatches do.
--
Nick

On Wed, 13 Sept 2023 at 16:58, Nick Telford  wrote:

> Hi Bruno,
>
> I've updated the KIP based on our conversation. The only things I've not
> yet done are:
>
> 1. Using transactions under ALOS and EOS.
> 2. Making IsolationLevel a query-time constraint, rather than linking it
> to the processing.guarantee.
>
> There's a wrinkle that makes this a challenge: Interactive Queries that
> open an Iterator, when using transactions and READ_UNCOMMITTED.
> The problem is that under READ_UNCOMMITTED, queries need to be able to
> read records from the currently uncommitted transaction buffer
> (WriteBatch). This includes for Iterators, which should iterate both the
> transaction buffer and underlying database (using
> WriteBatch#iteratorWithBase()).
>
> The issue is that when the StreamThread commits, it writes the current
> WriteBatch to RocksDB *and then clears the WriteBatch*. Clearing the
> WriteBatch while an Interactive Query holds an open Iterator on it will
> invalidate the Iterator. Worse, it turns out that Iterators over a
> WriteBatch become invalidated not just when the WriteBatch is cleared, but
> also when the Iterators' current key receives a new write.
>
> Now that I'm writing this, I remember that this is the major reason that I
> switched the original design from having a query-time IsolationLevel to
> having the IsolationLevel linked to the transactionality of the stores
> themselves.
>
> It *might* be possible to resolve this, by having a "chain" of
> WriteBatches, with the StreamThread switching to a new WriteBatch whenever
> a new Interactive Query attempts to read from the database, but that could
> cause some performance problems/memory pressure when subjected to a high
> Interactive Query load. It would also reduce the efficiency of WriteBatches
> on-commit, as we'd have to write N WriteBatches, where N is the number of
> Interactive Queries since the last commit.
>
> I realise this is getting into the weeds of the implementation, and you'd
> rather we focus on the API for now, but I think it's important to consider
> how to implement the desired API, in case we come up with an API that
> cannot be implemented efficiently, or even at all!
>
> Thoughts?
> --
> Nick
>
> On Wed, 13 Sept 2023 at 13:03, Bruno Cadonna  wrote:
>
>> Hi Nick,
>>
>> 6.
>> Of course, you are right! My bad!
>> Wiping out the state in the downgrading case is fine.
>>
>>
>> 3a.
>> Focus on the public facing changes for the KIP. We will manage to get
>> the internals right. Regarding state stores that do not support
>> READ_COMMITTED, they should throw an error stating that they do not
>> support READ_COMMITTED. No need to adapt all state stores immediately.
>>
>> 3b.
>> I am in favor of using transactions also for ALOS.
>>
>>
>> Best,
>> Bruno
>>
>> On 9/13/23 11:57 AM, Nick Telford wrote:
>> > Hi Bruno,
>> >
>> > Thanks for getting back to me!
>> >
>> > 2.
>> > The fact that implementations can always track estimated memory usage in
>> > the wrapper is a good point. I can remove -1 as an option, and I'll
>> clarify
>> > the JavaDoc that 0 is not just for non-transactional stores, which is
>> > currently misleading.
>> >
>> > 6.
>> > The problem with catching the exception in the downgrade process is that
>> > would require new code in the Kafka version being downgraded to. Since
>> > users could conceivably downgrade to almost *any* older version of Kafka
>> > Streams, I'm not sure how we could add that code?
>> > The only way I can think of doing it would be to provide a dedicated
>> > downgrade tool, that goes through every local store and removes the
>> > offsets column families. But that seems like an unnecessary amount of
>> extra
>> > code to maintain just to handle a somewhat niche situation, when the
>> > alternative (automatically wipe and restore stores) should be
>> acceptable.
>> >
>> > 1, 4, 5: Agreed. I'll make the changes you've requested.
>> >
>> > 3a.
>> > I agree that IsolationLevel makes more sense at query-time, and I
>> actually
>> > initially attempted to place the IsolationLevel at query-time, but I ran
>> > into some problems:
>> > - The key issue is that, under ALOS we're not staging writes in
>> > transactions, so can't perform writes at the READ_COMMITTED isolation
>> > level. However, this may be addressed if we decide to *always* use
>> > transactions as discussed under 3b.
>> > - IQv1 and IQv2 have quite different implementations. I remember having
>> > some difficulty understanding the IQv1 internals, which made it
>> difficult
>> > to determine what needed to be changed. However, I *think* this can be
>> > addressed for both impl

Re: [VOTE] KIP-714: Client metrics and observability

2023-09-13 Thread Philip Nee
Hey Andrew -

Thank you for taking the time to reply to my questions. I'm just adding
some notes to this discussion.

1. epoch: It can be helpful to know the delta of the client side and the
actual leader epoch.  It is helpful to understand why sometimes commit
fails/client not making progress.
2. Client connection: If the client selects the "wrong" connection to push
out the data, I assume the request would timeout; which should lead to
disconnecting from the node and reselecting another node as you mentioned,
via the least loaded node.

Cheers,
P


On Tue, Sep 12, 2023 at 10:40 AM Andrew Schofield <
andrew_schofield_j...@outlook.com> wrote:

> Hi Philip,
> Thanks for your vote and interest in the KIP.
>
> KIP-714 does not introduce any new client metrics, and that’s intentional.
> It does
> tell how that all of the client metrics can have their names transformed
> into
> equivalent "telemetry metric names”, and then potentially used in metrics
> subscriptions.
>
> I am interested in the idea of client’s leader epoch in this context, but
> I don’t have
> an immediate plan for how best to do this, and it would take another KIP
> to enhance
> existing metrics or introduce some new ones. Those would then naturally be
> applicable to the metrics push introduced in KIP-714.
>
> In a similar vein, there are no existing client metrics specifically for
> auto-commit.
> We could add them to Kafka, but I really think this is just an example of
> asynchronous
> commit in which the application has decided not to specify when the commit
> should
> begin.
>
> It is possible to increase the cadence of pushing by modifying the
> interval.ms
> configuration property of the CLIENT_METRICS resource.
>
> There is an “assigned-partitions” metric for each consumer, but not one for
> active partitions. We could add one, again as a follow-on KIP.
>
> I take your point about holding on to a connection in a channel which might
> experience congestion. Do you have a suggestion for how to improve on this?
> For example, the client does have the concept of a least-loaded node. Maybe
> this is something we should investigate in the implementation and decide
> on the
> best approach. In general, I think sticking with the same node for
> consecutive
> pushes is best, but if you choose the “wrong” node to start with, it’s not
> ideal.
>
> Thanks,
> Andrew
>
> > On 8 Sep 2023, at 19:29, Philip Nee  wrote:
> >
> > Hey Andrew -
> >
> > +1 but I don't have a binding vote!
> >
> > It took me a while to go through the KIP. Here are some of my notes
> during
> > the reading:
> >
> > *Metrics*
> > - Should we care about the client's leader epoch? There is a case where
> the
> > user recreates the topic, but the consumer thinks it is still the same
> > topic and therefore, attempts to start from an offset that doesn't exist.
> > KIP-848 addresses this issue, but I can still see some potential benefits
> > from knowing the client's epoch information.
> > - I assume poll idle is similar to poll interval: I needed to read the
> > description a few times.
> > - I don't have a clear use case in mind for the commit latency, but I do
> > think sometimes people lack clarity about how much progress was tracked
> by
> > the auto-commit.  Would tracking auto-commit-related metrics be useful? I
> > was thinking: the last offset committed or the actual cadence in ms.
> > - Are there cases when we need to increase the cadence of telemetry data
> > push? i.e. variable interval.
> > - Thanks for implementing the randomized initial metric push; I think it
> is
> > really important.
> > - Is there a potential use case for tracking the number of active
> > partitions? The consumer can pause partitions via API, during revocation,
> > or during offset reset for the stream.
> >
> > *Connections*:
> > - The KIP stated that it will keep the same connection until the
> connection
> > is disconnected. I wonder if that could potentially cause congestion if
> it
> > is already a busy channel, which leads to connection timeout and
> > subsequently disconnection.
> >
> > Thanks,
> > P
> >
> > On Fri, Sep 8, 2023 at 4:15 AM Andrew Schofield <
> > andrew_schofield_j...@outlook.com> wrote:
> >
> >> Bumping the voting thread for KIP-714.
> >>
> >> So far, we have:
> >> Non-binding +2 (Milind and Kirk), non-binding -1 (Ryanne)
> >>
> >> Thanks,
> >> Andrew
> >>
> >>> On 4 Aug 2023, at 09:45, Andrew Schofield 
> >> wrote:
> >>>
> >>> Hi,
> >>> After almost 2 1/2 years in the making, I would like to call a vote for
> >> KIP-714 (
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-714%3A+Client+metrics+and+observability
> >> ).
> >>>
> >>> This KIP aims to improve monitoring and troubleshooting of client
> >> performance by enabling clients to push metrics to brokers.
> >>>
> >>> I’d like to thank everyone that participated in the discussion,
> >> especially the librdkafka team since one of the aims of the KIP is to
> >> enable any client to participate, not just the Apache

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2201

2023-09-13 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-966: Eligible Leader Replicas

2023-09-13 Thread Colin McCabe
On Tue, Sep 12, 2023, at 17:21, Calvin Liu wrote:
> Hi Colin
> Thanks for the comments!
>

Hi Calvin,

Thanks again for the KIP.

One meta-comment: it's usually better to just do a diff on a message spec file 
or java file if you're including changes to it in the KIP. This is easier to 
read than looking for "new fields begin" etc. in the text, and gracefully 
handles the case where existing fields were changed.

> Rewrite the Additional High Watermark advancement requirement
> There was feedback on this section that some readers may not be familiar
> with HWM and Ack=0,1,all requests. This can help them understand the
> proposal. I will rewrite this part for more readability.
>

To be clear, I wasn't suggesting dropping either section. I agree that they add 
useful background. I was just suggesting that we should discuss the "acks" 
setting AFTER discussing the new high watermark advancement conditions. We also 
should discuss acks=0. While it isn't conceptually much different than acks=1 
here, its omission from this section is confusing.

> Unclean recovery
>
> The plan is to replace the unclean.leader.election.enable with
> unclean.recovery.strategy. If the Unclean Recovery is enabled then it deals
> with the three options in the unclean.recovery.strategy.
>
>
> Let’s refine the Unclean Recovery. We have already taken a lot of
> suggestions and I hope to enhance the durability of Kafka to the next level
> with this KIP.

I am OK with doing the unclean leader recovery improvements in this KIP. 
However, I think we need to really work on the configuration settings.

Configuration overrides are often quite messy. For example, the cases where we 
have log.roll.hours and log.roll.segment.ms, the user has to remember which one 
takes precedence, and it is not obvious. So, rather than creating a new 
configuration, why not add additional values to 
"unclean.leader.election.enable"? I think this will be simpler for people to 
understand, and simpler in the code as well.

What if we continued to use "unclean.leader.election.enable" but extended it so 
that it took a string? Then the string could have these values:

never
never automatically do an unclean leader election under any conditions

false / default
only do an unclean leader election if there may be possible data loss

true / always
always do an unclean leader election if we can't immediately elect a leader

It's a bit awkward that false maps to default rather than to never. But this 
awkwardness exists if we use two different configuration keys as well. The 
reason for the awkwardness is that we simply don't want most of the people 
currently setting unclean.leader.election.enable=false to get the "never" 
behavior. We have to bite that bullet. Better to be clear and explicit than 
hide it.

Another thing that's a bit awkward is having two different ways to do unclean 
leader election specified in the KIP. You descirbe two methods: the simple 
"choose the last leader" method, and the "unclean recovery manager" method. I 
understand why you did it this way -- "choose the last leader" is simple, and 
will help us deliver an implementation quickly, while the URM is preferable in 
the long term. My suggestion here is to separate the decision of HOW to do 
unclean leader election from the decision of WHEN to do it.

So in other words, have "unclean.leader.election.enable" specify when we do 
unclean leader election, and have a new configuration like 
"unclean.recovery.manager.enable" to determine if we use the URM. Presumably 
the URM will take some time to get fully stable, so this can default to false 
for a while, and we can flip the default to true when we feel ready.

The URM is somewhat under-described here. I think we need a few configurations 
here for it. For example, we need a configuration to specify how long it should 
wait for a broker to respond to its RPCs before moving on. We also need to 
understand how the URM interacts with unclean.leader.election.enable=always. I 
assume that with "always" we will just unconditionally use the URM rather than 
choosing randomly. But this should be spelled out in the KIP.

>
> DescribeTopicRequest
>
>1.
>Yes, the plan is to replace the MetadataRequest with the
>DescribeTopicRequest for the admin clients. Will check the details.

Sounds good. But as I said, you need to specify how AdminClient interacts with 
the new request. This will involve adding some fields to TopicDescription.java. 
And you need to specify the changes to the kafka-topics.sh command line tool. 
Otherwise we cannot use the tool to see the new information.

The new requests, DescribeTopicRequest and GetReplicaLogInfoRequest, need to 
have limits placed on them so that their size can't be infinite. We don't want 
to propagate the current problems of MetadataRequest, where clients can request 
massive responses that can mess up the JVM when handled.

Adding limits is simple for GetReplicaLogInfoRequest -- we can just

Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.6 #39

2023-09-13 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2202

2023-09-13 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-15464) Allow dynamic reloading of certificates with different DN / SANs

2023-09-13 Thread Jakub Scholz (Jira)
Jakub Scholz created KAFKA-15464:


 Summary: Allow dynamic reloading of certificates with different DN 
/ SANs
 Key: KAFKA-15464
 URL: https://issues.apache.org/jira/browse/KAFKA-15464
 Project: Kafka
  Issue Type: Improvement
Reporter: Jakub Scholz


Kafka currently doesn't allow dynamic reloading of keystores when the new key 
has a different DN or removes some of the SANs. While it might help to prevent 
users from breaking their cluster, in some cases it would be great to be able 
to bypass this validation when desired.

More details are in the [KIP-978: Allow dynamic reloading of certificates with 
different DN / 
SANs|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=263429128]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[DISCUSS] KIP-978: Allow dynamic reloading of certificates with different DN / SANs

2023-09-13 Thread Jakub Scholz
Hi all,

I would like to start the discussion about the KIP-978: Allow dynamic
reloading of certificates with different DN / SANs
.
It proposes adding an option to disable the current validation of the DN
and SANs when dynamically changing the keystore. Please have a look and let
me know your thoughts ...

Thanks & Regards
Jakub


[jira] [Created] (KAFKA-15465) MM2 not working when its internal topics are pre-created on a cluster that disallows topic creation

2023-09-13 Thread Ahmed HIBOT (Jira)
Ahmed HIBOT created KAFKA-15465:
---

 Summary: MM2 not working when its internal topics are pre-created 
on a cluster that disallows topic creation
 Key: KAFKA-15465
 URL: https://issues.apache.org/jira/browse/KAFKA-15465
 Project: Kafka
  Issue Type: Bug
  Components: mirrormaker
Affects Versions: 3.4.1
Reporter: Ahmed HIBOT






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-15459) Convert coordinator retriable errors to a known producer response error.

2023-09-13 Thread Justine Olshan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-15459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justine Olshan resolved KAFKA-15459.

Resolution: Fixed

> Convert coordinator retriable errors to a known producer response error.
> 
>
> Key: KAFKA-15459
> URL: https://issues.apache.org/jira/browse/KAFKA-15459
> Project: Kafka
>  Issue Type: Sub-task
>Affects Versions: 3.6.0
>Reporter: Justine Olshan
>Assignee: Justine Olshan
>Priority: Blocker
> Fix For: 3.6.0
>
>
> KIP-890 Part 1 tries to address hanging transactions on old clients. Thus, 
> the produce version can not be bumped and no new errors can be added. 
> Currently we use the java client's notion of retriable and abortable errors 
> -- retriable errors are defined as such by extending the retriable error 
> class, fatal errors are defined explicitly, and abortable errors are the 
> remaining. However, many other clients treat non specified errors as fatal 
> and that means many retriable errors kill the application. This is not ideal.
> While reviewing [https://github.com/apache/kafka/pull/14370] I added some of 
> the documentation for the returned errors in the produce response as well.
> There were concerns about the new errors:
>  * {@link Errors#COORDINATOR_LOAD_IN_PROGRESS}
>  * {@link Errors#COORDINATOR_NOT_AVAILABLE}
>  * {@link Errors#INVALID_TXN_STATE}
>  * {@link Errors#INVALID_PRODUCER_ID_MAPPING}
>  * {@link Errors#CONCURRENT_TRANSACTIONS}
> The coordinator load, not available, and concurrent transactions errors 
> should be retriable.
> The invalid txn state and pid mapping errors should be abortable.
> This is how older java clients handle the errors, but it is unclear how other 
> clients handle them. It seems that rdkafka (for example) treats the abortable 
> errors as fatal instead. The coordinator errors are retriable but not the 
> concurrent transactions error. Generally anything not specified otherwise is 
> fatal.
> It seems acceptable for the abortable errors to be fatal on some clients 
> since the error is likely on a zombie producer or in a state that may be 
> harder to recover from. However, for the retriable errors, we can return 
> NOT_ENOUGH_REPLICAS which is a known retriable response. We can use the 
> produce api's response string to specify the real cause of the error for 
> debugging. 
> There were trade-offs between making the older clients work and for clarity 
> in errors. This seems to be the best compromise.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-15466) Add KIP-919 support to kafka-features.sh, kafka-metadata-quorum.sh, kafka-cluster.sh

2023-09-13 Thread Colin McCabe (Jira)
Colin McCabe created KAFKA-15466:


 Summary: Add KIP-919 support to kafka-features.sh, 
kafka-metadata-quorum.sh, kafka-cluster.sh
 Key: KAFKA-15466
 URL: https://issues.apache.org/jira/browse/KAFKA-15466
 Project: Kafka
  Issue Type: Improvement
Reporter: Colin McCabe






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.6 #40

2023-09-13 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-714: Client metrics and observability

2023-09-13 Thread Jason Gustafson
Hey Andrew,

+1 on the KIP. For many users of Kafka, it may not be fully understood how
much of a challenge client monitoring is. With tens of clients in a
cluster, it is already difficult to coordinate metrics collection. When
there are thousands of clients, and when the cluster operator has no
control over them, it is essentially impossible. For the fat clients that
we have, the lack of useful telemetry is a huge operational gap.
Consistency between clients has also been a major challenge. I think the
effort toward standardization in this KIP will have some positive impact
even in deployments which have effective client-side monitoring. Overall, I
think this proposal will provide a lot of value across the board.

Best,
Jason

On Wed, Sep 13, 2023 at 9:50 AM Philip Nee  wrote:

> Hey Andrew -
>
> Thank you for taking the time to reply to my questions. I'm just adding
> some notes to this discussion.
>
> 1. epoch: It can be helpful to know the delta of the client side and the
> actual leader epoch.  It is helpful to understand why sometimes commit
> fails/client not making progress.
> 2. Client connection: If the client selects the "wrong" connection to push
> out the data, I assume the request would timeout; which should lead to
> disconnecting from the node and reselecting another node as you mentioned,
> via the least loaded node.
>
> Cheers,
> P
>
>
> On Tue, Sep 12, 2023 at 10:40 AM Andrew Schofield <
> andrew_schofield_j...@outlook.com> wrote:
>
> > Hi Philip,
> > Thanks for your vote and interest in the KIP.
> >
> > KIP-714 does not introduce any new client metrics, and that’s
> intentional.
> > It does
> > tell how that all of the client metrics can have their names transformed
> > into
> > equivalent "telemetry metric names”, and then potentially used in metrics
> > subscriptions.
> >
> > I am interested in the idea of client’s leader epoch in this context, but
> > I don’t have
> > an immediate plan for how best to do this, and it would take another KIP
> > to enhance
> > existing metrics or introduce some new ones. Those would then naturally
> be
> > applicable to the metrics push introduced in KIP-714.
> >
> > In a similar vein, there are no existing client metrics specifically for
> > auto-commit.
> > We could add them to Kafka, but I really think this is just an example of
> > asynchronous
> > commit in which the application has decided not to specify when the
> commit
> > should
> > begin.
> >
> > It is possible to increase the cadence of pushing by modifying the
> > interval.ms
> > configuration property of the CLIENT_METRICS resource.
> >
> > There is an “assigned-partitions” metric for each consumer, but not one
> for
> > active partitions. We could add one, again as a follow-on KIP.
> >
> > I take your point about holding on to a connection in a channel which
> might
> > experience congestion. Do you have a suggestion for how to improve on
> this?
> > For example, the client does have the concept of a least-loaded node.
> Maybe
> > this is something we should investigate in the implementation and decide
> > on the
> > best approach. In general, I think sticking with the same node for
> > consecutive
> > pushes is best, but if you choose the “wrong” node to start with, it’s
> not
> > ideal.
> >
> > Thanks,
> > Andrew
> >
> > > On 8 Sep 2023, at 19:29, Philip Nee  wrote:
> > >
> > > Hey Andrew -
> > >
> > > +1 but I don't have a binding vote!
> > >
> > > It took me a while to go through the KIP. Here are some of my notes
> > during
> > > the reading:
> > >
> > > *Metrics*
> > > - Should we care about the client's leader epoch? There is a case where
> > the
> > > user recreates the topic, but the consumer thinks it is still the same
> > > topic and therefore, attempts to start from an offset that doesn't
> exist.
> > > KIP-848 addresses this issue, but I can still see some potential
> benefits
> > > from knowing the client's epoch information.
> > > - I assume poll idle is similar to poll interval: I needed to read the
> > > description a few times.
> > > - I don't have a clear use case in mind for the commit latency, but I
> do
> > > think sometimes people lack clarity about how much progress was tracked
> > by
> > > the auto-commit.  Would tracking auto-commit-related metrics be
> useful? I
> > > was thinking: the last offset committed or the actual cadence in ms.
> > > - Are there cases when we need to increase the cadence of telemetry
> data
> > > push? i.e. variable interval.
> > > - Thanks for implementing the randomized initial metric push; I think
> it
> > is
> > > really important.
> > > - Is there a potential use case for tracking the number of active
> > > partitions? The consumer can pause partitions via API, during
> revocation,
> > > or during offset reset for the stream.
> > >
> > > *Connections*:
> > > - The KIP stated that it will keep the same connection until the
> > connection
> > > is disconnected. I wonder if that could potentially cause congestion if
> > it
> > > is al

Re: Apache Kafka 3.6.0 release

2023-09-13 Thread Luke Chen
Hi Satish,

Since this PR:
https://github.com/apache/kafka/pull/14366 only changes the doc, I've
backported to 3.6 branch. FYI.

Thanks.
Luke

On Thu, Sep 14, 2023 at 12:15 AM Justine Olshan
 wrote:

> Hey Satish -- yes, you are correct. KAFKA-15459 only affects 3.6.
> PR should be finalized soon.
>
> Thanks,
> Justine
>
> On Wed, Sep 13, 2023 at 1:41 AM Federico Valeri 
> wrote:
>
> > Hi Satish, this is a small documentation fix about ZK to KRaft
> > migration, that we would like to backport to 3.5 and 3.6 branches. Are
> > you ok with that?
> >
> > https://github.com/apache/kafka/pull/14366
> >
> > On Wed, Sep 13, 2023 at 3:13 AM Satish Duggana  >
> > wrote:
> > >
> > > Thanks David for the quick resolution.
> > >
> > > ~Satish.
> > >
> > > On Tue, 12 Sept 2023 at 22:51, David Arthur
> > >  wrote:
> > > >
> > > > Satish,
> > > >
> > > > KAFKA-15450 is merged to 3.6 (as well as trunk, 3.5, and 3.4)
> > > >
> > > > Thanks!
> > > > David
> > > >
> > > > On Tue, Sep 12, 2023 at 11:44 AM Ismael Juma 
> > wrote:
> > > >
> > > > > Justine,
> > > > >
> > > > > Probably best to have the conversation in the JIRA ticket vs the
> > release
> > > > > thread. Generally, we want to only include low risk bug fixes that
> > are
> > > > > fully compatible in patch releases.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Tue, Sep 12, 2023 at 7:16 AM Justine Olshan
> > > > > 
> > > > > wrote:
> > > > >
> > > > > > Thanks Satish. I understand.
> > > > > > Just curious, is this something that could be added to 3.6.1? It
> > would be
> > > > > > nice to say that hanging transactions are fully covered in a 3.6
> > release.
> > > > > > I'm not as familiar with the rules around minor releases, but
> > adding it
> > > > > > there would give more time to ensure stability.
> > > > > >
> > > > > > Thanks,
> > > > > > Justine
> > > > > >
> > > > > > On Tue, Sep 12, 2023 at 5:49 AM Satish Duggana <
> > satish.dugg...@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Justine,
> > > > > > > We can skip this change into 3.6 now as it is not a blocker or
> > > > > > > regression and it involves changes to the API implementation.
> > Let us
> > > > > > > plan to add the gap in the release notes as you mentioned.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Satish.
> > > > > > >
> > > > > > > On Tue, 12 Sept 2023 at 04:44, Justine Olshan
> > > > > > >  wrote:
> > > > > > > >
> > > > > > > > Hey Satish,
> > > > > > > >
> > > > > > > > We just discovered a gap in KIP-890 part 1. We currently
> don't
> > verify
> > > > > > on
> > > > > > > > txn offset commits, so it is still possible to have hanging
> > > > > > transactions
> > > > > > > on
> > > > > > > > the consumer offsets partitions.
> > > > > > > > I've opened a jira to wire the verification in that request.
> > > > > > > > https://issues.apache.org/jira/browse/KAFKA-15449
> > > > > > > >
> > > > > > > > This also isn't a regression, but it would be nice to have
> > part 1
> > > > > fully
> > > > > > > > complete. I have opened a PR with the fix:
> > > > > > > > https://github.com/apache/kafka/pull/14370.
> > > > > > > >
> > > > > > > > I understand if there are concerns about last minute changes
> > to this
> > > > > > API
> > > > > > > > and we can hold off if that makes the most sense.
> > > > > > > > If we take that route, I think we should still keep
> > verification for
> > > > > > the
> > > > > > > > data partitions since it still provides full protection there
> > and
> > > > > > > improves
> > > > > > > > the transactions experience. We will need to call out the gap
> > in the
> > > > > > > > release notes for consumer offsets partitions
> > > > > > > >
> > > > > > > > Let me know what you think.
> > > > > > > > Justine
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Sep 11, 2023 at 12:29 PM David Arthur
> > > > > > > >  wrote:
> > > > > > > >
> > > > > > > > > Another (small) ZK migration issue was identified. This one
> > isn't a
> > > > > > > > > regression (it has existed since 3.4), but I think it's
> > reasonable
> > > > > to
> > > > > > > > > include. It's a small configuration check that could
> > potentially
> > > > > save
> > > > > > > end
> > > > > > > > > users from some headaches down the line.
> > > > > > > > >
> > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-15450
> > > > > > > > > https://github.com/apache/kafka/pull/14367
> > > > > > > > >
> > > > > > > > > I think we can get this one committed to trunk today.
> > > > > > > > >
> > > > > > > > > -David
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Sun, Sep 10, 2023 at 7:50 PM Ismael Juma <
> > m...@ismaeljuma.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Satish,
> > > > > > > > > >
> > > > > > > > > > That sounds great. I think we should aim to only allow
> > blockers
> > > > > > > > > > (regressions, impactful security issues, etc.) on the 3.6
> > branch
> > > > > > > until
> > > > > > > > > > 3.6.0 is out.
> > > > 

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.6 #41

2023-09-13 Thread Apache Jenkins Server
See 




Re: Apache Kafka 3.6.0 release

2023-09-13 Thread Satish Duggana
Thanks Luke for the update.

~Satish.

On Thu, 14 Sept 2023 at 07:29, Luke Chen  wrote:
>
> Hi Satish,
>
> Since this PR:
> https://github.com/apache/kafka/pull/14366 only changes the doc, I've
> backported to 3.6 branch. FYI.
>
> Thanks.
> Luke
>
> On Thu, Sep 14, 2023 at 12:15 AM Justine Olshan
>  wrote:
>
> > Hey Satish -- yes, you are correct. KAFKA-15459 only affects 3.6.
> > PR should be finalized soon.
> >
> > Thanks,
> > Justine
> >
> > On Wed, Sep 13, 2023 at 1:41 AM Federico Valeri 
> > wrote:
> >
> > > Hi Satish, this is a small documentation fix about ZK to KRaft
> > > migration, that we would like to backport to 3.5 and 3.6 branches. Are
> > > you ok with that?
> > >
> > > https://github.com/apache/kafka/pull/14366
> > >
> > > On Wed, Sep 13, 2023 at 3:13 AM Satish Duggana  > >
> > > wrote:
> > > >
> > > > Thanks David for the quick resolution.
> > > >
> > > > ~Satish.
> > > >
> > > > On Tue, 12 Sept 2023 at 22:51, David Arthur
> > > >  wrote:
> > > > >
> > > > > Satish,
> > > > >
> > > > > KAFKA-15450 is merged to 3.6 (as well as trunk, 3.5, and 3.4)
> > > > >
> > > > > Thanks!
> > > > > David
> > > > >
> > > > > On Tue, Sep 12, 2023 at 11:44 AM Ismael Juma 
> > > wrote:
> > > > >
> > > > > > Justine,
> > > > > >
> > > > > > Probably best to have the conversation in the JIRA ticket vs the
> > > release
> > > > > > thread. Generally, we want to only include low risk bug fixes that
> > > are
> > > > > > fully compatible in patch releases.
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Tue, Sep 12, 2023 at 7:16 AM Justine Olshan
> > > > > > 
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks Satish. I understand.
> > > > > > > Just curious, is this something that could be added to 3.6.1? It
> > > would be
> > > > > > > nice to say that hanging transactions are fully covered in a 3.6
> > > release.
> > > > > > > I'm not as familiar with the rules around minor releases, but
> > > adding it
> > > > > > > there would give more time to ensure stability.
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Justine
> > > > > > >
> > > > > > > On Tue, Sep 12, 2023 at 5:49 AM Satish Duggana <
> > > satish.dugg...@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Justine,
> > > > > > > > We can skip this change into 3.6 now as it is not a blocker or
> > > > > > > > regression and it involves changes to the API implementation.
> > > Let us
> > > > > > > > plan to add the gap in the release notes as you mentioned.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > > Satish.
> > > > > > > >
> > > > > > > > On Tue, 12 Sept 2023 at 04:44, Justine Olshan
> > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > Hey Satish,
> > > > > > > > >
> > > > > > > > > We just discovered a gap in KIP-890 part 1. We currently
> > don't
> > > verify
> > > > > > > on
> > > > > > > > > txn offset commits, so it is still possible to have hanging
> > > > > > > transactions
> > > > > > > > on
> > > > > > > > > the consumer offsets partitions.
> > > > > > > > > I've opened a jira to wire the verification in that request.
> > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-15449
> > > > > > > > >
> > > > > > > > > This also isn't a regression, but it would be nice to have
> > > part 1
> > > > > > fully
> > > > > > > > > complete. I have opened a PR with the fix:
> > > > > > > > > https://github.com/apache/kafka/pull/14370.
> > > > > > > > >
> > > > > > > > > I understand if there are concerns about last minute changes
> > > to this
> > > > > > > API
> > > > > > > > > and we can hold off if that makes the most sense.
> > > > > > > > > If we take that route, I think we should still keep
> > > verification for
> > > > > > > the
> > > > > > > > > data partitions since it still provides full protection there
> > > and
> > > > > > > > improves
> > > > > > > > > the transactions experience. We will need to call out the gap
> > > in the
> > > > > > > > > release notes for consumer offsets partitions
> > > > > > > > >
> > > > > > > > > Let me know what you think.
> > > > > > > > > Justine
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Sep 11, 2023 at 12:29 PM David Arthur
> > > > > > > > >  wrote:
> > > > > > > > >
> > > > > > > > > > Another (small) ZK migration issue was identified. This one
> > > isn't a
> > > > > > > > > > regression (it has existed since 3.4), but I think it's
> > > reasonable
> > > > > > to
> > > > > > > > > > include. It's a small configuration check that could
> > > potentially
> > > > > > save
> > > > > > > > end
> > > > > > > > > > users from some headaches down the line.
> > > > > > > > > >
> > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-15450
> > > > > > > > > > https://github.com/apache/kafka/pull/14367
> > > > > > > > > >
> > > > > > > > > > I think we can get this one committed to trunk today.
> > > > > > > > > >
> > > > > > > > > > -David
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Sun, 

Re: [VOTE] KIP-942: Add Power(ppc64le) support

2023-09-13 Thread Divij Vaidya
I don’t have a strong opinion on this. I am fine with running these tests
nightly instead of on every merge to trunk.

On Wed 13. Sep 2023 at 08:13, Vaibhav Nazare 
wrote:

> Hi Colin,
> I do agree with you on running a nightly job.
>
> Any thoughts on this Mickael and Divij so I can update the KIP accordingly?
>
> Thanks
> Vaibhav Nazare
>
>
> -Original Message-
> From: Colin McCabe 
> Sent: Wednesday, September 13, 2023 3:48 AM
> To: dev@kafka.apache.org
> Subject: [EXTERNAL] Re: [VOTE] KIP-942: Add Power(ppc64le) support
>
> I just disagree with the idea of running it for every commit. If we can't
> compromise on this condition then I just have to cast a -1. It's tough
> enough to get stuff committed since our tests take quite a long time.  The
> fact that we only manage 5-6 commits a day is a bad thing, and not
> something we should make even worse. It leads directly to things like big
> PR backlogs, which we've discussed here many times.
>
> best,
> Colin
>
>
> On Mon, Sep 11, 2023, at 06:50, Vaibhav Nazare wrote:
> > Hi Colin
> >
> > Can we continue the voting process now?
> >
> > Thanks
> > VaibhavNazare
> >
> > -Original Message-
> > From: Divij Vaidya 
> > Sent: Monday, August 28, 2023 1:43 PM
> > To: dev@kafka.apache.org
> > Subject: [EXTERNAL] Re: [VOTE] KIP-942: Add Power(ppc64le) support
> >
> > Hey Colin
> >
> > I suggested running tests on every merge to trunk because on an
> > average we have 5-6 commits merged per day in the discuss thread
> > https://lists.apache.org/thread/4mfq46fc7nnsr96odqxxhcxyv24d8zn0  .
> > Running this test suite 5 times won't be a burden to the CI
> > infrastructure. The advantage we get is that, unlike nightly builds
> > which have a chance of being ignored, branch builds are actively
> > monitored by folks in the community. Hence, we will be able to add
> > this new suite without adding a new routine in the maintenance.
> >
> > --
> > Divij Vaidya
> >
> > On Fri, Aug 25, 2023 at 6:49 PM Colin McCabe  wrote:
> >>
> >> Thank you for continuing to work on this.
> >>
> >> One comment. When we discussed this in the DISCUSS thread, we all
> wanted to run it nightly in branch builder (or possibly weekly). But
> looking at the KIP, it doesn't seem to have been updated with the results
> of these discussions.
> >>
> >> best,
> >> Colin
> >>
> >>
> >> On Mon, Aug 21, 2023, at 01:37, Mickael Maison wrote:
> >> > +1 (binding)
> >> > Thanks for the KIP!
> >> >
> >> > Mickael
> >> >
> >> > On Mon, Aug 14, 2023 at 1:40 PM Divij Vaidya 
> wrote:
> >> >>
> >> >> +1 (binding)
> >> >>
> >> >> --
> >> >> Divij Vaidya
> >> >>
> >> >>
> >> >> On Wed, Jul 26, 2023 at 9:04 AM Vaibhav Nazare
> >> >>  wrote:
> >> >> >
> >> >> > I'd like to call a vote on KIP-942
>


Re: [VOTE] KIP-942: Add Power(ppc64le) support

2023-09-13 Thread Boudjelda Mohamed Said
+1 (binding)


On Wed 26 Jul 2023 at 09:04, Vaibhav Nazare 
wrote:

> I'd like to call a vote on KIP-942
>