date:20241009

[jira] [Created] (KAFKA-17753) Update protobuf and commons-io dependencies

2024-10-09 Thread Colin McCabe (Jira)

Colin McCabe created KAFKA-17753:


 Summary: Update protobuf and commons-io dependencies
 Key: KAFKA-17753
 URL: https://issues.apache.org/jira/browse/KAFKA-17753
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.9.0
Reporter: Colin McCabe
Assignee: Colin McCabe
 Fix For: 3.9.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: New release branch 3.9

2024-10-09 Thread Colin McCabe

Hi Josep,

I marked this as a blocker. We will take a look. I also found some dependencies 
that need to be updated due to CVEs. See github.com/apache/kafka/pull/17436

best,
Colin

On Wed, Oct 9, 2024, at 05:52, Josep Prat wrote:
> Hi Colin,
>
> I want to raise a bug we encountered while testing 3.9 RC0. You can find
> the report here: https://issues.apache.org/jira/browse/KAFKA-17751
> Its current severity is "high", but IMO it might even be a blocker. What do
> you think?
>
> Best,
>
> On Fri, Oct 4, 2024 at 5:18 PM José Armando García Sancio
>  wrote:
>
>> Thanks Colin.
>>
>> KAFKA-16927 has been merged to trunk and the 3.9 branch.
>>
>> --
>> -José
>>
>
>
> -- 
> [image: Aiven] 
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io    |   
>      
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> Anna Richardson, Kenneth Chen
> Amtsgericht Charlottenburg, HRB 209739 B

[Need Help] Planning To Contribute To Kafka Open Source

2024-10-09 Thread Anshul Goyal

Hi Team,

I am a Senior Engineer with an experience of almost 8 years in Java tech
stack. I want to contribute to the development of Kafka.

Before getting started, I would like to share my approach on getting
started with you and need your inputs on the same.

I plan to pick one smaller end to end workflow (e.g. topic creation) and
try to understand it better and then start contributing to the JIRA items
specific to that workflow.

Please suggest which topic I could dive in i.e. I would like to select a
topic where future development is to happen and JIRA items are available to
be picked up. Also, it would be very helpful if you could share some
product documents , discussion sessions etc as well.

Not sure if I will get a response to this email.
Still trying my best


Thanks In Advance
Anshul Goyal

Re: [DISCUSS] Apache Kafka 3.8.1 release

2024-10-09 Thread Matthias J. Sax


Hi,

we recently found a bug that we would like to get fixed with 3.8.1 
release: https://issues.apache.org/jira/browse/KAFKA-17731


There is already a PR for it. I did mark the Jira as blocker for 3.8.1 
for now for tracking. Can we consider it for the release?



-Matthias

On 9/30/24 1:41 AM, Josep Prat wrote:

Hi there,
I'll attempt to cut the first RC for 3.8.1 this Wednesday. If you have any
bug fix that you'd like to backport to the 3.8 branch and you'd need more
time, please let me know.

Best,

On Tue, Sep 24, 2024 at 1:15 PM Josep Prat  wrote:


Hi folks!
As promised, here you have the release plan for 3.8.1:
https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+3.8.1

I'm aiming to have a release candidate by the first week of October unless
somebody else finds a blocker before then.

Best,

On Fri, Sep 20, 2024 at 11:57 AM Chia-Ping Tsai 
wrote:


+1

Luke Chen  於 2024年9月20日 週五 下午4:22寫道：


Thanks Josep!
+1

Luke

On Fri, Sep 20, 2024 at 3:36 PM Josep Prat 
Hey folks,

I'd like to volunteer to be the release manager for a bug fix release

of

the 3.8 series. This will be the first bug fix release and will be

version

3.8.1.

If no one has any objections, I will send out a release plan on

2024/09/23

that includes a list of all of the fixes we are targeting for 3.8.1

along

with a timeline (aiming probably for a release happening at the

beginning

of October).

Thanks!

--
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   <

https://www.facebook.com/aivencloud



      <
https://twitter.com/aiven_io>
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
Anna Richardson, Kenneth Chen
Amtsgericht Charlottenburg, HRB 209739 B








--
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |

   
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
Anna Richardson, Kenneth Chen
Amtsgericht Charlottenburg, HRB 209739 B

Re: New release branch 3.9

2024-10-09 Thread Josep Prat

Thanks Colin,

I reviewed the PR you shared.

Best,

On Wed, Oct 9, 2024 at 6:26 PM Colin McCabe  wrote:

> Hi Josep,
>
> I marked this as a blocker. We will take a look. I also found some
> dependencies that need to be updated due to CVEs. See
> github.com/apache/kafka/pull/17436
>
> best,
> Colin
>
> On Wed, Oct 9, 2024, at 05:52, Josep Prat wrote:
> > Hi Colin,
> >
> > I want to raise a bug we encountered while testing 3.9 RC0. You can find
> > the report here: https://issues.apache.org/jira/browse/KAFKA-17751
> > Its current severity is "high", but IMO it might even be a blocker. What
> do
> > you think?
> >
> > Best,
> >
> > On Fri, Oct 4, 2024 at 5:18 PM José Armando García Sancio
> >  wrote:
> >
> >> Thanks Colin.
> >>
> >> KAFKA-16927 has been merged to trunk and the 3.9 branch.
> >>
> >> --
> >> -José
> >>
> >
> >
> > --
> > [image: Aiven] 
> >
> > *Josep Prat*
> > Open Source Engineering Director, *Aiven*
> > josep.p...@aiven.io   |   +491715557497
> > aiven.io    |   <
> https://www.facebook.com/aivencloud>
> >      <
> https://twitter.com/aiven_io>
> > *Aiven Deutschland GmbH*
> > Alexanderufer 3-7, 10117 Berlin
> > Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> > Anna Richardson, Kenneth Chen
> > Amtsgericht Charlottenburg, HRB 209739 B
>


-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   
     
*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
Anna Richardson, Kenneth Chen
Amtsgericht Charlottenburg, HRB 209739 B

[jira] [Resolved] (KAFKA-17703) Move DelayedActionsQueue outside DelayedShareFetch

2024-10-09 Thread Jun Rao (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-17703.
-
Fix Version/s: 4.0.0
   Resolution: Fixed

merged the PR to trunk.

> Move DelayedActionsQueue outside DelayedShareFetch
> --
>
> Key: KAFKA-17703
> URL: https://issues.apache.org/jira/browse/KAFKA-17703
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Abhinav Dixit
>Assignee: Abhinav Dixit
>Priority: Major
> Fix For: 4.0.0
>
>
> In reference to comments 
> [https://github.com/apache/kafka/pull/17177#issuecomment-2392017658] and 
> [https://github.com/apache/kafka/pull/17177#issuecomment-2392108397] , we 
> need to do the following - 
> 1. Move ActionQueue outside DelayedShareFetch class to SharePartitionManager 
> where SharePartitionManager has the ability to add a delayed action to the 
> ActionQueue.
> 2. Add TopicPartition as a key for delayed share fetch along with 
> SharePartition (that is already present right now)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17287) Add integration test for KafkaShareConsumer

2024-10-09 Thread Shivsundar R (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Shivsundar R resolved KAFKA-17287.
--
Resolution: Fixed

After several clean and green runs, the PR was merged.

> Add integration test for KafkaShareConsumer
> ---
>
> Key: KAFKA-17287
> URL: https://issues.apache.org/jira/browse/KAFKA-17287
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Andrew Schofield
>Assignee: Shivsundar R
>Priority: Major
>
> Add an integration test suite for testing KafkaShareConsumer.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: New release branch 3.9

2024-10-09 Thread José Armando García Sancio

Thanks Josep. I also agree that we should try to include this fix in
3.9.0. It is a regression that would cause the CPU resource to get
wasted.

On Wed, Oct 9, 2024 at 8:53 AM Josep Prat  wrote:
>
> Hi Colin,
>
> I want to raise a bug we encountered while testing 3.9 RC0. You can find
> the report here: https://issues.apache.org/jira/browse/KAFKA-17751
> Its current severity is "high", but IMO it might even be a blocker. What do
> you think?
>
> Best,
>
> On Fri, Oct 4, 2024 at 5:18 PM José Armando García Sancio
>  wrote:
>
> > Thanks Colin.
> >
> > KAFKA-16927 has been merged to trunk and the 3.9 branch.
> >
> > --
> > -José
> >
>
>
> --
> [image: Aiven] 
>
> *Josep Prat*
> Open Source Engineering Director, *Aiven*
> josep.p...@aiven.io   |   +491715557497
> aiven.io    |   
>      
> *Aiven Deutschland GmbH*
> Alexanderufer 3-7, 10117 Berlin
> Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
> Anna Richardson, Kenneth Chen
> Amtsgericht Charlottenburg, HRB 209739 B



-- 
-José

[jira] [Resolved] (KAFKA-1207) Launch Kafka from within Apache Mesos

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-1207.
---
Resolution: Won't Do

Apache Mesos has been retired for a few years already, closing this issue.

> Launch Kafka from within Apache Mesos
> -
>
> Key: KAFKA-1207
> URL: https://issues.apache.org/jira/browse/KAFKA-1207
> Project: Kafka
>  Issue Type: Bug
>  Components: packaging
>Reporter: Joe Stein
>Priority: Major
>  Labels: mesos
> Attachments: KAFKA-1207.patch, KAFKA-1207_2014-01-19_00:04:58.patch, 
> KAFKA-1207_2014-01-19_00:48:49.patch
>
>
> There are a few components to this.
> 1) The Framework:  This is going to be responsible for starting up and 
> managing the fail over of brokers within the mesos cluster.  This will have 
> to get some Kafka focused paramaters for launching new replica brokers, 
> moving topics and partitions around based on what is happening in the grid 
> through time.
> 2) The Scheduler: This is what is going to ask for resources for Kafka 
> brokers (new ones, replacement ones, commissioned ones) and other operations 
> such as stopping tasks (decommissioning brokers).  I think this should also 
> expose a user interface (or at least a rest api) for producers and consumers 
> so we can have producers and consumers run inside of the mesos cluster if 
> folks want (just add the jar)
> 3) The Executor : This is the task launcher.  It launches tasks kills them 
> off.
> 4) Sharing data between Scheduler and Executor: I looked at the a few 
> implementations of this.  I like parts of the Storm implementation but think 
> using the environment variable 
> ExectorInfo.CommandInfo.Enviornment.Variables[] is the best shot.  We can 
> have a command line bin/kafka-mesos-scheduler-start.sh that would build the 
> contrib project if not already built and support conf/server.properties to 
> start.
> The Framework and operating Scheduler would run in on an administrative node. 
>  I am probably going to hook Apache Curator into it so it can do it's own 
> failure to a another follower.  Running more than 2 should be sufficient as 
> long as it can bring back it's state (e.g. from zk).  I think we can add this 
> in after once everything is working.
> Additional detail can be found on the Wiki page 
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=38570672



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-8524) Zookeeper Acl Sensitive Path Extension

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8524?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-8524.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Zookeeper Acl Sensitive Path Extension
> --
>
> Key: KAFKA-8524
> URL: https://issues.apache.org/jira/browse/KAFKA-8524
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 1.1.0, 2.2.1
>Reporter: sebastien diaz
>Priority: Major
>  Labels: path, zkcli, zookeeper
>
> There is too more readable config in Zookeeper as /brokers,/controller, 
> /kafka-acl, .
> As Zookeeper can be accessed by other projects/users , the security should be 
> extended to Zookeeper ACL properly.
> We shoudl have the possibility to set these paths by configuration and not 
> (as it is today) in the code.
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-14234) /admin/delete_topics is not in the list of zookeeper watchers

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-14234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-14234.

Resolution: Cannot Reproduce

We are now removing ZooKeeper support so closing this issue.

> /admin/delete_topics is not in the list of zookeeper watchers
> -
>
> Key: KAFKA-14234
> URL: https://issues.apache.org/jira/browse/KAFKA-14234
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 3.2.1
>Reporter: Yan Xue
>Priority: Minor
>
> I deployed the Kafka cluster on Kuberentes and am trying to figure out how 
> topic deletion works. I know Kafka controller has the topic deletion manager 
> which watches the node change in the zookeeper. Whenever a topic is deleted, 
> the manager is triggered. I expected to see that the {{/admin/delete_topics}} 
> is in the watcher list. However, I didn't find it. Sample output:
> root@kafka-broker-2:/opt/kafka# echo wchc | nc ZOOKEEPER_IP 2181
> 0x20010021139
>     /admin/preferred_replica_election
>     /brokers/ids/0
>     /brokers/ids/1
>     /brokers/ids/2
>     /brokers/topics/__consumer_offsets
>     /brokers/ids/3
>     /brokers/ids/4
>     /controller
>     /admin/reassign_partitions
>     /brokers/topics/test-test
>     /feature
> 0x200100211390001
>     /controller
>     /feature
> 0x1631f9
>     /controller
>     /feature
>  
> Even though I can delete the topic, I am confused about the output.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-13774) AclAuthorizer should handle it a bit more gracefully if zookeeper.connect is null

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-13774?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-13774.

Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> AclAuthorizer should handle it a bit more gracefully if zookeeper.connect is 
> null
> -
>
> Key: KAFKA-13774
> URL: https://issues.apache.org/jira/browse/KAFKA-13774
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Colin McCabe
>Priority: Minor
>  Labels: kip-500
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17747) Trigger rebalance on rack topology changes

2024-10-09 Thread David Jacot (Jira)

David Jacot created KAFKA-17747:
---

 Summary: Trigger rebalance on rack topology changes
 Key: KAFKA-17747
 URL: https://issues.apache.org/jira/browse/KAFKA-17747
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Jacot


At the moment, we trigger a rebalance of the consumer group only when the 
number of partitions of a topic has changed (e.g. increased the number of 
partitions). We tried to extend this mechanism to also take racks into 
consideration (see [this|https://github.com/apache/kafka/pull/17233]) but it 
turned out to be to expensive from a memory and cpu perspective. It was also 
bad because we ended up duplicating many of the information already present in 
the Metadata image. We should design a better way to do this and it may require 
a KIP depending on the solution.

I have two high level ideas in mind:
 # One way would be to include a new epoch to the topic metadata stored in the 
controlled. This new epoch could be incremented whenever the topology of the 
topic has changed (e.g. adding partition, reassignment, etc.). Then we could 
store the epoch in the group coordinator to detect changes and rebalance the 
group. The downside of this approach is that it couple the group coordinator to 
the controller.
 # Another way would be to come up with a way to compute a hash of the current 
topology on the topic(s). The digest would then be stored in the group 
coordinator and used to detect changes. The downside of this is that it 
requires to re-compute the hash to determine whether this is a change or not. 
Option 1) would be a bit more efficient because the controller knows when the 
epoch must be bumped.

We should explore those ideas and possibly other ones.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17745) Move static request validation from GroupMetadataManager to GroupCoordinatorService

2024-10-09 Thread David Jacot (Jira)

David Jacot created KAFKA-17745:
---

 Summary: Move static request validation from GroupMetadataManager 
to GroupCoordinatorService
 Key: KAFKA-17745
 URL: https://issues.apache.org/jira/browse/KAFKA-17745
 Project: Kafka
  Issue Type: Sub-task
Reporter: David Jacot


At the moment, we validate the request in the GroupMetadataManager. It means 
that the validation runs within the group coordinator threads. In case of rogue 
requests, it would unnecessarily use resources from those threads. It would be 
better to do the static validation in the GroupCoordinatorService while still 
running in the request handler threads. I haven't looked into all the details 
but it seems feasible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-7349) Long Disk Writes cause Zookeeper Disconnects

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7349.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Long Disk Writes cause Zookeeper Disconnects
> 
>
> Key: KAFKA-7349
> URL: https://issues.apache.org/jira/browse/KAFKA-7349
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 0.11.0.1
>Reporter: Adam Kafka
>Priority: Minor
> Attachments: SpikeInWriteTime.png
>
>
> We run our Kafka cluster on a cloud service provider. As a consequence, we 
> notice a large tail latency write time that is out of our control. Some 
> writes take on the order of seconds. We have noticed that often these long 
> write times are correlated with subsequent Zookeeper disconnects from the 
> brokers. It appears that during the long write time, the Zookeeper heartbeat 
> thread does not get scheduled CPU time, resulting in a long gap of heartbeats 
> sent. After the write, the ZK thread does get scheduled CPU time, but it 
> detects that it has not received a heartbeat from Zookeeper in a while, so it 
> drops its connection then rejoins the cluster.
> Note that the timeout reported is inconsistent with the timeout as set by the 
> configuration ({{zookeeper.session.timeout.ms}} = default of 6 seconds). We 
> have seen a range of values reported here, including 5950ms (less than 
> threshold), 12032ms (double the threshold), 25999ms (much larger than the 
> threshold).
> We noticed that during a service degradation of the storage service of our 
> cloud provider, these Zookeeper disconnects increased drastically in 
> frequency. 
> We are hoping there is a way to decouple these components. Do you agree with 
> our diagnosis that the ZK disconnects are occurring due to thread contention 
> caused by long disk writes? Perhaps the ZK thread could be scheduled at a 
> higher priority? Do you have any suggestions for how to avoid the ZK 
> disconnects?
> Here is an example of one of these events:
> Logs on the Broker:
> {code}
> [2018-08-25 04:10:19,695] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:21,697] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:23,700] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:25,701] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:27,702] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:29,704] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:31,707] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:33,709] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:35,712] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:37,714] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:39,716] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:41,719] DEBUG Got ping response for sessionid: 
> 0x36202ab4337002c after 1ms (org.apache.zookeeper.ClientCnxn)
> ...
> [2018-08-25 04:10:53,752] WARN Client session timed out, have not heard from 
> server in 12032ms for sessionid 0x36202ab4337002c 
> (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:53,754] INFO Client session timed out, have not heard from 
> server in 12032ms for sessionid 0x36202ab4337002c, closing socket connection 
> and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2018-08-25 04:10:53,920] INFO zookeeper state changed (Disconnected) 
> (org.I0Itec.zkclient.ZkClient)
> [2018-08-25 04:10:53,920] INFO Waiting for keeper state SyncConnected 
> (org.I0Itec.zkclient.ZkClient)
> ...
> {code}
> GC logs during the same time (demonstrating this is not just a long GC):
> {code}
> 2018-08-25T04:10:36.434+: 35150.779: [GC (Allocation Failure)  
> 3074119K->2529089K(6223360K), 0.0137342 secs]
> 2018-08-25T04:10:37.367+: 35151.713: [GC (Allocation Failure)  
> 3074433K->2528524K(6223360K), 0.0127938 secs]
> 2018-08-25T04:10:38.274+: 35152.620: [GC (Allocation Failure)  
> 3073868K->2528357K(6223360K), 0.0131040 secs]
> 2018-08-25T04:10:39.220+: 35153.566: [GC (Allocation Failure)

[jira] [Resolved] (KAFKA-6062) Reduce topic partition count in kafka version 0.10.0.0

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-6062.
---
Resolution: Cannot Reproduce

> Reduce topic partition count in kafka version 0.10.0.0
> --
>
> Key: KAFKA-6062
> URL: https://issues.apache.org/jira/browse/KAFKA-6062
> Project: Kafka
>  Issue Type: Task
>Reporter: Balu
>Priority: Major
>
> Using  kafka,zookeeper,schema repository cluster. Current partition count is 
> 10 and have to make it 3. Can we do it?
> Appreciate steps if dataloss is fine.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-1155) Kafka server can miss zookeeper watches during long zkclient callbacks

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-1155.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Kafka server can miss zookeeper watches during long zkclient callbacks
> --
>
> Key: KAFKA-1155
> URL: https://issues.apache.org/jira/browse/KAFKA-1155
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.0, 0.8.1, 0.8.2.0
>Reporter: Neha Narkhede
>Assignee: Neha Narkhede
>Priority: Critical
>  Labels: newbie++
>
> On getting a zookeeper watch, zkclient invokes the blocking user callback and 
> only re-registers the watch after the callback returns. This leaves a 
> possibly large window of time when Kafka has not registered for watches on 
> the desired zookeeper paths and hence can miss important state changes (on 
> the controller). In any case, it is worth noting that even though zookeeper 
> has a read-and-set-watch API, there can always be a window of time between 
> the watch being fired, the callback and the read-and-set-watch API call. Due 
> to the zkclient wrapper, it is difficult to handle this properly in the Kafka 
> code unless we directly use the zookeeper client. One way of getting around 
> this issue is to use timestamps on the paths and when a watch fires, check if 
> the timestamp in zk is different from the one in the callback handler.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-795) Improvements to PreferredReplicaLeaderElection tool

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-795?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-795.
--
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Improvements to PreferredReplicaLeaderElection tool
> ---
>
> Key: KAFKA-795
> URL: https://issues.apache.org/jira/browse/KAFKA-795
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.8.0
>Reporter: Swapnil Ghike
>Assignee: Swapnil Ghike
>Priority: Major
>
> We can make some improvements to the PreferredReplicaLeaderElection tool:
> 1. Terminate the tool if a controller is not up and running. Currently we can 
> run the tool without having any broker running, which is kind of confusing. 
> 2. Should we delete /admin zookeeper path in PreferredReplicaLeaderElection 
> (and ReassignPartition) tool at the end? Otherwise the next run of the tool 
> complains that a replica election is already in progress. 
> 3. If there is an error, we can see it in cotroller.log. Should the tool also 
> throw an error?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-4418) Broker Leadership Election Fails If Missing ZK Path Raises Exception

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-4418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-4418.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Broker Leadership Election Fails If Missing ZK Path Raises Exception
> 
>
> Key: KAFKA-4418
> URL: https://issues.apache.org/jira/browse/KAFKA-4418
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 0.9.0.1, 0.10.0.0, 0.10.0.1
>Reporter: Michael Pedersen
>Priority: Major
>  Labels: reliability
>
> Our Kafka cluster went down because a single node went down *and* a path in 
> Zookeeper was missing for one topic (/brokers/topics//partitions). 
> When this occurred, leadership election could not run, and produced a stack 
> trace that looked like this:
> Failed to start preferred replica election
> org.I0Itec.zkclient.exception.ZkNoNodeException: 
> org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = 
> NoNode for /brokers/topics/warandpeace/partitions
>   at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47)
>   at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:995)
>   at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:675)
>   at org.I0Itec.zkclient.ZkClient.getChildren(ZkClient.java:671)
>   at kafka.utils.ZkUtils.getChildren(ZkUtils.scala:537)
>   at 
> kafka.utils.ZkUtils$$anonfun$getAllPartitions$1.apply(ZkUtils.scala:817)
>   at 
> kafka.utils.ZkUtils$$anonfun$getAllPartitions$1.apply(ZkUtils.scala:816)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at 
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
>   at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>   at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>   at scala.collection.IterableLike$class.foreach(IterableLike.scala:72)
>   at scala.collection.AbstractIterable.foreach(Iterable.scala:54)
>   at scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
>   at scala.collection.AbstractTraversable.map(Traversable.scala:105)
>   at kafka.utils.ZkUtils.getAllPartitions(ZkUtils.scala:816)
>   at 
> kafka.admin.PreferredReplicaLeaderElectionCommand$.main(PreferredReplicaLeaderElectionCommand.scala:64)
>   at 
> kafka.admin.PreferredReplicaLeaderElectionCommand.main(PreferredReplicaLeaderElectionCommand.scala)
> Caused by: org.apache.zookeeper.KeeperException$NoNodeException: 
> KeeperErrorCode = NoNode for /brokers/topics/warandpeace/partitions
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:111)
>   at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>   at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1472)
>   at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1500)
>   at org.I0Itec.zkclient.ZkConnection.getChildren(ZkConnection.java:114)
>   at org.I0Itec.zkclient.ZkClient$4.call(ZkClient.java:678)
>   at org.I0Itec.zkclient.ZkClient$4.call(ZkClient.java:675)
>   at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:985)
>   ... 16 more
> I have checked through the code a bit, and have found a quick place to 
> introduce a fix that would seem to allow the leadership election to continue. 
> Specifically, the function at 
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/utils/ZkUtils.scala#L633
>  does not handle possible exceptions. Wrapping a try/catch block here would 
> work, but could introduce a number of other problems:
> * If the code is used elsewhere, the exception might be needed at a higher 
> level to prevent something else.
> * Unless the exception is logged/reported somehow, no one will know this 
> problem exists, which makes debugging other problems harder.
> I'm sure there are other issues I'm not aware of, but those two come to mind 
> quickly. What would be the best route for getting this resolved quickly?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-3685) Auto-generate ZooKeeper data structure wiki

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-3685?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-3685.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Auto-generate ZooKeeper data structure wiki
> ---
>
> Key: KAFKA-3685
> URL: https://issues.apache.org/jira/browse/KAFKA-3685
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Vahid Hashemian
>Assignee: Vahid Hashemian
>Priority: Minor
>
> The ZooKeeper data structure wiki page is located at 
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+data+structures+in+Zookeeper.
>  This should be auto-generated and versioned according to various releases. A 
> similar auto-generate has been previously done for protocol. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17748) Remove scala-java8-compat

2024-10-09 Thread TengYao Chi (Jira)

TengYao Chi created KAFKA-17748:
---

 Summary: Remove scala-java8-compat 
 Key: KAFKA-17748
 URL: https://issues.apache.org/jira/browse/KAFKA-17748
 Project: Kafka
  Issue Type: Sub-task
Reporter: TengYao Chi
Assignee: TengYao Chi
 Fix For: 4.0.0


We should remove `scala-java8-compat` lib after removing Java 8 and zk.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-3288) Update ZK dependency to 3.5.1 when it is marked as stable

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-3288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-3288.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Update ZK dependency to 3.5.1 when it is marked as stable
> -
>
> Key: KAFKA-3288
> URL: https://issues.apache.org/jira/browse/KAFKA-3288
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Ashish Singh
>Assignee: Ashish Singh
>Priority: Major
>
> When a stable version of ZK 3.5.1+ is released, update Kafka's ZK dependency.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-1918) System test for ZooKeeper quorum failure scenarios

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-1918?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-1918.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> System test for ZooKeeper quorum failure scenarios
> --
>
> Key: KAFKA-1918
> URL: https://issues.apache.org/jira/browse/KAFKA-1918
> Project: Kafka
>  Issue Type: Test
>  Components: system tests
>Reporter: Omid Aladini
>Priority: Major
>
> Following up on the [conversation on the mailing 
> list|http://mail-archives.apache.org/mod_mbox/kafka-users/201502.mbox/%3CCAHwHRrX3SAWDUGF5LjU4rrMUsqv%3DtJcyjX7OENeL5C_V5o3tCw%40mail.gmail.com%3E],
>  the FAQ writes:
> {quote}
> Once the Zookeeper quorum is down, brokers could result in a bad state and 
> could not normally serve client requests, etc. Although when Zookeeper quorum 
> recovers, the Kafka brokers should be able to resume to normal state 
> automatically, _there are still a few +corner cases+ the they cannot and a 
> hard kill-and-recovery is required to bring it back to normal_. Hence it is 
> recommended to closely monitor your zookeeper cluster and provision it so 
> that it is performant.
> {quote}
> As ZK quorum failures are inevitable (due to rolling upgrades of ZK, leader 
> hardware failure, etc), it would be great to identify the corner cases (if 
> they still exist) and fix them if necessary.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-2762) Add a log appender for zookeeper in the log4j.properties file

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-2762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-2762.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Add a log appender for zookeeper in the log4j.properties file
> -
>
> Key: KAFKA-2762
> URL: https://issues.apache.org/jira/browse/KAFKA-2762
> Project: Kafka
>  Issue Type: Task
>  Components: log
>Reporter: Raju Bairishetti
>Assignee: Jay Kreps
>Priority: Major
>
> Log4j.properties file is present under config directory. Right now, we are 
> using this log4j file from the daemon scripts. Ex: kaka-server-start.sh & 
> zookeeper-start.sh scripts. I am not seeing any log appender for zookeeper 
> daemon in the log4j properties.
>  *IMO, we should add a log appender for zookeeper in the log4j properties 
> file to redirect zookeeper logs to different file.*
> Zookeeper logs will be printed on console if we use the existing the log4j 
> properties file. I agree that users can still update log4j file but it would 
> be nice to keep the appender so that every user does not need to modify the 
> log4j file.
> I have recently started using kafka. Could anyone please me add as a 
> contributor to this project?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-3287) Add over-wire encryption support between KAFKA and ZK

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-3287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-3287.
---
Resolution: Duplicate

TLS between Kafka and ZooKeeper has been supported for many years, closing.

> Add over-wire encryption support between KAFKA and ZK
> -
>
> Key: KAFKA-3287
> URL: https://issues.apache.org/jira/browse/KAFKA-3287
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Ashish Singh
>Assignee: Ashish Singh
>Priority: Major
>
> ZOOKEEPER-2125 added support for SSL. After Kafka upgrades ZK's dependency to 
> 3.5.1+ or 3.6.0+, SSL support between kafka broker and zk can be added.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-5885) NPE in ZKClient

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-5885?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-5885.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> NPE in ZKClient
> ---
>
> Key: KAFKA-5885
> URL: https://issues.apache.org/jira/browse/KAFKA-5885
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 0.10.2.1
>Reporter: Dustin Cote
>Priority: Major
>
> A null znode for a topic (reason how this happen isn't totally clear, but not 
> the focus of this issue) can currently cause controller leader election to 
> fail. When looking at the broker logging, you can see there is a 
> NullPointerException emanating from the ZKClient:
> {code}
> [2017-09-11 00:00:21,441] ERROR Error while electing or becoming leader on 
> broker 1010674 (kafka.server.ZookeeperLeaderElector)
> kafka.common.KafkaException: Can't parse json string: null
> at kafka.utils.Json$.liftedTree1$1(Json.scala:40)
> at kafka.utils.Json$.parseFull(Json.scala:36)
> at 
> kafka.utils.ZkUtils$$anonfun$getReplicaAssignmentForTopics$1.apply(ZkUtils.scala:704)
> at 
> kafka.utils.ZkUtils$$anonfun$getReplicaAssignmentForTopics$1.apply(ZkUtils.scala:700)
> at 
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
> at 
> kafka.utils.ZkUtils.getReplicaAssignmentForTopics(ZkUtils.scala:700)
> at 
> kafka.controller.KafkaController.initializeControllerContext(KafkaController.scala:742)
> at 
> kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:333)
> at 
> kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:160)
> at 
> kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:85)
> at 
> kafka.server.ZookeeperLeaderElector$LeaderChangeListener$$anonfun$handleDataDeleted$1.apply$mcZ$sp(ZookeeperLeaderElector.scala:154)
> at 
> kafka.server.ZookeeperLeaderElector$LeaderChangeListener$$anonfun$handleDataDeleted$1.apply(ZookeeperLeaderElector.scala:154)
> at 
> kafka.server.ZookeeperLeaderElector$LeaderChangeListener$$anonfun$handleDataDeleted$1.apply(ZookeeperLeaderElector.scala:154)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:213)
> at 
> kafka.server.ZookeeperLeaderElector$LeaderChangeListener.handleDataDeleted(ZookeeperLeaderElector.scala:153)
> at org.I0Itec.zkclient.ZkClient$9.run(ZkClient.java:825)
> at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:72)
> Caused by: java.lang.NullPointerException
> {code}
> Regardless of how a null topic znode ended up in ZooKeeper, we can probably 
> handle this better, at least by printing the path up to the problematic znode 
> in the log. The way this particular problem was resolved was by using the 
> ``kafka-topics`` command and seeing it persistently fail trying to read a 
> particular topic with this same message. Then deleting the null znode allowed 
> the leader election to complete.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-7122) Data is lost when ZooKeeper times out

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7122.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Data is lost when ZooKeeper times out
> -
>
> Key: KAFKA-7122
> URL: https://issues.apache.org/jira/browse/KAFKA-7122
> Project: Kafka
>  Issue Type: Bug
>  Components: core, replication
>Affects Versions: 0.11.0.2
>Reporter: Nick Lipple
>Priority: Blocker
>
> Noticed that a kafka cluster will lose data when a leader for a partition has 
> their zookeeper connection timeout.
> Sequence of events:
>  # Say broker A leads a partition followed by brokers B and C
>  # A ZK node has a network issue, happens to be the node used by broker A. 
> Lets say this happens at offset X
>  # Kafka Controller immediately selects broker C as the new partition leader
>  # Broker A does not timeout from zookeeper for another 4 seconds. Broker A 
> still thinks it is the leader, presumably accepting producer writes.
>  # Broker A detects the ZK timeout and leaves the ISR.
>  # Broker A reconnects to ZK, rejoins cluster as follower for partition
>  # Broker A truncates log to some offset Y such that Y > X. Broker A proceeds 
> to catch up normally and becomes an ISR
>  # ISRs for partition are now in an inconsistent state:
>  ## Broker C has all offsets X through Y plus everything after
>  ## Broker B has all offsets X through Y plus everything after
>  ## Broker A has offsets up to X and after Y. Everything between X and Y *IS 
> MISSING*
>  # Within 5 minutes, controller trigger preferred replica election making 
> Broker A the new leader for partition (this is default behavior)
> All consumers after step 9 will not receive any messages for offsets between 
> X and Y.
>  
> The root problem here seems to be broker A truncates to offset Y when 
> rejoining the cluster. It should be truncating further back to offset X to 
> prevent data loss
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-6584) Session expiration concurrent with ZooKeeper leadership failover may lead to broker registration failure

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-6584?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-6584.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Session expiration concurrent with ZooKeeper leadership failover may lead to 
> broker registration failure
> 
>
> Key: KAFKA-6584
> URL: https://issues.apache.org/jira/browse/KAFKA-6584
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 1.0.0
>Reporter: Chris Thunes
>Priority: Major
>
> It seems that an edge case exists which can lead to sessions "un-expiring" 
> during a ZooKeeper leadership failover. Additional details can be found in 
> ZOOKEEPER-2985.
> This leads to a NODEXISTS error when attempting to re-create the ephemeral 
> brokers/ids/\{id} node in ZkUtils.registerBrokerInZk. We experienced this 
> issue on each node within a 3-node Kafka cluster running 1.0.0. All three 
> nodes continued running (producers and consumers appeared unaffected), but 
> none of the nodes were considered online and partition leadership could be 
> not re-assigned.
> I took a quick look at trunk and I believe the issue is still present, but 
> has moved into KafkaZkClient.checkedEphemeralCreate which will [raise an 
> error|https://github.com/apache/kafka/blob/90e0bbe/core/src/main/scala/kafka/zk/KafkaZkClient.scala#L1512]
>  when it finds that the broker/ids/\{id} node exists, but belongs to the old 
> (believed expired) session.
>  
> *NOTE:* KAFKA-7165 introduce a workaround to cope with the case described 
> here. We decided to keep this issue open to track the ZOOKEEPER-2985 status.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Re: New release branch 3.9

2024-10-09 Thread Josep Prat

Hi Colin,

I want to raise a bug we encountered while testing 3.9 RC0. You can find
the report here: https://issues.apache.org/jira/browse/KAFKA-17751
Its current severity is "high", but IMO it might even be a blocker. What do
you think?

Best,

On Fri, Oct 4, 2024 at 5:18 PM José Armando García Sancio
 wrote:

> Thanks Colin.
>
> KAFKA-16927 has been merged to trunk and the 3.9 branch.
>
> --
> -José
>

-- 
[image: Aiven] 

*Josep Prat*
Open Source Engineering Director, *Aiven*
josep.p...@aiven.io   |   +491715557497
aiven.io    |   

*Aiven Deutschland GmbH*
Alexanderufer 3-7, 10117 Berlin
Geschäftsführer: Oskari Saarenmaa, Hannu Valtonen,
Anna Richardson, Kenneth Chen
Amtsgericht Charlottenburg, HRB 209739 B

[jira] [Resolved] (KAFKA-9147) zookeeper service not running

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-9147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-9147.
---
Resolution: Invalid

This is not a Kafka issue, closing.

> zookeeper service not running 
> --
>
> Key: KAFKA-9147
> URL: https://issues.apache.org/jira/browse/KAFKA-9147
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 2.3.0
> Environment: Ubuntu
>Reporter: parimal
>Priority: Major
>
> i was able to start zookeeper service on stand alone Ubuntu using the command
>  
> root@N-5CG73531RZ:/# /usr/local/zookeeper/bin/zkServer.sh start
> /usr/bin/java
> ZooKeeper JMX enabled by default
> Using config: /usr/local/zookeeper/bin/../conf/zoo.cfg
> Starting zookeeper ... STARTED
>  
> However when i do ps -ef I dont see any zookeeper service running 
>  
> root@N-5CG73531RZ:/# ps -ef
> UID PID PPID C STIME TTY TIME CMD
> root 1 0 0 Nov04 ? 00:00:00 /init
> root 5 1 0 Nov04 tty1 00:00:00 /init
> pgarg00 6 5 0 Nov04 tty1 00:00:00 -bash
> root 2861 6 0 Nov04 tty1 00:00:00 sudo -i
> root 2862 2861 0 Nov04 tty1 00:00:03 -bash
> root 5347 1 0 18:24 ? 00:00:00 /usr/sbin/sshd
> root 5367 1 0 18:25 ? 00:00:00 /usr/sbin/inetd
> root 8950 2862 0 19:15 tty1 00:00:00 ps -ef
>  
> Also when I do telnet , connection is refused 
> root@N-5CG73531RZ:/# telnet localhost 2181
> Trying 127.0.0.1...
> telnet: Unable to connect to remote host: Connection refused
>  
> can you plz help me ?
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-7546) Java implementation for Authorizer

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7546.
---
Resolution: Not A Problem

The org.apache.kafka.server.authorizer.Authorizer java interface exists to 
implement custom authorizers that don't depend on ZooKeeper, closing.

> Java implementation for Authorizer
> --
>
> Key: KAFKA-7546
> URL: https://issues.apache.org/jira/browse/KAFKA-7546
> Project: Kafka
>  Issue Type: Improvement
>  Components: security
>Reporter: Pradeep Bansal
>Priority: Major
> Attachments: AuthorizerImpl.PNG
>
>
> I am using kafka with authentication and authorization. I wanted to plugin my 
> own implementation of Authorizer which doesn't use zookeeper instead has 
> permission mapping in SQL database. Is it possible to write Authorizer code 
> in Java?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-7721) Connection to zookeeper refused

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7721.
---
Resolution: Cannot Reproduce

We are now removing ZooKeeper support so closing this issue.

> Connection to zookeeper refused
> ---
>
> Key: KAFKA-7721
> URL: https://issues.apache.org/jira/browse/KAFKA-7721
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.1.0
> Environment: Dockerized containers - kubernetes 1.9
>Reporter: Mohammad Etemad
>Priority: Major
>
> Kafka throws exception when trying to connect to zookeeper. This happens when 
> zookeeper connection is lost and recovered. Kafka seems to be stuck in a loop 
> that cannot renew the connection. Here are the logs:
> 2018-12-11 13:52:52,905] INFO Opening socket connection to server 
> zookeeper-0.zookeeper.logging.svc.cluster.local/10.38.128.12:2181. Will not 
> attempt to authenticate using SASL (unknown error) 
> (org.apache.zookeeper.ClientCnxn)
> [2018-12-11 13:52:52,906] WARN Session 0x1001443ad77000f for server null, 
> unexpected error, closing socket connection and attempting reconnect 
> (org.apache.zookeeper.ClientCnxn)
> java.net.ConnectException: Connection refused
>  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>  at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
>  at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
>  at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
> On the zookeeper side it can be seen that kafka connection is established. 
> Here are the logs:
> 2018-12-11 13:53:44,969 [myid:] - INFO 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - 
> Accepted socket connection from /10.38.128.8:46066
> 2018-12-11 13:53:44,976 [myid:] - INFO 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@938] - Client 
> attempting to establish new session at /10.38.128.8:46066
> 2018-12-11 13:53:45,005 [myid:] - INFO [SyncThread:0:ZooKeeperServer@683] - 
> Established session 0x10060ff12a58dc0 with negotiated timeout 3 for 
> client /10.38.128.8:46066
> 2018-12-11 13:53:45,071 [myid:] - INFO [ProcessThread(sid:0 
> cport:2181)::PrepRequestProcessor@487] - Processed session termination for 
> sessionid: 0x10060ff12a58dc0
> 2018-12-11 13:53:45,077 [myid:] - INFO 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed 
> socket connection for client /10.38.128.8:46066 which had sessionid 
> 0x10060ff12a58dc0
> 2018-12-11 13:53:47,119 [myid:] - INFO 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@215] - 
> Accepted socket connection from /10.36.0.8:48798
> 2018-12-11 13:53:47,124 [myid:] - INFO 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:ZooKeeperServer@938] - Client 
> attempting to establish new session at /10.36.0.8:48798
> 2018-12-11 13:53:47,134 [myid:] - INFO [SyncThread:0:ZooKeeperServer@683] - 
> Established session 0x10060ff12a58dc1 with negotiated timeout 3 for 
> client /10.36.0.8:48798
> 2018-12-11 13:53:47,582 [myid:] - INFO [ProcessThread(sid:0 
> cport:2181)::PrepRequestProcessor@487] - Processed session termination for 
> sessionid: 0x10060ff12a58dc1
> 2018-12-11 13:53:47,592 [myid:] - INFO 
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1040] - Closed 
> socket connection for client /10.36.0.8:48798 which had sessionid 
> 0x10060ff12a58dc1



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-8708) Zookeeper Session expired either before or while waiting for connection

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-8708.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Zookeeper Session expired either before or while waiting for connection
> ---
>
> Key: KAFKA-8708
> URL: https://issues.apache.org/jira/browse/KAFKA-8708
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 2.0.1
>Reporter: Chethan Bheemaiah
>Priority: Major
>
> Recently we had encountered an issue in one of our kafka cluster. One of the 
> node went down and was not joining the kafka cluster on restart. We had 
> observed Session expired error messages in server.log
> Below is one message
> ERROR kafka.common.ZkNodeChangeNotificationListener: Error while processing 
> notification change for path = /config/changes
> kafka.zookeeper.ZooKeeperClientExpiredException: Session expired either 
> before or while waiting for connection
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply$mcV$sp(ZooKeeperClient.scala:238)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
> at 
> kafka.zookeeper.ZooKeeperClient.kafka$zookeeper$ZooKeeperClient$$waitUntilConnected(ZooKeeperClient.scala:226)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$waitUntilConnected$1.apply$mcV$sp(ZooKeeperClient.scala:220)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$waitUntilConnected$1.apply(ZooKeeperClient.scala:220)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$waitUntilConnected$1.apply(ZooKeeperClient.scala:220)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
> at 
> kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:219)
> at 
> kafka.zk.KafkaZkClient.retryRequestsUntilConnected(KafkaZkClient.scala:1510)
> at 
> kafka.zk.KafkaZkClient.kafka$zk$KafkaZkClient$$retryRequestUntilConnected(KafkaZkClient.scala:1486)
> at kafka.zk.KafkaZkClient.getChildren(KafkaZkClient.scala:585)
> at 
> kafka.common.ZkNodeChangeNotificationListener.kafka$common$ZkNodeChangeNotificationListener$$processNotifications(ZkNodeChangeNotificationListener.scala:82)
> at 
> kafka.common.ZkNodeChangeNotificationListener$ChangeNotification.process(ZkNodeChangeNotificationListener.scala:119)
> at 
> kafka.common.ZkNodeChangeNotificationListener$ChangeEventProcessThread.doWork(ZkNodeChangeNotificationListener.scala:145)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-8935) Please update zookeeper in the next release

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8935?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-8935.
---
Resolution: Invalid

Kafka currently uses ZooKeeper 3.8.4, closing.

> Please update zookeeper in the next release
> ---
>
> Key: KAFKA-8935
> URL: https://issues.apache.org/jira/browse/KAFKA-8935
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Agostino Sarubbo
>Priority: Major
>
> Please update zookeeper in the next release. Atm, 2.3.0 ships 
> zookeeper-3.4.14 that does not support ssl. Thanks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-8707) Zookeeper Session expired either before or while waiting for connection

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8707?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-8707.
---
Resolution: Duplicate

> Zookeeper Session expired either before or while waiting for connection
> ---
>
> Key: KAFKA-8707
> URL: https://issues.apache.org/jira/browse/KAFKA-8707
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 2.0.1
>Reporter: Chethan Bheemaiah
>Priority: Major
>
> Recently we had encountered an issue in one of our kafka cluster. One of the 
> node went down and was not joining the kafka cluster on restart. We had 
> observed Session expired error messages in server.log
> Below is one message
> ERROR kafka.common.ZkNodeChangeNotificationListener: Error while processing 
> notification change for path = /config/changes
> kafka.zookeeper.ZooKeeperClientExpiredException: Session expired either 
> before or while waiting for connection
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply$mcV$sp(ZooKeeperClient.scala:238)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$kafka$zookeeper$ZooKeeperClient$$waitUntilConnected$1.apply(ZooKeeperClient.scala:226)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
> at 
> kafka.zookeeper.ZooKeeperClient.kafka$zookeeper$ZooKeeperClient$$waitUntilConnected(ZooKeeperClient.scala:226)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$waitUntilConnected$1.apply$mcV$sp(ZooKeeperClient.scala:220)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$waitUntilConnected$1.apply(ZooKeeperClient.scala:220)
> at 
> kafka.zookeeper.ZooKeeperClient$$anonfun$waitUntilConnected$1.apply(ZooKeeperClient.scala:220)
> at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:251)
> at 
> kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:219)
> at 
> kafka.zk.KafkaZkClient.retryRequestsUntilConnected(KafkaZkClient.scala:1510)
> at 
> kafka.zk.KafkaZkClient.kafka$zk$KafkaZkClient$$retryRequestUntilConnected(KafkaZkClient.scala:1486)
> at kafka.zk.KafkaZkClient.getChildren(KafkaZkClient.scala:585)
> at 
> kafka.common.ZkNodeChangeNotificationListener.kafka$common$ZkNodeChangeNotificationListener$$processNotifications(ZkNodeChangeNotificationListener.scala:82)
> at 
> kafka.common.ZkNodeChangeNotificationListener$ChangeNotification.process(ZkNodeChangeNotificationListener.scala:119)
> at 
> kafka.common.ZkNodeChangeNotificationListener$ChangeEventProcessThread.doWork(ZkNodeChangeNotificationListener.scala:145)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-2404) Delete config znode when config values are empty

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-2404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-2404.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Delete config znode when config values are empty
> 
>
> Key: KAFKA-2404
> URL: https://issues.apache.org/jira/browse/KAFKA-2404
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Aditya Auradkar
>Assignee: Aditya Auradkar
>Priority: Major
>
> Jun's comment from KAFKA-2205:
> "Currently, if I add client config and then remove it, the clientid still 
> shows up during describe, but with empty config values. We probably should 
> delete the path when there is no overwritten values. Could you do that in a 
> follow up patch?
> bin/kafka-configs.sh --zookeeper localhost:2181 --entity-type client 
> --describe 
> Configs for client:client1 are"



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-2204) Dynamic Configuration via ZK

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-2204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-2204.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Dynamic Configuration via ZK
> 
>
> Key: KAFKA-2204
> URL: https://issues.apache.org/jira/browse/KAFKA-2204
> Project: Kafka
>  Issue Type: New Feature
>Reporter: Aditya Auradkar
>Assignee: Aditya Auradkar
>Priority: Major
>
> Parent ticket to track all jiras for dynamic configuration via zookeeper.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-7629) Mirror maker goes into infinite loop

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7629.
---
Resolution: Won't Fix

The original MirrorMaker tool has been removed from Kafka. You should now use 
the Connect based MirrorMaker: 
https://kafka.apache.org/documentation/#georeplication

> Mirror maker goes into infinite loop
> 
>
> Key: KAFKA-7629
> URL: https://issues.apache.org/jira/browse/KAFKA-7629
> Project: Kafka
>  Issue Type: Bug
>  Components: mirrormaker
>Affects Versions: 2.0.0
> Environment: local
>Reporter: Darshan Mehta
>Priority: Major
>
> *Setup:*
> I have 2 kafka images running Spoify Kafka image 
> [https://hub.docker.com/r/spotify/kafka]
> Config:
> Image 1:
>  * host: kafka1
>  * zk port : 2181
>  * broker port : 9092
> Image 2:
>  * host: kafka2
>  * zk port : 1181
>  * broker port : 8092
> Producer Properties for Mirror maker: 
> {code:java}
> bootstrap.servers=kafka2:8092
> {code}
> Consumer Properties for Mirror maker: 
> {code:java}
> bootstrap.servers=kafka1:9092
> group.id=test-consumer-group
> exclude.internal.topics=true
> {code}
>  
> *Steps to replicate :*
>  # Start mirror maker with following command : 
> {code:java}
> $KAFKA_INSTALLATION_DIR/bin/kafka-mirror-maker.sh --producer.config 
>  --consumer.config  
> --num.streams 1 --whitelist topic-1
> {code}
>  # Start local kafka console consumer to listen to topic-1 for kafka2:8092 
> {code:java}
> $KAFKA_INSTALLATION_DIR/bin/kafka-console-consumer.sh --bootstrap-server 
> kafka2:8092 --topic topic-1
> {code}
>  # Produce an event to kafka1:9092 - topic-1  -> It will be printed on the 
> console in Step 2
>  # Stop mirror maker with ctrl+C (started in step 1)
>  # Restart mirror maker with same command
>  # Produce an event onto the same topic (i.e. repeat step 3)
>  # Both source and destination will be flooded with the same messages until 
> mirror maker is stopped
> Surprisingly, source kafka also gets flooded with the same message. I believe 
> when restarted, the mirror maker is unable to read the state?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-10726) How to detect heartbeat failure between broker/zookeeper leader

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-10726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-10726.

Resolution: Information Provided

We are now removing ZooKeeper support so closing this issue.

> How to detect heartbeat failure between broker/zookeeper leader
> ---
>
> Key: KAFKA-10726
> URL: https://issues.apache.org/jira/browse/KAFKA-10726
> Project: Kafka
>  Issue Type: Bug
>  Components: controller, logging
>Affects Versions: 2.1.1
>Reporter: Keiichiro Wakasa
>Priority: Critical
>
> Hello experts,
> I'm not sure this is proper place to ask but I'd appreciate if you could help 
> us with the following question...
>  
> We've continuously suffered from broker exclusion caused by heartbeat timeout 
> between broker and zookeeper leader.
> This issue can be easily detected by checking ephemeral nodes via zkcli.sh 
> but we'd like to detect this with logs like server.log/controller.log since 
> we have an existing system to forward these logs to our system. 
> Looking at server.log/controller.log, we couldn't find any logs that 
> indicates the heartbeat timeout. Is there any other logs to check for 
> heartbeat health?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-12783) Remove the deprecated ZK-based partition reassignment API

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-12783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-12783.

Resolution: Duplicate

> Remove the deprecated ZK-based partition reassignment API
> -
>
> Key: KAFKA-12783
> URL: https://issues.apache.org/jira/browse/KAFKA-12783
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 3.0.0
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Major
>
> ZK-based reassignment has been deprecated since AK 2.5.  It's time to remove 
> since the next major release is coming up (3.0)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-2448) BrokerChangeListener missed broker id path ephemeral node deletion event.

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-2448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-2448.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> BrokerChangeListener missed broker id path ephemeral node deletion event.
> -
>
> Key: KAFKA-2448
> URL: https://issues.apache.org/jira/browse/KAFKA-2448
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.9.0.0
>Reporter: Jiangjie Qin
>Assignee: Flavio Paiva Junqueira
>Priority: Major
>
> When a broker get bounced, ideally the sequence should be like this:
> 1.1. Broker shutdown resources.
> 1.2. Broker close zkClient (this will cause the ephemeral node of 
> /brokers/ids/BROKER_ID to be deleted)
> 1.3. Broker restart and load the log segment
> 1.4. Broker create ephemeral node /brokers/ids/BROKER_ID
> The broker side log s are:
> {noformat}
> ...
> 2015/08/17 22:42:37.663 INFO [SocketServer] [Thread-1] [kafka-server] [] 
> [Socket Server on Broker 1140], Shutting down
> 2015/08/17 22:42:37.735 INFO [SocketServer] [Thread-1] [kafka-server] [] 
> [Socket Server on Broker 1140], Shutdown completed
> ...
> 2015/08/17 22:42:53.898 INFO [ZooKeeper] [Thread-1] [kafka-server] [] 
> Session: 0x14d43fd905f68d7 closed
> 2015/08/17 22:42:53.898 INFO [ClientCnxn] [main-EventThread] [kafka-server] 
> [] EventThread shut down
> 2015/08/17 22:42:53.898 INFO [KafkaServer] [Thread-1] [kafka-server] [] 
> [Kafka Server 1140], shut down completed
> ...
> 2015/08/17 22:43:03.306 INFO [ClientCnxn] 
> [main-SendThread(zk-ei1-kafkatest.stg.linkedin.com:12913)] [kafka-server] [] 
> Session establishment complete on server zk-ei1-kafkatest.stg.linkedin
> .com/172.20.73.211:12913, sessionid = 0x24d43fd93d96821, negotiated timeout = 
> 12000
> 2015/08/17 22:43:03.306 INFO [ZkClient] [main-EventThread] [kafka-server] [] 
> zookeeper state changed (SyncConnected)
> ...
> {noformat}
> On the controller side, the sequence should be:
> 2.1. Controlled shutdown the broker
> 2.2. BrokerChangeListener fired for /brokers/ids child change because 
> ephemeral node is deleted in step 1.2
> 2.3. BrokerChangeListener fired again for /borkers/ids child change because 
> the ephemeral node is created in 1.4
> The issue I saw was on controller side, the broker change listener only fired 
> once after step 1.4. So the controller did not see any broker change.
> {noformat}
> 2015/08/17 22:41:46.189 [KafkaController] [Controller 1507]: Shutting down 
> broker 1140
> ...
> 2015/08/17 22:42:38.031 [RequestSendThread] 
> [Controller-1507-to-broker-1140-send-thread], Controller 1507 epoch 799 fails 
> to send request Name: StopReplicaRequest; Version: 0; CorrelationId: 5334; 
> ClientId: ; DeletePartitions: false; ControllerId: 1507; ControllerEpoch: 
> 799; Partitions: [seas-decisionboard-searcher-service_call,1] to broker 1140 
> : (EndPoint(eat1-app1140.corp.linkedin.com,10251,PLAINTEXT)). Reconnecting to 
> broker.
> java.nio.channels.ClosedChannelException
> at kafka.network.BlockingChannel.send(BlockingChannel.scala:110)
> at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
> at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> 2015/08/17 22:42:38.031 [RequestSendThread] 
> [Controller-1507-to-broker-1140-send-thread], Controller 1507 connected to 
> 1140 : (EndPoint(eat1-app1140.corp.linkedin.com,10251,PLAINTEXT)) for sending 
> state change requests
> 2015/08/17 22:42:38.332 [RequestSendThread] 
> [Controller-1507-to-broker-1140-send-thread], Controller 1507 epoch 799 fails 
> to send request Name: StopReplicaRequest; Version: 0; CorrelationId: 5334; 
> ClientId: ; DeletePartitions: false; ControllerId: 1507; ControllerEpoch: 
> 799; Partitions: [seas-decisionboard-searcher-service_call,1] to broker 1140 
> : (EndPoint(eat1-app1140.corp.linkedin.com,10251,PLAINTEXT)). Reconnecting to 
> broker.
> java.nio.channels.ClosedChannelException
> at kafka.network.BlockingChannel.send(BlockingChannel.scala:110)
> at 
> kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:132)
> at 
> kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:131)
> at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:60)
> 
> 2015/08/17 22:43:09.035 [ReplicaStateMachine$BrokerChangeListener] 
> [BrokerChangeListener on Controller 1507]: Broker change listener fired for 
> path /brokers/ids with children 
> 1140,1282,1579,871,1556,872,1511,873,874,852,1575,875,1574,1530,854,857,858,859,1493,1272,880,1547,1568,1500,1521,863,864,865,867,1507
> 2015/08/17 22:43:09.082

[jira] [Resolved] (KAFKA-1407) Broker can not return to ISR because of BadVersionException

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-1407.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Broker can not return to ISR because of BadVersionException
> ---
>
> Key: KAFKA-1407
> URL: https://issues.apache.org/jira/browse/KAFKA-1407
> Project: Kafka
>  Issue Type: Bug
>  Components: controller
>Affects Versions: 0.8.1, 2.4.1
>Reporter: Dmitry Bugaychenko
>Assignee: Neha Narkhede
>Priority: Critical
>
> Each morning we found a broker out of ISR at stuck with log full of messages:
> {code}
> INFO   | jvm 1| 2014/04/21 08:36:21 | [2014-04-21 09:36:21,907] ERROR 
> Conditional update of path /brokers/topics/topic2/partitions/1/state with 
> data 
> {"controller_epoch":46,"leader":2,"version":1,"leader_epoch":38,"isr":[2]} 
> and expected version 53 failed due to 
> org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = 
> BadVersion for /brokers/topics/topic2/partitions/1/state 
> (kafka.utils.ZkUtils$)
> INFO   | jvm 1| 2014/04/21 08:36:21 | [2014-04-21 09:36:21,907] INFO 
> Partition [topic2,1] on broker 2: Cached zkVersion [53] not equal to that in 
> zookeeper, skip updating ISR (kafka.cluster.Partition)
> {code}
> It seems that it can not recover after short netwrok break down and the only 
> way to return it is restart it using kill -9.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-1599) Change preferred replica election admin command to handle large clusters

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-1599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-1599.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Change preferred replica election admin command to handle large clusters
> 
>
> Key: KAFKA-1599
> URL: https://issues.apache.org/jira/browse/KAFKA-1599
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.8.2.0
>Reporter: Todd Palino
>Assignee: Abhishek Nigam
>Priority: Major
>  Labels: newbie++
>
> We ran into a problem with a cluster that has 70k partitions where we could 
> not trigger a preferred replica election for all topics and partitions using 
> the admin tool. Upon investigation, it was determined that this was because 
> the JSON object that was being written to the admin znode to tell the 
> controller to start the election was 1.8 MB in size. As the default Zookeeper 
> data size limit is 1MB, and it is non-trivial to change, we should come up 
> with a better way to represent the list of topics and partitions for this 
> admin command.
> I have several thoughts on this so far:
> 1) Trigger the command for all topics and partitions with a JSON object that 
> does not include an explicit list of them (i.e. a flag that says "all 
> partitions")
> 2) Use a more compact JSON representation. Currently, the JSON contains a 
> 'partitions' key which holds a list of dictionaries that each have a 'topic' 
> and 'partition' key, and there must be one list item for each partition. This 
> results in a lot of repetition of key names that is unneeded. Changing this 
> to a format like this would be much more compact:
> {'topics': {'topicName1': [0, 1, 2, 3], 'topicName2': [0,1]}, 'version': 1}
> 3) Use a representation other than JSON. Strings are inefficient. A binary 
> format would be the most compact. This does put a greater burden on tools and 
> scripts that do not use the inbuilt libraries, but it is not too high.
> 4) Use a representation that involves multiple znodes. A structured tree in 
> the admin command would probably provide the most complete solution. However, 
> we would need to make sure to not exceed the data size limit with a wide tree 
> (the list of children for any single znode cannot exceed the ZK data size of 
> 1MB)
> Obviously, there could be a combination of #1 with a change in the 
> representation, which would likely be appropriate as well.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17749) Throttle metrics have changed name

2024-10-09 Thread Mickael Maison (Jira)

Mickael Maison created KAFKA-17749:
--

 Summary: Throttle metrics have changed name
 Key: KAFKA-17749
 URL: https://issues.apache.org/jira/browse/KAFKA-17749
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.8.0, 3.9.0
Reporter: Mickael Maison
Assignee: Mickael Maison


In 
https://github.com/apache/kafka/commit/e4e1116156d44d5e7a52ad8fb51a57d5e5755710,
 we moved the Throttler class to the storage module but this made the metrics 
emitted from this class change name.

Since 3.8 the metrics are named 
org.apache.kafka.storage.internals.utils.Throttler. Previously they were called 
kafka.util.Thottler



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-7090) Zookeeper client setting in server-properties

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7090?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7090.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Zookeeper client setting in server-properties
> -
>
> Key: KAFKA-7090
> URL: https://issues.apache.org/jira/browse/KAFKA-7090
> Project: Kafka
>  Issue Type: New Feature
>  Components: config, documentation
>Reporter: Christian Tramnitz
>Priority: Minor
>
> There are several Zookeeper client settings that may be used to connect to ZK.
> Currently, it seems only very few zookeeper.* settings are supported in 
> Kafka's server.properties file. Wouldn't it make sense to support all 
> zookeeper client settings there or where would that need to go?
> I.e. for using Zookeeper 3.5 with TLS enabled, the following properties are 
> required:
> zookeeper.clientCnxnSocket
> zookeeper.client.secure
> zookeeper.ssl.keyStore.location
> zookeeper.ssl.keyStore.password
> zookeeper.ssl.trustStore.location
> zookeeper.ssl.trustStore.password
> It's obviously possible to pass them through "-D", but especially for the 
> keystore password, I'd be more comfortable with this sitting in the 
> properties file than being visible in the process list...



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-6940) Kafka Cluster and Zookeeper ensemble configuration with SASL authentication

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-6940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-6940.
---
Resolution: Information Provided

Securing ZooKeeper is covered in this section in the docs: 
https://kafka.apache.org/documentation/#zk_authz

> Kafka Cluster and Zookeeper ensemble configuration with SASL authentication
> ---
>
> Key: KAFKA-6940
> URL: https://issues.apache.org/jira/browse/KAFKA-6940
> Project: Kafka
>  Issue Type: Task
>  Components: core, security, zkclient
>Affects Versions: 0.11.0.2
> Environment: PRE Production
>Reporter: Shashank Jain
>Priority: Blocker
>  Labels: security, test
>   Original Estimate: 96h
>  Remaining Estimate: 96h
>
> Hi All, 
>  
>  
> I have a working  Kafka Cluster and Zookeeper Ensemble  but  after  
> integrating   SASL authentication I am facing below exception, 
>  
>  
> Zookeeper:- 
>  
>  
> 2018-05-23 07:39:59,476 [myid:1] - INFO  [ProcessThread(sid:1 cport:-1):: ] - 
> Got user-level KeeperException when processing sessionid:0x301cae0b3480002 
> type:delete cxid:0x48 zxid:0x2004e txntype:-1 reqpath:n/a Error 
> Path:/admin/preferred_replica_election Error:KeeperErrorCode = NoNode for 
> /admin/preferred_replica_election
> 2018-05-23 07:40:39,240 [myid:1] - INFO  [ProcessThread(sid:1 
> cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when 
> processing sessionid:0x200b4f13c190006 type:create cxid:0x20 zxid:0x20052 
> txntype:-1 reqpath:n/a Error Path:/brokers Error:KeeperErrorCode = NodeExists 
> for /brokers
> 2018-05-23 07:40:39,240 [myid:1] - INFO  [ProcessThread(sid:1 
> cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when 
> processing sessionid:0x200b4f13c190006 type:create cxid:0x21 zxid:0x20053 
> txntype:-1 reqpath:n/a Error Path:/brokers/ids Error:KeeperErrorCode = 
> NodeExists for /brokers/ids
> 2018-05-23 07:41:00,864 [myid:1] - INFO  [ProcessThread(sid:1 
> cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when 
> processing sessionid:0x301cae0b3480004 type:create cxid:0x20 zxid:0x20058 
> txntype:-1 reqpath:n/a Error Path:/brokers Error:KeeperErrorCode = NodeExists 
> for /brokers
> 2018-05-23 07:41:00,864 [myid:1] - INFO  [ProcessThread(sid:1 
> cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when 
> processing sessionid:0x301cae0b3480004 type:create cxid:0x21 zxid:0x20059 
> txntype:-1 reqpath:n/a Error Path:/brokers/ids Error:KeeperErrorCode = 
> NodeExists for /brokers/ids
> 2018-05-23 07:41:28,456 [myid:1] - INFO  [ProcessThread(sid:1 
> cport:-1)::PrepRequestProcessor@487] - Processed session termination for 
> sessionid: 0x200b4f13c190002
> 2018-05-23 07:41:29,563 [myid:1] - INFO  [ProcessThread(sid:1 
> cport:-1)::PrepRequestProcessor@487] - Processed session termination for 
> sessionid: 0x301cae0b3480002
> 2018-05-23 07:41:29,569 [myid:1] - INFO  [ProcessThread(sid:1 
> cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when 
> processing sessionid:0x200b4f13c190006 type:create cxid:0x2d zxid:0x2005f 
> txntype:-1 reqpath:n/a Error Path:/controller Error:KeeperErrorCode = 
> NodeExists for /controller
> 2018-05-23 07:41:29,679 [myid:1] - INFO  [ProcessThread(sid:1 
> cport:-1)::PrepRequestProcessor@653] - Got user-level KeeperException when 
> processing sessionid:0x301cae0b3480004 type:delete cxid:0x4e zxid:0x20061 
> txntype:-1 reqpath:n/a Error Path:/admin/preferred_replica_election 
> Error:KeeperErrorCode = NoNode for /admin/preferred_replica_election
>  
>  
> Kafka:- 
>  
> [2018-05-23 09:06:31,969] ERROR [ReplicaFetcherThread-0-1]: Error for 
> partition [23MAY,0] to broker 
> 1:org.apache.kafka.common.errors.UnknownTopicOrPartitionException: This 
> server does not host this topic-partition. (kafka.server.ReplicaFetcherThread)
>  
>  
>  
> ERROR [ReplicaFetcherThread-0-2]: Current offset 142474 for partition 
> [23MAY,1] out of range; reset offset to 142478 
> (kafka.server.ReplicaFetcherThread)
>  
>  
> ERROR [ReplicaFetcherThread-0-2]: Error for partition [23MAY,2] to broker 
> 2:org.apache.kafka.common.errors.NotLeaderForPartitionException: This server 
> is not the leader for that topic-partition. 
> (kafka.server.ReplicaFetcherThread)
>  
>  
>  
> Below are my configuration:- 
>  
>  
> Zookeeper:- 
>  
>  java.env
> SERVER_JVMFLAGS="-Djava.security.auth.login.config=/usr/local/zookeeper/conf/ZK_jaas.conf"
>  
>  
> ZK_jaas.conf
> Server
>  
> { org.apache.zookeeper.server.auth.DigestLoginModule required
>   username="admin"
>   password="admin-secret"
>   user_admin="admin-secret";
>  };
>  
> QuorumServer {
>        org.apache.zookeeper.server.auth.DigestLoginModule required
>        user_test="test";

[jira] [Created] (KAFKA-17750) Extend kafka-consumer-groups command line tool to support new consumer group

2024-10-09 Thread David Jacot (Jira)

David Jacot created KAFKA-17750:
---

 Summary: Extend kafka-consumer-groups command line tool to support 
new consumer group
 Key: KAFKA-17750
 URL: https://issues.apache.org/jira/browse/KAFKA-17750
 Project: Kafka
  Issue Type: Improvement
  Components: tools
Reporter: David Jacot


At the moment, the `kafka-consumer-groups` command line tool still display the 
information of consumer groups regardless of whether they are classic groups 
using the embedded consumer protocol or consumers groups.

The new consumer groups have more state available to troubleshoot issues. For 
instance, each member has an epoch, an assignment, a target assignment, etc. 
You can refer to 
[KIP-848|https://cwiki.apache.org/confluence/display/KAFKA/KIP-848%3A+The+Next+Generation+of+the+Consumer+Rebalance+Protocol]
 for the details. It would be useful to display those too in order to give 
administrator a detailed view of the state of the groups.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17626) Move common fetch related classes from storage to server-common

2024-10-09 Thread Apoorv Mittal (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Apoorv Mittal resolved KAFKA-17626.
---
Resolution: Fixed

> Move common fetch related classes from storage to server-common
> ---
>
> Key: KAFKA-17626
> URL: https://issues.apache.org/jira/browse/KAFKA-17626
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Apoorv Mittal
>Assignee: Apoorv Mittal
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17752) Contoller crashes when removed if it is an initial controller

2024-10-09 Thread Juha Mynttinen (Jira)

Juha Mynttinen created KAFKA-17752:
--

 Summary: Contoller crashes when removed if it is an initial 
controller
 Key: KAFKA-17752
 URL: https://issues.apache.org/jira/browse/KAFKA-17752
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.9.0
Reporter: Juha Mynttinen


Hey, 

Tested using 3.9.0 RC0.

It seems that "kafka-metadata-quorum.sh remove-controller" causes the removed 
controller to crash if it is one of the controllers specified using 
"--initial-controllers "

Steps to reproduce:

Clean up and setup the environment

rm -rf /tmp/controllers && \
mkdir -p /tmp/controllers/c1 && \
mkdir -p /tmp/controllers/c2 && \
mkdir -p /tmp/controllers/c3

export KAFKA_HOME=

Format the controllers

$KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 
----0001 --initial-controllers 
1001@localhost:10001:AAEAAA,1002@localhost:10002:AAEAAA,1003@localhost:10003:AAEAAA
 --config c1.properties
$KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 
----0001 --initial-controllers 
1001@localhost:10001:AAEAAA,1002@localhost:10002:AAEAAA,1003@localhost:10003:AAEAAA
 --config c2.properties
$KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 
----0001 --initial-controllers 
1001@localhost:10001:AAEAAA,1002@localhost:10002:AAEAAA,1003@localhost:10003:AAEAAA
 --config c3.properties

Start the controllers, in separate terminals

$KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c1.properties
$KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c2.properties
$KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c3.properties

Remove a controller:

$KAFKA_HOME/bin/kafka-metadata-quorum.sh --bootstrap-controller 
localhost:10001,localhost:10002,localhost:10003,localhost:10004 
remove-controller --controller-id 1001 --controller-directory-id 
AAEAAA

The process crashes with the following error:

[2024-10-09 15:19:15,574] ERROR Encountered fatal fault: exception while 
renouncing leadership 
(org.apache.kafka.server.fault.ProcessTerminatingFaultHandler)
java.lang.RuntimeException: Unable to reset to last stable offset 55. No 
in-memory snapshot found for this offset.
        at 
org.apache.kafka.controller.OffsetControlManager.deactivate(OffsetControlManager.java:268)
        at 
org.apache.kafka.controller.QuorumController.renounce(QuorumController.java:1281)
        at 
org.apache.kafka.controller.QuorumController.handleEventException(QuorumController.java:552)
        at 
org.apache.kafka.controller.QuorumController.access$800(QuorumController.java:180)
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.complete(QuorumController.java:885)
        at 
org.apache.kafka.controller.QuorumController$ControllerWriteEvent.handleException(QuorumController.java:875)
        at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.completeWithException(KafkaEventQueue.java:153)
        at 
org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:142)
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:215)
        at 
org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:186)
        at java.base/java.lang.Thread.run(Thread.java:840)

If the process that died is restarted it joins the cluster and becomes on 
observer, as expected.

The crash doesn't happen in a slightly different case, exact steps missing. But 
the idea is this:
1. Create a 3-controller cluster as above
2. Format and start a 4rd controller. 
3. Add the 4th controller as a voter.
4. Remove the 4th controller to make it an observer. It becomes observer as 
expected.

Because this case works, I'm guessing the crash is somehow related to the 
controller being one of the initial controllers.

I didn't dig deeper on why the crash occurs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-6602) Support Kafka to save credentials in Java Key Store on Zookeeper node

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-6602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-6602.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Support Kafka to save credentials in Java Key Store on Zookeeper node
> -
>
> Key: KAFKA-6602
> URL: https://issues.apache.org/jira/browse/KAFKA-6602
> Project: Kafka
>  Issue Type: New Feature
>  Components: security
>Reporter: Chen He
>Assignee: Chen He
>Priority: Major
>
> Kafka connect needs to talk to multifarious distributed systems. However, 
> each system has its own authentication mechanism. How we manage these 
> credentials become a common problem. 
> Here are my thoughts:
>  # We may need to save it in java key store;
>  # We may need to put this key store in a distributed system (topic or 
> zookeeper);
>  # Key store password may be configured in Kafka configuration;
> I have implement the feature that allows store java key store in zookeeper 
> node. If Kafka community likes this idea, I am happy to contribute.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-7898) ERROR Caught unexpected throwable (org.apache.zookeeper.ClientCnxn)

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7898.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> ERROR Caught unexpected throwable (org.apache.zookeeper.ClientCnxn)
> ---
>
> Key: KAFKA-7898
> URL: https://issues.apache.org/jira/browse/KAFKA-7898
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Gabriel Lukacs
>Priority: Major
>
> We observed a NullPointerException on one of our broker in 3 broker cluster 
> environment. If I list the processes and open ports it seems that the faulty 
> broker is running, but the kafka-connect (we used it also) periodically 
> restarts due to fact that it can not connect to the kafka cluster (configured 
> ssl & plaintext mode too). Is it a bug in kafka/zookeeper?
>  
> [2019-02-05 14:28:11,359] WARN Client session timed out, have not heard from 
> server in 4141ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:12,525] ERROR Caught unexpected throwable 
> (org.apache.zookeeper.ClientCnxn)
> java.lang.NullPointerException
>  at 
> kafka.zookeeper.ZooKeeperClient$$anon$8.processResult(ZooKeeperClient.scala:217)
>  at 
> org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:633)
>  at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:508)
> [2019-02-05 14:28:12,526] ERROR Caught unexpected throwable 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:22,701] WARN Client session timed out, have not heard from 
> server in 4004ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 14:28:28,670] WARN Client session timed out, have not heard from 
> server in 4049ms for sessionid 0x310166e 
> (org.apache.zookeeper.ClientCnxn)
> [2019-02-05 15:05:20,601] WARN [GroupCoordinator 1]: Failed to write empty 
> metadata for group 
> encodable-emvTokenAccess-delta-encoder-group-emvIssuerAccess-v2-2-0: The 
> group is rebalancing, so a rejoin is needed. 
> (kafka.coordinator.group.GroupCoordinator)
> kafka 7381 1 0 14:22 ? 00:00:19 java -Xmx512M -Xms512M -server -XX:+UseG1GC 
> -XX:MaxGCPauseMillis=20 -XX:InitiatingHeapOccupancyPercent=35 
> -XX:+ExplicitGCInvokesConcurrent -Djava.awt.headless=true 
> -Xloggc:/opt/kafka/bin/../logs/zookeeper-gc.log -verbose:gc 
> -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps 
> -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=10 -XX:GCLogFileSize=100M 
> -Dcom.sun.management.jmxremote 
> -Dcom.sun.management.jmxremote.authenticate=false 
> -Dcom.sun.management.jmxremote.ssl=false 
> -Dkafka.logs.dir=/opt/kafka/bin/../logs 
> -Dlog4j.configuration=file:/opt/kafka/config/zoo-log4j.properties -cp 
> /opt/kafka/bin/../libs/activation-1.1.1.jar:/opt/kafka/bin/../libs/aopalliance-repackaged-2.5.0-b42.jar:/opt/kafka/bin/../libs/argparse4j-0.7.0.jar:/opt/kafka/bin/../libs/audience-annotations-0.5.0.jar:/opt/kafka/bin/../libs/commons-lang3-3.5.jar:/opt/kafka/bin/../libs/compileScala.mapping:/opt/kafka/bin/../libs/compileScala.mapping.asc:/opt/kafka/bin/../libs/connect-api-2.1.0.jar:/opt/kafka/bin/../libs/connect-basic-auth-extension-2.1.0.jar:/opt/kafka/bin/../libs/connect-file-2.1.0.jar:/opt/kafka/bin/../libs/connect-json-2.1.0.jar:/opt/kafka/bin/../libs/connect-runtime-2.1.0.jar:/opt/kafka/bin/../libs/connect-transforms-2.1.0.jar:/opt/kafka/bin/../libs/guava-20.0.jar:/opt/kafka/bin/../libs/hk2-api-2.5.0-b42.jar:/opt/kafka/bin/../libs/hk2-locator-2.5.0-b42.jar:/opt/kafka/bin/../libs/hk2-utils-2.5.0-b42.jar:/opt/kafka/bin/../libs/jackson-annotations-2.9.7.jar:/opt/kafka/bin/../libs/jackson-core-2.9.7.jar:/opt/kafka/bin/../libs/jackson-databind-2.9.7.jar:/opt/kafka/bin/../libs/jackson-jaxrs-base-2.9.7.jar:/opt/kafka/bin/../libs/jackson-jaxrs-json-provider-2.9.7.jar:/opt/kafka/bin/../libs/jackson-module-jaxb-annotations-2.9.7.jar:/opt/kafka/bin/../libs/javassist-3.22.0-CR2.jar:/opt/kafka/bin/../libs/javax.annotation-api-1.2.jar:/opt/kafka/bin/../libs/javax.inject-1.jar:/opt/kafka/bin/../libs/javax.inject-2.5.0-b42.jar:/opt/kafka/bin/../libs/javax.servlet-api-3.1.0.jar:/opt/kafka/bin/../libs/javax.ws.rs-api-2.1.1.jar:/opt/kafka/bin/../libs/javax.ws.rs-api-2.1.jar:/opt/kafka/bin/../libs/jaxb-api-2.3.0.jar:/opt/kafka/bin/../libs/jersey-client-2.27.jar:/opt/kafka/bin/../libs/jersey-common-2.27.jar:/opt/kafka/bin/../libs/jersey-container-servlet-2.27.jar:/opt/kafka/bin/../libs/jersey-container-servlet-core-2.27.jar:/opt/kafka/bin/../libs/jersey-hk2-2.27.jar:/opt/kafka/bin/../libs/jersey-media-jaxb-2.27.jar:/opt/kafka/bin/../libs/jersey-server-2.27.jar:/opt/kafka/bin/../libs/jetty-client-9.4.12.v20180830.jar:/opt/kafka/bin/../libs/jetty-continuation-9.4.12.v20180830.jar:/opt/kafka/

[jira] [Resolved] (KAFKA-7754) zookeeper-security-migration.sh sets the root ZNode as world-readable

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7754?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7754.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> zookeeper-security-migration.sh sets the root ZNode as world-readable
> -
>
> Key: KAFKA-7754
> URL: https://issues.apache.org/jira/browse/KAFKA-7754
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.0.0
>Reporter: Badai Aqrandista
>Priority: Minor
>
> If I start broker with {{zookeeper.set.acl=true}} from the first time I start 
> the broker, the root ZNode is not set to be world-readable to allow other 
> application to share the Zookeeper ensemble with chroot.
> But if I run {{zookeeper-security-migration.sh}} with  {{–zookeeper.acl 
> secure}}, the root ZNode becomes world-readable. Is this correct?
>  
> {noformat}
> root@localhost:/# zookeeper-shell localhost:2181
> Connecting to localhost:2181
> Welcome to ZooKeeper!
> JLine support is enabled
> [zk: localhost:2181(CONNECTING) 0] 
> WATCHER::
> WatchedEvent state:SyncConnected type:None path:null
> WATCHER::
> WatchedEvent state:SaslAuthenticated type:None path:null
> [zk: localhost:2181(CONNECTED) 0] getAcl /
> 'world,'anyone
> : cdrwa
> [zk: localhost:2181(CONNECTED) 1] getAcl /brokers
> 'world,'anyone
> : r
> 'sasl,'kafkabroker
> : cdrwa
> [zk: localhost:2181(CONNECTED) 2] quit
> Quitting...
> root@localhost:/# zookeeper-security-migration --zookeeper.acl secure 
> --zookeeper.connect localhost:2181
> root@localhost:/# zookeeper-shell localhost:2181
> Connecting to localhost:2181
> Welcome to ZooKeeper!
> JLine support is enabled
> [zk: localhost:2181(CONNECTING) 0] 
> WATCHER::
> WatchedEvent state:SyncConnected type:None path:null
> WATCHER::
> WatchedEvent state:SaslAuthenticated type:None path:null
> [zk: localhost:2181(CONNECTED) 0] getAcl /
> 'world,'anyone
> : r
> 'sasl,'kafkabroker
> : cdrwa
> [zk: localhost:2181(CONNECTED) 1] getAcl /brokers
> 'world,'anyone
> : r
> 'sasl,'kafkabroker
> : cdrwa
> [zk: localhost:2181(CONNECTED) 2] 
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-7710) Poor Zookeeper ACL management with Kerberos

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-7710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-7710.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Poor Zookeeper ACL management with Kerberos
> ---
>
> Key: KAFKA-7710
> URL: https://issues.apache.org/jira/browse/KAFKA-7710
> Project: Kafka
>  Issue Type: Bug
>Reporter: Mr Kafka
>Priority: Major
>
> I have seen many organizations run many Kafka clusters. The simplest scenario 
> is you may have a *kafka.dev.example.com* cluster and a 
> *kafka.prod.example.com* cluster. The more extreme examples is teams within 
> an organization may run their own individual clusters and want isolation.
> When you enable Zookeeper ACLs in Kafka the ACL looks to be set to the 
> principal (SPN) that is used to authenticate against Zookeeper.
> For example I have brokers:
>  * *01.kafka.dev.example.com*
>  * *02.kafka.dev.example.com***
>  * *03.kafka.dev.example.com***
> On *01.kafka.dev.example.com* **I run the below the security-migration tool:
> {code:java}
> KAFKA_OPTS="-Djava.security.auth.login.config=/etc/kafka/kafka_server_jaas.conf
>  -Dzookeeper.sasl.clientconfig=ZkClient" zookeeper-security-migration 
> --zookeeper.acl=secure --zookeeper.connect=a01.zookeeper.dev.example.com:2181
> {code}
> I end up with ACL's in Zookeeper as below:
> {code:java}
> # [zk: localhost:2181(CONNECTED) 2] getAcl /cluster
> # 'sasl,'kafka/01.kafka.dev.example.com@EXAMPLE
> # : cdrwa
> {code}
> This ACL means no other broker in the cluster can access the znode in 
> Zookeeper except broker 01.
> To resolve the issue you need to set the below properties in Zookeeper's 
> config:
> {code:java}
> kerberos.removeHostFromPrincipal = true
> kerberos.removeRealmFromPrincipal = true
> {code}
> Now when Kafka set ACL's they are stored as:
> {code:java}
> # [zk: localhost:2181(CONNECTED) 2] getAcl /cluster
> # 'sasl,'kafka
> #: cdrwa
> {code}
> This now means any broker in the cluster can access the ZK node.This means if 
> I have a dev Kafka broker it can right to a "prod.zookeeper.example.com" 
> zookeeper host as when it auth's based on a SPN 
> "kafka/01.kafka.dev.example.com" the host is dropped and we auth against the 
> service principal kafka.
> If your organization is flexible you may be able to create different Kerberos 
> Realms per cluster and use:
> {code:java}
> kerberos.removeHostFromPrincipal = true
> kerberos.removeRealmFromPrincipal = false{code}
> That means acl's will be in the format "kafka/REALM" which means only brokers 
> in the same realm can connect. The difficulty here is your average large 
> organization security team willing to create adhoc realms.
> *Proposal*
> Kafka support setting ACLs for all known brokers in the cluster i.e ACLs on a 
> Znode have
> {code:java}
> kafka/01.kafka.dev.example.com@EXAMPLE
> kafka/02.kafka.dev.example.com@EXAMPLE
> kafka/03.kafka.dev.example.com@EXAMPLE{code}
> With this though some kind of support will need to be added so if a new 
> broker joins the cluster the host ACL gets added to existing ZNodes.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-6344) 0.8.2 clients will store invalid configuration in ZK for Kafka 1.0 brokers

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-6344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-6344.
---
Resolution: Fixed

We are now removing ZooKeeper support so closing this issue.

> 0.8.2 clients will store invalid configuration in ZK for Kafka 1.0 brokers
> --
>
> Key: KAFKA-6344
> URL: https://issues.apache.org/jira/browse/KAFKA-6344
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 1.0.0
>Reporter: Vincent Bernat
>Priority: Major
>
> Hello,
> When using a Kafka 0.8.2 Scala client, the "changeTopicConfig" method from 
> AdminUtils will write the topic name to /config/changes/config_change_X. 
> Since 0.9, it is expected to have a JSON string and brokers will bail out if 
> it is not the case with a java.lang.IllegalArgumentException with message 
> "Config change notification has an unexpected value. The format 
> is:{\"version\" : 1, \"entity_type\":\"topics/clients\", \"entity_name\" : 
> \"topic_name/client_id\"} or {\"version\" : 2, 
> \"entity_path\":\"entity_type/entity_name\"}. Received: \"dns\"". Moreover, 
> the broker will shutdown after this error.
> As 1.0 brokers are expected to accept 0.8.x clients, either highlight in the 
> documentation this doesn't apply to AdminUtils or accept this "version 0" 
> format.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-2952) Add ducktape test for secure->unsecure ZK migration

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-2952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-2952.
---
Resolution: Won't Do

We are now removing ZooKeeper support so closing this issue.

> Add ducktape test for secure->unsecure ZK migration 
> 
>
> Key: KAFKA-2952
> URL: https://issues.apache.org/jira/browse/KAFKA-2952
> Project: Kafka
>  Issue Type: Test
>Affects Versions: 0.9.0.0
>Reporter: Flavio Paiva Junqueira
>Assignee: Flavio Paiva Junqueira
>Priority: Major
>
> We have test cases for the unsecure -> secure path, but not the other way 
> around, We should add it.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-10041) Kafka upgrade fails from 1.1 to 2.4/2.5/trunk fails due to failure in ZooKeeper

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-10041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-10041.

Resolution: Information Provided

We are now removing ZooKeeper support so closing this issue.

> Kafka upgrade fails from 1.1 to 2.4/2.5/trunk fails due to failure in 
> ZooKeeper
> ---
>
> Key: KAFKA-10041
> URL: https://issues.apache.org/jira/browse/KAFKA-10041
> Project: Kafka
>  Issue Type: Bug
>  Components: zkclient
>Affects Versions: 2.4.0, 2.5.0, 2.6.0
>Reporter: Zhuqi Jin
>Priority: Major
>
> When we tested upgrading Kafka from 1.1 to 2.4/2.5, the upgraded node failed 
> to start due to a known zookeeper failure - ZOOKEEPER-3056.
> The error message is shown below:
>  
> {code:java}
> [2020-05-24 23:45:17,638] ERROR Unexpected exception, exiting abnormally 
> (org.apache.zookeeper.server.ZooKeeperServerMain)
> java.io.IOException: No snapshot found, but there are log entries. Something 
> is broken!
>  at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.restore(FileTxnSnapLog.java:240)
>  at org.apache.zookeeper.server.ZKDatabase.loadDataBase(ZKDatabase.java:240)
>  at 
> org.apache.zookeeper.server.ZooKeeperServer.loadData(ZooKeeperServer.java:290)
>  at 
> org.apache.zookeeper.server.ZooKeeperServer.startdata(ZooKeeperServer.java:450)
>  at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.startup(NIOServerCnxnFactory.java:764)
>  at 
> org.apache.zookeeper.server.ServerCnxnFactory.startup(ServerCnxnFactory.java:98)
>  at 
> org.apache.zookeeper.server.ZooKeeperServerMain.runFromConfig(ZooKeeperServerMain.java:144)
>  at 
> org.apache.zookeeper.server.ZooKeeperServerMain.initializeAndRun(ZooKeeperServerMain.java:106)
>  at 
> org.apache.zookeeper.server.ZooKeeperServerMain.main(ZooKeeperServerMain.java:64)
>  at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.initializeAndRun(QuorumPeerMain.java:128)
>  at 
> org.apache.zookeeper.server.quorum.QuorumPeerMain.main(QuorumPeerMain.java:82)
> {code}
>  
> {code:java}
> [2020-05-24 23:45:25,142] ERROR Fatal error during KafkaServer startup. 
> Prepare to shutdown (kafka.server.KafkaServer)
> kafka.zookeeper.ZooKeeperClientTimeoutException: Timed out waiting for 
> connection while in state: CONNECTING
>  at 
> kafka.zookeeper.ZooKeeperClient.$anonfun$waitUntilConnected$3(ZooKeeperClient.scala:259)
>  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>  at kafka.utils.CoreUtils$.inLock(CoreUtils.scala:253)
>  at 
> kafka.zookeeper.ZooKeeperClient.waitUntilConnected(ZooKeeperClient.scala:255)
>  at kafka.zookeeper.ZooKeeperClient.(ZooKeeperClient.scala:113)
>  at kafka.zk.KafkaZkClient$.apply(KafkaZkClient.scala:1858)
>  at kafka.server.KafkaServer.createZkClient$1(KafkaServer.scala:375)
>  at kafka.server.KafkaServer.initZkClient(KafkaServer.scala:399)
>  at kafka.server.KafkaServer.startup(KafkaServer.scala:207)
>  at kafka.server.KafkaServerStartable.startup(KafkaServerStartable.scala:44)
>  at kafka.Kafka$.main(Kafka.scala:84)
>  at kafka.Kafka.main(Kafka.scala){code}
> It can be reproduced through the following steps:
> 1. Start a single-node kafka 1.1. 
> 2. Create a topic and use kafka-producer-perf-test.sh to produce several 
> message.
> {code:java}
> bin/kafka-topics.sh --create --bootstrap-server localhost:9092 
> --replication-factor 1 --partitions 1 --topic test 
> bin/kafka-producer-perf-test.sh --topic test --num-records 500 --record-size 
> 300 --throughput 100 --producer-props bootstrap.servers=localhost:9092{code}
> 3. Upgrade the node to 2.4/2.5 with the same configuration. The new version 
> node failed to start because of the zookeeper.
> Kafka 1.1 is using dependant-libs-2.11.12/zookeeper-3.4.10.jar, and Kafka 
> 2.4/2.5/trunk(5302efb2d1b7a69bcd3173a13b2d08a2666979ed) are using 
> zookeeper-3.5.8.jar
> The bug is fixed in zookeeper-3.6.0, should we upgrade the dependency of 
> Kafka 2.4/2.5/trunk to use zookeeper-3.6.0.jar?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-8188) Zookeeper Connection Issue Take Down the Whole Kafka Cluster

2024-10-09 Thread Mickael Maison (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-8188?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mickael Maison resolved KAFKA-8188.
---
Resolution: Won't Fix

We are now removing ZooKeeper support so closing this issue.

> Zookeeper Connection Issue Take Down the Whole Kafka Cluster
> 
>
> Key: KAFKA-8188
> URL: https://issues.apache.org/jira/browse/KAFKA-8188
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.1.1
>Reporter: Candice Wan
>Priority: Critical
> Attachments: thread_dump.log
>
>
> We recently upgraded to 2.1.1 and saw below zookeeper connection issues which 
> took down the whole cluster. We've got 3 nodes in the cluster, 2 of which 
> threw below exceptions at the same second.
> 2019-04-03 08:25:19.603 [main-SendThread(host2:36100)] WARN 
> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, 
> session 0x10071ff9baf0001 has expired
>  2019-04-03 08:25:19.603 [main-SendThread(host2:36100)] INFO 
> org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, 
> session 0x10071ff9baf0001 has expired, closing socket connection
>  2019-04-03 08:25:19.605 [main-EventThread] INFO 
> org.apache.zookeeper.ClientCnxn - EventThread shut down for session: 
> 0x10071ff9baf0001
>  2019-04-03 08:25:19.605 [zk-session-expiry-handler0] INFO 
> kafka.zookeeper.ZooKeeperClient - [ZooKeeperClient] Session expired.
>  2019-04-03 08:25:19.609 [zk-session-expiry-handler0] INFO 
> kafka.zookeeper.ZooKeeperClient - [ZooKeeperClient] Initializing a new 
> session to host1:36100,host2:36100,host3:36100.
>  2019-04-03 08:25:19.610 [zk-session-expiry-handler0] INFO 
> org.apache.zookeeper.ZooKeeper - Initiating client connection, 
> connectString=host1:36100,host2:36100,host3:36100 sessionTimeout=6000 
> watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@12f8b1d8
>  2019-04-03 08:25:19.610 [zk-session-expiry-handler0] INFO 
> o.apache.zookeeper.ClientCnxnSocket - jute.maxbuffer value is 4194304 Bytes
>  2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(host1:36100)] 
> WARN org.apache.zookeeper.ClientCnxn - SASL configuration failed: 
> javax.security.auth.login.LoginException: No JAAS configuration section named 
> 'Client' was found in specified JAAS configuration file: 
> 'file:/app0/common/config/ldap-auth.config'. Will continue connection to 
> Zookeeper server without SASL authentication, if Zookeeper server allows it.
>  2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(host1:36100)] 
> INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server 
> host1/169.30.47.206:36100
>  2019-04-03 08:25:19.611 [zk-session-expiry-handler0-EventThread] ERROR 
> kafka.zookeeper.ZooKeeperClient - [ZooKeeperClient] Auth failed.
>  2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(host1:36100)] 
> INFO org.apache.zookeeper.ClientCnxn - Socket connection established, 
> initiating session, client: /169.20.222.18:56876, server: 
> host1/169.30.47.206:36100
>  2019-04-03 08:25:19.612 [controller-event-thread] INFO 
> k.controller.PartitionStateMachine - [PartitionStateMachine controllerId=3] 
> Stopped partition state machine
>  2019-04-03 08:25:19.613 [controller-event-thread] INFO 
> kafka.controller.ReplicaStateMachine - [ReplicaStateMachine controllerId=3] 
> Stopped replica state machine
>  2019-04-03 08:25:19.614 [controller-event-thread] INFO 
> kafka.controller.KafkaController - [Controller id=3] Resigned
>  2019-04-03 08:25:19.615 [controller-event-thread] INFO 
> kafka.zk.KafkaZkClient - Creating /brokers/ids/3 (is it secure? false)
>  2019-04-03 08:25:19.628 [zk-session-expiry-handler0-SendThread(host1:36100)] 
> INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on 
> server host1/169.30.47.206:36100, sessionid = 0x1007f4d2b81, negotiated 
> timeout = 6000
>  2019-04-03 08:25:19.631 [/config/changes-event-process-thread] INFO 
> k.c.ZkNodeChangeNotificationListener - Processing notification(s) to 
> /config/changes
>  2019-04-03 08:25:19.637 [controller-event-thread] ERROR 
> k.zk.KafkaZkClient$CheckedEphemeral - Error while creating ephemeral at 
> /brokers/ids/3, node already exists and owner '72182936680464385' does not 
> match current session '72197563457011712'
>  2019-04-03 08:25:19.637 [controller-event-thread] INFO 
> kafka.zk.KafkaZkClient - Result of znode creation at /brokers/ids/3 is: 
> NODEEXISTS
>  2019-04-03 08:25:19.644 [controller-event-thread] ERROR 
> k.c.ControllerEventManager$ControllerEventThread - [ControllerEventThread 
> controllerId=3] Error processing event RegisterBrokerAndReelect
>  org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = 
> NodeExists
>  at org.apache.zookeeper.KeeperException.creat

[jira] [Created] (KAFKA-17746) Replace JavaConverters with CollectionConverters

2024-10-09 Thread Mickael Maison (Jira)

Mickael Maison created KAFKA-17746:
--

 Summary: Replace JavaConverters with CollectionConverters
 Key: KAFKA-17746
 URL: https://issues.apache.org/jira/browse/KAFKA-17746
 Project: Kafka
  Issue Type: Sub-task
Reporter: Mickael Maison


scala.collection.JavaConverters is deprecated, since Scala 2.13 we can use 
scala.jdk.CollectionConverters instead



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17751) Contoller high CPU when formatted with --initial-controllers

2024-10-09 Thread Juha Mynttinen (Jira)

Juha Mynttinen created KAFKA-17751:
--

 Summary: Contoller high CPU when formatted with 
--initial-controllers 
 Key: KAFKA-17751
 URL: https://issues.apache.org/jira/browse/KAFKA-17751
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.9.0
Reporter: Juha Mynttinen
 Attachments: Screenshot 2024-10-09 at 9.15.06.png, c1.properties, 
c2.properties, c3.properties, c4.properties

Hey,

I'm using 3.9.0 RC0.

I noticed that formatting a simple three node controller cluster with 
--initial-controllers and starting the controller leads to a situation where 
the non-leader voters consume a lot of CPU.

Here are the steps to reproduce. The needed configuration files are attached.



Clean up and setup the environment.

rm -rf /tmp/controllers && \
mkdir -p /tmp/controllers/c1 && \
mkdir -p /tmp/controllers/c2 && \
mkdir -p /tmp/controllers/c3 && \
mkdir -p /tmp/controllers/c4

export KAFKA_HOME=

Format the controllers

$KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 
----0001 --initial-controllers 
1001@localhost:10001:AAEAAA,1002@localhost:10002:AAEAAA,1003@localhost:10003:AAEAAA
 --config c1.properties
$KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 
----0001 --initial-controllers 
1001@localhost:10001:AAEAAA,1002@localhost:10002:AAEAAA,1003@localhost:10003:AAEAAA
 --config c2.properties
$KAFKA_HOME/bin/kafka-storage.sh format --cluster-id 
----0001 --initial-controllers 
1001@localhost:10001:AAEAAA,1002@localhost:10002:AAEAAA,1003@localhost:10003:AAEAAA
 --config c3.properties

Start the controllers, in separate terminals

$KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c1.properties
$KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c2.properties
$KAFKA_HOME/bin/kafka-run-class.sh -name kafkaService kafka.Kafka c3.properties

Observe two of the controllers have CPU usage at 100%. If you check which PID 
is which, you can see that it's the two processes that are voters that have 
elevated CPU. The CPU usage of the leader is fine.

I did in an slightly different environment some profiling. The screenshot is 
attached.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17702) KIP 919 support describeMetadataQuorum for controller

2024-10-09 Thread TaiJuWu (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17702?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

TaiJuWu resolved KAFKA-17702.
-
Resolution: Invalid

> KIP 919 support describeMetadataQuorum for controller
> -
>
> Key: KAFKA-17702
> URL: https://issues.apache.org/jira/browse/KAFKA-17702
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: TaiJuWu
>Assignee: TaiJuWu
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17737) E2E tests need to drop Kafka versions prior to 1.0.0

2024-10-09 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17737?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17737.

Fix Version/s: 4.0.0
   Resolution: Fixed

> E2E tests need to drop Kafka versions prior to 1.0.0
> 
>
> Key: KAFKA-17737
> URL: https://issues.apache.org/jira/browse/KAFKA-17737
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Jhen Yung Hsu
>Priority: Major
> Fix For: 4.0.0
>
>
> There are three reasons for removing older Kafka versions from the 
> compatibility tests:
> 1. Kafka versions prior to 1.0.0 can't run under JDK 11 (KAFKA-5076).
> 2. The old protocol will be removed by KAFKA-16096.
> 3. The base image needs to be upgraded to JDK 11, as JDK 8 will be dropped 
> (KAFKA-12894)."
> The following E2E tests need to drop Kafka versions prior to 1.0.0.
> 1. tests/kafkatest/tests/streams/streams_broker_compatibility_test.py 
> 2. tests/kafkatest/tests/client/client_compatibility_produce_consume_test.py 
> 3. tests/kafkatest/tests/client/client_compatibility_features_test.py 
> 4. 
> tests/kafkatest/tests/connect/connect_distributed_test.py::ConnectDistributedTest.test_broker_compatibility
> Please note that this JIRA only removes broker-related tests. the old client 
> code can still run under JDK 11
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Jenkins build is still unstable: Kafka » Kafka PowerPC Daily » test-powerpc #82

2024-10-09 Thread Apache Jenkins Server

See

[jira] [Created] (KAFKA-17757) cleanup code case under JDK 11

2024-10-09 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-17757:
--

 Summary: cleanup code case under JDK 11
 Key: KAFKA-17757
 URL: https://issues.apache.org/jira/browse/KAFKA-17757
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


We can leverage Java 11 features instead of relying on many small utility 
functions.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17758) Remove Utils.mkMap and Utils.mkEntry

2024-10-09 Thread Chia-Ping Tsai (Jira)

Chia-Ping Tsai created KAFKA-17758:
--

 Summary: Remove Utils.mkMap and Utils.mkEntry
 Key: KAFKA-17758
 URL: https://issues.apache.org/jira/browse/KAFKA-17758
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


They can be replaced by Map.entry and Map.ofEntries



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17739) Clean up build.gradle to adopt the minimum Java version as 11.

2024-10-09 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17739.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Clean up build.gradle to adopt the minimum Java version as 11.
> --
>
> Key: KAFKA-17739
> URL: https://issues.apache.org/jira/browse/KAFKA-17739
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: TengYao Chi
>Priority: Minor
> Fix For: 4.0.0
>
>
> 1. minJavaVersion=11
> 2. remove sourceCompatibility and targetCompatibility
> 3. cleanup all compatibility checks



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-16727) Add dynamic group configuration for record lock duration

2024-10-09 Thread Chirag Wadhwa (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-16727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chirag Wadhwa resolved KAFKA-16727.
---
Resolution: Fixed

> Add dynamic group configuration for record lock duration
> 
>
> Key: KAFKA-16727
> URL: https://issues.apache.org/jira/browse/KAFKA-16727
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Andrew Schofield
>Assignee: Chirag Wadhwa
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17532) Restructure Share Group Configs

2024-10-09 Thread Chirag Wadhwa (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chirag Wadhwa resolved KAFKA-17532.
---
Resolution: Fixed

> Restructure Share Group Configs
> ---
>
> Key: KAFKA-17532
> URL: https://issues.apache.org/jira/browse/KAFKA-17532
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chirag Wadhwa
>Assignee: Chirag Wadhwa
>Priority: Major
>
> Move ShareGroupConfig to share module



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17756) Add dynamic share group configurations group.share.heartbeat.interval.ms and group.share.session.timeout.ms

2024-10-09 Thread Chirag Wadhwa (Jira)

Chirag Wadhwa created KAFKA-17756:
-

 Summary: Add dynamic share group configurations 
group.share.heartbeat.interval.ms and group.share.session.timeout.ms
 Key: KAFKA-17756
 URL: https://issues.apache.org/jira/browse/KAFKA-17756
 Project: Kafka
  Issue Type: Sub-task
Reporter: Chirag Wadhwa
Assignee: Chirag Wadhwa






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17738) upgrade base image from jdk8 to jdk11

2024-10-09 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17738?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17738.

Fix Version/s: 4.0.0
   Resolution: Fixed

> upgrade base image from jdk8 to jdk11
> -
>
> Key: KAFKA-17738
> URL: https://issues.apache.org/jira/browse/KAFKA-17738
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Chia-Ping Tsai
>Assignee: Jhen Yung Hsu
>Priority: Major
> Fix For: 4.0.0
>
>
> as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17387) Remove broker-list in VerifiableConsumer

2024-10-09 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17387.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Remove broker-list in VerifiableConsumer
> 
>
> Key: KAFKA-17387
> URL: https://issues.apache.org/jira/browse/KAFKA-17387
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: PoAn Yang
>Assignee: PoAn Yang
>Priority: Minor
> Fix For: 4.0.0
>
>
> The 
> [broker-list|https://github.com/apache/kafka/blob/944c1353a925858ea9bd9024a713cd7301f55133/tools/src/main/java/org/apache/kafka/tools/VerifiableConsumer.java#L522-L528]
>  is deprecated option in VerifiableConsumer. We can consider to remove it in 
> 4.0.
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-499+-+Unify+connection+name+flag+for+command+line+tool



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Resolved] (KAFKA-17384) Remove deprecated options in tools

2024-10-09 Thread Chia-Ping Tsai (Jira)



 [ 
https://issues.apache.org/jira/browse/KAFKA-17384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-17384.

Fix Version/s: 4.0.0
   Resolution: Fixed

> Remove deprecated options in tools
> --
>
> Key: KAFKA-17384
> URL: https://issues.apache.org/jira/browse/KAFKA-17384
> Project: Kafka
>  Issue Type: Improvement
>Reporter: PoAn Yang
>Assignee: PoAn Yang
>Priority: Minor
> Fix For: 4.0.0
>
>
> There're deprecated options in following tools. We can consider to remove 
> them in 4.0.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.9 #85

2024-10-09 Thread Apache Jenkins Server

See

[jira] [Created] (KAFKA-17755) AbstractPartitionAssignor can not enable Rack Aware Assignment

2024-10-09 Thread Jerry Cai (Jira)

Jerry Cai created KAFKA-17755:
-

 Summary: AbstractPartitionAssignor can not enable Rack Aware 
Assignment 
 Key: KAFKA-17755
 URL: https://issues.apache.org/jira/browse/KAFKA-17755
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 3.8.0
Reporter: Jerry Cai


During my local test and debug, I noticed that the below logical is in correct, 
it need to change 

from   !racksPerPartition.values().stream().allMatch(partitionRacks::equals)  
to  racksPerPartition.values().stream().allMatch(partitionRacks::equals)

 

 

 

current logical 
{code:java}
protected boolean useRackAwareAssignment(Set consumerRacks, Set 
partitionRacks, Map> racksPerPartition) {
if (consumerRacks.isEmpty() || Collections.disjoint(consumerRacks, 
partitionRacks))
return false;
else if (preferRackAwareLogic)
return true;
else {
return 
!racksPerPartition.values().stream().allMatch(partitionRacks::equals);
}
}
 {code}
 

expected logical 
{code:java}
protected boolean useRackAwareAssignment(Set consumerRacks, Set 
partitionRacks, Map> racksPerPartition) {
if (consumerRacks.isEmpty() || Collections.disjoint(consumerRacks, 
partitionRacks))
return false;
else if (preferRackAwareLogic)
return true;
else {
return 
racksPerPartition.values().stream().allMatch(partitionRacks::equals);
}
}
 {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (KAFKA-17754) A delayed EndTxn message can cause aborted read, lost writes, atomicity violation

2024-10-09 Thread Kyle Kingsbury (Jira)

Kyle Kingsbury created KAFKA-17754:
--

 Summary: A delayed EndTxn message can cause aborted read, lost 
writes, atomicity violation
 Key: KAFKA-17754
 URL: https://issues.apache.org/jira/browse/KAFKA-17754
 Project: Kafka
  Issue Type: Bug
  Components: clients, producer , protocol
Affects Versions: 3.8.0
Reporter: Kyle Kingsbury
 Attachments: g1a-trace.svg

In short: I believe that both an internal retry mechanism and guidance from the 
example code and client API docs are independently capable of causing committed 
transactions to actually abort, aborted transactions to actually commit, and 
transactions to be split into multiple parts with different fates. A delayed 
EndTxn message could arrive seconds later and decide the fate of an unrelated 
transaction.

Consider the attached Lamport diagram, reconstructed from node logs and packet 
captures in a recent Jepsen test. In it, a single process, using a single 
producer and consumer, executes a series of transactions which all commit or 
abort cleanly.Process 76 selected the unique transactional ID `jt1234` on 
initialization.

>From packet captures and debug logs, we see `jt1234` used producer ID `233`, 
>submitted all four operations, then sent an EndTxn message with `committed = 
>false`, which denotes a transaction abort. However, fifteen separate calls to 
>`poll` observed this transaction's write of `424` to key `5`---an obvious case 
>of aborted read (G1a). Even stranger, *no* poller observed the other writes 
>from this transaction: key `17` apparently never
received values `926` or `927`. Why?

Close inspection of the packet capture, combined with Bufstream's logs, allowed 
us to reconstruct what happened. Process 76 began a transaction which sent 
`1018` to key `15`. It sent an `EndTxn` message to commit that transaction to 
node `n3`. However, it did not receive a prompt response. The client then 
quietly sent a *second* commit message to `n4`, which returned successfully; 
the test harness's call to `commitTransaction` completed successfully. The 
process then performed and intentionally aborted a second transaction; this
completed OK. So far, so good.

Then process 76 began our problematic transaction. It sent `424` to key `5`, 
and added new partitions to the transaction. Just after accepting record `424`, 
node `n3` received the delayed commit message from two transactions previously. 
It committed the current transaction, effectively chopping it in half. The 
first half (record `424`) was committed and visible to pollers. The second 
half, sending `926` and `927` to key `17`, implicitly began a second 
transaction, which was aborted by the client.

This suggests a fundamental problem in the Kafka transaction protocol. The 
protocol is [intentionally designed](https://kafka.apache.org/protocol) to 
allow clients to submit requests over multiple TCP connections and to 
distribute them across multiple nodes. There is no sequence number to order 
requests from the same client. There is no concept of a transaction number. 
When a server receives a commit (or abort) message, it has no way to know what 
transaction the client intended to commit. It simply commits or aborts whatever
transaction happens to be in progress.

This means transactions which appeared to commit could actually abort, and vice 
versa: we observed both aborted reads and lost writes. It also means 
transactions could get chopped in to smaller pieces: one could lose some, but 
not all, of a transaction's effects.

What does it take to get this behavior? First, an `EndTxn` message must be 
delayed---for instance due to network latency, packet loss, a slow computer, 
garbage collection, etc. Second, while that `EndTxn` arrow is hovering in the 
air, the client needs to move on to perform a second transaction using the same 
producer ID and epoch. There are several ways this could happen.

First, users could explicitly retry committing or aborting a transaction. The 
docs say they can, and the client won't stop them.

Second, the official Kafka Java client docs instruct users repeatedly instruct 
users to call `abortTransaction` if an error occurs during 
`commitTransaction`.[^abort-exceptions] The provided example code leads 
directly to this behavior: if `commitTransaction` times out, it calls 
`abortTransaction`, and violà: the client can move on to later operations. The 
only exceptions in the docs are `ProducerFencedException`, 
`OutOfOrderSequenceException`, and `AuthorizationException`, none of which 
apply here.

I've tried to avoid this problem by ensuring that transactions either commit 
once, or abort once, never both. Sadly, this doesn't work. Indeed, process 76 
in this test run *never* tried to abort a transaction after calling 
commit---and even though it only calls `commitTransaction` once, it sent *two* 
commit messages to two diff

74 matches

Mail list logo