Jenkins build is unstable: Kafka » Kafka Branch Builder » 3.7 #82

2024-01-31 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2606

2024-01-31 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.6 #141

2024-01-31 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16212) Cache partitions by TopicIdPartition instead of TopicPartition

2024-01-31 Thread Gaurav Narula (Jira)
Gaurav Narula created KAFKA-16212:
-

 Summary: Cache partitions by TopicIdPartition instead of 
TopicPartition
 Key: KAFKA-16212
 URL: https://issues.apache.org/jira/browse/KAFKA-16212
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 3.7.0
Reporter: Gaurav Narula


>From the discussion in [PR 
>15263|https://github.com/apache/kafka/pull/15263#discussion_r1471075201], it 
>would be better to cache {{allPartitions}} by {{TopicIdPartition}} instead of 
>{{TopicPartition}} to avoid ambiguity.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2607

2024-01-31 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-16213) KRaft should honor voterId outside of VotedState

2024-01-31 Thread Jira
José Armando García Sancio created KAFKA-16213:
--

 Summary: KRaft should honor voterId outside of VotedState
 Key: KAFKA-16213
 URL: https://issues.apache.org/jira/browse/KAFKA-16213
 Project: Kafka
  Issue Type: Improvement
  Components: kraft
Reporter: José Armando García Sancio
Assignee: José Armando García Sancio


The current implementation of KRaft only stores the id of the replica for which 
it voted when it is in the VotedState. When it transitions to other states like 
Follower, Leader, Resigned, etc. it doesn't continue to remember and persist 
the replica id of the replica for which it voted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


RE: DISCUSS KIP-984 Add pluggable compression interface to Kafka

2024-01-31 Thread Diop, Assane
Hi Divij, 
Thank you for your response!
  
Although compression is not a new problem, it has continued to be an important 
research topic.
The integration and testing of new compression algorithms into Kafka currently 
requires significant code changes and rebuilding of the distribution package 
for Kafka. 
This KIP will allow for any compression algorithm to be seamlessly integrated 
into Kafka by writing a plugin that would bind into the wrapForInput and 
wrapForOutput methods in Kafka.

As you mentioned, Kafka currently supports zstd, snappy, gzip and lz4. However, 
other opensource compression projects like the Brotli algorithm are also 
gaining traction. For example the HTTP servers Apache and nginx offer Brotli 
compression as an option. With a pluggable interface, any Kafka developer could 
integrate and test Brotli with Kafka simply by writing a plugin. This same 
motivation can be applied to any other compression algorithm including hardware 
accelerated compression. There are hardware companies including intel and AMD 
that are working on accelerating compression. 

This KIP would certainly complement the current 
https://issues.apache.org/jira/browse/KAFKA-7632 by adding even more 
flexibility for the users. 
A plugin could be tailored to arbitrary datasets in response to a user's 
specific resource requirements. 
 
For reference, other opensource projects have already started or implemented 
this type of plugin technology such as: 
1. Cassandra, which has implemented the same concept of pluggable 
interface. 
2. OpenSearch is also working on enabling the same type of plugin 
framework.
 
With respect to message recompression, the plugin interface would handle this 
use case on the broker side similar to the current recompression process. 
 
Assane  

-Original Message-
From: Divij Vaidya  
Sent: Friday, December 22, 2023 2:27 AM
To: dev@kafka.apache.org
Subject: Re: DISCUSS KIP-984 Add pluggable compression interface to Kafka

Thank you for writing the KIP Assane.

In general, exposing a "pluggable" interface is not a decision made lightly 
because it limits our ability to remove / change that interface in future.
Any future changes to the interface will have to remain compatible with 
existing plugins which limits the flexibility of changes we can make inside 
Kafka. Hence, we need a strong motivation for adding a pluggable interface.

1\ May I ask the motivation for this KIP? Are the current compression codecs 
(zstd, gzip, lz4, snappy) not sufficient for your use case? Would proving fine 
grained compression options as proposed in
https://issues.apache.org/jira/browse/KAFKA-7632 and 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-390%3A+Support+Compression+Level
address your use case?
2\ "This option impacts the following processes" -> This should also include 
the decompression and compression that occurs during message version 
transformation, i.e. when client send message with V1 and broker expects in V2, 
we convert the message and recompress it.

--
Divij Vaidya



On Mon, Dec 18, 2023 at 7:22 PM Diop, Assane  wrote:

> I would like to bring some attention to this KIP. We have added an 
> interface to the compression code that allow anyone to build their own 
> compression plugin and integrate easily back to kafka.
>
> Assane
>
> -Original Message-
> From: Diop, Assane 
> Sent: Monday, October 2, 2023 9:27 AM
> To: dev@kafka.apache.org
> Subject: DISCUSS KIP-984 Add pluggable compression interface to Kafka
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-984%3A+Add+plugg
> able+compression+interface+to+Kafka
>


Re: [DISCUSS] KIP-1018: Introduce max remote fetch timeout config

2024-01-31 Thread Jorge Esteban Quilcate Otoya
Hi Kamal,

Thanks for this KIP! It should help to solve one of the main issues with
tiered storage at the moment that is dealing with individual consumer
configurations to avoid flooding logs with interrupted exceptions.

One of the topics discussed in [1][2] was on the semantics of `
fetch.max.wait.ms` and how it's affected by remote storage. Should we
consider within this KIP the update of `fetch.max.wail.ms` docs to clarify
it only applies to local storage?

Otherwise, LGTM -- looking forward to see this KIP adopted.

[1] https://issues.apache.org/jira/browse/KAFKA-15776
[2] https://github.com/apache/kafka/pull/14778#issuecomment-1820588080

On Tue, 30 Jan 2024 at 01:01, Kamal Chandraprakash <
kamal.chandraprak...@gmail.com> wrote:

> Hi all,
>
> I have opened a KIP-1018
> <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests
> >
> to introduce dynamic max-remote-fetch-timeout broker config to give more
> control to the operator.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1018%3A+Introduce+max+remote+fetch+timeout+config+for+DelayedRemoteFetch+requests
>
> Let me know if you have any feedback or suggestions.
>
> --
> Kamal
>


Re: [DISCUSS] KIP-974 Docker Image for GraalVM based Native Kafka Broker

2024-01-31 Thread Justine Olshan
Hey Krishna,

Can we include the perf results between the distroless/alpine/ubuntu images
in the KIP?

I also noticed

   - Alpine employs the apk package manager, which, being relatively less
   popular, may pose challenges in the future. There's a potential risk that
   certain libraries we might need could lack support from apk


Is this a concern we would have with the other images?

Thanks,
Justine

On Tue, Dec 12, 2023 at 9:34 AM Krishna Agarwal <
krishna0608agar...@gmail.com> wrote:

> Hi Ismael,
> Would you happen to have any remaining concerns regarding the selection of
> the base Docker image?
> Alternatively, do you have any additional suggestions or insights?
>
> Regards,
> Krishna
>
>
> On Fri, Nov 24, 2023 at 1:16 AM Krishna Agarwal <
> krishna0608agar...@gmail.com> wrote:
>
> > Hi Ismael,
> >
> > In my pursuit of a lightweight base image, I initially considered Alpine
> > and Distroless
> >
> >1. The next best option I explored is the Ubuntu Docker image(
> >https://hub.docker.com/_/ubuntu/tags) which is a more complete image.
> >It has a size of 70MB compared to the 15MB of the Alpine image
> >(post-installation of glibc and bash), resulting in a difference of
> 55MB.
> >2. To assess performance, I executed produce/consume performance
> >scripts on the Kafka native Docker image using both Alpine and
> Ubuntu, and
> >the results indicated comparable performance between the two.
> >
> > I wanted to check if there's any other image you'd like me to assess for
> > consideration. Your input would be greatly appreciated.
> >
> > Regards,
> > Krishna
> >
> > On Thu, Nov 23, 2023 at 2:31 AM Ismael Juma  wrote:
> >
> >> Hi Krishna,
> >>
> >> I am still finding it difficult to evaluate this choice. A couple of
> >> things
> >> would help:
> >>
> >> 1. How much smaller is the alpine image compared to the best
> alternative?
> >> 2. Is there any performance impact of going with Alpine?
> >>
> >> Ismael
> >>
> >>
> >> On Wed, Nov 22, 2023, 8:42 AM Krishna Agarwal <
> >> krishna0608agar...@gmail.com>
> >> wrote:
> >>
> >> > Hi Ismael,
> >> > Thanks for the feedback.
> >> >
> >> > The alpine image does present a few drawbacks, such as the use of musl
> >> libc
> >> > instead of glibc, the absence of bash, and reliance on the less
> popular
> >> > package manager "apk". Considering the advantage of a smaller image
> size
> >> > and installing the missing packages(glibc and bash), I have proposed
> the
> >> > alpine image as the base image. Let me know if you have any
> suggestions.
> >> > I have added a detailed section for the same in the KIP.
> >> >
> >> > Regards,
> >> > Krishna
> >> >
> >> > On Wed, Nov 22, 2023 at 8:08 PM Ismael Juma 
> wrote:
> >> >
> >> > > Hi,
> >> > >
> >> > > One question I have is regarding the choice to use alpine - it would
> >> be
> >> > > good to clarify if there are downsides (the upside was explained -
> >> images
> >> > > are smaller).
> >> > >
> >> > > Ismael
> >> > >
> >> > > On Fri, Sep 8, 2023, 12:17 AM Krishna Agarwal <
> >> > > krishna0608agar...@gmail.com>
> >> > > wrote:
> >> > >
> >> > > > Hi,
> >> > > > I want to submit a KIP to deliver an experimental Apache Kafka
> >> docker
> >> > > > image.
> >> > > > The proposed docker image can launch brokers with sub-second
> startup
> >> > time
> >> > > > and minimal memory footprint by leveraging a GraalVM based native
> >> Kafka
> >> > > > binary.
> >> > > >
> >> > > > KIP-974: Docker Image for GraalVM based Native Kafka Broker
> >> > > > <
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-974%3A+Docker+Image+for+GraalVM+based+Native+Kafka+Broker
> >> > > > >
> >> > > >
> >> > > > Regards,
> >> > > > Krishna
> >> > > >
> >> > >
> >> >
> >>
> >
>


Re: Kafka-Streams-Scala for Scala 3

2024-01-31 Thread Matthias J. Sax
Thanks for raising this. The `kafka-streams-scala` module seems to be an 
important feature for Kafka Streams and I am generally in favor of your 
proposal to add Scala 3 support. However, I am personally no Scala 
person and it sounds like quite some overhead.


If you are willing to drive and own this initiative happy to support you 
to the extend I can.


About the concrete proposal: my understanding is that :core will move 
off Scala long-term (not 100% sure what the timeline is, but new modules 
are written in Java only). Thus, down the road the compatibility issue 
would go away naturally, but it's unclear when.


Thus, if we can test kafak-stream-scala_3 with core_2.13 it seems we 
could add support for Scala 3 now, taking a risk that it might break in 
the future assume that the migration off Scala from core is not fast enough.


For proposal (2), I don't think that it would be easily possible for 
unit/integration tests. We could fall back to system tests though, but 
they would be much more heavy weight of course.


Might be good to hear from others. We might actually also want to do a 
KIP for this?



-Matthias

On 1/20/24 10:34 AM, Matthias Berndt wrote:

Hey there,

I'd like to discuss a Scala 3 port of the kafka-streams-scala library.
Currently, the build system is set up such that kafka-streams-scala
and core (i. e. kafka itself) are compiled with the same Scala
compiler versions. This is not an optimal situation because it means
that a Scala 3 release of kafka-streams-scala cannot happen
independently of kafka itself. I think this should be changed

The production codebase of scala-streams-kafka actually compiles just
fine on Scala 3.3.1 with two lines of trivial syntax changes. The
problem is with the tests. These use the `EmbeddedKafkaCluster` class,
which means that kafka is pulled into the classpath, potentially
leading to binary compatibility issues.
I can see several approaches to fixing this:

1. Run the kafka-streams-scala tests using the compatible version of
:core if one is available. Currently, this means that everything can
be tested (test kafka-streams-scala_2.12 using core_2.12,
kafka-streams-scala_2.13 using core_2.13 and kafka-streams-scala_3
using core_2.13, as these should be compatible), but when a new
scala-library version is released that is no longer compatible with
2.13, we won't be able to test that.
2. Rewrite the tests to run without EmbeddedKafkaCluster, instead
running the test cluster in a separate JVM or perhaps even a
container.

I'd be willing to get my hands dirty working on this, but before I
start I'd like to get some feedback from the Kafka team regarding the
approaches outlined above.

All the best
Matthias Berndt


ZK vs KRaft benchmarking - latency differences?

2024-01-31 Thread Brebner, Paul
Hi all,

We’ve previously done some benchmarking of Kafka ZooKeeper vs KRaft and found 
no difference in throughput (which we believed is also what theory predicted, 
as ZK/Kraft are only involved in Kafka meta-data operations, not data 
workloads).

BUT – latest tests reveal improved producer and consumer latency for Kraft c.f. 
ZooKeeper.  So just wanted to check if Kraft is actually involved in any aspect 
of write/read workloads? For example, some documentation (possibly old) 
suggests that consumer offsets are stored in meta-data?  In which case this 
could explain better Kraft latencies. But if not, then I’m curious to 
understand the difference (and if it’s documented anywhere?)

Also if anyone has noticed the same regarding latency in benchmarks.

Regards, Paul Brebner


[jira] [Created] (KAFKA-16214) No user info when SASL authentication failure

2024-01-31 Thread Luke Chen (Jira)
Luke Chen created KAFKA-16214:
-

 Summary: No user info when SASL authentication failure
 Key: KAFKA-16214
 URL: https://issues.apache.org/jira/browse/KAFKA-16214
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.6.0
Reporter: Luke Chen
Assignee: Luke Chen


When client authenticate failed, the server will log with the client IP address 
only. The the IP address sometimes cannot represent a specific user, especially 
if there is proxy between client and server. Ex:


{code:java}
INFO [SocketServer listenerType=ZK_BROKER, nodeId=0] Failed authentication with 
/127.0.0.1 (channelId=127.0.0.1:9093-127.0.0.1:53223-5) (Authentication failed: 
Invalid username or password) (org.apache.kafka.common.network.Selector)
{code}


If there are many failed authentication log appeared in the server, it'd be 
better to identify who is triggering it soon. Adding the client info to the log 
is a good start. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)