Re: [DISCUSS] KIP-832 Allow creating a producer/consumer using a producer/consumer config

2022-05-12 Thread Bruno Cadonna

Hi Francois,

Modifying this KIP or starting a new one is your call.

I guess the idea of the builder might cause some more discussions. So if 
you want to get a working version released as soon as possible, I would 
opt to get the current KIP implemented and create a new one for the 
builder. If time is not an issue I would modify the current KIP.


But as I said: It's your call.

Best,
Bruno

On 11.05.22 11:01, François Rosière wrote:

To be clear, there is no problem for me to update the current KIP with the
builder approach.
It's not a lot of work in terms of code.
So, up to you. Let me know and I will do the necessary to go in one or the
other direction...
Thanks again for the feedbacks.

Le mer. 11 mai 2022 à 10:52, François Rosière 
a écrit :


Hello,

Builder is clearly the way to go for future releases of Kafka.

If we align streams, we would have 3 builders

new ConsumerBuilder()
.withKeyDeserializer()
.withValueDeserializer()
.withInterceptors()
.withMetricsReporter()
.build();

new ProducerBuilder()
.withKeySerializer()
.withValueSerializer()
.withInterceptors()
.withPartitioner()
.withMetricsReporter()
.build();

new KafkaStreamsBuilder()
.withProducerInterceptors()
.withConsumerInterceptors()
.withTime()
.withKafkaClientSupplier()
.withMetricsReporter()
.build();

The builder property would always override the configuration "instances".
There is maybe other methods to add to the builders...
The map, properties or config could be given to the constructor of the
builder instead of the build method...
At the end, we may only keep one single constructor in the Producer,
Consumer and KafkaStreams obects.

@Chris, @Bruno, thank you for your replies and proposals. Do you want I
create another KIP explaining the builder approach or do you prefer to do
it?

Kr,

F.


Le mer. 11 mai 2022 à 09:46, Bruno Cadonna  a écrit :


Hi Francois and Chris,

I find the idea of the builder interesting.

I think we should go ahead with the current KIP as it is to allow
Francois to fix his issue soon. If one of you or both want to push
forward the builder idea, feel free to create a new KIP and discuss it
with the community.

Regarding Francois' questions:

3. If the existing constructors should be removed, they need to be
marked as deprecated and removed in one of the next major releases.

5. Yes, I think Streams should be aligned.

Both questions should be discussed in the context of a new KIP about the
builder idea.

Best,
Bruno

On 11.05.22 04:24, Chris Egerton wrote:

Hi Francois,

Thanks for your thoughts. I think it's worth noting that in regards to

item

2, it's possible to explicitly declare the type parameters for a builder
without capturing it in a variable; for example:

KafkaProducer p = new Builder(...)
  .withKeySerializer(new StringSerializer())
  .withValueSerializer(new IntegerSerializer())
  .build();

That aside, given the three binding votes already cast on the vote

thread,

it's probably too late to be worth changing direction at this point.

Thanks

for entertaining the proposal, and congratulations on your KIP!

Cheers,

Chris

On Tue, May 10, 2022 at 5:33 PM François Rosière <

francois.rosi...@gmail.com>

wrote:


Hello Chris,

Thanks for the feedback. Builders is definitely the pattern to apply

when

an object needs to be built using different arguments/combinations.

Here are my thoughts about the proposal:

1. The builder should only expose meaningful methods for the users

such as

the interceptors, the serializer/deserializer, partitioner, etc. A

method

like the configured instances is internal and should not be exposed if

we

don't want to expose the config itself. Using this internal method is

the

only way to solve the issue if the config is exposed.

2. As the key and value types are not given, a variable will need to be
created for the builder before being used. Otherwise, there is no way

to

infer the type correctly. Breaks a bit the inline usage with DSL style.

3. What about existing constructors, they would need to stay to keep
compatibility with existing o could they be removed in benefit of the
builder?

4. Having an access to the config also gives a way to also fine tune

other

aspects such as the logging related to the config. Log unused, skip

some

properties, etc.

5. What about streams? Shouldn't it be aligned?

So, to summarise, the KIP was a best effort solution to support already
configured instances related to both the producer and the consumer.
The builder will work, it's just a matter of deciding the best

approach...

for me, both solutions are fine, I just need a way to inject already
configured dependencies into the producers and consumers.

If we conclude, I will drop a PR on Github.

Kr,

F.

Le mar. 10 mai 2022 à 15:01, Chris Egerton  a
écrit :


Hi Francois,

Thanks for the KIP! I sympathize with the issue you're facing and with
John's reluctance to let perfect be the 

[GitHub] [kafka-site] gty92 opened a new pull request, #409: Add AllegroGraph to powerdy-by

2022-05-12 Thread GitBox


gty92 opened a new pull request, #409:
URL: https://github.com/apache/kafka-site/pull/409

   Hello Kafka team,
   
   please consider merge this small PR where a new entry is added to 
`powered-by.html`. Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [VOTE] 3.1.1 RC1

2022-05-12 Thread Tom Bentley
Hi,

Not having heard from any of the existing voters, I will go ahead and
complete the release process.

Thanks Chris, I will do that and also open JIRAs for any other flaky tests
which don't already have them.

Dongjoon, thanks for your patience and sorry for this having been a longer
than usual release process.

Kind regards,

Tom

On Thu, 12 May 2022 at 06:28, Dongjoon Hyun  wrote:

> Thank you for sharing your assessment, Tom and Chris.
>
> Apache Spark community had a discussion thread on March.
>
> https://lists.apache.org/thread/yq6f5gv7gtdbk1ynwpd9hnc547bgz03m
> "[DISCUSS] Migration guide on upgrading Kafka to 3.1 in Spark 3.3"
>
> As a person who initiated that upgrade via SPARK-36837, I need to answer
> it before Apache Spark 3.3 RC2. In the worst case, Apache Spark community
> may decide to wait for next Kafka versions instead of upgrading.
>
> Thank you again all people to help this release!
>
> Bests,
> Dongjoon.
>
> On 2022/05/12 01:39:16 Chris Egerton wrote:
> > It's worth noting that one of the most frequent offenders
> > (RebalanceSourceConnectorsIntegrationTest.testDeleteConnector), which
> > failed in five of the nine runs from 111 through 119, is actually
> disabled
> > on 3.2 and trunk as it's caused by a known and not-yet-addressed bug in
> > rebalancing logic. Since it's not a regression we can safely ignore those
> > failures, but if we plan on putting out another 3.1 release we should
> > probably disable that test on 3.1 as well.
> >
> > On Wed, May 11, 2022 at 3:32 AM Tom Bentley  wrote:
> >
> > > Hi Dongjoon,
> > >
> > > I've been trying to get a green build of the 3.1 branch from Jenkins (
> > > https://ci-builds.apache.org/job/Kafka/job/kafka/job/3.1/) since my
> > > original email promised to provide this. Individually each stage
> between
> > > builds 111 and 119 has passed in at least one run, but there have been
> no
> > > runs where all stages have passed.
> > >
> > > However, as you note, the vote has the necessary 3 binding votes, so
> > > assuming the voters don't wish to change their votes I can proceed
> with the
> > > release. If this is not the case please respond on the thread. If I
> hear
> > > nothing in the next 24 hours I will assume the voters are OK with this
> and
> > > proceed with the rest of the release process.
> > >
> > > Apologies for the delay,
> > >
> > > Kind regards,
> > >
> > > Tom
> > >
> > >
> > > On Wed, 11 May 2022 at 08:15, Dongjoon Hyun 
> wrote:
> > >
> > > > Hi, Tom.
> > > >
> > > > Could you conclude this vote as a release manager? :)
> > > >
> > > > Dongjoon.
> > > >
> > > > On 2022/05/06 13:31:15 Michal Tóth wrote:
> > > > > Hello
> > > > >
> > > > > I have executed some produce/consume system tests which all passed.
> > > > > Also everything passed from
> > > > https://github.com/tombentley/kafka-verify-rc
> > > > > - checking signatures, checksums, with gradle unit & integration
> tests,
> > > > etc.
> > > > >
> > > > > Good from me (non-binding).
> > > > >
> > > > >
> > > > >
> > > > > pi 6. 5. 2022 o 14:30 David Jacot 
> > > > napísal(a):
> > > > >
> > > > > > Thanks for running the release, Tom.
> > > > > >
> > > > > > I performed the following validations:
> > > > > > * Verified all checksums and signatures.
> > > > > > * Built from source and ran unit tests.
> > > > > > * Ran the first quickstart steps for both ZK and KRaft.
> > > > > > * Spotchecked the Javadocs.
> > > > > >
> > > > > > I noticed the same issues as others on the website. I checked
> > > > > > the doc in git and it looks good.
> > > > > >
> > > > > > +1 (binding)
> > > > > >
> > > > > > Best,
> > > > > > David
> > > > > >
> > > > > > On Thu, May 5, 2022 at 7:52 PM Dongjoon Hyun <
> dongj...@apache.org>
> > > > wrote:
> > > > > > >
> > > > > > > +1 (non-binding)
> > > > > > >
> > > > > > > RC1 was tested with Apache Spark tests
> > > > > > >
> > > > > > > - https://github.com/apache/spark/pull/36135
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Dongjoon.
> > > > > > >
> > > > > > > On 2022/05/05 03:25:25 Luke Chen wrote:
> > > > > > > > Hi Tom,
> > > > > > > >
> > > > > > > > I did:
> > > > > > > > 1. check the signature and checksums
> > > > > > > > 2. ran quick start with java17 + sacla2.12
> > > > > > > > 3. browse java docs/documentations
> > > > > > > >
> > > > > > > > +1 (non-binding)
> > > > > > > >
> > > > > > > > Thanks for running the release.
> > > > > > > >
> > > > > > > > Luke
> > > > > > > >
> > > > > > > > On Thu, May 5, 2022 at 12:43 AM Bill Bejeck <
> bbej...@apache.org>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Tom,
> > > > > > > > >
> > > > > > > > > Thanks for running the release!
> > > > > > > > >
> > > > > > > > > I did the following checks:
> > > > > > > > >
> > > > > > > > >1. Validated all the checksum and signatures
> > > > > > > > >2. Built from source and ran the unit tests (Java 11)
> > > > > > > > >3. Ran the quickstart and the Kafka Streams demo (Java
> 11)
> > > > > > > > >4. Did a quick scan of t

[RESULTS] [VOTE] Release Kafka version 3.1.1

2022-05-12 Thread Tom Bentley
This vote passes with 7 +1 votes (3 bindings) and no 0 or -1 votes.

+1 votes
PMC Members:
* Mickael Maison
* Bill Bejeck
* David Jacot

Committers:
* Luke Chen
* Dongjoon Hyun

Community:
* Jakub Scholz
* Michal Tóth

0 votes
* No votes

-1 votes
* No votes

Vote thread:
https://lists.apache.org/thread/jrs4ws0m69hn22lq2kq56cqg53glqgbm

I'll continue with the release process and the release announcement will
follow in the next few days.

Tom


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #925

2022-05-12 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-13895) Fix javadocs build with JDK < 12

2022-05-12 Thread Tom Bentley (Jira)
Tom Bentley created KAFKA-13895:
---

 Summary: Fix javadocs build with JDK < 12
 Key: KAFKA-13895
 URL: https://issues.apache.org/jira/browse/KAFKA-13895
 Project: Kafka
  Issue Type: Task
  Components: docs
Reporter: Tom Bentley


While doing the "Website update process" in the 3.1.1 release I found that I'd 
broken the javadoc search functionality due to having build the Java docs with 
Java 11. Java < 12 [a bug|https://bugs.openjdk.java.net/browse/JDK-8215291] 
that means the javadoc search functionality adds /undefined/ in the URL path 
(even though links between pages otherwise work. 

We could fix the build.gradle to use {{-no-module-directories}} when running 
with javadoc < v12, but that will then break the links to the JDK classes 
javadocs from the Kafka javadoc, [as described 
here|https://github.com/spring-projects/spring-security/issues/10944].

Alternatively we could change the release process docs to require building with 
Java 17. While this would fix the problem for the Javadocs published on the 
website, anyone building the javadocs for themselves would still be affected.




--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [kafka-clients] Re: [VOTE] 3.2.0 RC1

2022-05-12 Thread Bruno Cadonna

Hi all,

Please review the blog post for the Apache Kafka 3.2.0 release:

https://blogs.apache.org/preview/kafka/?previewEntry=what-s-new-in-apache8

I will accept comments until Monday, May 16th EOD PT.

Best,
Bruno


On 06.05.22 14:52, 'David Jacot' via kafka-clients wrote:

Thanks for running the release, Bruno.

I performed the following validations:
* Verified all checksums and signatures.
* Built from source and ran unit tests.
* Ran the first quickstart steps for both ZK and KRaft.
* Spotchecked the doc and the Javadocs.

+1 (binding)

Best,
David

On Thu, May 5, 2022 at 10:36 AM Jakub Scholz  wrote:


+1 (non-binding).

I used the Scala 2.13 binaries and the staged Maven artifacts and ran
various tests with them. Thanks for doing the release.

Jakub

On Tue, May 3, 2022 at 4:07 PM Bruno Cadonna  wrote:


Hello Kafka users, developers and client-developers,

This is the second candidate for release of Apache Kafka 3.2.0.

* log4j 1.x is replaced with reload4j (KAFKA-9366)
* StandardAuthorizer for KRaft (KIP-801)
* Send a hint to the partition leader to recover the partition (KIP-704)
* Top-level error code field in DescribeLogDirsResponse (KIP-784)
* kafka-console-producer writes headers and null values (KIP-798 and
KIP-810)
* JoinGroupRequest and LeaveGroupRequest have a reason attached (KIP-800)
* Static membership protocol lets the leader skip assignment (KIP-814)
* Rack-aware standby task assignment in Kafka Streams (KIP-708)
* Interactive Query v2 (KIP-796, KIP-805, and KIP-806)
* Connect APIs list all connector plugins and retrieve their
configuration (KIP-769)
* TimestampConverter SMT supports different unix time precisions (KIP-808)
* Connect source tasks handle producer exceptions (KIP-779)


Release notes for the 3.2.0 release:
https://home.apache.org/~cadonna/kafka-3.2.0-rc1/RELEASE_NOTES.html

*** Please download, test and vote by Tuesday, May 10th, 9am PDT

Kafka's KEYS file containing PGP keys we use to sign the release:
https://kafka.apache.org/KEYS

* Release artifacts to be voted upon (source and binary):
https://home.apache.org/~cadonna/kafka-3.2.0-rc1/

* Maven artifacts to be voted upon:
https://repository.apache.org/content/groups/staging/org/apache/kafka/

* Javadoc:
https://home.apache.org/~cadonna/kafka-3.2.0-rc1/javadoc/

* Tag to be voted upon (off 3.2 branch) is the 3.2.0 tag:
https://github.com/apache/kafka/releases/tag/3.2.0-rc1

* Documentation:
https://kafka.apache.org/32/documentation.html

* Protocol:
https://kafka.apache.org/32/protocol.html

* Successful Jenkins builds for the 3.2 branch:
Unit/integration tests: I'll share a link once the builds complete
System tests:
https://jenkins.confluent.io/job/system-test-kafka/job/3.2/30/

/**

Thanks,
Bruno





[GitHub] [kafka-site] qingwei91 opened a new pull request, #410: KAFKA-13882 Docker to preview docs locally

2022-05-12 Thread GitBox


qingwei91 opened a new pull request, #410:
URL: https://github.com/apache/kafka-site/pull/410

   I've added a Dockerfile and handy script to run documentation locally with 1 
command.
   
   It should reduce friction for docs contribution.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka-site] qingwei91 commented on pull request #410: KAFKA-13882 Docker to preview docs locally

2022-05-12 Thread GitBox


qingwei91 commented on PR #410:
URL: https://github.com/apache/kafka-site/pull/410#issuecomment-1124867030

   I dont think we can do the same on https://github.com/apache/kafka/ itself, 
please correct me if I'm wrong


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #926

2022-05-12 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.1 #123

2022-05-12 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-13896) Support unsafe downgrades in KRaft

2022-05-12 Thread David Arthur (Jira)
David Arthur created KAFKA-13896:


 Summary: Support unsafe downgrades in KRaft
 Key: KAFKA-13896
 URL: https://issues.apache.org/jira/browse/KAFKA-13896
 Project: Kafka
  Issue Type: Sub-task
  Components: controller, kraft
Reporter: David Arthur


In order to support the "unsafe" downgrade specified in KIP-778 we need to be 
able to generate a downgraded snapshot on the broker and controller. This 
snapshot will write out metadata records at the record level that matches the 
target metadata.version. Records that were added after the target 
metadata.version will be omitted from the snapshot.

This also means we need a way to correlate record versions with 
metadata.version. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #927

2022-05-12 Thread Apache Jenkins Server
See 




[GitHub] [kafka-site] mimaison commented on a diff in pull request #410: KAFKA-13882 Docker to preview docs locally

2022-05-12 Thread GitBox


mimaison commented on code in PR #410:
URL: https://github.com/apache/kafka-site/pull/410#discussion_r871522164


##
README.md:
##
@@ -0,0 +1,11 @@
+# How to preview the documentation changes locally?
+
+The documentation can hosted on a local webserver via httpd.
+
+You can run it with the following command, note that it requires docker:
+
+```shell
+sh start-preview.sh
+```
+
+Then you can open localhost:8080 on your browser and browse the documentation

Review Comment:
   Can we make a link for localhost:8080?
   Also let's add a full stop at the end of the sentence



##
README.md:
##
@@ -0,0 +1,11 @@
+# How to preview the documentation changes locally?

Review Comment:
   We can drop `the` here



##
start-preview.sh:
##
@@ -0,0 +1,9 @@
+#!/usr/bin/env bash
+
+set -euxo pipefail
+
+docker build -t kafka-site-preview .
+
+docker run -dit --rm --name mypreview -p 8080:80 -v 
"$PWD":/usr/local/apache2/htdocs/ kafka-site-preview
+
+echo "You can stop the preview server by running `docker stop mypreview"

Review Comment:
   Can we close the backtick and also add a new line?



##
README.md:
##
@@ -0,0 +1,11 @@
+# How to preview the documentation changes locally?
+
+The documentation can hosted on a local webserver via httpd.

Review Comment:
   `The documentation can hosted` -> `The documentation can be hosted`



##
README.md:
##
@@ -0,0 +1,11 @@
+# How to preview the documentation changes locally?
+
+The documentation can hosted on a local webserver via httpd.
+
+You can run it with the following command, note that it requires docker:
+
+```shell
+sh start-preview.sh

Review Comment:
   Can we make the script executable and call `./start-preview.sh`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka-site] qingwei91 commented on a diff in pull request #410: KAFKA-13882 Docker to preview docs locally

2022-05-12 Thread GitBox


qingwei91 commented on code in PR #410:
URL: https://github.com/apache/kafka-site/pull/410#discussion_r871576678


##
README.md:
##
@@ -0,0 +1,11 @@
+# How to preview the documentation changes locally?
+
+The documentation can hosted on a local webserver via httpd.
+
+You can run it with the following command, note that it requires docker:
+
+```shell
+sh start-preview.sh

Review Comment:
   Hi, does this means users have to run chmod +x themselves?
   



##
README.md:
##
@@ -0,0 +1,11 @@
+# How to preview the documentation changes locally?
+
+The documentation can hosted on a local webserver via httpd.
+
+You can run it with the following command, note that it requires docker:
+
+```shell
+sh start-preview.sh

Review Comment:
   Hi, does this means users have to run chmod +x?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka-site] mimaison commented on a diff in pull request #410: KAFKA-13882 Docker to preview docs locally

2022-05-12 Thread GitBox


mimaison commented on code in PR #410:
URL: https://github.com/apache/kafka-site/pull/410#discussion_r871588483


##
README.md:
##
@@ -0,0 +1,11 @@
+# How to preview the documentation changes locally?
+
+The documentation can hosted on a local webserver via httpd.
+
+You can run it with the following command, note that it requires docker:
+
+```shell
+sh start-preview.sh

Review Comment:
   We don't want users to have to do it, make the file runnable (by running 
`chmod +x start-preview.sh`) as part of this PR



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [kafka-site] mimaison commented on a diff in pull request #410: KAFKA-13882 Docker to preview docs locally

2022-05-12 Thread GitBox


mimaison commented on code in PR #410:
URL: https://github.com/apache/kafka-site/pull/410#discussion_r871590415


##
start-preview.sh:
##
@@ -0,0 +1,9 @@
+#!/usr/bin/env bash
+
+set -euxo pipefail
+
+docker build -t kafka-site-preview .
+
+docker run -dit --rm --name mypreview -p 8080:80 -v 
"$PWD":/usr/local/apache2/htdocs/ kafka-site-preview
+
+echo "You can stop the preview server by running `docker stop mypreview"

Review Comment:
   Thanks for the update. Can we add a new line at the end of the file?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



Re: [DISCUSS] KIP-787 - MM2 Interface to manage Kafka resources

2022-05-12 Thread Omnia Ibrahim
I updated the KIP to reflect the options we have been discussing since Oct
2021 for people who didn't read the discussion thread.

On Wed, May 11, 2022 at 11:07 PM Omnia Ibrahim 
wrote:

> Hi Colin,
> I don't mind the idea of MM2 users implementing the AdminClient interface.
> However, there're two disadvantages to this.
>
>1. Having around 70 methods definitions to have "NotImplemented" is
>one downside, and keep up with these if the AdminClient interface changes.
>2. It makes it hard to list what admin functionality MM2 uses as MM2
>interactions with AdminClient in the codebase are in many places.
>
> I guess it's OK for MM2 users who want to build their admin client to
> carry this burden, as I explained in my previous response to the discussion
> thread. And we can do some cleanup to the codebase to have all Admin
> interactions in MM2 in a utils class or something like that to make it
> easier to navigate what MM2 needs from the Admin interface.
>
> Maybe I'm misunderstanding the use-case you're describing here. But it
>> seems to me that if you create a proxy that has the ability to do any admin
>> operation, and give MM2 access to that proxy, the security model is the
>> same as just giving MM2 admin access. (Or it may be worse if the sysadmin
>> doesn't know what this proxy is doing, and doesn't lock it down...)
>>
>
> MM2 runs with the assumption that it has
>
>- "CREATE" ACLs for topics on the source clusters to create
>`heartbeat` topics.
>- "CREATE"  and "ALTER" ACLs to create topics, add partitions, update
>topics' config and topics' ACLs (in future, will also include group ACLS as
>Mikael mentioned before in the thread) on the destination clusters.
>
> Most organisations have some resource management or federated solutions
> (some would even have a budget system as part of these systems) to manage
> Kafka resources, and these systems are usually the only application allowed
> to initializing a client with "CREATE" and "ALTER" ACLs. They don't grant
> these ACLs to any other teams/groups/applications to create such a client
> outside these systems, so assuming MM2 can bypass these systems and use the
> AdminClient directly to create/update resources isn't valid. This is the
> primary concern here.
>
> The KIP is trying to give MM2 more flexibility to allow organisations to
> integrate MM2 with their resource management system as they see fit without
> forcing them to disable most MM2 features.
>
> Hope this make sense and clear it up.
>
>
> On Wed, May 11, 2022 at 9:09 PM Colin McCabe  wrote:
>
>> Hi Omnia Ibrahim,
>>
>> I'm sorry, but I am -1 on adding competing Admin interfaces. This would
>> create confusion and a heavier maintenance burden for the project.
>>
>> Since the org.apache.kafka.clients.admin.Admin interface is a Java
>> interface, any third-party software that wants to insert its own
>> implementation of the interface can do so already.
>>
>> A KIP to make the Admin class used pluggable for MM2 would be reasonable.
>> Adding a competing admin API is not.
>>
>> It's true that there are many Admin methods, but you do not need to
>> implement all of them -- just the ones that MirrorMaker uses. The other
>> ones can throw a NotImplementedException or similar.
>>
>> > The current approach also assumes that the user running MM2 has the
>> Admin right to
>> > create/update topics, which is only valid if the user who runs MM2 also
>> manages both
>> > source and destination clusters.
>>
>> Maybe I'm misunderstanding the use-case you're describing here. But it
>> seems to me that if you create a proxy that has the ability to do any admin
>> operation, and give MM2 access to that proxy, the security model is the
>> same as just giving MM2 admin access. (Or it may be worse if the sysadmin
>> doesn't know what this proxy is doing, and doesn't lock it down...)
>>
>> best,
>> Colin
>>
>>
>> On Mon, May 9, 2022, at 13:21, Omnia Ibrahim wrote:
>> > Hi, I gave the KIP another look after talking to some people at the
>> Kafka
>> > Summit in London. And I would like to clear up the motivation of this
>> KIP.
>> >
>> >
>> > At the moment, MM2 has some opinionated decisions that are creating
>> issues
>> > for teams that use IaC, federated solutions or have a capacity/budget
>> > planning system for Kafka destination clusters. To explain it better,
>> let's
>> > assume we have MM2 with the following configurations to highlight these
>> > problems.
>> >
>> > ```
>> >
>> > topics = .*
>> >
>> > refresh.topics.enabled = true
>> >
>> > sync.topic.configs.enabled = true
>> >
>> > sync.topic.acls.enabled = true
>> >
>> > // Maybe in futrue we can have sync.group.acls.enabled = true
>> >
>> > ```
>> >
>> >
>> > These configurations allow us to run MM2 with the value of its full
>> > features. However, there are two main concerns when we run on a scale
>> with
>> > these configs:
>> >
>> > 1. *Capacity/Budgeting Planning:*
>> >
>> > Functionality or features that impact c

Re: [DISCUSS] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-05-12 Thread Niket Goel
Thanks for the suggestion Colin.

> One minor point: I suspect that whatever we end up naming the additional
fields here, should also be the name of the metrics in KIP-835. So if we go
with a metric named "last-applied-offset" we'd want a lastAppliedOffset
field here, and so on.

This is a good point. Will respond to the discussion thread on KIP-835
about the dependency here.

> I also wonder if it makes sense for us to report the timestamp of the
latest batch that has been fetched (and not necessarily applied) rather
than the wall clock time at which the leader made the latest fetch.

In theory I am onboard with your suggestion and honestly I too wanted to
add something similar. However, from what I understand (and please correct
me if my understanding is off), the `DescribeQuorum` API as it is
implemented lives in the Raft layer and utilizes the data available within
that layer to fill out the response. To achieve a more accurate info on
what was applied etc like you recommend, we would need to look into the
log.
This leaves us two with options high level options --
1. Peek into the log in the raft layer:
  I think this is definitely not the way to go as it breaks the isolation
the raft layer has from the contents of the log and also introduces more
computational work which would hurt performance.
2. Have the layer above the Raft Client (so the controller) provide the
required information:
  We can consider this approach, however it will break the separation
between the layers. IIUC, the `DescribeQuorum` API is intended to be a Raft
API, but doing this will result in it being dependent on the controller (or
some layer driving the raft client). I am not sure if that is the direction
we want to go in the long term.

I think my meta point is that there might be a way to get more accurate
information of "lag" into the response, but the question is that if that
additional fidelity in the accuracy of the lag is worth the cost we will
end up paying to add it.

Let me know your thoughts on this.

On Wed, May 11, 2022 at 12:56 PM Colin McCabe  wrote:

> Thanks, Niket. I also agree with Jason that this is a public API despite
> the lack of command-line tool, so we do indeed need a KIP. :)
>
> One minor point: I suspect that whatever we end up naming the additional
> fields here, should also be the name of the metrics in KIP-835. So if we go
> with a metric named "last-applied-offset" we'd want a lastAppliedOffset
> field here, and so on.
>
> I also wonder if it makes sense for us to report the timestamp of the
> latest batch that has been fetched (and not necessarily applied) rather
> than the wall clock time at which the leader made the latest fetch. If we
> take both timestamps directly from the metadata log, we know they'll be
> comparable even in the presence of clock skew. And we know because of
> KIP-835 that the metadata log won't go quiet for prolonged periods.
>
> best,
> Colin
>
>
> On Tue, May 10, 2022, at 13:30, Niket Goel wrote:
> >> @Niket does it make sense to add the Admin API to this KIP?
> >
> > Thanks Deng for pointing this out. I agree with Jason's suggestion. I
> will
> > go ahead and add the admin API to this KIP.
> >
> > - Niket
> >
> > On Tue, May 10, 2022 at 11:44 AM Jason Gustafson
> 
> > wrote:
> >
> >> > Hello Niket, currently DescribeQuorumResponse is not a public API, we
> >> don’t have a Admin api or shell script to get DescribeQuorumResponse, so
> >> it’s unnecessary to submit a KIP to change it, you can just submit a PR
> to
> >> accomplish this.
> >>
> >> Hey Ziming, I think it is public. It was documented in KIP-595 and we
> have
> >> implemented the API on the server. However, it looks like I never added
> >> the Admin API (even though it is assumed by the
> `kafka-metadata-quorum.sh`
> >> tool). @Niket does it make sense to add the Admin API to this KIP?
> >>
> >> Best,
> >> Jason
> >>
> >> On Mon, May 9, 2022 at 8:09 PM deng ziming 
> >> wrote:
> >>
> >> > Hello Niket, currently DescribeQuorumResponse is not a public API, we
> >> > don’t have a Admin api or shell script to get DescribeQuorumResponse,
> so
> >> > it’s unnecessary to submit a KIP to change it, you can just submit a
> PR
> >> to
> >> > accomplish this.
> >> >
> >> > --
> >> > Thanks
> >> > Ziming
> >> >
> >> > > On May 10, 2022, at 1:33 AM, Niket Goel  >
> >> > wrote:
> >> > >
> >> > > Hi all,
> >> > >
> >> > > I created a KIP to add some more information to
> >> > `DesscribeQuorumResponse` to enable ascertaining voter lag in the
> quorum
> >> a
> >> > little better.
> >> > > Please see KIP --
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-836%3A+Additional+Information+in+DescribeQuorumResponse+about+Voter+Lag
> >> > >
> >> > > Thanks for your feedback,
> >> > > Niket Goel
> >> >
> >> >
> >>
> >
> >
> > --
> > - Niket
>


-- 
- Niket


Re: [DISCUSS] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-05-12 Thread Ron Dagostino
Hi Niket.  Thanks for the KIP.  Are all the fields you specified
always known?  For example, might a new controller not have a last
fetch time for other voters, and then what would it send in the
response?  If this is possible then we should be explicit about what
is to be sent in this case.

Ron

On Thu, May 12, 2022 at 12:54 PM Niket Goel  wrote:
>
> Thanks for the suggestion Colin.
>
> > One minor point: I suspect that whatever we end up naming the additional
> fields here, should also be the name of the metrics in KIP-835. So if we go
> with a metric named "last-applied-offset" we'd want a lastAppliedOffset
> field here, and so on.
>
> This is a good point. Will respond to the discussion thread on KIP-835
> about the dependency here.
>
> > I also wonder if it makes sense for us to report the timestamp of the
> latest batch that has been fetched (and not necessarily applied) rather
> than the wall clock time at which the leader made the latest fetch.
>
> In theory I am onboard with your suggestion and honestly I too wanted to
> add something similar. However, from what I understand (and please correct
> me if my understanding is off), the `DescribeQuorum` API as it is
> implemented lives in the Raft layer and utilizes the data available within
> that layer to fill out the response. To achieve a more accurate info on
> what was applied etc like you recommend, we would need to look into the
> log.
> This leaves us two with options high level options --
> 1. Peek into the log in the raft layer:
>   I think this is definitely not the way to go as it breaks the isolation
> the raft layer has from the contents of the log and also introduces more
> computational work which would hurt performance.
> 2. Have the layer above the Raft Client (so the controller) provide the
> required information:
>   We can consider this approach, however it will break the separation
> between the layers. IIUC, the `DescribeQuorum` API is intended to be a Raft
> API, but doing this will result in it being dependent on the controller (or
> some layer driving the raft client). I am not sure if that is the direction
> we want to go in the long term.
>
> I think my meta point is that there might be a way to get more accurate
> information of "lag" into the response, but the question is that if that
> additional fidelity in the accuracy of the lag is worth the cost we will
> end up paying to add it.
>
> Let me know your thoughts on this.
>
> On Wed, May 11, 2022 at 12:56 PM Colin McCabe  wrote:
>
> > Thanks, Niket. I also agree with Jason that this is a public API despite
> > the lack of command-line tool, so we do indeed need a KIP. :)
> >
> > One minor point: I suspect that whatever we end up naming the additional
> > fields here, should also be the name of the metrics in KIP-835. So if we go
> > with a metric named "last-applied-offset" we'd want a lastAppliedOffset
> > field here, and so on.
> >
> > I also wonder if it makes sense for us to report the timestamp of the
> > latest batch that has been fetched (and not necessarily applied) rather
> > than the wall clock time at which the leader made the latest fetch. If we
> > take both timestamps directly from the metadata log, we know they'll be
> > comparable even in the presence of clock skew. And we know because of
> > KIP-835 that the metadata log won't go quiet for prolonged periods.
> >
> > best,
> > Colin
> >
> >
> > On Tue, May 10, 2022, at 13:30, Niket Goel wrote:
> > >> @Niket does it make sense to add the Admin API to this KIP?
> > >
> > > Thanks Deng for pointing this out. I agree with Jason's suggestion. I
> > will
> > > go ahead and add the admin API to this KIP.
> > >
> > > - Niket
> > >
> > > On Tue, May 10, 2022 at 11:44 AM Jason Gustafson
> > 
> > > wrote:
> > >
> > >> > Hello Niket, currently DescribeQuorumResponse is not a public API, we
> > >> don’t have a Admin api or shell script to get DescribeQuorumResponse, so
> > >> it’s unnecessary to submit a KIP to change it, you can just submit a PR
> > to
> > >> accomplish this.
> > >>
> > >> Hey Ziming, I think it is public. It was documented in KIP-595 and we
> > have
> > >> implemented the API on the server. However, it looks like I never added
> > >> the Admin API (even though it is assumed by the
> > `kafka-metadata-quorum.sh`
> > >> tool). @Niket does it make sense to add the Admin API to this KIP?
> > >>
> > >> Best,
> > >> Jason
> > >>
> > >> On Mon, May 9, 2022 at 8:09 PM deng ziming 
> > >> wrote:
> > >>
> > >> > Hello Niket, currently DescribeQuorumResponse is not a public API, we
> > >> > don’t have a Admin api or shell script to get DescribeQuorumResponse,
> > so
> > >> > it’s unnecessary to submit a KIP to change it, you can just submit a
> > PR
> > >> to
> > >> > accomplish this.
> > >> >
> > >> > --
> > >> > Thanks
> > >> > Ziming
> > >> >
> > >> > > On May 10, 2022, at 1:33 AM, Niket Goel  > >
> > >> > wrote:
> > >> > >
> > >> > > Hi all,
> > >> > >
> > >> > > I created a KIP to add some more informati

Re: [DISCUSS] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-05-12 Thread Niket Goel
Hey Ron,

That's a good callout. Just a minor call out so that we are on the same
page - This API is always responded to by the Raft leader.
Now as you pointed out there is a possibility that the leader has not heard
from the voters yet. We will need to add in a state describing this UNKNOWN
fetch time for the voters. I will update the KIP to reflect this.

Thanks


On Thu, May 12, 2022 at 10:57 AM Ron Dagostino  wrote:

> Hi Niket.  Thanks for the KIP.  Are all the fields you specified
> always known?  For example, might a new controller not have a last
> fetch time for other voters, and then what would it send in the
> response?  If this is possible then we should be explicit about what
> is to be sent in this case.
>
> Ron
>
> On Thu, May 12, 2022 at 12:54 PM Niket Goel 
> wrote:
> >
> > Thanks for the suggestion Colin.
> >
> > > One minor point: I suspect that whatever we end up naming the
> additional
> > fields here, should also be the name of the metrics in KIP-835. So if we
> go
> > with a metric named "last-applied-offset" we'd want a lastAppliedOffset
> > field here, and so on.
> >
> > This is a good point. Will respond to the discussion thread on KIP-835
> > about the dependency here.
> >
> > > I also wonder if it makes sense for us to report the timestamp of the
> > latest batch that has been fetched (and not necessarily applied) rather
> > than the wall clock time at which the leader made the latest fetch.
> >
> > In theory I am onboard with your suggestion and honestly I too wanted to
> > add something similar. However, from what I understand (and please
> correct
> > me if my understanding is off), the `DescribeQuorum` API as it is
> > implemented lives in the Raft layer and utilizes the data available
> within
> > that layer to fill out the response. To achieve a more accurate info on
> > what was applied etc like you recommend, we would need to look into the
> > log.
> > This leaves us two with options high level options --
> > 1. Peek into the log in the raft layer:
> >   I think this is definitely not the way to go as it breaks the isolation
> > the raft layer has from the contents of the log and also introduces more
> > computational work which would hurt performance.
> > 2. Have the layer above the Raft Client (so the controller) provide the
> > required information:
> >   We can consider this approach, however it will break the separation
> > between the layers. IIUC, the `DescribeQuorum` API is intended to be a
> Raft
> > API, but doing this will result in it being dependent on the controller
> (or
> > some layer driving the raft client). I am not sure if that is the
> direction
> > we want to go in the long term.
> >
> > I think my meta point is that there might be a way to get more accurate
> > information of "lag" into the response, but the question is that if that
> > additional fidelity in the accuracy of the lag is worth the cost we will
> > end up paying to add it.
> >
> > Let me know your thoughts on this.
> >
> > On Wed, May 11, 2022 at 12:56 PM Colin McCabe 
> wrote:
> >
> > > Thanks, Niket. I also agree with Jason that this is a public API
> despite
> > > the lack of command-line tool, so we do indeed need a KIP. :)
> > >
> > > One minor point: I suspect that whatever we end up naming the
> additional
> > > fields here, should also be the name of the metrics in KIP-835. So if
> we go
> > > with a metric named "last-applied-offset" we'd want a lastAppliedOffset
> > > field here, and so on.
> > >
> > > I also wonder if it makes sense for us to report the timestamp of the
> > > latest batch that has been fetched (and not necessarily applied) rather
> > > than the wall clock time at which the leader made the latest fetch. If
> we
> > > take both timestamps directly from the metadata log, we know they'll be
> > > comparable even in the presence of clock skew. And we know because of
> > > KIP-835 that the metadata log won't go quiet for prolonged periods.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Tue, May 10, 2022, at 13:30, Niket Goel wrote:
> > > >> @Niket does it make sense to add the Admin API to this KIP?
> > > >
> > > > Thanks Deng for pointing this out. I agree with Jason's suggestion. I
> > > will
> > > > go ahead and add the admin API to this KIP.
> > > >
> > > > - Niket
> > > >
> > > > On Tue, May 10, 2022 at 11:44 AM Jason Gustafson
> > > 
> > > > wrote:
> > > >
> > > >> > Hello Niket, currently DescribeQuorumResponse is not a public
> API, we
> > > >> don’t have a Admin api or shell script to get
> DescribeQuorumResponse, so
> > > >> it’s unnecessary to submit a KIP to change it, you can just submit
> a PR
> > > to
> > > >> accomplish this.
> > > >>
> > > >> Hey Ziming, I think it is public. It was documented in KIP-595 and
> we
> > > have
> > > >> implemented the API on the server. However, it looks like I never
> added
> > > >> the Admin API (even though it is assumed by the
> > > `kafka-metadata-quorum.sh`
> > > >> tool). @Niket does it make sense to add 

Re: [DISCUSS] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-05-12 Thread José Armando García Sancio
Thanks for the Kafka improvement Niket.

1. For the fields `LastFetchTime` and `LastCaughtUpTime`, Kafka tends
to use the suffix "Timestamp" when the value is an absolute wall clock
value.

2. The method `result()` for the type `DescribeQuorumResult` returns
the type `DescribeQuorumResponseData`. The types generated from the
RPC JSON schema are internal to Kafka and not exposed to clients. For
the admin client we should use a different type that is explicitly
public. See `org.apache.kafka.client.admin.DescribeTopicsResult` for
an example.

3. The proposed section has his sentence "Whenever a new fetch request
comes in the replica's last caught up time is updated to the time of
the fetch request if it requests an offset greater than the leader's
current end offset." Did you mean "previous fetch time" instead of
"last caught up time"? What do you mean by "requests an offset greater
than the leader's current end offset.?" Excluding diverging logs the
follower fetch offset should never be greater than the leader LEO.

Thanks,
-José


Re: [DISCUSS] KIP-836: Addition of Information in DescribeQuorumResponse about Voter Lag

2022-05-12 Thread Niket Goel
Appreciate the careful review Jose.!

Ack on 1 and 2. Will fix.

For number 3 (and I am using [1] as a reference for this discussion), I
think the correct language to use would be:

"Whenever a new fetch request
comes in the replica's last caught up time is updated to the time of
this fetch request if it requests an offset *greater than or equal to* the
leader's
current end offset"
Does that sound right now?

Although I think I will go ahead and rewrite the explanation in a way that
is more understandable. Thanks for pointing this out.

Thanks

[1]
https://github.com/apache/kafka/blob/fa59be4e770627cd34cef85986b58ad7f606928d/core/src/main/scala/kafka/cluster/Replica.scala#L97



On Thu, May 12, 2022 at 3:20 PM José Armando García Sancio
 wrote:

> Thanks for the Kafka improvement Niket.
>
> 1. For the fields `LastFetchTime` and `LastCaughtUpTime`, Kafka tends
> to use the suffix "Timestamp" when the value is an absolute wall clock
> value.
>
> 2. The method `result()` for the type `DescribeQuorumResult` returns
> the type `DescribeQuorumResponseData`. The types generated from the
> RPC JSON schema are internal to Kafka and not exposed to clients. For
> the admin client we should use a different type that is explicitly
> public. See `org.apache.kafka.client.admin.DescribeTopicsResult` for
> an example.
>
> 3. The proposed section has his sentence "Whenever a new fetch request
> comes in the replica's last caught up time is updated to the time of
> the fetch request if it requests an offset greater than the leader's
> current end offset." Did you mean "previous fetch time" instead of
> "last caught up time"? What do you mean by "requests an offset greater
> than the leader's current end offset.?" Excluding diverging logs the
> follower fetch offset should never be greater than the leader LEO.
>
> Thanks,
> -José
>


-- 
- Niket


Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #928

2022-05-12 Thread Apache Jenkins Server
See