[jira] [Created] (KAFKA-16844) ByteArrayConverter can't convert ByteBuffer

2024-05-27 Thread Fan Yang (Jira)
Fan Yang created KAFKA-16844:


 Summary: ByteArrayConverter can't convert ByteBuffer
 Key: KAFKA-16844
 URL: https://issues.apache.org/jira/browse/KAFKA-16844
 Project: Kafka
  Issue Type: Improvement
  Components: connect
Reporter: Fan Yang


In current Schema design, schema type Bytes correspond to two kinds of classes, 
byte[] and ByteBuffer. But current ByteArrayConverter can only convert byte[]. 
My suggestion is to add ByteBuffer support in current ByteArrayConverter.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (KAFKA-16845) Migrate ReplicationQuotasTestRig to new test infra

2024-05-27 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16845:
--

 Summary: Migrate ReplicationQuotasTestRig to new test infra
 Key: KAFKA-16845
 URL: https://issues.apache.org/jira/browse/KAFKA-16845
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


as title



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [VOTE] KIP-1040: Improve handling of nullable values in InsertField, ExtractField, and other transformations

2024-05-27 Thread Mario Fiore Vitale
After 7 days I received only one vote. Should I suppose this will not be
approved?

On Mon, May 20, 2024 at 4:14 PM Chris Egerton 
wrote:

> Thanks for the KIP! +1 (binding)
>
> On Mon, May 20, 2024 at 4:22 AM Mario Fiore Vitale 
> wrote:
>
> > Hi everyone,
> >
> > I'd like to call a vote on KIP-1040 which aims to improve handling of
> > nullable values in InsertField, ExtractField, and other transformations
> >
> > KIP -
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=303794677
> >
> > Discussion thread -
> > https://lists.apache.org/thread/ggqqqjbg6ccpz8g6ztyj7oxr80q5184n
> >
> > Thanks and regards,
> > Mario
> >
>


-- 

Mario Fiore Vitale

Senior Software Engineer

Red Hat 



Re: [DISCUSS] KIP-1048 Improve kafka-consumer-perf-test to benchmark single partition

2024-05-27 Thread Harsh Panchal
Bumping up this thread since I can not find it in the mail archive.

On Wed, 22 May 2024 at 18:09, Harsh Panchal 
wrote:

> Hi,
>
> I would like to propose a change in the kafka-consumer-perf-test tool to
> support perf testing specific partitions.
>
> kafka-consumer-perf-test is a great tool to quickly check raw consumer
> performance. Currently, It subscribes to all the partitions and gives
> overall cluster performance, however If we want to test performance of a
> single broker/partition, existing tool does not support it.
>
> I propose two optional flags --partitions and --offsets which gives
> flexibility to benchmark only specific partitions optionally from specified
> offsets.
>
> KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1048%3A+Improve+kafka-consumer-perf-test+to+benchmark+single+partition
>
> Regards,
> Harsh Panchal
>


[jira] [Created] (KAFKA-16846) Should TxnOffsetCommit API fail all the offsets if any fails the validation?

2024-05-27 Thread David Jacot (Jira)
David Jacot created KAFKA-16846:
---

 Summary: Should TxnOffsetCommit API fail all the offsets if any 
fails the validation?
 Key: KAFKA-16846
 URL: https://issues.apache.org/jira/browse/KAFKA-16846
 Project: Kafka
  Issue Type: Improvement
Reporter: David Jacot


While working on KAFKA-16371, we realized that the handling of 
INVALID_COMMIT_OFFSET_SIZE errors while committer transaction offsets, is a bit 
inconsistent between the server and the client. On the server, the offsets are 
validated independently from each others. Hence if two offsets A and B are 
committed and A fails the validation, B is still written to the log as part of 
the transaction. On the client, when INVALID_COMMIT_OFFSET_SIZE is received, 
the transaction transitions to the fatal state. Hence the transaction will be 
eventually aborted.

The client side API is quite limiting here because it does not return an error 
per committed offsets. It is all or nothing. From this point of view, the 
current behaviour is correct. It seems that we could either change the API and 
let the user decide what to do; or we could strengthen the validation on the 
server to fail all the offsets if any of them fails (all or nothing). We could 
also leave it as it is.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [DISCUSS] KIP-1044: A proposal to change idempotent producer -- server implementation

2024-05-27 Thread Claude Warren, Jr
Igor,

Thanks for the well thought out comment.  Do you have a suggestion for a
fast way to write to disk?  Since the design requires random access perhaps
just a random access file?

Claude

On Thu, May 23, 2024 at 1:17 PM Igor Soarez  wrote:

> Hi Claude,
>
> Thanks for writing this KIP. This issue seems particularly
> thorny, and I appreciate everyone's effort to address this.
>
> I want to share my concern with the KIP's proposal of the
> use of memory mapped files – mmap is Java's achilles heel,
> Kafka should make less use of it, not more.
>
> The JVM often needs to stop all application threads (aka
> mutator threads) before some operations, such as GC,
> optimizations, redefinitions, internal cleanups and various
> other internal reasons. This is known as Safepointing.
>
> Because the JVM cannot forcefully stop threads, it must instead
> wait for each thread to observe the Safepointing request,
> mark themselves as safe and stop.
> A single delayed thread can leave the whole JVM hanging, waiting.
>
> Reads and writes to memory mapped files can trigger system interrupts,
> which can block on IO for prolonged amounts of time.
> One particualrly bad example is hitting the page cage dirty ratio,
> and having to flush all of the page cage, in a potentially large
> (high RAM) system into a potentially slow filesystem.
> I have seen pauses as extreme as 1 minute, and others have reported
> There are other public reports on this. [1][2]
>
> Safepointing in the JVM is designed with mechanisms to prevent having
> to wait for a single busy thread: Threads mark themselves as safe before
> waiting on locks, before system calls, before doing JNI, etc, and upon
> returning they check if a Safepoint is ongoing.
> So if a read or write syscall takes a bit longer that's fine, the JVM
> won't halt for Safepointing, it will proceed knowing that any thread stuck
> on a syscall will stop if necessary when it returns.
> But there's no protection against long system interrups.
> From the JVM's perspective the use of mmap is just a simple memory access,
> so there's no Safepointing protection around that.
> The kernel does know nor care for Java's Safepointing, and does not treat
> halting a single unsuspecting thread for a longer period of time with
> the severity that it may imply during a JVM Safepoint.
>
> So for this reason, I urge you to consider alternatives to the use
> of memory mapped files.
>
> Best,
>
> --
> Igor
>
> https://groups.google.com/g/mechanical-sympathy/c/LFrJPhyVOJ4
>
> https://groups.google.com/g/mechanical-sympathy/c/tepoA7PRFRU/m/7HbSINaFBgAJ
>
>


[jira] [Resolved] (KAFKA-16371) Unstable committed offsets after triggering commits where metadata for some partitions are over the limit

2024-05-27 Thread David Jacot (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-16371.
-
Fix Version/s: 3.8.0
   3.7.1
 Assignee: David Jacot
   Resolution: Fixed

> Unstable committed offsets after triggering commits where metadata for some 
> partitions are over the limit
> -
>
> Key: KAFKA-16371
> URL: https://issues.apache.org/jira/browse/KAFKA-16371
> Project: Kafka
>  Issue Type: Bug
>  Components: offset manager
>Affects Versions: 3.7.0
>Reporter: mlowicki
>Assignee: David Jacot
>Priority: Major
> Fix For: 3.8.0, 3.7.1
>
>
> Issue is reproducible with simple CLI tool - 
> [https://gist.github.com/mlowicki/c3b942f5545faced93dc414e01a2da70]
> {code:java}
> #!/usr/bin/env bash
> for i in {1..100}
> do
> kafka-committer --bootstrap "ADDR:9092" --topic "TOPIC" --group foo 
> --metadata-min 6000 --metadata-max 1 --partitions 72 --fetch
> done{code}
> What it does it that initially it fetches committed offsets and then tries to 
> commit for multiple partitions. If some of commits have metadata over the 
> allowed limit then:
> 1. I see errors about too large commits (expected)
> 2. Another run the tool fails at the stage of fetching commits with (this is 
> the problem):
> {code:java}
> config: ClientConfig { conf_map: { "group.id": "bar", "bootstrap.servers": 
> "ADDR:9092", }, log_level: Error, }
> fetching committed offsets..
> Error: Meta data fetch error: OperationTimedOut (Local: Timed out) Caused by: 
> OperationTimedOut (Local: Timed out){code}
> On the Kafka side I see _unstable_offset_commits_ errors reported by out 
> internal metric which is derived from:
> {noformat}
>  
> kafka.network:type=RequestMetrics,name=ErrorsPerSec,request=X,error=Y{noformat}
> Increasing the timeout doesn't help and the only solution I've found is to 
> trigger commits for all partitions with metadata below the limit or to use: 
> {code:java}
> isolation.level=read_uncommitted{code}
>  
> I don't know that code very well but 
> [https://github.com/apache/kafka/blob/3.7/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L492-L496]
>  seems fishy:
> {code:java}
>     if (isTxnOffsetCommit) {
>       addProducerGroup(producerId, group.groupId)
>       group.prepareTxnOffsetCommit(producerId, offsetMetadata)
>     } else {
>       group.prepareOffsetCommit(offsetMetadata)
>     }{code}
> as it's using _offsetMetadata_ and not _filteredOffsetMetadata_ and I see 
> that while removing those pending commits we use filtered offset metadata 
> around 
> [https://github.com/apache/kafka/blob/3.7/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L397-L422]
>  
> {code:java}
>       val responseError = group.inLock {
>         if (status.error == Errors.NONE) {
>           if (!group.is(Dead)) {
>             filteredOffsetMetadata.forKeyValue { (topicIdPartition, 
> offsetAndMetadata) =>
>               if (isTxnOffsetCommit)
>                 group.onTxnOffsetCommitAppend(producerId, topicIdPartition, 
> CommitRecordMetadataAndOffset(Some(status.baseOffset), offsetAndMetadata))
>               else
>                 group.onOffsetCommitAppend(topicIdPartition, 
> CommitRecordMetadataAndOffset(Some(status.baseOffset), offsetAndMetadata))
>             }
>           }
>           // Record the number of offsets committed to the log
>           offsetCommitsSensor.record(records.size)
>           Errors.NONE
>         } else {
>           if (!group.is(Dead)) {
>             if (!group.hasPendingOffsetCommitsFromProducer(producerId))
>               removeProducerGroup(producerId, group.groupId)
>             filteredOffsetMetadata.forKeyValue { (topicIdPartition, 
> offsetAndMetadata) =>
>               if (isTxnOffsetCommit)
>                 group.failPendingTxnOffsetCommit(producerId, topicIdPartition)
>               else
>                 group.failPendingOffsetWrite(topicIdPartition, 
> offsetAndMetadata)
>             }
>           }
> {code}
> so the problem might be related to not cleaning up the data structure with 
> pending commits properly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16841) ZKMigrationIntegrationTests broken

2024-05-27 Thread Justine Olshan (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justine Olshan resolved KAFKA-16841.

Resolution: Fixed

fixed by 
https://github.com/apache/kafka/commit/bac8df56ffdf8a64ecfb78ec0779bcbc8e9f7c10

> ZKMigrationIntegrationTests broken
> --
>
> Key: KAFKA-16841
> URL: https://issues.apache.org/jira/browse/KAFKA-16841
> Project: Kafka
>  Issue Type: Task
>Reporter: Justine Olshan
>Priority: Blocker
>
> A recent merge to trunk seems to have broken tests so that I see 78 failures 
> in the CI. 
> I see lots of timeout errors and `Alter Topic Configs had an error`



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (KAFKA-16418) Review/split long-running admin client integration tests

2024-05-27 Thread Lianet Magrans (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lianet Magrans resolved KAFKA-16418.

Resolution: Not A Problem

> Review/split long-running admin client integration tests
> 
>
> Key: KAFKA-16418
> URL: https://issues.apache.org/jira/browse/KAFKA-16418
> Project: Kafka
>  Issue Type: Task
>  Components: clients
>Reporter: Lianet Magrans
>Assignee: Lianet Magrans
>Priority: Major
>
> Review PlaintextAdminIntegrationTest and attempt to split it to allow for 
> parallelization and improve build times. This tests is the longest running 
> integration test in kafka.api, so a similar approach to what has been done 
> with the consumer tests in PlaintextConsumerTest should be a good 
> improvement. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[VOTE] KIP-1033: Add Kafka Streams exception handler for exceptions occurring during processing

2024-05-27 Thread Loic Greffier
Hi everyone,

A first pull request has been opened: 
https://github.com/apache/kafka/pull/16090.

As Bruno mentioned in the following discussion: 
https://github.com/apache/kafka/pull/16090#discussion_r1616264629, we bring a 
minor change to KIP-1033 by renaming both ProcessingExceptionHandler 
implementations to stick with the name of the interface:
- "ProcessingLogAndContinueExceptionHandler" to 
"LogAndContinueProcessingExceptionHandler"
- "ProcessingLogAndFailExceptionHandler" to 
"LogAndFailProcessingExceptionHandler"

Regards,
Loïc


Build failed in Jenkins: Kafka » kafka-2.7-jdk8 #191

2024-05-27 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 3.46 MB...]

org.apache.kafka.streams.TestTopicsTest > testStartTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testNegativeAdvance STARTED

org.apache.kafka.streams.TestTopicsTest > testNegativeAdvance PASSED

org.apache.kafka.streams.TestTopicsTest > shouldNotAllowToCreateWithNullDriver 
STARTED

org.apache.kafka.streams.TestTopicsTest > shouldNotAllowToCreateWithNullDriver 
PASSED

org.apache.kafka.streams.TestTopicsTest > testDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testOutputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testOutputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testValue STARTED

org.apache.kafka.streams.TestTopicsTest > testValue PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance PASSED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName PASSED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics STARTED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver PASSED

org.apache.kafka.streams.TestTopicsTest > testValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testValueList PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordList PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testInputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testInputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestamp STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders STARTED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValue STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValue PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName PASSED

> Task :streams:upgrade-system-tests-0100:processTestResources NO-SOURCE
> Task :streams:upgrade-system-tests-0100:testClasses
> Task :streams:upgrade-system-tests-0100:checkstyleTest
> Task :streams:upgrade-system-tests-0100:spotbugsMain NO-SOURCE
> Task :streams:upgrade-system-tests-0100:test
> Task :streams:upgrade-system-tests-0101:compileJava NO-SOURCE
> Task :streams:upgrade-system-tests-0101:processResources NO-SOURCE
> Task :streams:upgrade-system-tests-0101:classes UP-TO-DATE
> Task :streams:upgrade-system-tests-0101:checkstyleMain NO-SOURCE
> Task :streams:upgrade-system-tests-0101:compileTestJava
> Task :streams:upgrade-system-tests-0101:processTestResources NO-SOURCE
> Task :streams:upgrade-system-tests-0101:testClasses
> Task :streams:upgrade-system-tests-0101:checkstyleTest
> Task :streams:upgrade-system-tests-0101:spotbugsMain NO-SOURCE
> Task :streams:upgrade-system-tests-0101:test
> Task :streams:upgrade-system-tests-0102:compileJava NO-SOURCE
> Task :streams:upgrade-system-tests-0102:processResources NO-SOURCE
> Task :streams:upgrade-system-tests-0102:classes U

Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #2937

2024-05-27 Thread Apache Jenkins Server
See 




Action requested: Changes to CI for JDK 11 & 17 builds on Pull Requests

2024-05-27 Thread Greg Harris
Hello Apache Kafka Developers,

In order to better utilize scarce CI resources shared with other Apache
projects, the Kafka project will no longer be running full test suites for
the JDK 11 & 17 components of PR builds.

*Action requested: If you have an active pull request, please merge or
rebase the latest trunk into your branch* before continuing development as
normal. You may wait to push the resulting branch until you make another
commit, or push the result immediately.

What to expect with this change:
* Trunk (and release branch) builds will not be affected.
* JDK 8 and 21 builds will not be affected.
* Compilation will not be affected.
* Static analysis (spotbugs, checkstyle, etc) will not be affected.
* Overall build execution time should be similar or slightly better than
before.
* You can expect fewer tests to be run on your PRs (~6 instead of
~12).
* Test flakiness should be similar or slightly better than before.

And as a reminder, build failures (red indicators in CloudBees) are always
blockers for merging. Starting now, the 11 and 17 builds should always pass
(green indicators in CloudBees) before merging, as failed tests (yellow
indicators in CloudBees) should no longer be present.

Thanks everyone,
Greg Harris


[jira] [Resolved] (KAFKA-16709) alter logDir within broker might cause log cleanup hanging

2024-05-27 Thread Luke Chen (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-16709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Luke Chen resolved KAFKA-16709.
---
Resolution: Fixed

> alter logDir within broker might cause log cleanup hanging
> --
>
> Key: KAFKA-16709
> URL: https://issues.apache.org/jira/browse/KAFKA-16709
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 3.7.0
>Reporter: Luke Chen
>Assignee: Luke Chen
>Priority: Major
> Fix For: 3.8.0
>
>
> When doing alter replica logDirs, we'll create a future log and pause log 
> cleaning for the partition( 
> [here|https://github.com/apache/kafka/blob/643db430a707479c9e87eec1ad67e1d4f43c9268/core/src/main/scala/kafka/server/ReplicaManager.scala#L1200]).
>  And this log cleaning pausing will resume after alter replica logDirs 
> completes 
> ([here|https://github.com/apache/kafka/blob/643db430a707479c9e87eec1ad67e1d4f43c9268/core/src/main/scala/kafka/log/LogManager.scala#L1254]).
>  And when in the resuming log cleaning, we'll decrement 1 for the 
> LogCleaningPaused count. Once the count reached 0, the cleaning pause is 
> really resuming. 
> ([here|https://github.com/apache/kafka/blob/643db430a707479c9e87eec1ad67e1d4f43c9268/core/src/main/scala/kafka/log/LogCleanerManager.scala#L310]).
>  For more explanation about the logCleaningPaused state can check 
> [here|https://github.com/apache/kafka/blob/643db430a707479c9e87eec1ad67e1d4f43c9268/core/src/main/scala/kafka/log/LogCleanerManager.scala#L55].
>  
> But, there's still one factor that could increase the LogCleaningPaused 
> count: leadership change 
> ([here|https://github.com/apache/kafka/blob/643db430a707479c9e87eec1ad67e1d4f43c9268/core/src/main/scala/kafka/server/ReplicaManager.scala#L2126]).
>  When there's a leadership change, we'll check if there's a future log in 
> this partition, if so, we'll create future log and pauseCleaning 
> (LogCleaningPaused count + 1). So, if during the alter replica logDirs:
>  # alter replica logDirs for tp0 triggered (LogCleaningPaused count = 1)
>  # tp0 leadership changed (LogCleaningPaused count = 2)
>  # alter replica logDirs completes, resuming logCleaning (LogCleaningPaused 
> count = 1)
>  # LogCleaning keeps paused because the count is always >  0
>  
> The log cleaning is not just related to compacting logs, but also affecting 
> the normal log retention processing, which means, the log retention for these 
> paused partitions will be pending. This issue can be fixed when broker 
> restarted.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Action requested: Changes to CI for JDK 11 & 17 builds on Pull Requests

2024-05-27 Thread Ismael Juma
Hi Greg,

Thanks for making this change.

Note that compilation with Java 11/17 doesn't add any value over compiling
with Java 21 with the appropriate --release config (which we set). So, this
part of the build process is wasteful. Running the tests does add some
value (and hence why we originally had it), but the return on investment is
not good enough given our CI issues (and hence why the change is good).

Ismael

On Mon, May 27, 2024, 8:20 PM Greg Harris 
wrote:

> Hello Apache Kafka Developers,
>
> In order to better utilize scarce CI resources shared with other Apache
> projects, the Kafka project will no longer be running full test suites for
> the JDK 11 & 17 components of PR builds.
>
> *Action requested: If you have an active pull request, please merge or
> rebase the latest trunk into your branch* before continuing development as
> normal. You may wait to push the resulting branch until you make another
> commit, or push the result immediately.
>
> What to expect with this change:
> * Trunk (and release branch) builds will not be affected.
> * JDK 8 and 21 builds will not be affected.
> * Compilation will not be affected.
> * Static analysis (spotbugs, checkstyle, etc) will not be affected.
> * Overall build execution time should be similar or slightly better than
> before.
> * You can expect fewer tests to be run on your PRs (~6 instead of
> ~12).
> * Test flakiness should be similar or slightly better than before.
>
> And as a reminder, build failures (red indicators in CloudBees) are always
> blockers for merging. Starting now, the 11 and 17 builds should always pass
> (green indicators in CloudBees) before merging, as failed tests (yellow
> indicators in CloudBees) should no longer be present.
>
> Thanks everyone,
> Greg Harris
>


[jira] [Created] (KAFKA-16847) Revise the README for recent CI changes

2024-05-27 Thread Chia-Ping Tsai (Jira)
Chia-Ping Tsai created KAFKA-16847:
--

 Summary: Revise the README for recent CI changes 
 Key: KAFKA-16847
 URL: https://issues.apache.org/jira/browse/KAFKA-16847
 Project: Kafka
  Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai


The recent changes [0] removes the test of 11 and 17, and that is good to our 
CI resources. However, in the root readme we still declaim "We build and test 
Apache Kafka with Java 8, 11, 17 and 21" 


[0] 
https://github.com/apache/kafka/commit/adab48df6830259d33bd9705b91885c4f384f267
[1] https://github.com/apache/kafka/blob/trunk/README.md?plain=1#L7



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Action requested: Changes to CI for JDK 11 & 17 builds on Pull Requests

2024-05-27 Thread Chia-Ping Tsai
Dear all, 

I do love Harris's patch as no one love slow CI I believe. For another, I file 
https://issues.apache.org/jira/browse/KAFKA-16847 just now to revise our readme 
about JDK. I'd like to raise more discussion here.

> Note that compilation with Java 11/17 doesn't add any value over compiling
> with Java 21 with the appropriate --release config (which we set). So, this
> part of the build process is wasteful.

I did not see build failure that happens in 11 and 17 but not in 8 or 21, and 
also it can save more CI resources and make our CI be thinner. Hence, I'm +1 to 
drop 11 and 17 totally.

Best,
Chia-Ping


On 2024/05/28 04:40:48 Ismael Juma wrote:
> Hi Greg,
> 
> Thanks for making this change.
> 
> Note that compilation with Java 11/17 doesn't add any value over compiling
> with Java 21 with the appropriate --release config (which we set). So, this
> part of the build process is wasteful. Running the tests does add some
> value (and hence why we originally had it), but the return on investment is
> not good enough given our CI issues (and hence why the change is good).
> 
> Ismael
> 
> On Mon, May 27, 2024, 8:20 PM Greg Harris 
> wrote:
> 
> > Hello Apache Kafka Developers,
> >
> > In order to better utilize scarce CI resources shared with other Apache
> > projects, the Kafka project will no longer be running full test suites for
> > the JDK 11 & 17 components of PR builds.
> >
> > *Action requested: If you have an active pull request, please merge or
> > rebase the latest trunk into your branch* before continuing development as
> > normal. You may wait to push the resulting branch until you make another
> > commit, or push the result immediately.
> >
> > What to expect with this change:
> > * Trunk (and release branch) builds will not be affected.
> > * JDK 8 and 21 builds will not be affected.
> > * Compilation will not be affected.
> > * Static analysis (spotbugs, checkstyle, etc) will not be affected.
> > * Overall build execution time should be similar or slightly better than
> > before.
> > * You can expect fewer tests to be run on your PRs (~6 instead of
> > ~12).
> > * Test flakiness should be similar or slightly better than before.
> >
> > And as a reminder, build failures (red indicators in CloudBees) are always
> > blockers for merging. Starting now, the 11 and 17 builds should always pass
> > (green indicators in CloudBees) before merging, as failed tests (yellow
> > indicators in CloudBees) should no longer be present.
> >
> > Thanks everyone,
> > Greg Harris
> >
> 


Re: Action requested: Changes to CI for JDK 11 & 17 builds on Pull Requests

2024-05-27 Thread Luke Chen
> I did not see build failure that happens in 11 and 17 but not in 8 or 21,
and also it can save more CI resources and make our CI be thinner.
Same here. I've never seen build passed in jdk 21 but failed in 11 or 17.
But even if it happened, it is rare. I think we are just making a trade-off
to make CI more reliable and faster.

Thanks.
Luke

On Tue, May 28, 2024 at 2:22 PM Chia-Ping Tsai  wrote:

> Dear all,
>
> I do love Harris's patch as no one love slow CI I believe. For another, I
> file https://issues.apache.org/jira/browse/KAFKA-16847 just now to revise
> our readme about JDK. I'd like to raise more discussion here.
>
> > Note that compilation with Java 11/17 doesn't add any value over
> compiling
> > with Java 21 with the appropriate --release config (which we set). So,
> this
> > part of the build process is wasteful.
>
> I did not see build failure that happens in 11 and 17 but not in 8 or 21,
> and also it can save more CI resources and make our CI be thinner. Hence,
> I'm +1 to drop 11 and 17 totally.
>
> Best,
> Chia-Ping
>
>
> On 2024/05/28 04:40:48 Ismael Juma wrote:
> > Hi Greg,
> >
> > Thanks for making this change.
> >
> > Note that compilation with Java 11/17 doesn't add any value over
> compiling
> > with Java 21 with the appropriate --release config (which we set). So,
> this
> > part of the build process is wasteful. Running the tests does add some
> > value (and hence why we originally had it), but the return on investment
> is
> > not good enough given our CI issues (and hence why the change is good).
> >
> > Ismael
> >
> > On Mon, May 27, 2024, 8:20 PM Greg Harris 
> > wrote:
> >
> > > Hello Apache Kafka Developers,
> > >
> > > In order to better utilize scarce CI resources shared with other Apache
> > > projects, the Kafka project will no longer be running full test suites
> for
> > > the JDK 11 & 17 components of PR builds.
> > >
> > > *Action requested: If you have an active pull request, please merge or
> > > rebase the latest trunk into your branch* before continuing
> development as
> > > normal. You may wait to push the resulting branch until you make
> another
> > > commit, or push the result immediately.
> > >
> > > What to expect with this change:
> > > * Trunk (and release branch) builds will not be affected.
> > > * JDK 8 and 21 builds will not be affected.
> > > * Compilation will not be affected.
> > > * Static analysis (spotbugs, checkstyle, etc) will not be affected.
> > > * Overall build execution time should be similar or slightly better
> than
> > > before.
> > > * You can expect fewer tests to be run on your PRs (~6 instead of
> > > ~12).
> > > * Test flakiness should be similar or slightly better than before.
> > >
> > > And as a reminder, build failures (red indicators in CloudBees) are
> always
> > > blockers for merging. Starting now, the 11 and 17 builds should always
> pass
> > > (green indicators in CloudBees) before merging, as failed tests (yellow
> > > indicators in CloudBees) should no longer be present.
> > >
> > > Thanks everyone,
> > > Greg Harris
> > >
> >
>


Re: [DISCUSS] Apache Kafka 3.8.0 release

2024-05-27 Thread Josep Prat
Hi Kafka developers,

This is a reminder about the upcoming deadlines:
- Feature freeze is on May 29th
- Code freeze is June 12th

I'll cut the new branch during morning hours (CEST) on May 30th.

Thanks all!

On Thu, May 16, 2024 at 8:34 AM Josep Prat  wrote:

> Hi all,
>
> We are now officially past the KIP freeze deadline. KIPs that are approved
> after this point in time shouldn't be adopted in the 3.8.x release (except
> the 2 already mentioned KIPS 989 and 1028 assuming no vetoes occur).
>
> Reminder of the upcoming deadlines:
> - Feature freeze is on May 29th
> - Code freeze is June 12th
>
> If you have an approved KIP that you know already you won't be able to
> complete before the feature freeze deadline, please update the Release
> column in the
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> page.
>
> Thanks all,
>
> On Wed, May 15, 2024 at 8:53 PM Josep Prat  wrote:
>
>> Hi Nick,
>>
>> If nobody comes up with concerns or pushback until the time of closing
>> the vote, I think we can take it for 3.8.
>>
>> Best,
>>
>> -
>>
>> Josep Prat
>> Open Source Engineering Director, aivenjosep.p...@aiven.io   |
>> +491715557497 | aiven.io
>> Aiven Deutschland GmbH
>> Alexanderufer 3-7, 10117 Berlin
>> Geschäftsführer: Oskari Saarenmaa & Hannu Valtonen
>> Amtsgericht Charlottenburg, HRB 209739 B
>>
>> On Wed, May 15, 2024, 20:48 Nick Telford  wrote:
>>
>>> Hi Josep,
>>>
>>> Would it be possible to sneak KIP-989 into 3.8? Just as with 1028, it's
>>> currently being voted on and has already received the requisite votes.
>>> The
>>> only thing holding it back is the 72 hour voting window.
>>>
>>> Vote thread here:
>>> https://lists.apache.org/thread/nhr65h4784z49jbsyt5nx8ys81q90k6s
>>>
>>> Regards,
>>>
>>> Nick
>>>
>>> On Wed, 15 May 2024 at 17:47, Josep Prat 
>>> wrote:
>>>
>>> > And my maths are wrong! I added 24 hours more to all the numbers in
>>> there.
>>> > If after 72 hours no vetoes appear, I have no objections on adding this
>>> > specific KIP as it shouldn't have a big blast radius of affectation.
>>> >
>>> > Best,
>>> >
>>> > On Wed, May 15, 2024 at 6:44 PM Josep Prat 
>>> wrote:
>>> >
>>> > > Ah, I see Chris was faster writing this than me.
>>> > >
>>> > > On Wed, May 15, 2024 at 6:43 PM Josep Prat 
>>> wrote:
>>> > >
>>> > >> Hi all,
>>> > >> You still have the full day of today (independently for the
>>> timezone) to
>>> > >> get KIPs approved. Tomorrow morning (CEST timezone) I'll send
>>> another
>>> > email
>>> > >> asking developers to assign future approved KIPs to another version
>>> > that is
>>> > >> not 3.8.
>>> > >>
>>> > >> So, the only problem I see with KIP-1028 is that it hasn't been
>>> open for
>>> > >> a vote for 72 hours (48 hours as of now). If there is no negative
>>> > voting on
>>> > >> the KIP I think we can let that one in, given it would only miss the
>>> > >> deadline by less than 12 hours (if my timezone maths add up).
>>> > >>
>>> > >> Best,
>>> > >>
>>> > >> On Wed, May 15, 2024 at 6:35 PM Ismael Juma 
>>> wrote:
>>> > >>
>>> > >>> The KIP freeze is just about having the KIP accepted. Not sure why
>>> we
>>> > >>> would
>>> > >>> need an exception for that.
>>> > >>>
>>> > >>> Ismael
>>> > >>>
>>> > >>> On Wed, May 15, 2024 at 9:20 AM Chris Egerton <
>>> fearthecel...@gmail.com
>>> > >
>>> > >>> wrote:
>>> > >>>
>>> > >>> > FWIW I think that the low blast radius for KIP-1028 should allow
>>> it
>>> > to
>>> > >>> > proceed without adhering to the usual KIP and feature freeze
>>> dates.
>>> > >>> Code
>>> > >>> > freeze is probably worth still  respecting, at least if changes
>>> are
>>> > >>> > required to the docker/jvm/Dockerfile. But I defer to Josep's
>>> > >>> judgement as
>>> > >>> > the release manager.
>>> > >>> >
>>> > >>> > On Wed, May 15, 2024, 06:59 Vedarth Sharma <
>>> vedarth.sha...@gmail.com
>>> > >
>>> > >>> > wrote:
>>> > >>> >
>>> > >>> > > Hey Josep!
>>> > >>> > >
>>> > >>> > > The KIP 1028 has received the required votes. Voting thread:-
>>> > >>> > >
>>> https://lists.apache.org/thread/cdq4wfv5v1gpqlxnf46ycwtcwk5wos4q
>>> > >>> > > But we are keeping the vote open for 72 hours as per the
>>> process.
>>> > >>> > >
>>> > >>> > > I would like to request you to please consider it for the 3.8.0
>>> > >>> release.
>>> > >>> > >
>>> > >>> > > Thanks and regards,
>>> > >>> > > Vedarth
>>> > >>> > >
>>> > >>> > >
>>> > >>> > > On Wed, May 15, 2024 at 1:14 PM Josep Prat
>>> > >>> 
>>> > >>> > > wrote:
>>> > >>> > >
>>> > >>> > > > Hi Kafka developers!
>>> > >>> > > >
>>> > >>> > > > Today is the KIP freeze deadline. All KIPs should be
>>> accepted by
>>> > >>> EOD
>>> > >>> > > today.
>>> > >>> > > > Tomorrow morning (CEST timezone) I'll start summarizing all
>>> KIPs
>>> > >>> that
>>> > >>> > > have
>>> > >>> > > > been approved. Please any KIP approved after tomorrow should
>>> be
>>> > >>> adopted
>>> > >>> > > in
>>> > >>> > > > a future release version, not 3.8.
>>> > >>> > > >
>>> > >>> > > > Other

[jira] [Created] (KAFKA-16848) Reverting KRaft migration for "Migrating brokers to KRaft" state is wrong

2024-05-27 Thread Luke Chen (Jira)
Luke Chen created KAFKA-16848:
-

 Summary: Reverting KRaft migration for "Migrating brokers to 
KRaft" state is wrong
 Key: KAFKA-16848
 URL: https://issues.apache.org/jira/browse/KAFKA-16848
 Project: Kafka
  Issue Type: Bug
Affects Versions: 3.7.0
Reporter: Luke Chen


Hello,

 

I would like to report a mistake in the {_}Kafka 3.7 Documentation -> 6.10 
KRaft -> ZooKeeper to KRaft Migration -> Reverting to ZooKeeper mode During the 
Migration{_}.

 

While migrating my Kafka + Zookeeper cluster to KRaft and testing rollbacks at 
a different migration stages I have noticed, that "{_}Directions for 
reverting{_}" provided for "{_}Migrating brokers to KRaft{_}" are wrong.



Following the first step provided in documentation you suppose to : _On each 
broker, remove the process.roles configuration, and restore the 
zookeeper.connect configuration to its previous value. If your cluster requires 
other ZooKeeper configurations for brokers, such as zookeeper.ssl.protocol, 
re-add those configurations as well. Then perform a rolling._


In that case, if you remove _process.roles_ configuration and restore  
_zookeeper.connect_ as well as other _ZooKeeper_ configuration (If your cluster 
requires) you will receive an error that looks like this:
[2024-05-28 08:09:49,396] lvl=ERROR Exiting Kafka due to fatal exception 
logger=kafka.Kafka$

java.lang.IllegalArgumentException: requirement failed: 
controller.listener.names must be empty when not running in KRaft mode: 
[CONTROLLER]

    at scala.Predef$.require(Predef.scala:337)

    at kafka.server.KafkaConfig.validateValues(KafkaConfig.scala:2441)

    at kafka.server.KafkaConfig.(KafkaConfig.scala:2290)

    at kafka.server.KafkaConfig.(KafkaConfig.scala:1639)

    at kafka.Kafka$.buildServer(Kafka.scala:71)

    at kafka.Kafka$.main(Kafka.scala:90)

    at kafka.Kafka.main(Kafka.scala)

 

However I was able to perform rollback successfully by performing additional 
steps:
 * Restore _zookeeper.metadata.migration.enable=true_ line in broker 
configuration;
 * We are using {_}[authorizer.class.name|http://authorizer.class.name/]{_}, so 
it also had to be reverted: 
_org.apache.kafka.metadata.authorizer.StandardAuthorizer_ -> 
{_}kafka.security.authorizer.AclAuthorizer{_};

 

I believe that should be mentioned.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)