Re: Re: [DISCUSS] KIP-842: Add richer group offset reset mechanisms

2022-06-07 Thread hudeqi
I think so too, what about Guozhang Wang and Luke Chen? Can I initiate a voting 
process?

Best,
hudeqi

> -原始邮件-
> 发件人: "邓子明" 
> 发送时间: 2022-06-07 10:23:37 (星期二)
> 收件人: dev@kafka.apache.org
> 抄送: 
> 主题: Re: [DISCUSS] KIP-842: Add richer group offset reset mechanisms
> 


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #982

2022-06-07 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-13962) KRaft StripedReplicaPlacer should handle replicas in controlled shutdown

2022-06-07 Thread David Jacot (Jira)
David Jacot created KAFKA-13962:
---

 Summary: KRaft StripedReplicaPlacer should handle replicas in 
controlled shutdown
 Key: KAFKA-13962
 URL: https://issues.apache.org/jira/browse/KAFKA-13962
 Project: Kafka
  Issue Type: Improvement
Reporter: David Jacot
Assignee: David Jacot


[KIP-841|https://cwiki.apache.org/confluence/display/KAFKA/KIP-841%3A+Fenced+replicas+should+not+be+allowed+to+join+the+ISR+in+KRaft]
 added the in-controlled-shutdown state to the quorum controller. The 
StripedReplicaPlacer should be aware of them and treat them like fenced 
replicas (place there only as last resort).



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


RE: Re: KIP 678: New Kafka Connect SMT for plainText => Struct(or Map) with Regex

2022-06-07 Thread gyejun choi
Hi Chris,

Thank you for your positive feed back,
And sorry about late reply ( I didn’t recognize your reply email… TT )

And I update KIP and PR with your review
https://cwiki.apache.org/confluence/display/KAFKA/KIP+678%3A+New+Kafka+Connect+SMT+for+plainText+%3D%3E+Struct%28or+Map%29+with+Regex

1. typecast function removed
2. struct.field removed
3. Proposed Interface changed


I will wait for you second feed back

Thanks~

On 2021/07/15 15:57:14 Chris Egerton wrote:
> Hi whsoul,
>
> Thanks for the KIP. The two use cases you identified seem quite appealing,
> especially the first one: parsing structured log messages.
>
> Here are my initial thoughts on the proposed design:
>
> 1. I wonder if it's necessary to include support for type casting with
this
> SMT. We already have a Cast SMT (
>
https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java
)
> that can parse multiple fields of a structured record value with differing
> types. Would it be enough for your new SMT to only produce string values
> for its structured data, and then allow users to perform casting logic
> using the Cast SMT afterward?
>
> 2. It seems like the "struct.field" property is similar; based on the
> examples, it looks like when the SMT is configured with a value for that
> property, it will first pull out a field from a structured record value
> (for example, it would pull out the value "
> https://kafka.apache.org/documentation/#connect"; from a map of {"url": "
> https://kafka.apache.org/documentation/#connect"}), then parse that
field's
> value, and replace the entire record value (or key) with the result of the
> parsing stage. It seems like this could be accomplished using the
> ExtractField SMT (
>
https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ExtractField.java
)
> as a preliminary step before passing it to your new SMT. Is this correct?
> And if so, could we simplify the interface for your SMT by removing the
> "struct.field" property in favor of the existing ExtractField SMT?
>
> 3. (Nit) I believe the property names in the table beneath the "Proposed
> Interfaces" section should have the "transforms.RegexTransform." prefix
> removed, since users will be able to configure the SMT with any name, not
> just "RegexTransform". This keeps in line with similar KIPs such as
> KIP-437, which added a new property to the MaskField SMT (
>
https://cwiki.apache.org/confluence/display/KAFKA/KIP-437%3A+Custom+replacement+for+MaskField+SMT
> ).
>
> Cheers,
>
> Chris
>
> On Thu, Jul 15, 2021 at 7:56 AM gyejun choi  wrote:
>
> > is there someone who any feed back about this?
> >
> > 2020년 10월 23일 (금) 오후 2:56, gyejun choi 님이 작성:
> >
> > > Hi,
> > >
> > > I've opened KIP-678 which is intended to provide a new SMT in Kafka
> > Connect.
> > > I'd be grateful for any
> > > feedback:
> >
https://cwiki.apache.org/confluence/display/KAFKA/KIP+678%3A+New+Kafka+Connect+SMT+for+plainText+%3D%3E+Struct%28or+Map%29+with+Regex==
> > >
> > > thanks,
> > >
> > > whsoul
> > >
> > >
> >
>


[jira] [Created] (KAFKA-13963) Topology Description ignores context.forward

2022-06-07 Thread Tomasz Kaszuba (Jira)
Tomasz Kaszuba created KAFKA-13963:
--

 Summary: Topology Description ignores context.forward
 Key: KAFKA-13963
 URL: https://issues.apache.org/jira/browse/KAFKA-13963
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.7.2
Reporter: Tomasz Kaszuba


I have a simple topology:
{code:java}
      val topology = new Topology
      topology
        .addSource("source", Serdes.stringSerde.deserializer, 
Serdes.stringSerde.deserializer, inputTopic)
        .addProcessor(
          "process",
          new ProcessorSupplier[String, String] {
            override def get(): Processor[String, String] =
              new RecordCollectorProcessor()
          },
          "source"
        ) {code}
And a simple processor that uses context.forward to forward messages:
{code:java}
  private class ContextForwardProcessor extends AbstractProcessor[String, 
String]() {    override def process(key: String, value: String): Unit =
      context().forward("key", "value", To.child("output"))    override def 
close(): Unit = ()
  }  {code}
when I call topology.describe() I receive this:
{noformat}
Topologies:
   Sub-topology: 0
    Source: source (topics: [input])
      --> process
    Processor: process (stores: [])
      --> none
      <-- source {noformat}
Ignoring the fact that this will not run since it will throw a runtime 
exception why is the To.child ignored?

Taking it one point further if I add multiple sinks to the topology like so:
{code:java}
val topology = new Topology
      topology
        .addSource("source", Serdes.stringSerde.deserializer, 
Serdes.stringSerde.deserializer, inputTopic)
        .addProcessor(
          "process",
          new ProcessorSupplier[String, String] {
            override def get(): Processor[String, String] =
              new ContextForwardProcessor()
          },
          "source"
        )
        .addSink("sink", "output1", Serdes.stringSerde.serializer(), 
Serdes.stringSerde.serializer(), "process")
        .addSink("sink2", "output2", Serdes.stringSerde.serializer(), 
Serdes.stringSerde.serializer(), "process")  {code}
but have the processor only output to "output1" it is in no way reflected in 
the described topology graph.

I assume this is by design since it's a lot more work to interpret what the 
context.forward is doing but when I tried to look for this information in the 
java doc I couldn't find it.

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [VOTE] KIP-840: Config file option for MessageReader/MessageFormatter in ConsoleProducer/ConsoleConsumer

2022-06-07 Thread Alexandre Garnier
Hi!

A little reminder to vote for this KIP.

Thanks.


Le mer. 1 juin 2022 à 10:58, Alexandre Garnier  a écrit :
>
> Hi everyone!
>
> I propose to start voting for KIP-840:
> https://cwiki.apache.org/confluence/x/bBqhD
>
> Thanks,
> --
> Alex


[jira] [Resolved] (KAFKA-13335) Upgrading connect from 2.7.0 to 2.8.0 causes worker instability

2022-06-07 Thread John Gray (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13335?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Gray resolved KAFKA-13335.
---
Resolution: Not A Problem

Finally got back to this after a long time. This is no bug or fault of Kafka 
Connect. We have a lot of connectors, so it takes a while to rebalance all of 
them. We were simply constantly hitting the rebalance.timeout.ms, leaving us in 
an endless loop of rebalancing. Not sure what changed between 2.7.0 and 2.8.0 
to enforce this timeout or to lengthen the time to rebalance, but something 
did. Bumped the timeout to 3 minutes from 1 minute and we are good to go! 

> Upgrading connect from 2.7.0 to 2.8.0 causes worker instability
> ---
>
> Key: KAFKA-13335
> URL: https://issues.apache.org/jira/browse/KAFKA-13335
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 2.8.0
>Reporter: John Gray
>Priority: Major
> Attachments: image-2021-09-29-09-15-18-172.png
>
>
> After recently upgrading our connect cluster to 2.8.0 (via 
> strimzi+Kubernetes, brokers are still on 2.7.0), I am noticing that the 
> cluster is struggling to stabilize. Connectors are being 
> unassigned/reassigned/duplicated continuously, and never settling back down. 
> A downgrade back to 2.7.0 fixes things immediately. I have attached a picture 
> of our Grafana dashboards showing some metrics. We have a connect cluster 
> with 4 nodes, trying to maintain about 1000 connectors, each connector with a 
> maxTask of 1. 
> We are noticing a slow increase in memory usage with big random peaks of 
> tasks counts and thread counts.
> I do also notice over the course of letting 2.8.0 run a huge increase in logs 
> stating that {code}ERROR Graceful stop of task (task name here) 
> failed.{code}, but the logs do not seem to indicate a reason. The connector 
> appears to be stopped only seconds after its creation. It appears to only 
> affect our source connectors. These logs stop after downgrading back to 2.7.0.
> I am also seeing an increase in logs stating that {code}Couldn't instantiate 
> task (task name) because it has an invalid task configuration. This task will 
> not execute until reconfigured. 
> (org.apache.kafka.connect.runtime.distributed.DistributedHerder) 
> [StartAndStopExecutor-connect-1-1]
> org.apache.kafka.connect.errors.ConnectException: Task already exists in this 
> worker: (task name)
>   at org.apache.kafka.connect.runtime.Worker.startTask(Worker.java:512)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.startTask(DistributedHerder.java:1251)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder.access$1700(DistributedHerder.java:127)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$10.call(DistributedHerder.java:1266)
>   at 
> org.apache.kafka.connect.runtime.distributed.DistributedHerder$10.call(DistributedHerder.java:1262)
>   at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
>   at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
>   at java.base/java.lang.Thread.run(Thread.java:834){code}
> I am not sure what could be causing this, any insight would be appreciated! 
> I do notice Kafka 2.7.1/2.8.0 contains a bugfix related to connect rebalances 
> (KAFKA-10413). Is that fix potentially causing instability? 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] KIP-714: Client metrics and observability

2022-06-07 Thread Magnus Edenhill
Hey Jun,

I've clarified the scope of the standard metrics in the KIP, but basically:

 * We define a standard set of generic metrics that should be relevant to
most client implementations, e.g., each producer implementation probably
has some sort of per-partition message queue.
 * A client implementation should strive to implement as many of the
standard metrics as possible, but only the ones that make sense.
 * For metrics that are not in the standard set, a client maintainer can
choose to either submit a KIP to add additional standard metrics - if
they're relevant, or go ahead and add custom metrics that are specific to
that client implementation. These custom metrics will have a prefix
specific to that client implementation, as opposed to the standard metric
set that resides under "org.apache.kafka...". E.g.,
"se.edenhill.librdkafka" or whatever.
 * Existing non-KIP-714 metrics should remain untouched. In some cases we
might be able to use the same meter given it is compatible with the
standard metric set definition, in other cases a semi-duplicate meter may
be needed. Thus this will not affect the metrics exposed through JMX, or
vice versa.

Thanks,
Magnus



Den ons 1 juni 2022 kl 18:55 skrev Jun Rao :

> Hi, Magnus,
>
> 51. Just to clarify my question.  (1) Are standard metrics required for
> every client for this KIP to function?  (2) Are we converting existing java
> metrics to the standard metrics and deprecating the old ones? If so, could
> we list all existing java metrics that need to be renamed and the
> corresponding new name?
>
> Thanks,
>
> Jun
>
> On Tue, May 31, 2022 at 3:29 PM Jun Rao  wrote:
>
> > Hi, Magnus,
> >
> > Thanks for the reply.
> >
> > 51. I think it's fine to have a list of recommended metrics for every
> > client to implement. I am just not sure that standardizing on the metric
> > names across all clients is practical. The list of common metrics in the
> > KIP have completely different names from the java metric names. Some of
> > them have different types. For example, some of the common metrics have a
> > type of histogram, but the java client metrics don't use histogram in
> > general. Requiring the operator to translate those names and understand
> the
> > subtle differences across clients seem to cause more confusion during
> > troubleshooting.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, May 31, 2022 at 5:02 AM Magnus Edenhill 
> > wrote:
> >
> >> Den fre 20 maj 2022 kl 01:23 skrev Jun Rao :
> >>
> >> > Hi, Magus,
> >> >
> >> > Thanks for the reply.
> >> >
> >> > 50. Sounds good.
> >> >
> >> > 51. I miss-understood the proposal in the KIP then. The proposal is to
> >> > define a set of common metric names that every client should
> implement.
> >> The
> >> > problem is that every client already has its own set of metrics with
> its
> >> > own names. I am not sure that we could easily agree upon a common set
> of
> >> > metrics that work with all clients. There are likely to be some
> metrics
> >> > that are client specific. Translating between the common name and
> client
> >> > specific name is probably going to add more confusion. As mentioned in
> >> the
> >> > KIP, similar metrics from different clients could have subtle
> >> > semantic differences. Could we just let each client use its own set of
> >> > metric names?
> >> >
> >>
> >> We identified a common set of metrics that should be relevant for most
> >> client implementations,
> >> they're the ones listed in the KIP.
> >> A supporting client does not have to implement all those metrics, only
> the
> >> ones that makes sense
> >> based on that client implementation, and a client may implement other
> >> metrics that are not listed
> >> in the KIP under its own namespace.
> >> This approach has two benefits:
> >>  - there will be a common set of metrics that most/all clients
> implement,
> >> which makes monitoring
> >>   and troubleshooting easier across fleets with multiple Kafka client
> >> languages/implementations.
> >>  - client-specific metrics are still possible, so if there is no
> suitable
> >> standard metric a client can still
> >>provide what special metrics it has.
> >>
> >>
> >> Thanks,
> >> Magnus
> >>
> >>
> >> On Thu, May 19, 2022 at 10:39 AM Magnus Edenhill 
> >> wrote:
> >> >
> >> > > Den ons 18 maj 2022 kl 19:57 skrev Jun Rao  >> >:
> >> > >
> >> > > > Hi, Magnus,
> >> > > >
> >> > >
> >> > > Hi Jun
> >> > >
> >> > >
> >> > > >
> >> > > > Thanks for the updated KIP. Just a couple of more comments.
> >> > > >
> >> > > > 50. To troubleshoot a particular client issue, I imagine that the
> >> > client
> >> > > > needs to identify its client_instance_id. How does the client find
> >> this
> >> > > > out? Do we plan to include client_instance_id in the client log,
> >> expose
> >> > > it
> >> > > > as a metric or something else?
> >> > > >
> >> > >
> >> > > The KIP suggests that client implementations emit an informative log
> >> > > message
> >> > > with the assigned client-instance-id onc

RE: [External] WELCOME to dev@kafka.apache.org

2022-06-07 Thread Nozza, Nicola
Hi,

I'm contacting you to please ask for an ETA, if it is planned, for the release 
in the following link:

[KAFKA-9366] Upgrade log4j to log4j2 - ASF JIRA (apache.org)

Hope to hear from you soon,

Thanks,

Nicola Nozza

-Original Message-
From: dev-h...@kafka.apache.org 
Sent: martedì 7 giugno 2022 16:33
To: Nozza, Nicola 
Subject: [External] WELCOME to dev@kafka.apache.org

This message is from an EXTERNAL SENDER - be CAUTIOUS, particularly with links 
and attachments.

Hi! This is the ezmlm program. I'm managing the dev@kafka.apache.org mailing 
list.

I'm working for my owner, who can be reached at dev-ow...@kafka.apache.org.

Acknowledgment: I have added the address

   nicola.no...@accenture.com

to the dev mailing list.

Welcome to dev@kafka.apache.org!

Please save this message so that you know the address you are subscribed under, 
in case you later want to unsubscribe or change your subscription address.


--- Administrative commands for the dev list ---

I can handle administrative requests automatically. Please do not send them to 
the list address! Instead, send your message to the correct command address:

To subscribe to the list, send a message to:
   

To remove your address from the list, send a message to:
   

Send mail to the following for info and FAQ for this list:
   
   

Similar addresses exist for the digest list:
   
   

To get messages 123 through 145 (a maximum of 100 per request), mail:
   

To get an index with subject and author for messages 123-456 , mail:
   

They are always returned as sets of 100, max 2000 per request, so you'll 
actually get 100-499.

To receive all messages with the same subject as message 12345, send a short 
message to:
   

The messages should contain one line or word of text to avoid being treated as 
sp@m, but I will ignore their content.
Only the ADDRESS you send to is important.

You can start a subscription for an alternate address, for example 
"john@host.domain", just add a hyphen and your address (with '=' instead of 
'@') after the command word:


To stop subscription for this address, mail:


In both cases, I'll send a confirmation message to that address. When you 
receive it, simply reply to it to complete your subscription.

If despite following these instructions, you do not get the desired results, 
please contact my owner at dev-ow...@kafka.apache.org. Please be patient, my 
owner is a lot slower than I am ;-)

--- Enclosed is a copy of the request I received.

Return-Path: 
Received: (qmail 70509 invoked by uid 99); 7 Jun 2022 14:32:37 -
Received: from spamproc1-he-de.apache.org (HELO spamproc1-he-de.apache.org) 
(116.203.196.100)
by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Jun 2022 14:32:37 +
Received: from localhost (localhost [127.0.0.1])
by spamproc1-he-de.apache.org (ASF Mail Server at 
spamproc1-he-de.apache.org) with ESMTP id 5224C1FF3BB
for 
;
 Tue,  7 Jun 2022 14:32:36 + (UTC)
X-Virus-Scanned: Debian amavisd-new at spamproc1-he-de.apache.org
X-Spam-Flag: NO
X-Spam-Score: -5.212
X-Spam-Level:
X-Spam-Status: No, score=-5.212 tagged_above=-999 required=6.31
tests=[DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1,
DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_HI=-5,
RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01,
URIBL_BLOCKED=0.001] autolearn=disabled
Authentication-Results: spamproc1-he-de.apache.org (amavisd-new);
dkim=pass (2048-bit key) header.d=accenture.com
Received: from mx1-he-de.apache.org ([116.203.227.195])
by localhost (spamproc1-he-de.apache.org [116.203.196.100]) 
(amavisd-new, port 10024)
with ESMTP id TPRcc5xjNskU
for 
;
Tue,  7 Jun 2022 14:32:35 + (UTC)
Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=40.107.94.113; 
helo=nam10-mw2-obe.outbound.protection.outlook.com; 
envelope-from=nicola.no...@accenture.com; receiver=
Received: from NAM10-MW2-obe.outbound.protection.outlook.com 
(mail-mw2nam10on2113.outbound.protection.outlook.com [40.107.94.113])
by mx1-he-de.apache.org (ASF Mail Server at mx1-he-de.apache.org) with 
ESMTPS id 33AE27DD96
for 
;
 Tue,  7 Jun 2022 14:32:35 + (UTC)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;  
b=Un7s25ZqTYec4X9tyGH8T52Q7DGEPPHAOOhiSoJODeCDt6aR/l1qw7uTA7tOGdYi+L7kAGNSCr8fm+N+RGMWJ4rPAYQ7p9kXDCc0LQjcNCX89cyqr6jTtHcj9k5ixV5JNOb04TfeTbmTLhHiaJmdLownMJA0jp+d42lGlCcfXCVUaINUKDH76s1omOPZkdv2rCk6QIbLfNBDMUJ0CSIxKR9v5KO2QmGVb1fMFRP8dLO/DywUWcDfEv8SgBaliBWtikmH3pnpxh+oeQGp9oU7qSoexVW+V5RXqpT1P+fuGRBOyzCeAXsJZw4madL+ho1NbN24cl/n55zjaijvoutozA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com;  
s=arcselector9901;  
h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=a9tgQRrI1v2WcRcCai//zCtCmGGrm6yYw/8uppLC37o=;
 

Upgrade log4j to log4j2

2022-06-07 Thread Nozza, Nicola
Hi,

I'm contacting you to please ask for an ETA, if it is planned, for the release 
in the following link:

[KAFKA-9366] Upgrade log4j to log4j2 - ASF JIRA 
(apache.org)

Hope to hear from you soon,

Thanks,

Nicola Nozza



This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Where allowed by local law, electronic 
communications with Accenture and its affiliates, including e-mail and instant 
messaging (including content), may be scanned by our systems for the purposes 
of information security and assessment of internal compliance with Accenture 
policy. Your privacy is important to us. Accenture uses your personal data only 
in compliance with data protection laws. For further information on how 
Accenture processes your personal data, please see our privacy statement at 
https://www.accenture.com/us-en/privacy-policy.
__

www.accenture.com


Upgrade log4j to log4j2

2022-06-07 Thread Nozza, Nicola
Hi,



I'm contacting you to please ask for an ETA, if it is planned, for the release 
in the following link:



[KAFKA-9366] Upgrade log4j to log4j2 - ASF JIRA (apache.org)



Hope to hear from you soon,



Thanks,



Nicola Nozza




This message is for the designated recipient only and may contain privileged, 
proprietary, or otherwise confidential information. If you have received it in 
error, please notify the sender immediately and delete the original. Any other 
use of the e-mail by you is prohibited. Where allowed by local law, electronic 
communications with Accenture and its affiliates, including e-mail and instant 
messaging (including content), may be scanned by our systems for the purposes 
of information security and assessment of internal compliance with Accenture 
policy. Your privacy is important to us. Accenture uses your personal data only 
in compliance with data protection laws. For further information on how 
Accenture processes your personal data, please see our privacy statement at 
https://www.accenture.com/us-en/privacy-policy.
__

www.accenture.com


Re: [DISCUSS] KIP-827: Expose logdirs total and usable space via Kafka API

2022-06-07 Thread Mickael Maison
Hi Cong,

Some of use cases I have in mind are more around validation that an
operation was successful.
- Let's say you trigger a reassignment to even out disk usage on some
brokers. Your tool will submit the reassignment and wait for
completion using the AlterPartitionReassignments and
ListPartitionReassignments APIs respectively. Once it is complete you
may want to check the new disk usage on your brokers to validate the
reassignment did what you wanted. In that case you'd only call
DescribeLogDirs once.

- A similar example is when resizing volumes. You typically will query
repetitively the API that handles the volume resize and at the end
make a single call to DescribeLogDirs to validate brokers indeed
picked up the updated volumes.

Thanks,
Mickael

On Fri, Jun 3, 2022 at 9:08 PM Cong Ding  wrote:
>
> Thank you, Mickael. One more question: are you imaging these
> tooling/automation to call this API at a very low frequency? since
> high-frequency calls to this API are prohibitively expensive. Can you
> give some examples of low-frequency call use cases? I can think of
> some high-frequency call use cases which are valid in this case, but I
> had a hard time coming up with low-frequency call use cases.
>
> The one you give in the KIP is validating whether disk resize
> operations have been completed. However, this looks like a
> high-frequency call use case to me because we need to keep monitoring
> disk usage before and after resizing.
>
> Cong
>
> On Fri, Jun 3, 2022 at 5:22 AM Mickael Maison  
> wrote:
> >
> > Hi Cong,
> >
> > Maybe some people can do without this KIP.
> > But in many cases, especially around tooling and automation, it's
> > useful to be able to retrieve disk utilization values via the Kafka
> > API rather than interfacing with a metrics system.
> >
> > Does that clarify the motivation?
> >
> > Thanks,
> > Mickael
> >
> > On Wed, Jun 1, 2022 at 7:10 PM Cong Ding  wrote:
> > >
> > > Thanks for the explanation. I think the question is that if we have disk
> > > utilization in our environment, what is the use case for KIP-827? The disk
> > > utilization in our environment can already do the job. Is there anything I
> > > missed?
> > >
> > > Thanks,
> > > Cong
> > >
> > > On Tue, May 31, 2022 at 2:57 AM Mickael Maison 
> > > wrote:
> > >
> > > > Hi Cong,
> > > >
> > > > Kafka does not expose disk utilization metrics. This is something you
> > > > need to provide in your environment. You definitively should have a
> > > > mechanism for exposing metrics from your Kafka broker hosts and you
> > > > should absolutely monitor disk usage and have appropriate alerts.
> > > >
> > > > Thanks,
> > > > Mickael
> > > >
> > > > On Thu, May 26, 2022 at 7:34 PM Jun Rao  
> > > > wrote:
> > > > >
> > > > > Hi, Igor,
> > > > >
> > > > > Thanks for the reply.
> > > > >
> > > > > I agree that this KIP could be useful for improving the tool for 
> > > > > moving
> > > > > data across disks. It would be useful to clarify on the main 
> > > > > motivation
> > > > of
> > > > > the KIP. Also, DescribeLogDirsResponse already includes the size of 
> > > > > each
> > > > > partition on a disk. So, it seems that UsableBytes is redundant since
> > > > it's
> > > > > derivable.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Thu, May 26, 2022 at 3:30 AM Igor Soarez  wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > This can also be quite useful to make better use of existing
> > > > functionality
> > > > > > in the Kafka API — moving replicas between log directories via
> > > > > > ALTER_REPLICA_LOG_DIRS. If usable space information is also 
> > > > > > available
> > > > the
> > > > > > caller can make better decisions using the same API. It means a more
> > > > > > consistent way of interacting with Kafka to manage replicas 
> > > > > > locations
> > > > > > within a broker without having to correlate Kafka metrics with
> > > > information
> > > > > > from the Kafka API.
> > > > > >
> > > > > > --
> > > > > > Igor
> > > > > >
> > > > > > On Wed, May 25, 2022, at 8:16 PM, Jun Rao wrote:
> > > > > > > Hi, Mickael,
> > > > > > >
> > > > > > > Thanks for the KIP.  Since this is mostly for monitoring and
> > > > alerting,
> > > > > > > could we expose them as metrics instead of as part of the API? We
> > > > already
> > > > > > > have a size metric per log. Perhaps we could extend that to add
> > > > > > used/total
> > > > > > > metrics per disk?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Thu, May 19, 2022 at 10:21 PM Raman Verma
> > > >  > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hello Mikael,
> > > > > > >>
> > > > > > >> Thanks for the KIP.
> > > > > > >>
> > > > > > >> I see that the API response contains some information about each
> > > > > > partition.
> > > > > > >> ```
> > > > > > >> { "name": "PartitionSize", "type": "int64", "versions": "0+",
> > > > > > >>   "about": "The size of the log segments in th

Build failed in Jenkins: Kafka » Kafka Branch Builder » 3.1 #127

2022-06-07 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 506055 lines...]
[2022-06-07T16:54:05.646Z] > Task :metadata:testClasses UP-TO-DATE
[2022-06-07T16:54:06.579Z] > Task 
:clients:generateMetadataFileForMavenJavaPublication
[2022-06-07T16:54:06.579Z] > Task 
:clients:generatePomFileForMavenJavaPublication
[2022-06-07T16:54:06.579Z] 
[2022-06-07T16:54:06.579Z] > Task :streams:processMessages
[2022-06-07T16:54:06.579Z] Execution optimizations have been disabled for task 
':streams:processMessages' to ensure correctness due to the following reasons:
[2022-06-07T16:54:06.579Z]   - Gradle detected a problem with the following 
location: 
'/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.1/streams/src/generated/java/org/apache/kafka/streams/internals/generated'.
 Reason: Task ':streams:srcJar' uses this output of task 
':streams:processMessages' without declaring an explicit or implicit 
dependency. This can lead to incorrect results being produced, depending on 
what order the tasks are executed. Please refer to 
https://docs.gradle.org/7.2/userguide/validation_problems.html#implicit_dependency
 for more details about this problem.
[2022-06-07T16:54:06.579Z] MessageGenerator: processed 1 Kafka message JSON 
files(s).
[2022-06-07T16:54:06.579Z] 
[2022-06-07T16:54:06.579Z] > Task :streams:compileJava UP-TO-DATE
[2022-06-07T16:54:06.579Z] > Task :streams:classes UP-TO-DATE
[2022-06-07T16:54:06.579Z] > Task :streams:test-utils:compileJava UP-TO-DATE
[2022-06-07T16:54:06.579Z] > Task :streams:copyDependantLibs
[2022-06-07T16:54:06.579Z] > Task :streams:jar UP-TO-DATE
[2022-06-07T16:54:06.579Z] > Task 
:streams:generateMetadataFileForMavenJavaPublication
[2022-06-07T16:54:10.144Z] > Task :connect:api:javadoc
[2022-06-07T16:54:10.144Z] > Task :connect:api:copyDependantLibs UP-TO-DATE
[2022-06-07T16:54:10.144Z] > Task :connect:api:jar UP-TO-DATE
[2022-06-07T16:54:10.144Z] > Task 
:connect:api:generateMetadataFileForMavenJavaPublication
[2022-06-07T16:54:10.144Z] > Task :connect:json:copyDependantLibs UP-TO-DATE
[2022-06-07T16:54:10.144Z] > Task :connect:json:jar UP-TO-DATE
[2022-06-07T16:54:10.144Z] > Task 
:connect:json:generateMetadataFileForMavenJavaPublication
[2022-06-07T16:54:10.144Z] > Task :connect:api:javadocJar
[2022-06-07T16:54:10.144Z] > Task :connect:api:compileTestJava UP-TO-DATE
[2022-06-07T16:54:10.144Z] > Task :connect:api:testClasses UP-TO-DATE
[2022-06-07T16:54:10.144Z] > Task 
:connect:json:publishMavenJavaPublicationToMavenLocal
[2022-06-07T16:54:10.144Z] > Task :connect:json:publishToMavenLocal
[2022-06-07T16:54:10.144Z] > Task :connect:api:testJar
[2022-06-07T16:54:10.144Z] > Task :connect:api:testSrcJar
[2022-06-07T16:54:10.144Z] > Task 
:connect:api:publishMavenJavaPublicationToMavenLocal
[2022-06-07T16:54:10.144Z] > Task :connect:api:publishToMavenLocal
[2022-06-07T16:54:13.713Z] > Task :streams:javadoc
[2022-06-07T16:54:13.713Z] > Task :streams:javadocJar
[2022-06-07T16:54:13.713Z] 
[2022-06-07T16:54:13.713Z] > Task :clients:javadoc
[2022-06-07T16:54:13.713Z] 
/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.1/clients/src/main/java/org/apache/kafka/common/security/oauthbearer/secured/OAuthBearerLoginCallbackHandler.java:147:
 warning - Tag @link: reference not found: 
[2022-06-07T16:54:14.644Z] 1 warning
[2022-06-07T16:54:15.577Z] 
[2022-06-07T16:54:15.577Z] > Task :clients:javadocJar
[2022-06-07T16:54:16.510Z] 
[2022-06-07T16:54:16.511Z] > Task :clients:srcJar
[2022-06-07T16:54:16.511Z] Execution optimizations have been disabled for task 
':clients:srcJar' to ensure correctness due to the following reasons:
[2022-06-07T16:54:16.511Z]   - Gradle detected a problem with the following 
location: 
'/home/jenkins/jenkins-agent/workspace/Kafka_kafka_3.1/clients/src/generated/java'.
 Reason: Task ':clients:srcJar' uses this output of task 
':clients:processMessages' without declaring an explicit or implicit 
dependency. This can lead to incorrect results being produced, depending on 
what order the tasks are executed. Please refer to 
https://docs.gradle.org/7.2/userguide/validation_problems.html#implicit_dependency
 for more details about this problem.
[2022-06-07T16:54:17.445Z] 
[2022-06-07T16:54:17.445Z] > Task :clients:testJar
[2022-06-07T16:54:17.445Z] > Task :clients:testSrcJar
[2022-06-07T16:54:17.445Z] > Task 
:clients:publishMavenJavaPublicationToMavenLocal
[2022-06-07T16:54:17.445Z] > Task :clients:publishToMavenLocal
[2022-06-07T16:54:36.390Z] > Task :core:compileScala
[2022-06-07T16:55:43.400Z] > Task :core:classes
[2022-06-07T16:55:43.400Z] > Task :core:compileTestJava NO-SOURCE
[2022-06-07T16:56:05.522Z] > Task :core:compileTestScala
[2022-06-07T16:56:54.006Z] > Task :core:testClasses
[2022-06-07T16:57:03.994Z] > Task :streams:compileTestJava
[2022-06-07T16:57:03.994Z] > Task :streams:testClasses
[2022-06-07T16:57:03.994Z] > Task :streams:testJar
[2022-06-07T16:57:03.994Z] > Tas

[jira] [Created] (KAFKA-13964) kafka-configs.sh end with UnsupportedVersionException when describing TLS user with quotas

2022-06-07 Thread Jakub Stejskal (Jira)
Jakub Stejskal created KAFKA-13964:
--

 Summary: kafka-configs.sh end with UnsupportedVersionException 
when describing TLS user with quotas 
 Key: KAFKA-13964
 URL: https://issues.apache.org/jira/browse/KAFKA-13964
 Project: Kafka
  Issue Type: Bug
  Components: admin, kraft
Affects Versions: 3.2.0
 Environment: Kafka 3.2.0 running on OpenShift 4.10 in KRaft mode 
managed by Strimzi
Reporter: Jakub Stejskal


{color:#424242}Usage of {color:#424242}kafka-configs.sh end with 
{color:#424242}org.apache.kafka.common.errors.UnsupportedVersionException: The 
broker does not support DESCRIBE_USER_SCRAM_CREDENTIALS when describing TLS 
user with quotas enabled. {color}{color}{color}

 
{code:java}
bin/kafka-configs.sh --bootstrap-server localhost:9092 --describe --user 
CN=encrypted-arnost` got status code 1 and stderr: -- Error while executing 
config command with args '--bootstrap-server localhost:9092 --describe --user 
CN=encrypted-arnost' java.util.concurrent.ExecutionException: 
org.apache.kafka.common.errors.UnsupportedVersionException: The broker does not 
support DESCRIBE_USER_SCRAM_CREDENTIALS{code}
STDOUT contains all necessary data, but the script itself ends with return code 
1 and the error above. Scram-sha has not been configured anywhere in that case 
(not supported by KRaft). This might be fixed by adding support for scram-sha 
in the next version (not reproducible without KRaft enabled).

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Re: [DISCUSS] KIP-844: Transactional State Stores

2022-06-07 Thread Alexander Sorokoumov
Hi,

As a status update, I did the following changes to the KIP:
* replaced configuration via the top-level config with configuration via
Stores factory and StoreSuppliers,
* added IQv2 and elaborated how readCommitted will work when the store is
not transactional,
* removed claims about ALOS.

I am going to be OOO in the next couple of weeks and will resume working on
the proposal and responding to the discussion in this thread starting June
27. My next top priorities are:
1. Prototype the rollback approach as suggested by Guozhang.
2. Replace in-memory batches with the secondary-store approach as the
default implementation to address the feedback about memory pressure as
suggested by Sagar and Bruno.
3. Adjust Stores methods to make transactional implementations pluggable.
4. Publish the POC for the first review.

Best regards,
Alex

On Wed, Jun 1, 2022 at 2:52 PM Guozhang Wang  wrote:

> Alex,
>
> Thanks for your replies! That is very helpful.
>
> Just to broaden our discussions a bit here, I think there are some other
> approaches in parallel to the idea of "enforce to only persist upon
> explicit flush" and I'd like to throw one here -- not really advocating it,
> but just for us to compare the pros and cons:
>
> 1) We let the StateStore's `flush` function to return a token instead of
> returning `void`.
> 2) We add another `rollback(token)` interface of StateStore which would
> effectively rollback the state as indicated by the token to the snapshot
> when the corresponding `flush` is called.
> 3) We encode the token and commit as part of
> `producer#sendOffsetsToTransaction`.
>
> Users could optionally implement the new functions, or they can just not
> return the token at all and not implement the second function. Again, the
> APIs are just for the sake of illustration, not feeling they are the most
> natural :)
>
> Then the procedure would be:
>
> 1. the previous checkpointed offset is 100
> ...
> 3. flush store, make sure all writes are persisted; get the returned token
> that indicates the snapshot of 200.
> 4. producer.sendOffsetsToTransaction(token); producer.commitTransaction();
> 5. Update the checkpoint file (say, the new value is 200).
>
> Then if there's a failure, say between 3/4, we would get the token from the
> last committed txn, and first we would do the restoration (which may get
> the state to somewhere between 100 and 200), then call
> `store.rollback(token)` to rollback to the snapshot of offset 100.
>
> The pros is that we would then not need to enforce the state stores to not
> persist any data during the txn: for stores that may not be able to
> implement the `rollback` function, they can still reduce its impl to "not
> persisting any data" via this API, but for stores that can indeed support
> the rollback, their implementation may be more efficient. The cons though,
> on top of my head are 1) more complicated logic differentiating between EOS
> with and without store rollback support, and ALOS, 2) encoding the token as
> part of the commit offset is not ideal if it is big, 3) the recovery logic
> including the state store is also a bit more complicated.
>
>
> Guozhang
>
>
>
>
>
> On Wed, Jun 1, 2022 at 1:29 PM Alexander Sorokoumov
>  wrote:
>
> > Hi Guozhang,
> >
> > But I'm still trying to clarify how it guarantees EOS, and it seems that
> we
> > > would achieve it by enforcing to not persist any data written within
> this
> > > transaction until step 4. Is that correct?
> >
> >
> > This is correct. Both alternatives - in-memory WriteBatchWithIndex and
> > transactionality via the secondary store guarantee EOS by not persisting
> > data in the "main" state store until it is committed in the changelog
> > topic.
> >
> > Oh what I meant is not what KStream code does, but that StateStore impl
> > > classes themselves could potentially flush data to become persisted
> > > asynchronously
> >
> >
> > Thank you for elaborating! You are correct, the underlying state store
> > should not persist data until the streams app calls StateStore#flush.
> There
> > are 2 options how a State Store implementation can guarantee that -
> either
> > keep uncommitted writes in memory or be able to roll back the changes
> that
> > were not committed during recovery. RocksDB's WriteBatchWithIndex is an
> > implementation of the first option. A considered alternative,
> Transactions
> > via Secondary State Store for Uncommitted Changes, is the way to
> implement
> > the second option.
> >
> > As everyone correctly pointed out, keeping uncommitted data in memory
> > introduces a very real risk of OOM that we will need to handle. The more
> I
> > think about it, the more I lean towards going with the Transactions via
> > Secondary Store as the way to implement transactionality as it does not
> > have that issue.
> >
> > Best,
> > Alex
> >
> >
> > On Wed, Jun 1, 2022 at 12:59 PM Guozhang Wang 
> wrote:
> >
> > > Hello Alex,
> > >
> > > > we flush the cache, but not the underlying state store.
> > >
> 

Re: [DISCUSS] KIP-714: Client metrics and observability

2022-06-07 Thread Jun Rao
Hi, Magnus,

Thanks for the reply.

So, the standard set of generic metrics is just a recommendation and not a
requirement? This sounds good to me since it makes the adoption of the KIP
easier.

Regarding the metric names, I have two concerns. (1) If a client already
has an existing metric similar to the standard one, duplicating the metric
seems to be confusing. (2) If a client needs to implement a standard metric
that doesn't exist yet, using a naming convention (e.g., using dash vs dot)
different from other existing metrics also seems a bit confusing. It seems
that the main benefit of having standard metric names across clients is for
better server side monitoring. Could we do the standardization in the
plugin on the server?

Thanks,

Jun



On Tue, Jun 7, 2022 at 6:53 AM Magnus Edenhill  wrote:

> Hey Jun,
>
> I've clarified the scope of the standard metrics in the KIP, but basically:
>
>  * We define a standard set of generic metrics that should be relevant to
> most client implementations, e.g., each producer implementation probably
> has some sort of per-partition message queue.
>  * A client implementation should strive to implement as many of the
> standard metrics as possible, but only the ones that make sense.
>  * For metrics that are not in the standard set, a client maintainer can
> choose to either submit a KIP to add additional standard metrics - if
> they're relevant, or go ahead and add custom metrics that are specific to
> that client implementation. These custom metrics will have a prefix
> specific to that client implementation, as opposed to the standard metric
> set that resides under "org.apache.kafka...". E.g.,
> "se.edenhill.librdkafka" or whatever.
>  * Existing non-KIP-714 metrics should remain untouched. In some cases we
> might be able to use the same meter given it is compatible with the
> standard metric set definition, in other cases a semi-duplicate meter may
> be needed. Thus this will not affect the metrics exposed through JMX, or
> vice versa.
>
> Thanks,
> Magnus
>
>
>
> Den ons 1 juni 2022 kl 18:55 skrev Jun Rao :
>
> > Hi, Magnus,
> >
> > 51. Just to clarify my question.  (1) Are standard metrics required for
> > every client for this KIP to function?  (2) Are we converting existing
> java
> > metrics to the standard metrics and deprecating the old ones? If so,
> could
> > we list all existing java metrics that need to be renamed and the
> > corresponding new name?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, May 31, 2022 at 3:29 PM Jun Rao  wrote:
> >
> > > Hi, Magnus,
> > >
> > > Thanks for the reply.
> > >
> > > 51. I think it's fine to have a list of recommended metrics for every
> > > client to implement. I am just not sure that standardizing on the
> metric
> > > names across all clients is practical. The list of common metrics in
> the
> > > KIP have completely different names from the java metric names. Some of
> > > them have different types. For example, some of the common metrics
> have a
> > > type of histogram, but the java client metrics don't use histogram in
> > > general. Requiring the operator to translate those names and understand
> > the
> > > subtle differences across clients seem to cause more confusion during
> > > troubleshooting.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, May 31, 2022 at 5:02 AM Magnus Edenhill 
> > > wrote:
> > >
> > >> Den fre 20 maj 2022 kl 01:23 skrev Jun Rao  >:
> > >>
> > >> > Hi, Magus,
> > >> >
> > >> > Thanks for the reply.
> > >> >
> > >> > 50. Sounds good.
> > >> >
> > >> > 51. I miss-understood the proposal in the KIP then. The proposal is
> to
> > >> > define a set of common metric names that every client should
> > implement.
> > >> The
> > >> > problem is that every client already has its own set of metrics with
> > its
> > >> > own names. I am not sure that we could easily agree upon a common
> set
> > of
> > >> > metrics that work with all clients. There are likely to be some
> > metrics
> > >> > that are client specific. Translating between the common name and
> > client
> > >> > specific name is probably going to add more confusion. As mentioned
> in
> > >> the
> > >> > KIP, similar metrics from different clients could have subtle
> > >> > semantic differences. Could we just let each client use its own set
> of
> > >> > metric names?
> > >> >
> > >>
> > >> We identified a common set of metrics that should be relevant for most
> > >> client implementations,
> > >> they're the ones listed in the KIP.
> > >> A supporting client does not have to implement all those metrics, only
> > the
> > >> ones that makes sense
> > >> based on that client implementation, and a client may implement other
> > >> metrics that are not listed
> > >> in the KIP under its own namespace.
> > >> This approach has two benefits:
> > >>  - there will be a common set of metrics that most/all clients
> > implement,
> > >> which makes monitoring
> > >>   and troubleshooting easier across fleets with multiple Kafka client
> > 

Re: Re: KIP 678: New Kafka Connect SMT for plainText => Struct(or Map) with Regex

2022-06-07 Thread Chris Egerton
Hi whsoul,

Thanks for the updates. I have a few more thoughts but this is looking
pretty good:

1. The name "ToStructByRegexTransform" is a little unwieldy. What do you
think about something shorter like ParseStruct, ParseRegex, or
ParseStructByRegex?

2. What happens if a line that the SMT sees doesn't match the regex
supplied by the user? Will it throw an exception, silently skip the message
and return it as-is, allow the user to configure it to do either, something
else entirely? (I'd personally lean towards just throwing an exception
since users can configure regexes to be lenient via the optional
quantifier, i.e. "?")

3. The description for the "regex" property still includes the "( with
:{TYPE} )" snippet; should that be removed?

4. Is it worth adding support to this SMT to operate on an individual field
of a message? I.e., you specify a "field" of "log_line" and the SMT expects
to see a struct or a map with a field/key of "log_line" and parses that
instead of the entire message. If so, it might be worth specifying that
this property would follow any precedents set by KIP-821 [1] so that nested
fields could be accessed instead of just top-level fields.

5. What happens if the user specifies more names in the "mapping" property
than there are groups in the "regex" property?

6. (Nit) I don't think the GroupRegexValidator class needs to be called out
as part of the changes to public interface if it's just going to be used by
this transform.

[1] -
https://cwiki.apache.org/confluence/display/KAFKA/KIP-821%3A+Connect+Transforms+support+for+nested+structures

Cheers,

Chris

On Tue, Jun 7, 2022 at 4:47 AM gyejun choi  wrote:

> Hi Chris,
>
> Thank you for your positive feed back,
> And sorry about late reply ( I didn’t recognize your reply email… TT )
>
> And I update KIP and PR with your review
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP+678%3A+New+Kafka+Connect+SMT+for+plainText+%3D%3E+Struct%28or+Map%29+with+Regex
>
> 1. typecast function removed
> 2. struct.field removed
> 3. Proposed Interface changed
>
>
> I will wait for you second feed back
>
> Thanks~
>
> On 2021/07/15 15:57:14 Chris Egerton wrote:
> > Hi whsoul,
> >
> > Thanks for the KIP. The two use cases you identified seem quite
> appealing,
> > especially the first one: parsing structured log messages.
> >
> > Here are my initial thoughts on the proposed design:
> >
> > 1. I wonder if it's necessary to include support for type casting with
> this
> > SMT. We already have a Cast SMT (
> >
>
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/Cast.java
> )
> > that can parse multiple fields of a structured record value with
> differing
> > types. Would it be enough for your new SMT to only produce string values
> > for its structured data, and then allow users to perform casting logic
> > using the Cast SMT afterward?
> >
> > 2. It seems like the "struct.field" property is similar; based on the
> > examples, it looks like when the SMT is configured with a value for that
> > property, it will first pull out a field from a structured record value
> > (for example, it would pull out the value "
> > https://kafka.apache.org/documentation/#connect"; from a map of {"url": "
> > https://kafka.apache.org/documentation/#connect"}), then parse that
> field's
> > value, and replace the entire record value (or key) with the result of
> the
> > parsing stage. It seems like this could be accomplished using the
> > ExtractField SMT (
> >
>
> https://github.com/apache/kafka/blob/trunk/connect/transforms/src/main/java/org/apache/kafka/connect/transforms/ExtractField.java
> )
> > as a preliminary step before passing it to your new SMT. Is this correct?
> > And if so, could we simplify the interface for your SMT by removing the
> > "struct.field" property in favor of the existing ExtractField SMT?
> >
> > 3. (Nit) I believe the property names in the table beneath the "Proposed
> > Interfaces" section should have the "transforms.RegexTransform." prefix
> > removed, since users will be able to configure the SMT with any name, not
> > just "RegexTransform". This keeps in line with similar KIPs such as
> > KIP-437, which added a new property to the MaskField SMT (
> >
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-437%3A+Custom+replacement+for+MaskField+SMT
> > ).
> >
> > Cheers,
> >
> > Chris
> >
> > On Thu, Jul 15, 2021 at 7:56 AM gyejun choi  wrote:
> >
> > > is there someone who any feed back about this?
> > >
> > > 2020년 10월 23일 (금) 오후 2:56, gyejun choi 님이 작성:
> > >
> > > > Hi,
> > > >
> > > > I've opened KIP-678 which is intended to provide a new SMT in Kafka
> > > Connect.
> > > > I'd be grateful for any
> > > > feedback:
> > >
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP+678%3A+New+Kafka+Connect+SMT+for+plainText+%3D%3E+Struct%28or+Map%29+with+Regex==
> > > >
> > > > thanks,
> > > >
> > > > whsoul
> > > >
> > > >
> > >
> >
>


[jira] [Resolved] (KAFKA-13942) LogOffsetTest occasionally hangs during Jenkins build

2022-06-07 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-13942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-13942.
-
Resolution: Fixed

> LogOffsetTest occasionally hangs during Jenkins build
> -
>
> Key: KAFKA-13942
> URL: https://issues.apache.org/jira/browse/KAFKA-13942
> Project: Kafka
>  Issue Type: Bug
>  Components: unit tests
>Reporter: David Arthur
>Priority: Minor
>
> [~hachikuji] parsed the log output of one of the recent stalled Jenkins 
> builds and singled out LogOffsetTest as a likely culprit for not completing.
> I looked closely at the following build which appeared to be stuck and found 
> this test case had STARTED but not PASSED or FAILED.
> 15:19:58  LogOffsetTest > 
> testFetchOffsetByTimestampForMaxTimestampWithUnorderedTimestamps(String) > 
> kafka.server.LogOffsetTest.testFetchOffsetByTimestampForMaxTimestampWithUnorderedTimestamps(String)[2]
>  STARTED



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KAFKA-13965) Document broker-side socket-server-metrics

2022-06-07 Thread James Cheng (Jira)
James Cheng created KAFKA-13965:
---

 Summary: Document broker-side socket-server-metrics
 Key: KAFKA-13965
 URL: https://issues.apache.org/jira/browse/KAFKA-13965
 Project: Kafka
  Issue Type: Improvement
  Components: documentation
Affects Versions: 3.2.0
Reporter: James Cheng


There are a bunch of broker JMX metrics in the "socket-server-metrics" space 
that are not documented on kafka.apache.org/documentation

 
 * {_}MBean{_}: 
kafka.server:{{{}type=socket-server-metrics,listener=,networkProcessor={}}}
 ** From KIP-188: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-188+-+Add+new+metrics+to+support+health+checks]
 *  
kafka.server:type=socket-server-metrics,name=connection-accept-rate,listener=\{listenerName}
 ** From KIP-612: 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-612%3A+Ability+to+Limit+Connection+Creation+Rate+on+Brokers]

It would be helpful to get all the socket-server-metrics documented

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KAFKA-13966) Flaky test `QuorumControllerTest.testUnregisterBroker`

2022-06-07 Thread Jason Gustafson (Jira)
Jason Gustafson created KAFKA-13966:
---

 Summary: Flaky test `QuorumControllerTest.testUnregisterBroker`
 Key: KAFKA-13966
 URL: https://issues.apache.org/jira/browse/KAFKA-13966
 Project: Kafka
  Issue Type: Bug
Reporter: Jason Gustafson


We have seen the following assertion failure in 
`QuorumControllerTest.testUnregisterBroker`:

```

org.opentest4j.AssertionFailedError: expected: <2> but was: <0> at 
org.junit.jupiter.api.AssertionUtils.fail(AssertionUtils.java:55) at 
org.junit.jupiter.api.AssertionUtils.failNotEqual(AssertionUtils.java:62) at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:166) at 
org.junit.jupiter.api.AssertEquals.assertEquals(AssertEquals.java:161) at 
org.junit.jupiter.api.Assertions.assertEquals(Assertions.java:628) at 
org.apache.kafka.controller.QuorumControllerTest.testUnregisterBroker(QuorumControllerTest.java:494)

```

I reproduced it by running the test in a loop. It looks like what happens is 
that the BrokerRegistration request is able to get interleaved between the 
leader change event and the write of the bootstrap metadata. Something like 
this:
 # handleLeaderChange() start
 # appendWriteEvent(registerBroker)
 # appendWriteEvent(bootstrapMetadata)
 # handleLeaderChange() finish
 # registerBroker() -> writes broker registration to log
 # bootstrapMetadata() -> writes bootstrap metadata to log



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


[jira] [Created] (KAFKA-13967) Guarantees for producer callbacks on transaction commit should be documented

2022-06-07 Thread Chris Egerton (Jira)
Chris Egerton created KAFKA-13967:
-

 Summary: Guarantees for producer callbacks on transaction commit 
should be documented
 Key: KAFKA-13967
 URL: https://issues.apache.org/jira/browse/KAFKA-13967
 Project: Kafka
  Issue Type: Improvement
  Components: clients
Reporter: Chris Egerton
Assignee: Chris Egerton


As discussed in 
https://github.com/apache/kafka/pull/11780#discussion_r891835221, part of the 
contract for a transactional producer is that all callbacks given to the 
producer will have been invoked and completed (either successfully or by 
throwing an exception) by the time that {{KafkaProducer::commitTransaction}} 
returns. This should be documented so that users of the clients library can 
have a guarantee that they're not on the hook to do that kind of bookkeeping 
themselves.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)


Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #985

2022-06-07 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 634096 lines...]
[2022-06-08T04:31:11.461Z] 
[2022-06-08T04:31:11.461Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldMigrateInMemoryKeyValueStoreToTimestampedKeyValueStoreUsingPapi STARTED
[2022-06-08T04:32:04.706Z] 
[2022-06-08T04:32:04.706Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldMigrateInMemoryKeyValueStoreToTimestampedKeyValueStoreUsingPapi PASSED
[2022-06-08T04:32:04.706Z] 
[2022-06-08T04:32:04.706Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldProxyWindowStoreToTimestampedWindowStoreUsingPapi STARTED
[2022-06-08T04:32:49.747Z] 
[2022-06-08T04:32:49.747Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldProxyWindowStoreToTimestampedWindowStoreUsingPapi PASSED
[2022-06-08T04:32:49.747Z] 
[2022-06-08T04:32:49.747Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldMigratePersistentKeyValueStoreToTimestampedKeyValueStoreUsingPapi STARTED
[2022-06-08T04:33:39.836Z] 
[2022-06-08T04:33:39.836Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldMigratePersistentKeyValueStoreToTimestampedKeyValueStoreUsingPapi PASSED
[2022-06-08T04:33:39.836Z] 
[2022-06-08T04:33:39.836Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi STARTED
[2022-06-08T04:34:33.783Z] 
[2022-06-08T04:34:33.783Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldMigratePersistentWindowStoreToTimestampedWindowStoreUsingPapi PASSED
[2022-06-08T04:34:33.783Z] 
[2022-06-08T04:34:33.783Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldProxyKeyValueStoreToTimestampedKeyValueStoreUsingPapi STARTED
[2022-06-08T04:35:20.177Z] 
[2022-06-08T04:35:20.177Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldProxyKeyValueStoreToTimestampedKeyValueStoreUsingPapi PASSED
[2022-06-08T04:35:20.177Z] 
[2022-06-08T04:35:20.177Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldMigrateInMemoryWindowStoreToTimestampedWindowStoreUsingPapi STARTED
[2022-06-08T04:36:05.075Z] 
[2022-06-08T04:36:05.075Z] 
org.apache.kafka.streams.integration.StoreUpgradeIntegrationTest > 
shouldMigrateInMemoryWindowStoreToTimestampedWindowStoreUsingPapi PASSED
[2022-06-08T04:36:05.075Z] 
[2022-06-08T04:36:05.075Z] 
org.apache.kafka.streams.integration.StreamTableJoinIntegrationTest > 
testInner[caching enabled = true] STARTED
[2022-06-08T04:36:11.949Z] 
[2022-06-08T04:36:11.949Z] 
org.apache.kafka.streams.integration.StreamTableJoinIntegrationTest > 
testInner[caching enabled = true] PASSED
[2022-06-08T04:36:11.949Z] 
[2022-06-08T04:36:11.949Z] 
org.apache.kafka.streams.integration.StreamTableJoinIntegrationTest > 
testLeft[caching enabled = true] STARTED
[2022-06-08T04:36:20.072Z] 
[2022-06-08T04:36:20.072Z] 
org.apache.kafka.streams.integration.StreamTableJoinIntegrationTest > 
testLeft[caching enabled = true] PASSED
[2022-06-08T04:36:20.072Z] 
[2022-06-08T04:36:20.072Z] 
org.apache.kafka.streams.integration.StreamTableJoinIntegrationTest > 
testInner[caching enabled = false] STARTED
[2022-06-08T04:36:27.158Z] 
[2022-06-08T04:36:27.158Z] 
org.apache.kafka.streams.integration.StreamTableJoinIntegrationTest > 
testInner[caching enabled = false] PASSED
[2022-06-08T04:36:27.158Z] 
[2022-06-08T04:36:27.158Z] 
org.apache.kafka.streams.integration.StreamTableJoinIntegrationTest > 
testLeft[caching enabled = false] STARTED
[2022-06-08T04:36:35.661Z] 
[2022-06-08T04:36:35.661Z] 
org.apache.kafka.streams.integration.StreamTableJoinIntegrationTest > 
testLeft[caching enabled = false] PASSED
[2022-06-08T04:36:35.661Z] 
[2022-06-08T04:36:35.661Z] 
org.apache.kafka.streams.integration.StreamsUncaughtExceptionHandlerIntegrationTest
 > shouldReplaceThreads STARTED
[2022-06-08T04:36:37.630Z] 
[2022-06-08T04:36:37.630Z] Exception: java.lang.AssertionError thrown from the 
UncaughtExceptionHandler in thread 
"appId_StreamsUncaughtExceptionHandlerIntegrationTestnull-871964ad-7d9d-4298-860d-5106ef5a9240-StreamThread-1"
[2022-06-08T04:36:40.816Z] 
[2022-06-08T04:36:40.816Z] Exception: java.lang.AssertionError thrown from the 
UncaughtExceptionHandler in thread 
"appId_StreamsUncaughtExceptionHandlerIntegrationTestnull-871964ad-7d9d-4298-860d-5106ef5a9240-StreamThread-2"
[2022-06-08T04:36:43.075Z] 
[2022-06-08T04:36:43.075Z] 
org.apache.kafka.streams.integration.StreamsUncaughtExceptionHandlerIntegrationTest
 > shouldReplaceThreads PASSED
[2022-06-08T04:36:43.075Z] 
[2022-06-08T04:36:43.075Z] 
org.apache.kafka.streams.integration.StreamsUncaughtExceptionHandlerIntegrationTest
 > shouldShutdownClientWhenIllegalStateException STARTED
[2022-06-08T04:36:49.770Z] 
[2022-06-08T04:36:49.770Z] 
org.apache

[jira] [Created] (KAFKA-13968) Broker should not generator snapshot until been unfenced

2022-06-07 Thread dengziming (Jira)
dengziming created KAFKA-13968:
--

 Summary: Broker should not generator snapshot until been unfenced
 Key: KAFKA-13968
 URL: https://issues.apache.org/jira/browse/KAFKA-13968
 Project: Kafka
  Issue Type: Bug
  Components: kraft
Reporter: dengziming
Assignee: dengziming


 

There is a bug when computing `FeaturesDelta` which cause us to generate 
snapshot on every commit.

 

[2022-06-08 13:07:43,010] INFO [BrokerMetadataSnapshotter id=0] Creating a new 
snapshot at offset 0... (kafka.server.metadata.BrokerMetadataSnapshotter:66)

[2022-06-08 13:07:43,222] INFO [BrokerMetadataSnapshotter id=0] Creating a new 
snapshot at offset 2... (kafka.server.metadata.BrokerMetadataSnapshotter:66)

[2022-06-08 13:07:43,727] INFO [BrokerMetadataSnapshotter id=0] Creating a new 
snapshot at offset 3... (kafka.server.metadata.BrokerMetadataSnapshotter:66)

[2022-06-08 13:07:44,228] INFO [BrokerMetadataSnapshotter id=0] Creating a new 
snapshot at offset 4... (kafka.server.metadata.BrokerMetadataSnapshotter:66)

 

Before a broker being unfenced, it won't starting publishing metadata, so it's 
meaningless to  generate a snapshot.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)