Re: [DISCUSS] KIP-877: Mechanism for plugins and connectors to register metrics

2022-12-27 Thread Chris Egerton
Hi Yash,

Yes, a default no-op is exactly what I had in mind should the Connector and
Task classes implement the Monitorable interface.

Cheers,

Chris

On Tue, Dec 20, 2022 at 2:46 AM Yash Mayya  wrote:

> Hi Mickael,
>
> Thanks for creating this KIP, this will be a super useful feature to
> enhance existing connectors in the Kafka Connect ecosystem.
>
> I have some similar concerns to the ones that Chris has outlined above,
> especially with regard to directly exposing Connect's Metrics object to
> plugins. I believe it would be a lot friendlier to developers if we instead
> exposed wrapper methods in the context classes - such as one for
> registering a new metric, one for recording metric values and so on. This
> would also have the added benefit of minimizing the surface area for
> potential misuse by custom plugins.
>
> > for connectors and tasks they should handle the
> > metrics() method returning null when deployed on
> > an older runtime.
>
> I believe this won't be the case, and instead they'll need to handle a
> `NoSuchMethodError` right? This is similar to previous KIPs that added
> methods to connector context classes and will arise due to an
> incompatibility between the `connect-api` dependency that a plugin will be
> compiled against versus what it will actually get at runtime.
>
> Hi Chris,
>
> > WDYT about having the Connector and Task classes
> > implement the Monitorable interface, both for
> > consistency's sake, and to prevent classloading
> > headaches?
>
> Are you suggesting that the framework should configure connectors / tasks
> with a Metrics instance during their startup rather than the connector /
> task asking the framework to provide one? In this case, I'm guessing you're
> envisioning a default no-op implementation for the metrics configuration
> method rather than the framework having to handle the case where the
> connector was compiled against an older version of Connect right?
>
> Thanks,
> Yash
>
> On Wed, Nov 30, 2022 at 1:38 AM Chris Egerton 
> wrote:
>
> > Hi Mickael,
> >
> > Thanks for the KIP! This seems especially useful to reduce the
> > implementation cost and divergence in behavior for connectors that choose
> > to publish their own metrics.
> >
> > My initial thoughts:
> >
> > 1. Are you certain that the default implementation of the "metrics"
> method
> > for the various connector/task context classes will be used on older
> > versions of the Connect runtime? My understanding was that a
> > NoSuchMethodError (or some similar classloading exception) would be
> thrown
> > in that case. If that turns out to be true, WDYT about having the
> Connector
> > and Task classes implement the Monitorable interface, both for
> > consistency's sake, and to prevent classloading headaches?
> >
> > 2. Although I agree that administrators should be careful about which
> > plugins they run on their clients, Connect clusters, etc., I wonder if
> > there might still be value in wrapping the Metrics class behind a new
> > interface, for a few reasons:
> >
> >   a. Developers and administrators may still make mistakes, and if we can
> > reduce the blast radius by preventing plugins from, e.g., closing the
> > Metrics instance we give them, it may be worth it. This could also be
> > accomplished by forbidding plugins from invoking these methods, and
> giving
> > them a subclass of Metrics that throws UnsupportedOperationException from
> > these methods.
> >
> >   b. If we don't know of any reasonable use cases for closing the
> instance,
> > adding new reporters, removing metrics, etc., it can make the API cleaner
> > and easier for developers to grok if they don't even have the option to
> do
> > any of those things.
> >
> >   c. Interoperability between plugins that implement Monitorable and
> their
> > runtime becomes complicated. For example, a connector may be built
> against
> > a version of Kafka that introduces new methods for the Metrics class,
> which
> > introduces risks of incompatibility if its developer chooses to take
> > advantage of these methods without realizing that they will not be
> > available on Connect runtimes built against an older version of Kafka.
> With
> > a wrapper interface, we at least have a chance to isolate these issues so
> > that the Metrics class can be expanded without adding footguns for
> plugins
> > that implement Monitorable, and to call out potential compatibility
> > problems in documentation more clearly if/when we do expand the wrapper
> > interface.
> >
> > 3. It'd be nice to see a list of exactly which plugins will be able to
> take
> > advantage of the new Monitorable interface.
> >
> > Looking forward to your thoughts!
> >
> > Cheers,
> >
> > Chris
> >
> > On Mon, Nov 7, 2022 at 11:42 AM Mickael Maison  >
> > wrote:
> >
> > > Hi,
> > >
> > > I have opened KIP-877 to make it easy for plugins and connectors to
> > > register their own metrics:
> > >
> > >
> >
> https://eu01.z.antigena.com/l/9lWv8kbU9CKs2LajwgfKF~yMNQVM7rWRxYmYVNrHU

[jira] [Created] (KAFKA-14554) Move ClassLoaderAwareRemoteStorageManagerTest to storage module

2022-12-27 Thread Federico Valeri (Jira)
Federico Valeri created KAFKA-14554:
---

 Summary: Move ClassLoaderAwareRemoteStorageManagerTest to storage 
module
 Key: KAFKA-14554
 URL: https://issues.apache.org/jira/browse/KAFKA-14554
 Project: Kafka
  Issue Type: Sub-task
Reporter: Federico Valeri






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: Request Permissions To Contribute To Apache Kafka

2022-12-27 Thread Chris Egerton
Hi Terry,

You should be good to go.

Cheers,

Chris

On Sat, Dec 24, 2022 at 12:57 PM @T B..  wrote:

> Wiki ID:  beardt
> Jira ID:  beardt
>
> Kind Regards,
>
> -Terry Beard
>


Re: [VOTE] KIP-887 - Add ConfigProvider to make use of environment variables

2022-12-27 Thread Chris Egerton
Hi Roman,

Thanks for the KIP! I'm +1 (binding)

Cheers,

Chris

On Wed, Dec 14, 2022 at 3:52 PM Roman Schmitz 
wrote:

> Hi all,
>
> Thank you for the feedback so far.
> The KIP is rather straightforward and I'd like to start a vote on it.
> Please have a look at the KIP:
> https://eu01.z.antigena.com/l/EXPIk5DmddkPFlqfPnlswu2VHYg_8h-TuWq8d3DskL7C2Rgsv7AwoRLT9J1PT-WH2TaJ9SSZSW9IvgzjtTq4ksyl~QkZThD9b5tl_IhLpkq_OT2u-nL~lu3jT3a3DabKzOo5NUdNPsmM34PAefwMFE~QOWHNaYIqWpXSIsu2IXd_C_4
>
> Thanks,
> Roman
>


[jira] [Resolved] (KAFKA-14548) Stable streams applications stall due to infrequent restoreConsumer polls

2022-12-27 Thread Greg Harris (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-14548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Harris resolved KAFKA-14548.
-
Resolution: Duplicate

> Stable streams applications stall due to infrequent restoreConsumer polls
> -
>
> Key: KAFKA-14548
> URL: https://issues.apache.org/jira/browse/KAFKA-14548
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Greg Harris
>Priority: Major
>
> We have observed behavior with Streams where otherwise healthy applications 
> stall and become unable to process data after a rebalance 
> (https://issues.apache.org/jira/browse/KAFKA-13405.) The root cause of which 
> is that a restoreConsumer can be partitioned from a Kafka cluster with stale 
> metadata, while the mainConsumer is healthy with up-to-date metadata. This is 
> due to both an issue in streams and an issue in the consumer logic.
> In StoreChangelogReader, a long-lived restoreConsumer is kept instantiated 
> while the streams app is running. This consumer is only `poll()`ed when the 
> ChangelogReader::restore method is called and at least one changelog is in 
> the RESTORING state. This may be very infrequent if the streams app is stable.
> This is an anti-pattern, as frequent poll()s are expected to keep kafka 
> consumers in contact with the kafka cluster. Infrequent polls are considered 
> failures from the perspective of the consumer API. From the [official Kafka 
> Consumer 
> documentation|https://kafka.apache.org/33/javadoc/org/apache/kafka/clients/consumer/KafkaConsumer.html]:
> {noformat}
> The poll API is designed to ensure consumer liveness.
> ...
> So to stay in the group, you must continue to call poll.
> ...
> The recommended way to handle these cases [where the main thread is not ready 
> for more data] is to move message processing to another thread, which allows 
> the consumer to continue calling poll while the processor is still working.
> ...
> Note also that you will need to pause the partition so that no new records 
> are received from poll until after thread has finished handling those 
> previously returned.{noformat}
> With the current behavior, it is expected that the restoreConsumer will fall 
> out of the group regularly and be considered failed, when the rest of the 
> application is running exactly as intended.
> This is not normally an issue, as falling out of the group is easily repaired 
> by joining the group during the next poll. It does mean that there is 
> slightly higher latency to performing a restore, but that does not appear to 
> be a major concern at this time.
> This does become an issue when other deeper assumptions about the usage of 
> Kafka clients are violated. Relevant to this issue, it is assumed by the 
> client metadata management logic that regular polling will take place, and 
> that the regular poll call can be piggy-backed to initiate a metadata update. 
> Without a regular poll, the regular metadata update cannot be performed, and 
> the consumer violates its own `metadata.max.age.ms` configuration. This leads 
> to the restoreConsumer having a much older metadata containing none of the 
> currently live brokers, partitioning it from the cluster.
> Alleviating this failure mode does not _require_ the streams' polling 
> behavior to change, as solutions for all clients have been considered 
> (https://issues.apache.org/jira/browse/KAFKA-3068 and that family of 
> duplicate issues).
> However, as a tactical fix for the issue, and one which does not require a 
> KIP changing the behavior of {_}every kafka client{_}, we should consider 
> changing the restoreConsumer poll behavior to bring it closer to the expected 
> happy-path of at least one poll() every poll.interval.ms.
> If there is another hidden assumption of the clients that relies on regular 
> polling, then this tactical fix may prevent users of the streams library from 
> being affected, reducing the impact of that hidden assumption through 
> defense-in-depth.
> This would also be a backport-able fix for streams users, instead of a fix to 
> the consumers which would only apply to new versions of the consumers.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Re: [ANNOUNCE] New committer: Satish Duggana

2022-12-27 Thread Matthias J. Sax

Congrats!

On 12/27/22 10:20 AM, Kirk True wrote:

Congrats, Satish!

On Fri, Dec 23, 2022, at 10:07 AM, Jun Rao wrote:

Hi, Everyone,

The PMC of Apache Kafka is pleased to announce a new Kafka committer Satish
Duggana.

Satish has been a long time Kafka contributor since 2017. He is the main
driver behind KIP-405 that integrates Kafka with remote storage, a
significant and much anticipated feature in Kafka.

Congratulations, Satish!

Thanks,

Jun (on behalf of the Apache Kafka PMC)





[jira] [Created] (KAFKA-14555) Segfault in RocksDB DumpDataBlocks

2022-12-27 Thread Greg Harris (Jira)
Greg Harris created KAFKA-14555:
---

 Summary: Segfault in RocksDB DumpDataBlocks
 Key: KAFKA-14555
 URL: https://issues.apache.org/jira/browse/KAFKA-14555
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 3.4.0
Reporter: Greg Harris


I encountered this SIGSEGV while running the streams tests with gradle locally. 
I am unable to reproduce the crash reliably.

Here's the native stacktrace:
{noformat}
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0001269f2f2c, pid=88913, tid=40199
#
# JRE version: OpenJDK Runtime Environment Corretto-17.0.4.9.1 (17.0.4.1+9) 
(build 17.0.4.1+9-LTS)
# Java VM: OpenJDK 64-Bit Server VM Corretto-17.0.4.9.1 (17.0.4.1+9-LTS, mixed 
mode, sharing, tiered, compressed oops, compressed class ptrs, parallel gc, 
bsd-aarch64)
# Problematic frame:
# C  [librocksdbjni15989196819046251041.jnilib+0x2def2c]  
_ZN7rocksdb15BlockBasedTable14DumpDataBlocksERNSt3__113basic_ostreamIcNS1_11char_traitsIc+0x1650

---  T H R E A D  ---
Current thread is native 
threadStack: [0x000171704000,0x000171787000],  sp=0x000171784e90,  
free space=515k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [librocksdbjni15989196819046251041.jnilib+0x2def2c]  
_ZN7rocksdb15BlockBasedTable14DumpDataBlocksERNSt3__113basic_ostreamIcNS1_11char_traitsIc+0x1650
C  [librocksdbjni15989196819046251041.jnilib+0x2cff98]  
_ZN7rocksdb15BlockBasedTable28PrefetchIndexAndFilterBlocksERKNS_11ReadOptionsEPNS_18FilePrefetchBufferEPNS_20InternalIteratorBaseINS_5SliceEEEPS0_bRKNS_22BlockBasedTableOptionsEimmPNS_23BlockCacheLookupContextE+0x354
C  [librocksdbjni15989196819046251041.jnilib+0x2ce4d4]  
_ZN7rocksdb15BlockBasedTable4OpenERKNS_11ReadOptionsERKNS_16ImmutableOptionsERKNS_10EnvOptionsERKNS_22BlockBasedTableOptionsERKNS_21InternalKeyComparatorEONSt3__110unique_ptrINS_22RandomAccessFileReaderENSG_14default_deleteISI_yPNSH_INS_11TableReaderENSJ_ISN_RKNSG_10shared_ptrIKNS_14SliceTransformEEEbbibybPNS_17TailPrefetchStatsEPNS_16BlockCacheTracerEmRKNSG_12basic_stringIcNSG_11char_traitsIcEENSG_9allocatorIcy+0xaa0
C  [librocksdbjni15989196819046251041.jnilib+0x2bbd04]  
_ZNK7rocksdb22BlockBasedTableFactory14NewTableReaderERKNS_11ReadOptionsERKNS_18TableReaderOptionsEONSt3__110unique_ptrINS_22RandomAccessFileReaderENS7_14default_deleteIS9_yPNS8_INS_11TableReaderENSA_ISE_b+0x8c
C  [librocksdbjni15989196819046251041.jnilib+0x18de18]  
_ZN7rocksdb10TableCache14GetTableReaderERKNS_11ReadOptionsERKNS_11FileOptionsERKNS_21InternalKeyComparatorERKNS_14FileDescriptorEbbPNS_13HistogramImplEPNSt3__110unique_ptrINS_11TableReaderENSF_14default_deleteISH_RKNSF_10shared_ptrIKNS_14SliceTransformEEEbibmNS_11TemperatureE+0x418
C  [librocksdbjni15989196819046251041.jnilib+0x18e5e8]  
_ZN7rocksdb10TableCache9FindTableERKNS_11ReadOptionsERKNS_11FileOptionsERKNS_21InternalKeyComparatorERKNS_14FileDescriptorEPPNS_5Cache6HandleERKNSt3__110shared_ptrIKNS_14SliceTransformEEEbbPNS_13HistogramImplEbibmNS_11TemperatureE+0x22c
C  [librocksdbjni15989196819046251041.jnilib+0x18e96c]  
_ZN7rocksdb10TableCache11NewIteratorERKNS_11ReadOptionsERKNS_11FileOptionsERKNS_21InternalKeyComparatorERKNS_12FileMetaDataEPNS_18RangeDelAggregatorERKNSt3__110shared_ptrIKNS_14SliceTransformEEEPPNS_11TableReaderEPNS_13HistogramImplENS_17TableReaderCallerEPNS_5ArenaEbimPKNS_11InternalKeyESW_b+0x1ac
C  [librocksdbjni15989196819046251041.jnilib+0x8fdc8]  
_ZN7rocksdb13CompactionJob25ProcessKeyValueCompactionEPNS0_18SubcompactionStateE+0x1be4
C  [librocksdbjni15989196819046251041.jnilib+0x8d92c]  
_ZN7rocksdb13CompactionJob3RunEv+0xed8
C  [librocksdbjni15989196819046251041.jnilib+0xfb318]  
_ZN7rocksdb6DBImpl20BackgroundCompactionEPbPNS_10JobContextEPNS_9LogBufferEPNS0_19PrepickedCompactionENS_3Env8PriorityE+0xbc8
C  [librocksdbjni15989196819046251041.jnilib+0xf9484]  
_ZN7rocksdb6DBImpl24BackgroundCallCompactionEPNS0_19PrepickedCompactionENS_3Env8PriorityE+0xc0
C  [librocksdbjni15989196819046251041.jnilib+0xf6f58]  
_ZN7rocksdb6DBImpl16BGWorkCompactionEPv+0x30
C  [librocksdbjni15989196819046251041.jnilib+0x3561dc]  
_ZN7rocksdb14ThreadPoolImpl4Impl8BGThreadEm+0x1ec
C  [librocksdbjni15989196819046251041.jnilib+0x35645c]  
_ZN7rocksdb14ThreadPoolImpl4Impl15BGThreadWrapperEPv+0x7c
C  [librocksdbjni15989196819046251041.jnilib+0x357ed8]  
_ZN7rocksdb13NewThreadPoolEi+0x2b0
C  [libsystem_pthread.dylib+0x726c]  _pthread_start+0x94{noformat}
 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Jenkins build is unstable: Kafka » Kafka Branch Builder » trunk #1468

2022-12-27 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.4 #19

2022-12-27 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » trunk #1469

2022-12-27 Thread Apache Jenkins Server
See 




Jenkins build is still unstable: Kafka » Kafka Branch Builder » 3.4 #20

2022-12-27 Thread Apache Jenkins Server
See 




Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #1470

2022-12-27 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 525303 lines...]
[2022-12-28T04:40:16.774Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testChrootExistsAndRootIsLocked() STARTED
[2022-12-28T04:40:17.857Z] 
[2022-12-28T04:40:17.857Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testChrootExistsAndRootIsLocked() PASSED
[2022-12-28T04:40:17.857Z] 
[2022-12-28T04:40:17.857Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testCreateTopLevelPaths() STARTED
[2022-12-28T04:40:17.857Z] 
[2022-12-28T04:40:17.857Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testCreateTopLevelPaths() PASSED
[2022-12-28T04:40:17.857Z] 
[2022-12-28T04:40:17.857Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > 
testGetAllTopicsInClusterDoesNotTriggerWatch() STARTED
[2022-12-28T04:40:18.776Z] 
[2022-12-28T04:40:18.776Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > 
testGetAllTopicsInClusterDoesNotTriggerWatch() PASSED
[2022-12-28T04:40:18.776Z] 
[2022-12-28T04:40:18.776Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testIsrChangeNotificationGetters() STARTED
[2022-12-28T04:40:18.776Z] 
[2022-12-28T04:40:18.776Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testIsrChangeNotificationGetters() PASSED
[2022-12-28T04:40:18.776Z] 
[2022-12-28T04:40:18.776Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testLogDirEventNotificationsDeletion() 
STARTED
[2022-12-28T04:40:18.776Z] 
[2022-12-28T04:40:18.776Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testLogDirEventNotificationsDeletion() PASSED
[2022-12-28T04:40:18.776Z] 
[2022-12-28T04:40:18.776Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testGetLogConfigs() STARTED
[2022-12-28T04:40:19.694Z] 
[2022-12-28T04:40:19.694Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testGetLogConfigs() PASSED
[2022-12-28T04:40:19.694Z] 
[2022-12-28T04:40:19.694Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testBrokerSequenceIdMethods() STARTED
[2022-12-28T04:40:19.694Z] 
[2022-12-28T04:40:19.694Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testBrokerSequenceIdMethods() PASSED
[2022-12-28T04:40:19.694Z] 
[2022-12-28T04:40:19.694Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testAclMethods() STARTED
[2022-12-28T04:40:20.612Z] 
[2022-12-28T04:40:20.612Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testAclMethods() PASSED
[2022-12-28T04:40:20.612Z] 
[2022-12-28T04:40:20.612Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testCreateSequentialPersistentPath() STARTED
[2022-12-28T04:40:20.612Z] 
[2022-12-28T04:40:20.612Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testCreateSequentialPersistentPath() PASSED
[2022-12-28T04:40:20.612Z] 
[2022-12-28T04:40:20.612Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testConditionalUpdatePath() STARTED
[2022-12-28T04:40:20.612Z] 
[2022-12-28T04:40:20.612Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testConditionalUpdatePath() PASSED
[2022-12-28T04:40:20.612Z] 
[2022-12-28T04:40:20.612Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testGetAllTopicsInClusterTriggersWatch() 
STARTED
[2022-12-28T04:40:21.611Z] 
[2022-12-28T04:40:21.611Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testGetAllTopicsInClusterTriggersWatch() 
PASSED
[2022-12-28T04:40:21.611Z] 
[2022-12-28T04:40:21.611Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testDeleteTopicZNode() STARTED
[2022-12-28T04:40:21.611Z] 
[2022-12-28T04:40:21.611Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testDeleteTopicZNode() PASSED
[2022-12-28T04:40:21.611Z] 
[2022-12-28T04:40:21.611Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testDeletePath() STARTED
[2022-12-28T04:40:21.611Z] 
[2022-12-28T04:40:21.611Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > testDeletePath() PASSED
[2022-12-28T04:40:21.611Z] 
[2022-12-28T04:40:21.611Z] Gradle Test Run :core:integrationTest > Gradle Test 
Executor 165 > KafkaZkClientTest > test