Build failed in Jenkins: Kafka » Kafka Branch Builder » trunk #2441

2023-12-02 Thread Apache Jenkins Server
See 


Changes:


--
[...truncated 436568 lines...]
Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testRegisterBrokerInfo() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testRegisterBrokerInfo() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testRetryRegisterBrokerInfo() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testRetryRegisterBrokerInfo() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testConsumerOffsetPath() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testConsumerOffsetPath() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testDeleteRecursiveWithControllerEpochVersionCheck() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testDeleteRecursiveWithControllerEpochVersionCheck() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testTopicAssignments() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testTopicAssignments() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testControllerManagementMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testControllerManagementMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testTopicAssignmentMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testTopicAssignmentMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testConnectionViaNettyClient() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testConnectionViaNettyClient() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testPropagateIsrChanges() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testPropagateIsrChanges() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testControllerEpochMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testControllerEpochMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testDeleteRecursive() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testDeleteRecursive() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testGetTopicPartitionStates() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testGetTopicPartitionStates() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testCreateConfigChangeNotification() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testCreateConfigChangeNotification() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testDelegationTokenMethods() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > KafkaZkClientTest > 
testDelegationTokenMethods() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ReassignPartitionsZNodeTest > testDecodeInvalidJson() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ReassignPartitionsZNodeTest > testDecodeInvalidJson() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ReassignPartitionsZNodeTest > testEncode() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ReassignPartitionsZNodeTest > testEncode() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ReassignPartitionsZNodeTest > testDecodeValidJson() STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ReassignPartitionsZNodeTest > testDecodeValidJson() PASSED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ZkMigrationIntegrationTest > testMigrateTopicDeletions(ClusterInstance) > [1] 
Type=ZK, MetadataVersion=3.4-IV0, Security=PLAINTEXT STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ZkMigrationIntegrationTest > testMigrateTopicDeletions(ClusterInstance) > [1] 
Type=ZK, MetadataVersion=3.4-IV0, Security=PLAINTEXT SKIPPED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ZkMigrationIntegrationTest > testMigrateTopicDeletions(ClusterInstance) > [2] 
Type=ZK, MetadataVersion=3.5-IV2, Security=PLAINTEXT STARTED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ZkMigrationIntegrationTest > testMigrateTopicDeletions(ClusterInstance) > [2] 
Type=ZK, MetadataVersion=3.5-IV2, Security=PLAINTEXT SKIPPED

Gradle Test Run :core:test > Gradle Test Executor 92 > 
ZkMigrationIntegrationTest > testMigrateTopicDeletions(ClusterInstance) > [3] 
Type=

Re: [DISCUSS] KIP-1008: ParKa - the Marriage of Parquet and Kafka

2023-12-02 Thread Xinli shang
Hi Steven,

Thank you for your question! Firstly, the statistics such as min/max, null
count, exist inside the file (page and column index), or you can consider
it as inside the the Parquet segment. These statistics will be generated at
the Kafka producer in our proposal when the Parquet format is applied. It
is part of the Parquet format. Secondly, when the table format (like Delta,
Iceberg, Hudi) is applied during ingestion, those statistics inside the
file will be rolled up to the metadata of the table format.

The second part is out of the scope of this KIP because that is entirely
within the realm of ingestion. This KIP provides a byte buffer to the
ingestion application, and the data inside the byte buffer is in Parquet
format. Let me know if you have any questions.

Xinli

On Sun, Nov 26, 2023 at 9:42 AM Steven Wu  wrote:

> >  if we can produce the segment with Parquet, which is the native format
> in a data lake, the consumer application (e.g., Spark jobs for ingestion)
> can directly dump the segments as raw byte buffer into the data lake
> without unwrapping each record individually and then writing to the Parquet
> file one by one with expensive steps of encoding and compression again.
>
> This sounds like an interesting idea. I have one concern though. Data
> Lake/table formats (like Delta Lake, Hudi, Iceberg) have column-level
> statistics, which are important for query performance. How would column
> stats be handled in this proposal?
>
> On Tue, Nov 21, 2023 at 9:21 AM Xinli shang 
> wrote:
>
> > Hi, all
> >
> > Can I ask for a discussion on the KIP just created KIP-1008: ParKa - the
> > Marriage of Parquet and Kafka
> > <
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1008%3A+ParKa+-+the+Marriage+of+Parquet+and+Kafka
> > >
> > ?
> >
> > --
> > Xinli Shang
> >
>


-- 
Xinli Shang


Re: [DISCUSS] KIP-954: expand default DSL store configuration to custom types

2023-12-02 Thread Guozhang Wang
Hey Almog,

Sorry for the late reply.

Re: 2) above, maybe I'm just overthinking it. What I had in mind is
that when we have, say, a remote store impl customized by the users.
Besides being used inside the KS app itself, the user may try to
access the store instance outside the KS app as well? If that's the
case, maybe it's still worth having an interface from KS to expose the
store instance directly.


Guozhang


On Sun, Nov 19, 2023 at 5:26 PM Almog Gavra  wrote:
>
> Hello Guozhang,
>
> Thanks for the feedback! For 1 there are tests verifying this and I did so
> manually as well, it does not reveal anything about the store types -- just
> the names, so I think we're good there. I've put an example at the bottom
> of this reply for people following the conversation.
>
> I'm not sure I understand your question about 2. What's the integration
> point with the actual store for this external component? What does that
> have to do with this PR/how does it differ from what's available today
> (with the default.dsl.store configuration)? In either scenario, getting the
> actual instantiated store supplier must be done only after the topology is
> built and rewritten (it can be passed in either via
> Materialized/StreamJoined in the DSL code, via TopologyConfig overrides or
> in the global StreamsConfig passed in to KafkaStreams). Today, AFAIK, this
> isn't possible (you can't get from the built topology the instantiated
> store supplier).
>
> Thanks,
> Almog
>
> 
>
> Topologies:
>Sub-topology: 0
> Source: KSTREAM-SOURCE-00 (topics: [test_topic])
>   --> KSTREAM-TRANSFORMVALUES-01
> Processor: KSTREAM-TRANSFORMVALUES-01 (stores: [])
>   --> Aggregate-Prepare
>   <-- KSTREAM-SOURCE-00
> Processor: Aggregate-Prepare (stores: [])
>   --> KSTREAM-AGGREGATE-03
>   <-- KSTREAM-TRANSFORMVALUES-01
> Processor: KSTREAM-AGGREGATE-03 (stores:
> [Aggregate-Aggregate-Materialize])
>   --> Aggregate-Aggregate-ToOutputSchema
>   <-- Aggregate-Prepare
> Processor: Aggregate-Aggregate-ToOutputSchema (stores: [])
>   --> Aggregate-Project
>   <-- KSTREAM-AGGREGATE-03
> Processor: Aggregate-Project (stores: [])
>   --> KTABLE-TOSTREAM-06
>   <-- Aggregate-Aggregate-ToOutputSchema
> Processor: KTABLE-TOSTREAM-06 (stores: [])
>   --> KSTREAM-SINK-07
>   <-- Aggregate-Project
> Sink: KSTREAM-SINK-07 (topic: S2)
>   <-- KTABLE-TOSTREAM-06
>
> On Sat, Nov 18, 2023 at 6:05 PM Guozhang Wang 
> wrote:
>
> > Hello Almog,
> >
> > I left a comment in the PR before I got to read the newest updates
> > from this thread. My 2c:
> >
> > 1. I liked the idea of delaying the instantiation of StoreBuiler from
> > suppliers after the Topology is created. It has been a bit annoying
> > for many other features we were trying back then. The only thing is,
> > we need to check when we call Topology.describe() which gets a
> > TopologyDescription, does that reveal anything about the source of
> > truth store impl types already; if it does not, then we are good to
> > go.
> >
> > 2. I originally thought (and commented in the PR) that maybe we can
> > just add this new func "resolveDslStoreSuppliers" into StreamsConfig
> > directly and mark it as EVOLVING, because I was not clear that we are
> > trying to do 1) above. Now I'm leaning more towards what you proposed.
> > But I still have a question in mind: even after we've done
> > https://github.com/apache/kafka/pull/14548 later, don't we still need
> > some interface that user's can call to get the actual instantiated
> > store supplier for cases where some external custom logic, like an
> > external controller / scheduler which is developed by a different
> > group of people rather than the Streams app developers themselves,
> > that can only turn on certain features after learning the actual store
> > impl suppliers used?
> >
> > Guozhang
> >
> > On Sat, Nov 18, 2023 at 2:46 PM Almog Gavra  wrote:
> > >
> > > Hello Everyone - one more minor change to the KIP that came up during
> > > implementation (reflected in the KIP itself). I will be adding the method
> > > below to TopologyConfig. This allows us to determine whether or not the
> > > DslStoreSuppliers was explicitly passed in via either
> > > DSL_STORE_SUPPLIERS_CLASS_CONFIG or DEFAULT_DSL_STORE_CONFIG (if it was
> > not
> > > explicitly passed in, we will use the one that is configured in
> > > StreamsConfig, or the default value of RocksDBDslStoreSuppliers).
> > >
> > > See the discussion on the PR (
> > > https://github.com/apache/kafka/pull/14648#discussion_r1394939779) for
> > more
> > > context. Ideally this would be an internal utility method but there's no
> > > clean way to get that done now, so the goal was to minimize the surface
> > > area of what's being exposed (as opposed to exposing a generic method
> > like
> > > isOverridden(Stri

[jira] [Created] (KAFKA-15959) Replace byte handling classes with synchronized methods

2023-12-02 Thread Ismael Juma (Jira)
Ismael Juma created KAFKA-15959:
---

 Summary: Replace byte handling classes with synchronized methods
 Key: KAFKA-15959
 URL: https://issues.apache.org/jira/browse/KAFKA-15959
 Project: Kafka
  Issue Type: Improvement
Reporter: Ismael Juma


The JDK has a number of old byte handling classes that have a number of 
synchronized methods. This wasn't too bad until biased locking was disabled by 
default in Java 17 and removed in Java 21.

The overhead now can be significant if one such method happens to be in the hot 
path. And such overhead is unnecessary if the classes are used by a single 
thread (which is very common).

The classes we should replace:
 # ByteArrayInputStream
 # 
ByteArrayOutputStream
 # 
DataOutputStream
 # BufferedInputStream
 # BufferedOutputStream



--
This message was sent by Atlassian Jira
(v8.20.10#820010)