Using GlobalKTable/KeyValueStore for topic cache

2018-11-13 Thread Chris Toomey
We're considering using GlobalKTables / KeyValueStores for locally caching topic content in services. The topics would be compacted such that only the latest key/value pair would exist for a given key. One question that's come up is how to determine, when bootstrapping the app, when the cache has

Re: Using GlobalKTable/KeyValueStore for topic cache

2018-11-13 Thread Chris Toomey
Definitely Ryanne -- that's what I meant by "topics would be compacted". But that doesn't obviate checking bootstrapping progress. On Tue, Nov 13, 2018 at 5:04 PM Ryanne Dolan wrote: > Chris, consider using log compaction. > > Ryanne > > On Tue, Nov 13

Re: Using GlobalKTable/KeyValueStore for topic cache

2018-11-13 Thread Chris Toomey
when restoring state stores (including > GlobalStores) and may provide what you are looking for. > > Thanks, > Bill > > On Tue, Nov 13, 2018 at 8:49 PM Chris Toomey wrote: > > > Definitely Ryanne -- that's what I meant by "topics would be compacted". >

Re: Using GlobalKTable/KeyValueStore for topic cache

2018-11-14 Thread Chris Toomey
t; This can take a little for larger topic and/or multiple global stores. > > We are blocking access until they are available although this is not ideal > in terms of timeout tuning. > > Any ideas are welcome. > > Best regards > > Patrik > > > Am 14.11.201

Needless group coordination overhead for GlobalKTables

2019-10-29 Thread Chris Toomey
We have some simple Kafka streams apps that populate GlobalKTables to use as caches for topic contents. When running them with info-level logging enabled, I noticed unexpected activity around group coordination (joining, rebalancing, leaving, rejoining) that I didn't expect given that they need to

Re: Needless group coordination overhead for GlobalKTables

2019-10-29 Thread Chris Toomey
ee the following code line > > https://github.com/apache/kafka/blob/065411aa2273fd393e02f0af46f015edfc9f9b55/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L1149 > > Best, > Bruno > > On Tue, Oct 29, 2019 at 8:12 PM Chris Toomey wrote: > > > > We have s

Re: Needless group coordination overhead for GlobalKTables

2019-10-31 Thread Chris Toomey
ble improvement. > > Best, > Bruno > > On Wed, Oct 30, 2019 at 1:20 AM Chris Toomey wrote: > > > > Bruno, > > > > I'm using a fork based off the 2.4 branch .It's not the global consumer > but > > the stream thread consumer that has the g

Kafka Streams OffsetOutOfRangeException / restart to recover

2019-10-31 Thread Chris Toomey
I'm getting an OffsetOutOfRangeException accompanied by the log message "Updating global state failed. You can restart KafkaStreams to recover from this error." But I've restarted the app several times and it's not recovering, it keeps failing the same way. Is this error message just wrong (and sh

Re: Kafka Streams OffsetOutOfRangeException / restart to recover

2019-11-04 Thread Chris Toomey
Filed https://issues.apache.org/jira/browse/KAFKA-9141 On Thu, Oct 31, 2019 at 7:30 PM Chris Toomey wrote: > I'm getting an OffsetOutOfRangeException accompanied by the log message > "Updating global state failed. You can restart KafkaStreams to recover from > this error.&

Consumer call retries

2020-05-06 Thread Chris Toomey
We're using Kafka 2.4.0 and trying to understand the Java consumer behavior w.r.t. transient network problems, such as getting temporarily disconnected from a broker due to temporary broker or network issue. The following consumer config. settings imply that in the above scenario, all consumer cli

Re: Kafka consumer

2020-05-06 Thread Chris Toomey
You can set the max.poll.records config. setting to 1 in order to pull down and process 1 record at a time. See https://kafka.apache.org/documentation/#consumerconfigs . On Mon, May 4, 2020 at 1:04 AM vishnu murali wrote: > Hey Guys, > > I am having a topic and in that topic I am having 3000 me

Re: Is committing offset required for Consumer

2020-05-07 Thread Chris Toomey
If you choose to manually assign topic partitions, then you won't be using the group protocol to dynamically manage partition assignments and thus don't have a need to poll or heartbeat at any interval. See "Manual Partition Assignment" in https://kafka.apache.org/24/javadoc/org/apache/kafka/client

Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-07 Thread Chris Toomey
You really have to decide what behavior it is you want when one of your consumers gets "stuck". If you don't like the way the group protocol dynamically manages topic partition assignments or can't figure out an appropriate set of configuration settings that achieve your goal, you can always elect

Re: Is committing offset required for Consumer

2020-05-07 Thread Chris Toomey
titions and offset manually? > > On Thu, May 7, 2020 at 8:02 PM Chris Toomey wrote: > > > If you choose to manually assign topic partitions, then you won't be > using > > the group protocol to dynamically manage partition assignments and thus > > don't have a nee

Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-08 Thread Chris Toomey
ou please elaborate more on this? > > Regards, > Ali > > On Fri, May 8, 2020 at 1:09 PM Chris Toomey wrote: > > > You really have to decide what behavior it is you want when one of your > > consumers gets "stuck". If you don't like the way the group pro

Re: JDBC SINK SCHEMA

2020-05-08 Thread Chris Toomey
You have to either 1) use one of the Confluent serializers when you publish to the topic, so that the schema (or reference to it) is included, or 2) write and use a custom converter

Re: Kafka long running job consumer config best practices and what to do to avoid stuck consumer

2020-05-08 Thread Chris Toomey
> What I am looking for is how I can deal with long-running jobs in Apache > Kafka. > > Thanks, > Ali > > On Sat, May 9, 2020 at 4:25 AM Chris Toomey wrote: > > > I interpreted your post as saying "when our consumer gets stuck, Kafka's > > automatic

Re: Write to database directly by referencing schema registry, no jdbc sink connector

2020-05-08 Thread Chris Toomey
Write your own implementation of the JDBC sink connector and use the avro serializer to convert the kafka record into a connect record that your connector takes and writes to DB via JDBC. On Fri, May 8, 2020 at 7:38 PM wangl...@geekplus.com.cn < wangl...@geekplus.com.cn> wrote: > > Using debeziu

RocksDB state store disk space estimation

2021-02-18 Thread Chris Toomey
We're using RocksDB as a persistent Kafka state store for compacted topics and need to be able to estimate the maximum disk space required. We're using the default config. settings provided by Kafka, which include Universal compaction, no compression, and 4k block size. Given these settings and a

Re: RocksDB state store disk space estimation

2021-02-18 Thread Chris Toomey
n also monitor the sizes of your RocksDB state > stores with the metric total-sst-files-size > (https://kafka.apache.org/documentation/#kafka_streams_rocksdb_monitoring) > > Best, > Bruno > > On 18.02.21 17:43, Chris Toomey wrote: > > We're using RocksDB as a persist