Re: Kafka memory estimation

2019-04-12 Thread Peter Bukowinski
The memory that a kafka broker uses is the java heap + the page cache. If you’re able to split your memory metrics by memory-used and memory-cached, you should see that the majority of a broker’s memory usage is cached memory. As a broker receives data from producers, the data first enters the p

Re: Kafka memory estimation

2019-04-12 Thread Rammohan Vanteru
Hi Steve, We are using Prometheus jmx exporter and Prometheus to scrape metrics based on memory metric we are measuring. Jmx exporter: https://github.com/prometheus/jmx_exporter/blob/master/README.md Thanks, Ramm. On Fri, Apr 12, 2019 at 12:43 PM Steve Howard wrote: > Hi Rammohan, > > How are

Re: Using processor API via DSL

2019-04-12 Thread Alessandro Tagliapietra
Hi Bruno, Thank you for the quick answer. I'm actually trying to do that since it seems there is really no way to have it use `Processor`. I just wanted (if that would've made any sense) to use the Processor in both DSL and non-DSL pipelines. Anyway, regarding `transformValues()` I don't think I

Re: Using processor API via DSL

2019-04-12 Thread Bruno Cadonna
Hi Alessandro, Have you considered using `transform()` (actually in your case you should use `transformValues()`) instead of `.process()`? `transform()` and `transformValues()` are stateful operations similar to `.process` but they return a `KStream`. On a `KStream` you can then apply a windowed a

Using processor API via DSL

2019-04-12 Thread Alessandro Tagliapietra
Hi there, I'm just starting with Kafka and I'm trying to create a stream processor that in multiple stages: - filters messages using a kv store so that only messages with higher timestamp gets processed - aggregates the message metrics by minute giving e.g. the avg of those metrics in that minut

Re: Race condition with stream use of Global KTable

2019-04-12 Thread Raman Gupta
Currently topic-1 and topic-2 have a different number of partitions (due to vastly different concurrency/processing time requirements). So in order to accomplish this, I'd also need to open a can of worms and repartition topic-1 to create topic-1a, so that it can be co-partitioned with topic-2a, wh

Re: Race condition with stream use of Global KTable

2019-04-12 Thread Guozhang Wang
As for 2), just to clarify that co-partitioning is still needed to make sure that a record from topic-2 can get the expected data from the local materialized store from topic-1 --- this would be required for either DSL or for processor API. What I was suggesting is that, when sending to topic-2, b

Method to check if the log-cleaner of a Kafka broker is running or not

2019-04-12 Thread Karthik Yadav
I have a few brokers which take care of the data of a few applications. I have the log-cleaner for all the brokers enabled with thecleanup.policy as delete. There was an issue a few weeks back where the log-cleaner stopped working in one of the brokers and this led complete disk usage, and it wasn'

Re: Kafka memory estimation

2019-04-12 Thread Steve Howard
Hi Rammohan, How are you measuring "Kafka seems to be reserving most of the memory"? Thanks, Steve On Thu, Apr 11, 2019 at 11:53 PM Rammohan Vanteru wrote: > Hi Users, > > As per the article here: > https://docs.confluent.io/current/kafka/deployment.html#memory memory > requirement is roughl