Kafka streaming partition assignment

2018-05-13 Thread Liam Clarke
s that we'd see an even split of partitions across both instances - is this a realistic expectation? Or is this working as intended? Regards, Liam Clarke

Re: Kafka streaming partition assignment

2018-05-13 Thread Liam Clarke
/kafka/pull/4410 > > Thus, it's "by design" (for 1.0 and older) but we we want to improve it. > Cf: https://issues.apache.org/jira/browse/KAFKA-4969 > > -Matthias > > On 5/13/18 7:52 PM, Liam Clarke wrote: > > Hi all, > > > > We are running a KSt

Re: Kafka consumer to unzip stream of .gz files and read

2018-05-21 Thread Liam Clarke
Also, if I recall correctly - the console producer uses a BufferedReader to read from the console and assumes that a newline terminates a message, so any byte of value 0A in your gzipped file will send a message. I suggest using a Python producer to send your gzipped file. Regards, Liam Clarke

Re: Frequent "offset out of range" messages, partitions deserted by consumer

2018-06-19 Thread Liam Clarke
How frequently are your consumers committing offsets? On Wed, 20 Jun. 2018, 4:52 pm Shantanu Deshmukh, wrote: > I desperately need help. Facing this issue on production since a while now. > Someone please help me out. > > On Fri, Jun 15, 2018 at 2:02 AM Lawrence Weikum > wrote: > > > unsubscrib

Re: Frequent "offset out of range" messages, partitions deserted by consumer

2018-06-19 Thread Liam Clarke
How often is the consumer actually consuming? I know there's an issue where old committed offsets expire after a period of time. On Wed, 20 Jun. 2018, 5:46 pm Shantanu Deshmukh, wrote: > It is happening via auto-commit. Frequence is 3000 ms > > On Wed, Jun 20, 2018 at 10:31

Re: Spark structured streaming + kafka

2018-07-09 Thread Liam Clarke
In your SBT script you've specified that your Kafka data source jar is provided. Please read the documentation of dependency scopes. https://maven.apache.org/guides/introduction/introduction-to-dependency-mechanism.html#Dependency_Scope Kind regards, Liam Clarke On Tue, 10 Jul. 2018, 9:

Re: Question - Partition Lag

2018-07-21 Thread Liam Clarke
Presumably your consumers are blocking or taking a very long time to process consumed records. I'd suggest implementing indepth logging of your consumers to investigate their performance. On Sat, 21 Jul. 2018, 11:51 am Antony A, wrote: > Hello users. > > > We have a 7 broker cluster running 0.11

Re: Sometimes KafkaProducer.send() is blocked without throwing any Exception

2018-09-04 Thread Liam Clarke
What are your producer acks configured to? On Wed, 5 Sep. 2018, 2:46 pm LEE, Tung-Yu, wrote: > Actually we are not sure how often the blocking happens. Whenever it > happens, it seems never to stop until we kill the process. > > > > We start 12 processes at the same time using KafkaProducer to

Re: Sometimes KafkaProducer.send() is blocked without throwing any Exception

2018-09-04 Thread Liam Clarke
Okay so acks will be 1 then. Anything in the broker logs when this occurs? On Wed, 5 Sep. 2018, 3:32 pm LEE, Tung-Yu, wrote: > We didn't specify the acks in our producer.properties. I guess it will be > the default value? > > Liam Clarke 於 2018年9月5日 週三 11:09 寫道: > > &

Re: Sometimes KafkaProducer.send() is blocked without throwing any Exception

2018-09-04 Thread Liam Clarke
ep. 2018 3:59 pm, "Liam Clarke" wrote: Okay so acks will be 1 then. Anything in the broker logs when this occurs? On Wed, 5 Sep. 2018, 3:32 pm LEE, Tung-Yu, wrote: > We didn't specify the acks in our producer.properties. I guess it will be > the default value? > >

Re: Kafka on Windows

2018-09-04 Thread Liam Clarke
't unless Microsoft built it. Kind regards, Liam Clarke On Wed, 8 Aug. 2018, 2:42 am jan, wrote: > This is an excellent suggestion and I intend to do so henceforth > (thanks!), but it would be an adjunct to my request rather than the > answer; it still needs to be made clear

Re: Kafka on Windows

2018-09-05 Thread Liam Clarke
a NuGet. Consider those three days valuable learning instead. Kind regards, Liam Clarke On Wed, 5 Sep. 2018, 9:22 pm jan, wrote: > Hi Liam, > as a DB guy that does MSSQL (on windows, obviously) I literally have > no idea what a .sh file is,or what that would imply. I guess it'

Re: SAM Scala aggregate

2018-09-10 Thread Liam Clarke
https://mvnrepository.com/artifact/org.apache.kafka/kafka-streams/2.0.0 Kind regards, Liam Clarke On Tue, Sep 11, 2018 at 1:43 PM, Michael Eugene wrote: > Well to bring up that kafka to 2.0, do I just need for sbt kafka clients > and kafka streams 2.0 for sbt? And it doesn't matter if the syste

Re: Batch vs buffer differences

2018-09-11 Thread Liam Clarke
Batch size is a threshold that determines when a batch sends. Buffer size is total memory available to the producer. It determines how many records the producer can store in memory to send. While they relate somewhat in that buffer size couldn't be less than (batch size * num partitions), you can

Re: Batch vs buffer differences

2018-09-11 Thread Liam Clarke
Just to clarify - a batch of records is sent when batch.size or linger.ms are hit - so to control messaging rates/sizes you tweak both of those. On Tue, 11 Sep. 2018, 7:38 pm darekAsz, wrote: > Thank you so much :) > I think that now all is clear for me > > wt., 11 wrz 2018 o 09:28

Re: Need info

2018-09-12 Thread Liam Clarke
ce. E.g., https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines Kind regards, Liam Clarke On Wed, 12 Sep. 2018, 7:32 pm Chanchal Chatterji, < chanchal.chatte...@infosys.com> wrote: > Hi, > > In the process of mainframe modernizatio

Re: Kafka compression - results

2018-09-12 Thread Liam Clarke
500 million * 64B is 32GB. Are you sure you actually sent 500 million messages? (I assumed that mln = million) On Wed, 12 Sep. 2018, 9:54 pm darekAsz, wrote: > sorry, I wrote bad results :/ > here are correctly > > Directory size after sending uncompressed data: 1.5 GB > > Directory size after s

Re: Need info

2018-09-12 Thread Liam Clarke
So you need to figure out your needs. Kafka can deliver near real time streaming, and it can function as a data store. It can handle significantly large messages if you want, but there are tradeoffs - you'd obviously need more hardware. I have no idea how many MB a bank transaction is, but you nee

Re: Kafka consumer offset topic deletion

2018-09-18 Thread Liam Clarke
Odd that the log compaction isn't working. What OS is your broker running on and can you please post your server.properties? On Wed, 19 Sep. 2018, 2:13 am Kaushik Nambiar, wrote: > > > > Hello, > > We have a Kafka 0.11.xx version setup. > > So the system topic which is __consumer_offset, we are

Re: unable to start kafka when zookeeper cluster is in working but unhealthy state

2018-09-26 Thread Liam Clarke
Hi James, That's not an unresponsive node that's killing Kafka, that's a failure to resolve the address that's killing it - my personal expectation would be that even though zookeeper-2.zookeeper.etc may be down, its name should still resolve. Regards, Liam Clarke On Wed

Re: The limit on the number of consumers in a group.

2018-10-25 Thread Liam Clarke
How many partitions? On Fri, 26 Oct. 2018, 2:52 pm Dominic Kim, wrote: > Dear all. > > Is there any limit on the number of consumers in a group? > I want to utilize about 300 or more consumers in a group, but rebalancing > hangs and never get finished. > When I invoke only 130~140 consumers in a

Session Window emitting null values when converted to stream?

2018-12-02 Thread Liam Clarke
uch needs to be explicitly handled? I'm worried that my filtering those values from the ongoing stream is hiding a problem further upstream, but if it's working as designed, then sweet as. Kind regards, Liam Clarke

Re: Session Window emitting null values when converted to stream?

2018-12-04 Thread Liam Clarke
Thanks Matthias, That's far cleaner :) Cheers, Liam Clarke On Mon, Dec 3, 2018 at 4:59 PM Matthias J. Sax wrote: > The nulls are expected. > > It's not about expired session windows though: sessions window are > stored as `<(key,start-timestamp,end-timestamp

Re: Configuration of log compaction

2018-12-17 Thread Liam Clarke
Hi Claudia, Anything useful in the log cleaner log files? Cheers, Liam Clarke On Tue, 18 Dec. 2018, 3:18 am Claudia Wegmann Hi, > > thanks for the quick response. > > My problem is not, that no new segments are created, but that segments > with old data do not get compac

Re: maven dependency problems

2018-12-17 Thread Liam Clarke
Hi, Is it a transitive dependency of any of your other dependencies? Cheers, Liam Clarke On Tue, 18 Dec. 2018, 2:57 pm big data Hi, > our project includes this dependency by: > > org.apache.spark > spark-streaming-kafka_2.11 > 1.6.3 > > From dependenc

Re: maven dependency problems

2018-12-17 Thread Liam Clarke
Can you please post the full output of maven dependency:tree from the parent POM in both scenarios? Thanks, Liam Clarke On Tue, 18 Dec. 2018, 3:26 pm big data Hi, > > No other dependency include kafka's jar. > > The project structure is: > > pom.xml > > |

Re: Configuration of log compaction

2018-12-18 Thread Liam Clarke
? 3) No, it's a thread started by the broker. I imagine you're using an old broker version, around 0.8.2 perhaps? We've found that rolling broker restarts with 0.11 are rather easy and not to be feared. Kind regards, Liam Clarke On Tue, Dec 18, 2018 at 10:43 PM Claudia Wegmann wro

Re: Knowing when to grow a Kafka cluster

2018-12-26 Thread Liam Clarke
of whether to scale horizontally or vertically, that really depends on the costs involved in either option. Although if you're saturating the network interface on a node, you can't really scale that one vertically. Kind regards, Liam Clarke On 27 Dec. 2018 1:24 pm, "Harper Henn"

Re: Kafka compatible with Java 11?

2019-01-31 Thread Liam Clarke
That's a warning, but it's not yet an error. On Fri, 1 Feb. 2019, 2:24 pm pradeep bansal I am receiving below warnings when running kafka-acl and other tools which > interact with zookeeper. > > WARNING: An illegal reflective access operation has occurred > WARNING: Illegal reflective access by >

Re: Kafka CDC from Databases to S3

2019-02-14 Thread Liam Clarke
Kafka Connect JDBC source can pick up changes based on primary keys and mod time columns. Debezium consumes the DBs change log directly. Both would work for your purpose. Have a look at the documentation for the Kafka Connect S3 sink to see if it fits your needs on the other end. Cheers, Liam

Re: Kafka console consumer group

2019-03-29 Thread Liam Clarke
You can specify an offset to consume from with --offset, it defaults to latest if not provided. On Sat, 30 Mar. 2019, 4:03 am Sharmadha Sainath, wrote: > Hi all, > > > I have a basic question regarding kafka consumer groups. I will detail it > with an example : > > > 1.I run a console-producer o

Re: Something like a unique key to prevent same record from being inserted twice?

2019-04-03 Thread Liam Clarke
iness key derived from the message. Kind regards, Liam Clarke On Thu, Apr 4, 2019 at 4:09 AM Hans Jespersen wrote: > Ok what you are describing is different from accidental duplicate message > pruning which is what the idempotent publish feature does. > > You are describing a situat

Re: Streaming Data

2019-04-09 Thread Liam Clarke
Hi Nick, Have you looked into KSQL? Kind regards, Liam Clarke On Wed, 10 Apr. 2019, 8:26 am Nick Torenvliet, wrote: > Hi all, > > Just looking for some general guidance. > > We have a kafka -> druid pipeline we intend to use in an industrial setting > to monitor proces

Re: KSQL Question

2019-05-01 Thread Liam Clarke
Hi Shalom, If you're familiar with Docker the Confluent images are fantastic for experimentation and prototyping. https://docs.confluent.io/current/ksql/docs/installation/install-ksql-with-docker.html On Wed, May 1, 2019 at 7:04 PM Liam Clarke wrote: > > > On Wed, May 1, 2019 at

Re: KSQL Question

2019-05-01 Thread Liam Clarke
On Wed, May 1, 2019 at 7:04 PM shalom sagges wrote: > Thanks a lot Vahid! > > I will definitely give it a try. > > Thanks again. :) > > On Tue, Apr 30, 2019 at 6:30 PM Vahid Hashemian > > wrote: > > > Hi Shalom, > > > > This is the Github repo for KSQL: https://github.com/confluentinc/ksql > > H

Re: kafka-consumers-group.sh kafka_2.12-2.2.0 does not recognize --discribe option

2019-05-22 Thread Liam Clarke
You've spelled it wrong :) It's --describe On Thu, 23 May 2019, 2:04 pm jee yong kim, wrote: > hi, > > > I installed kafka_2.12-2.2.0 tarball on a CentOS server. > > > I have a strange behavior of the utility bin/kafka-consumer-groups.sh when > > it is called with --discribe option, like: > > >

Re: Customers are getting same emails for roughly 30-40 times

2019-05-24 Thread Liam Clarke
se it took too long, and hit the session timeout, and then wasn't able to commits its offsets. So I'd look closely at your consuming code and log every possible source of exceptions. Kind regards, Liam Clarke On Fri, 24 May 2019, 7:37 pm ASHOK MACHERLA, wrote: > Dear Team Membe

Re: Detecting server unavailable

2019-07-08 Thread Liam Clarke
Hi Steve, Have you tried setting "default.api.timeout.ms" to something lower in your consumer configuration? It'll then throw that exception earlier. Kind regards, Liam Clarke On Tue, Jul 9, 2019 at 3:01 AM Gorman, Steve A. wrote: > I have a springboot microservice that l

Re: Questions for platform to choose

2019-08-21 Thread Liam Clarke
C Hi Eliza, Kafka Streaming, Spark Streaming, Flink and Storm are all good. They also all have their caveats. It's really hard to say that X is the best. For example, Kafka Streaming can't read from one Kafka cluster and write to another, but Spark can. But then Spark offers two flavours of str

Re: OOM for large messages with compression?

2019-08-21 Thread Liam Clarke
ine 300 of StringCoding.java is byte[] ba = new byte[en]; Compression is applied after the string is serialized to bytes. So you'll need to increase your heap size to support this. Hope that helps :) Liam Clarke On Thu, Aug 22, 2019 at 1:52 AM l vic wrote: > I have to deal with large ( 16M

Re: Metrics of BytesRejectedPerSec keeping rising

2019-08-27 Thread Liam Clarke
Hi, The broker exposes per topic error rates, that might help? https://docs.confluent.io/current/kafka/monitoring.html#per-topic-metrics Kind regards, Liam Clarke On Tue, 27 Aug. 2019, 8:07 pm 李海军, wrote: > Hi all, > > We have a kafka cluster consisted of 6 brokers with the v

Re: Shipping Kafka logs

2019-09-23 Thread Liam Clarke
Hi Eva, Hope this helps. https://www.elastic.co/guide/en/logstash/current/plugins-inputs-kafka.html Kind regards, Liam Clarke On Tue, Sep 24, 2019 at 2:57 AM Eva Sheeva wrote: > Hello > > Can you please point me to the doc or guide me on shipping kafka logs to > elk. I unders

Fwd: Shipping Kafka logs

2019-09-24 Thread Liam Clarke
Cc list -- Forwarded message - From: Liam Clarke Date: Tue, 24 Sep. 2019, 7:40 pm Subject: Re: Shipping Kafka logs To: Eva Sheeva Ah, for that you need something like Filebeat - it's an agent that forwards monitored log files to ELK. On Tue, 24 Sep. 2019, 7:17 pm Eva S

Re: Monitoring Broker/Prod/Cons on Kubernetes

2019-09-29 Thread Liam Clarke
I'm using the Prometheus jmx_exporter to export key metrics from the brokers. Takes a bit of fiddling to get the regexes right, but works well. On Fri, 27 Sep. 2019, 2:02 am M. Manna, wrote: > Hello, > > Has anyone got any experience in using monitoring tool (e.g. Prometheus, > DataDog, or custo

Re: Partition Reassignment is getting stuck

2019-11-13 Thread Liam Clarke
If only one broker isn't in sync, it can caused by a dead replica fetcher thread in my experience. I fixed it by restarting the affected broker, but this was on 0.11, so YMMV. On Thu, Nov 14, 2019 at 9:35 AM Koushik Chitta wrote: > The topic partition having the ISR issue might be on a offline

Re: Health Check

2019-12-19 Thread Liam Clarke
se the JMX exporter runs within the Kafka process, any sustained failure to scrape also sets off an alert. Cheers, Liam Clarke On Thu, Dec 19, 2019 at 12:42 AM Miguel Silvestre wrote: > A simple tcp connection to kafka port (9092) should be enough no? > -- > Miguel Silvestre > > >

Re: [EXTERNAL] Re: Health Check

2019-12-21 Thread Liam Clarke
; > > > Work Cell 470-455-8346 > > Personal Cell 678-525-2583 > > Home Phone: 770-945-3315 > > > > [image: signature_522277300] > > > > > > > > *From: *Liam Clarke > *Reply-To: *"users@kafka.apache.org" > *Date: *Thursday, Dec

Re: Quotation Request for Kafka Apache

2020-02-18 Thread Liam Clarke
Kia ora, You'll want to contact Confluent: https://www.confluent.io/subscription Kind regards, Liam Clarke On Wed, 19 Feb. 2020, 11:32 am Doug Zellers, wrote: > Good afternoon, > > I work for a small, federally-focused value added reseller based in > McLean, VA. Do you

Re: Urgent helep please! How to measure producer and consumer throughput for single partition?

2020-02-19 Thread Liam Clarke
, Liam Clarke On Thu, 20 Feb. 2020, 5:57 pm Sunil CHAUDHARI, wrote: > Hi > I was referring to the article by Mr. June Rao about partitions in kafka > cluster. > https://www.confluent.io/blog/how-choose-number-topics-partitions-kafka-cluster/ > > "A rough formula fo

Re: Need details about required complete assessment guide to setup Kafka as messaging broker to de-couple applications/systems

2020-02-20 Thread Liam Clarke
cluster to fit it. But you also need to ensure that one partition can fit on disk on a broker. So if the above topic has 20 partitions, you'll need to make sure that each broker has a disk that can fit 50GB * 14. Plus you need some additional space for Kafka metadata and OS overhead etc. Hope th

Re: [External] Mirror Maker Questions

2020-02-20 Thread Liam Clarke
On #2, you can provide an implementation of a MirrorMakerMessageHandler that will called for each record - you ensure it's in the classpath and pass the class name to MM using --message.handler. On Thu, 20 Feb. 2020, 9:49 pm Dean Barlan, wrote: > Hi everyone, > > I have a few small questions f

Re: [External] Re: Urgent helep please! How to measure producer and consumer throughput for single partition?

2020-02-20 Thread Liam Clarke
. If you're not familiar with JMX or JConsole, there's plenty of great documentation on the Internet, have a Google :) On Thu, 20 Feb. 2020, 11:51 pm Sunil CHAUDHARI, wrote: > Hi Liam Clarke, > Sorry but this is bit unclear for me. > Can you please elaborate your answer? I a

Re: Need details about required complete assessment guide to setup Kafka as messaging broker to de-couple applications/systems

2020-02-20 Thread Liam Clarke
Message retention = 5 days > Number of topics = 4 (4 partitions are enough) > > And thinking to use below configuration. > > RAM to run Kafka cluster = 16GB (For all nodes including Kafka (2 nodes) > and Zookeeper (2 nodes)) > SD space= 20GB > > Thanks and regards, > Naveen

Re: [External] Re: Urgent helep please! How to measure producer and consumer throughput for single partition?

2020-02-20 Thread Liam Clarke
Hi Sunil, Looks like Metricbeats has a Jolokia module that will capture JMX exposed metrics for you: https://www.elastic.co/blog/brewing-in-beats-add-support-for-jolokia-lmx Kind regards, Liam Clarke On Fri, Feb 21, 2020 at 6:16 PM Sunil CHAUDHARI wrote: > Hi Liam Clarke, > Thanks fo

Re: [External] Re: Urgent helep please! How to measure producer and consumer throughput for single partition?

2020-02-21 Thread Liam Clarke
er:type=producer-metrics,client-id=(.+),topic=(.+)record-send-rate > > It seems I have to add few metrics in the *jolokia.yml. *is that > correct? > > > > > > > > -Original Message- > From: Liam Clarke > Sent: Friday, February 21, 2020 10:52 AM > To: u

Re: [External] Example of Producer and Consumer Config Files for MM

2020-02-23 Thread Liam Clarke
Hi Dean, Here's my producer conf, but might work better if you post yours for us to look at. As you can see from the port number, the configured bootstrap server is Kafka, not Zookeeper. bootstrap.servers=kafka.service.consul:9092 batch.size=10 linger.ms=1 compression.type=none client.id=

Re: Trouble understanding tuning batching config

2020-03-20 Thread Liam Clarke
Hi Ryan, Firstly, what version Kafka? Secondly check the broker's message.max.bytes and the topic's max.message.bytes, I suspect they're set a lot lower (or not at all) and will override your fetch.min.bytes. Cheers, Liam Clarke On Sat, 21 Mar. 2020, 11:09 am Ryan Schachte,

Re: Trouble understanding tuning batching config

2020-03-20 Thread Liam Clarke
Hi Ryan, That'll be per poll. Kind regards, Liam Clarke On Sat, 21 Mar. 2020, 11:41 am Ryan Schachte, wrote: > I do see the default for message.max.bytes is set to 1MB though. That would > be for each record or each poll? > > On Fri, Mar 20, 2020 at 3:36 PM Ryan Schachte

Re: Trouble understanding tuning batching config

2020-03-21 Thread Liam Clarke
sting data into datastores that prefer large bulk imports over smaller ones. :) Kind regards, Liam Clarke-Hutchinson On Sun, 22 Mar. 2020, 6:09 am Ryan Schachte, wrote: > You don't think it's weird if I just batch in memory manually do you? I > wrote a small snippet: > > // Rep

Re: Kubernetes Pod got restarted Terminating process due to signal SIGTERM (org.apache.kafka.common.utils.LoggingSignalHandler)

2020-03-29 Thread Liam Clarke
-up period for the broker into a 2 hour start-up period. So be very careful about how long a graceful shutdown of your broker will actually take. Kind regards, Liam Clarke-Hutchinson On Sat, Mar 28, 2020 at 3:49 AM Samir Tusharbhai Chauhan wrote: > Hi, > > My pod got restarted with

Re: Broker always out of ISRs

2020-03-31 Thread Liam Clarke
Hi Zach, Any issues with partitions broker 2 is leader of? Also, have you checked b2's server.log? Cheers, Liam Clarke-Hutchinson On Wed, 1 Apr. 2020, 11:02 am Zach Cox, wrote: > Hi - We have a small Kafka 2.0.0 (Zookeeper 3.4.13) cluster with 3 brokers: > 0, 1, and 2. Each bro

Re: Broker always out of ISRs

2020-04-01 Thread Liam Clarke
issue where CPU was very high or the network saturated. Cheers, Liam Clarke-Hutchinson On Thu, Apr 2, 2020 at 8:51 AM Zach Cox wrote: > Hi Liam, > > > > Any issues with partitions broker 2 is leader of? > > > > Earlier today, broker 2 was not leader of any partitions.

Re: java kafka-client use "props.put("retries", "5")" ,why print 2 log ?

2020-04-05 Thread Liam Clarke
Hi 直以来, Retries only applies to attempts to send a record - What you're seeing in the logs (NetworkClient) is the producer attempting to load cluster metadata (which broker is the leader for which partition etc.) on instantiation. Kind regards, Liam Clarke-Hutchinson On Sun, Apr 5, 2020

Re: High write operations rate on disk

2020-04-06 Thread Liam Clarke
Are you using log compaction? On Tue, Apr 7, 2020 at 6:29 PM Soumyajit Sahu wrote: > We are running Kafka on AWS EC2 instances (m5.2xlarge) with mounted EBS st1 > volume (one on each machine). > Occasionally, we have noticed that the write ops/second goes through the > roof and we get throttled

Kafka Streams - issues with windowing and suppress

2020-04-14 Thread Liam Clarke
lly appreciated. Kind regards, Liam Clarke

Re: Kafka Streams - issues with windowing and suppress

2020-04-14 Thread Liam Clarke
Okay, doing some debugging it looks like I'm seeing this behaviour because it's picking up a grace duration of 86,395,000 ms in KTableImpl.buildSuppress, which would happen to be 5000 millis (my window size) off 24 hours, so I've got some clues! On Wed, Apr 15, 2020 at 3:43 PM Lia

Re: Kafka Streams - issues with windowing and suppress

2020-04-14 Thread Liam Clarke
And the answer is to change .windowedBy(TimeWindows.of(Duration.ofMillis(5000))) and specify the grace period: windowedBy(TimeWindows.of(Duration.ofMillis(5000)).grace(Duration.ofMillis(100))) On Wed, Apr 15, 2020 at 4:34 PM Liam Clarke wrote: > Okay, doing some debugging it looks like

Re: Kafka Streams - issues with windowing and suppress

2020-04-19 Thread Liam Clarke
Hi John, I can't really think of a way to make it more obvious without breaking backwards compatibility - e.g., obvious easy fix is that grace period is a mandatory arg to TimeWindows, but that would definitely break compatibility. Cheers, Liam Clarke-Hutchinson On Thu, Apr 16, 2020 at 1:

Unexpected behaviour on windowing aggregations

2020-04-19 Thread Liam Clarke
s, or I've just misunderstood something about this. Does it matter that the window size in the persistent window store doesn't match the windowing time + grace time in the windowing clause? Any pointers gratefully welcome. Kind regards, Liam Clarke-Hutchinson

Re: Retention period for __consumer_offsets topic

2020-04-19 Thread Liam Clarke
Hi Nitin, Default in Kafka 2.0+ is 7 days, previously it was 24 hours IIRC. Only reason you need to change it is if you anticipate having a whole consumer group offline for longer than your current retention period for debugging/maintenance etc. Cheers, Liam Clarke-Hutchinson On Sun, 19 Apr

Re: Need advice on how to deploy and update a streams app

2020-04-19 Thread Liam Clarke
consuming app remain and drain the old topic until empty. I'm afraid I'm not available for voice chats etc., but if you can provide more examples of your pipeline composition and what changes you envisage happening, I can give you more focused advice. Cheers, Liam Clarke On Mon, Apr

Re: Unexpected behaviour on windowing aggregations

2020-04-19 Thread Liam Clarke
Hi John, Thanks for the reply - yep, that was a dumb copy and paste error, which is what I get for coding while surrounded by kids. >_< I'm deploying a fixed version of it as we speak. Thanks for the reply though :) Kind regards, Liam Clarke On Mon, 20 Apr. 2020, 2:08 am John Roes

Re: Kafka Streams - issues with windowing and suppress

2020-04-19 Thread Liam Clarke
with those changes if it seems appropriate. Cheers, Liam Clarke-Hutchinson On Mon, Apr 20, 2020 at 12:12 PM Matthias J. Sax wrote: > I would prefer to not make the grace-period a mandatory argument and > keep the API as-is. I understand the issue of backward compatibility, > but I wo

Re: Unexpected behaviour on windowing aggregations

2020-04-19 Thread Liam Clarke-Hutchinson
t; > On Sun, Apr 19, 2020, at 18:43, Liam Clarke wrote: > > Hi John, > > > > Thanks for the reply - yep, that was a dumb copy and paste error, which > is > > what I get for coding while surrounded by kids. >_< I'm deploying a fixed > > version of it as

Re: Kafka Streams - issues with windowing and suppress

2020-04-19 Thread Liam Clarke-Hutchinson
bother with an extra example. That would put it front and center. > > A PR would be greatly appreciated! Thanks for the offer! > > Thanks, > John > > On Sun, Apr 19, 2020, at 19:58, Liam Clarke wrote: > > Hi Matthias, > > > > I think as an interim measure

Re: kafka-consumer-groups.sh CURRENT-OFFSET column show "-" , what mean about it ! thank you !

2020-04-21 Thread Liam Clarke-Hutchinson
Hi, It means the consumer group exists, and Kafka's aware of it's topic subscriptions, but either the consumers haven't consumed yet (if using auto offset commit) or the consumers haven't committed any offsets, if using manual offset committing. Does that make sense? Kind r

Re: Kafka Streams - issues with windowing and suppress

2020-04-21 Thread Liam Clarke-Hutchinson
Hi John, Yep, I saw there was a few issues filed around default grace periods, and I have a few ideas about sensible defaults and possible APIs. I'll sign up for a Jira account in the next few days to join the discussion :) Kind regards, Liam Clarke-Hutchinson On Tue, Apr 21, 2020 at 9:

Re: Help with setting up Kafka Node

2020-04-21 Thread Liam Clarke-Hutchinson
a` on the same box? Kind regards, LIam Clarke-Hutchinson On Tue, Apr 21, 2020 at 7:28 PM wrote: > Zookepper: /home/kafka/kafka/bin/zookeeper-server-start.sh > /home/kafka/kafka/config/zookeeper.properties > Kafka: /home/kafka/kafka/bin/kafka-server-start.sh > /home/kafka/kafka/config/

Re: Unknown topic at org.apache.kafka.streams.TopologyTestDriver.pipeRecord

2020-04-21 Thread Liam Clarke-Hutchinson
Hi Nicu, I'd need to see more context to help - for example, what is the value of `topicName`? I've just finished writing Streams tests using the test driver, so can hopefully help with more code :) Cheers, Liam Clarke-Hutchinson On Tue, Apr 21, 2020 at 8:40 PM Dumitru-Nicola

Re: thank you ! which java-client api can has same effect about kafka-consumer-groups.sh command ?

2020-04-22 Thread Liam Clarke-Hutchinson
Looking at the source code, try listConsumerGroupOffsets(String groupId, ListConsumerGroupOffsetsOptions options) instead? On Wed, Apr 22, 2020 at 6:40 PM 一直以来 <279377...@qq.com> wrote: > ./kafka-consumer-groups.sh --bootstrap-server localhost:9081 --describe > --group test > > > use describeCons

Re: thank you ! which java-client api can has same effect about kafka-consumer-groups.sh command ?

2020-04-22 Thread Liam Clarke-Hutchinson
Yep, looking at the source code of our app we use to track lag, we're using that method. On Wed, Apr 22, 2020 at 7:35 PM Liam Clarke-Hutchinson < liam.cla...@adscale.co.nz> wrote: > Looking at the source code, try listConsumerGroupOffsets(String > groupId, ListConsumerGr

Re: kafka-consumer-groups.sh CURRENT-OFFSET column show "-" , what mean about it ! thank you !

2020-04-22 Thread Liam Clarke-Hutchinson
> but show "0" and "_", show two value difference??? > thank you ! > > > -- 原始邮件 -- > 发件人: "Liam Clarke-Hutchinson" 发送时间: 2020年4月21日(星期二) 凌晨4:22 > 收件人: "users" > 主题: Re: kafka-consumer-groups.sh CURRENT-

Re: thank you ! which java-client api can has same effect about kafka-consumer-groups.sh command ?

2020-04-22 Thread Liam Clarke-Hutchinson
picPartition key = iterator.next(); > OffsetAndMetadata value = map.get(key); > System.out.println(key.toString() + " " + > value.toString()); > } > } > > but i not find PARTITION,CURRENT-OFFSET,LOG

Re: Clarification regarding multi topics implementation

2020-04-22 Thread Liam Clarke-Hutchinson
ology = sb.build(); KafkaStreams streams = KafkaStreams(topology, streamsConfig); streams.start(); Hope that helps, Liam Clarke-Hutchinson On Thu, 23 Apr. 2020, 4:41 am Suresh Chidambaram, wrote: > Hi Team, > > Greetings. > > I have a use-case wherein I have to consume messages from multiple

Re: Kafka: Messages disappearing from topics, largestTime=0

2020-04-29 Thread Liam Clarke-Hutchinson
Hmm, how are you doing your rolling deploys? I'm wondering if the time indexes are being corrupted by unclean shutdowns. I've been reading code and the only path I could find that led to a largest timestamp of 0 was, as you've discovered, where there was no time index. WRT to the corruption - th

Re: Apache Kafka cluster to cluster

2020-04-29 Thread Liam Clarke-Hutchinson
Hi Blake, Replicator is, AFAIK, not FOSS - however, Mirror Maker 2.0, which is built along very similar lines (i.e., on top of Kafka Connect) is, as is Mirror Maker 1.0. On Thu, Apr 30, 2020 at 6:51 AM Blake Miller wrote: > Oh, and it looks like Confluent has released a newer replacement for >

Re: Apache Kafka cluster to cluster

2020-04-29 Thread Liam Clarke-Hutchinson
efaults to "latest" for auto offset reset - so when starting MM 1 it won't automatically replicate all existing data, only data created after it begins consuming. You can override this by setting the consumer property "auto.offset.reset" to "earliest". Hope that he

Re: Kafka: Messages disappearing from topics, largestTime=0

2020-04-30 Thread Liam Clarke-Hutchinson
I'd also suggest eyeballing your systemd conf to verify that someone hasn't set a very low TimeoutStopSec, or that KillSignal/RestartKillSignal haven't been configured to SIGKILL (confusingly named, imo, as the default for KillSignal is SIGTERM). Also, the Kafka broker logs at shutdown look very d

Re: Kafka: Messages disappearing from topics, largestTime=0

2020-04-30 Thread Liam Clarke-Hutchinson
t; Once again, thanks for the help. > > Em qui., 30 de abr. de 2020 às 15:04, Liam Clarke-Hutchinson < > liam.cla...@adscale.co.nz> escreveu: > > > I'd also suggest eyeballing your systemd conf to verify that someone > hasn't > > set a very low TimeoutStop

Re: Consumer not receiving messages after connection loss

2020-05-02 Thread Liam Clarke-Hutchinson
hard to read also. Could I suggest something like pastebin.com or gist.github.com for future issues? They keep the formatting intact and make it easier to delve into your logs. Cheers, Liam Clarke-Hutchinson On Tue, Apr 28, 2020 at 5:59 AM Goran Sliskovic wrote: > Apparently the issue is

Re: Connector For MirrorMaker

2020-05-02 Thread Liam Clarke-Hutchinson
luck, any questions, let me know, Liam Clarke-Hutchinson On Sat, May 2, 2020 at 2:05 AM vishnu murali wrote: > Hi Robin > > I am using Apache Kafka there is service called kafka-mirror-maker.bat with > the consumer and producer properties to copy topic from one cluster to > anoth

Re: Connect-Mirror 2.5.0

2020-05-02 Thread Liam Clarke-Hutchinson
Hi Vishnu, As per my earlier email: https://cwiki.apache.org/confluence/display/KAFKA/KIP-382%3A+MirrorMaker+2.0#KIP-382:MirrorMaker2.0-Walkthrough:RunningMirrorMaker2.0 In the same vein, any questions, hit me up, Liam Clarke-Hutchinson On Sat, May 2, 2020 at 9:56 PM vishnu murali wrote

Re: Kafka: Messages disappearing from topics, largestTime=0

2020-05-02 Thread Liam Clarke-Hutchinson
Good luck JP, do try it with the volume switching commented out, and see how it goes. On Fri, May 1, 2020 at 6:50 PM JP MB wrote: > Thank you very much for the help anyway. > > Best regards > > On Fri, May 1, 2020, 00:54 Liam Clarke-Hutchinson < > liam.cla...@adscale.co.

Re: kafka rdd save to hive errer

2020-05-02 Thread Liam Clarke-Hutchinson
Hello 姜戎 , Unfortunately there's not enough information in your email for us to help you. Are you trying to use Spark Batch to read from Kafka? Have you tried setting "endingOffsets" to "latest" instead of an arbitrary number? Kind regards, Liam Clarke-Hutchinson On

Re: kafka rdd save to hive errer

2020-05-02 Thread Liam Clarke-Hutchinson
.option("startingOffsets", "earliest") .option("endingOffsets", "latest") .load() On Sun, May 3, 2020 at 1:50 AM Liam Clarke-Hutchinson < liam.cla...@adscale.co.nz> wrote: > Hello 姜戎 , > > Unfortunately there's not enough information in y

Re: Separate Kafka partitioning from key compaction

2020-05-06 Thread Liam Clarke-Hutchinson
Could you deploy a Kafka Streams app that implemented your desired partitioning? Obviously this would require a duplication in topics between those produced to initially, and those partitioned the way you'd like, but it would solve the issue you're having. On Wed, 6 May 2020, 10:25 pm Young, Ben

Re: JDBC Sink Connector

2020-05-07 Thread Liam Clarke-Hutchinson
he update count in the DB will be 10, not 15. Kind regards, Liam Clarke-Hutchinson On Thu, 7 May 2020, 5:48 pm vishnu murali, wrote: > Hey Guys, > > i am working on JDBC Sink Conneector to take data from kafka topic to > mysql. > > i am having 2 questions. > > i am using

Kafka Connect SMT to insert key into message

2020-05-07 Thread Liam Clarke-Hutchinson
Hi all, I've been double checking the docs, and before I write a custom transform, am I correct that there's no supported way in the InsertField SMT to insert the key as a field of the value? Cheers, Liam Clarke-Hutchinson

Re: Kafka Connect SMT to insert key into message

2020-05-07 Thread Liam Clarke-Hutchinson
s with the aims of the project. My use case were records keyed on the topic by user id, and I wanted to persist them into S3 with the user id as part of the one JSON object per line data. Thanks, Liam Clarke-Hutchinson On Fri, May 8, 2020 at 12:14 PM Liam Clarke-Hutchinson < liam.cla...@ad

Re: Kafka - FindCoordinator error

2020-05-07 Thread Liam Clarke-Hutchinson
Hi Rajib, We can't see the args you're passing the consumer, and the error message indicates the consumer can't find the cluster. Thanks, Liam Clarke-Hutchinson On Fri, 8 May 2020, 3:04 pm Rajib Deb, wrote: > I wanted to check if anyone has faced this issue > > Thanks

  1   2   3   >