Re: Kafka windowed table not aggregating correctly

2016-12-12 Thread Sachin Mittal
Hi, Well it does help in case you mentioned, but in the case when on 2017 Dec 12 12:01 AM if we receive a message stamped 2017 Dec 11 11:59 PM, it will either drop this message or create a fresh older window and aggregate the message in that, and then drop the window. It is not clear which of the c

What does GetOffsetShell result represent

2016-12-12 Thread Sachin Mittal
Hi, I have some trouble interpreting the result of GetOffsetShell command. Say if I run bin\windows\kafka-run-class.bat kafka.tools.GetOffsetShell --broker-list localhost:9092 --topic test-window-stream --time -2 test-window-stream:0:0 D:\kafka_2.10-0.10.2.0-SNAPSHOT>bin\windows\kafka-run-class.b

Re: checking consumer lag on KStreams app?

2016-12-12 Thread Sachin Mittal
Hi, I used the following command bin\windows\kafka-consumer-groups.bat --bootstrap-server localhost:9092 --describe --group test and I get the following output Note: This will only show information about consumers that use the Java consumer API (non-ZooKeeper-based consumers). Error: Consumer gro

failed to delete kafka topic when building from source

2016-12-12 Thread Sachin Mittal
Hi, I recently built an application from source and I get the following exception when trying to delete a topic kafka.common.KafkaStorageException: Failed to rename log directory from D:\tmp\kafka-logs\test-window-stream-0 to D:\tmp\kafka-logs\test-window-stream-0.0ce9f915431397d1c2dad4f535a3-

Re: min.insync.replica for __consumer_offsets topic

2016-12-12 Thread Manikumar
There is no "topic.replication.factor" client/server config property. Yes, min.insync.replicas config property is applicable to internal topics also. We can override with topic level configs You can use kafka-topics command to describe the __consumer_offsets topic. You can use kafka-topics.sh

min.insync.replica for __consumer_offsets topic

2016-12-12 Thread Fang Wong
Hi, In our application, we set topic.replication.factor to 3 in the client side and min.insync.replicas = 2 in the kafka server side (server.properties). Does min.insync.replicas = 2 apply to kafka internal topic __consumer_offsets (we are using kafka version 0.9.0.1 and have offsets.storage = ka

Re: "Log end offset should not change while restoring"

2016-12-12 Thread Jon Yeargers
What is the specific cache config setting? On Mon, Dec 12, 2016 at 1:49 PM, Matthias J. Sax wrote: > We discovered a few more bugs and a bug fix release 0.10.1.1 is planned > already. > > The voting started for it, and it should get release the next weeks. > > If you issues is related to this ca

Re: Struggling with Kafka Streams rebalances under load / in production

2016-12-12 Thread Guozhang Wang
Robert, To validate if a rebalance happens, you can check the server-side logs starting with "Preparing to restabilize group %s with old generation..", and if that is triggered by a consumer failure detected, it will have some entries like "Member XX in group YY has failed" before the "preparing"

Re: Another odd error

2016-12-12 Thread Guozhang Wang
Jon, To help investigating this issue, could you let me know 1) your topology sketch and 2) your app configs? For example did you enable caching in your apps with the cache.max.bytes.buffering config? Guozhang On Sun, Dec 11, 2016 at 3:44 PM, Jon Yeargers wrote: > I get this one quite a bit.

Re: Kafka windowed table not aggregating correctly

2016-12-12 Thread Guozhang Wang
Hi Sachin, Note that "until" means that the window will be retained for that period of time after the window starting time. So when you set the time to 1 year, if there is a message whose timestamp is 1 year + 1 sec beyond the "current stream time", then yes it will cause the window to be dropped.

Re: partition count multiples - adverse effects on rebalancing?

2016-12-12 Thread Matthias J. Sax
If you have less threads than partitions, some thread will process multiple partitions. -Matthias On 12/12/16 3:45 AM, Jon Yeargers wrote: > Just curious - how is rebalancing handled when the number of potential > consumer threads isn't a multiple of the number of partitions? > > IE If I have 9

Re: "Log end offset should not change while restoring"

2016-12-12 Thread Matthias J. Sax
We discovered a few more bugs and a bug fix release 0.10.1.1 is planned already. The voting started for it, and it should get release the next weeks. If you issues is related to this caching problem, disabling the cache via StreamsConfig should fix the problem for now. Just set the cache size to

Re: How does 'TimeWindows.of().until()' work?

2016-12-12 Thread Matthias J. Sax
Sachin, There is no reason to have an .until() AND a .retain() -- just increase the value of .until() If you have a window of let's say 1h size and you set .until() also to 1h -- you can obviously not process any late arriving data. If you set until() to 2h is this example, you can process data t

Re: 10.1.0 client and 10.0.1 brokers

2016-12-12 Thread Samuel Taylor
Yup! "0.10.1.x clients only support 0.10.1.x or later brokers while 0.10.1.x brokers also support older clients" https://kafka.apache.org/documentation.html#upgrade - Samuel On Mon, Dec 12, 2016 at 1:54 PM, Aaron Niskode-Dossett < aniskodedoss...@etsy.com.invalid> wrote: > I'd just like to con

10.1.0 client and 10.0.1 brokers

2016-12-12 Thread Aaron Niskode-Dossett
I'd just like to confirm my understanding of kafka compatibility that this scenario will NOT work, is that right? A 10.1.0 client must read from a 10.1.0 or greater broker? Thanks! -Aaron

Re: rocksdb error(s)

2016-12-12 Thread Jon Yeargers
The attached error log shows several of the problems I've run into. After hitting this list (and typically much less) the app dies. On Mon, Dec 12, 2016 at 4:48 AM, Damian Guy wrote: > Just set the log level to debug and then run your app until you start > seeing the problem. > Thanks > > On Mon

Re: stunning error - Request of length 1550939497 is not valid, it is larger than the maximum size of 104857600 bytes

2016-12-12 Thread Martin Gainty
Hi Todd take a look at: chunked encoding implemented by NIOConnector https://publib.boulder.ibm.com/wasce/V2.1.0/en/http-connector.html IBM WebSphere Application Server Community Edition ... publib.boulder.ibm.com Configurin

Re: NotEnoughReplication

2016-12-12 Thread Mohit Anchlia
It looks like replicas never catch up even when there is no load. Am I missing something? On Sat, Dec 10, 2016 at 8:09 PM, Mohit Anchlia wrote: > Does Kafka automatically replicate the under replicated partitions? > > I looked at these metrics through jmxterm the Value of > Underreplicatedpartit

Re: ActiveControllerCount is always will be either 0 or 1 in 3 nodes kafka cluster?

2016-12-12 Thread Tom Crayford
This is confluent documentation, not Apache documentation. I'd recommend talking to Confluent about that. On Mon, Dec 12, 2016 at 4:57 AM, Sven Ludwig wrote: > Hi, > > in JMX each Kafka broker has a value 1 or 0 for ActiveControllerCount. As > I understood from this thread, the sum of these valu

Re: Struggling with Kafka Streams rebalances under load / in production

2016-12-12 Thread Jay Kreps
I think the most common cause of rebalancing is still GC that exceeds the consumer liveness timeout you've configured. Might be worth enabling GC logging in java and then checking the pause times. If they exceed the timeout you have for liveness then you will detect that as a process failure and re

Re: [Streams] Threading Frustration

2016-12-12 Thread Avi Flax
> On Dec 12, 2016, at 12:29, Damian Guy wrote: > > Yep - that looks correct Fantastic! Thanks again! Software Architect @ Park Assist We’re hiring! http://tech.parkassist.com/jobs/

Re: [Streams] Threading Frustration

2016-12-12 Thread Damian Guy
It is the same. The ruby code in this example is just calling into the Java API. On Mon, 12 Dec 2016 at 17:40 Ali Akhtar wrote: > @Damian, > > In the Java equivalent of this, does each KStream / KStreamBuilder.stream() > invocation create its own topic group, i.e its own thread? > > On Mon, Dec

Re: [Streams] Threading Frustration

2016-12-12 Thread Ali Akhtar
@Damian, In the Java equivalent of this, does each KStream / KStreamBuilder.stream() invocation create its own topic group, i.e its own thread? On Mon, Dec 12, 2016 at 10:29 PM, Damian Guy wrote: > Yep - that looks correct > > On Mon, 12 Dec 2016 at 17:18 Avi Flax wrote: > > > > > > On Dec 12,

Re: stunning error - Request of length 1550939497 is not valid, it is larger than the maximum size of 104857600 bytes

2016-12-12 Thread Todd Palino
Are you actually getting requests that are 1.3 GB in size, or is something else happening, like someone trying to make HTTP requests against the Kafka broker port? -Todd On Mon, Dec 12, 2016 at 4:19 AM, Ramya Ramamurthy < ramyaramamur...@teledna.com> wrote: > We have got exactly the same proble

Re: Deleting a topic without delete.topic.enable=true?

2016-12-12 Thread Todd Palino
If it issues topic metadata requests to find out what brokers the topic partitions live on (so it can make specific requests, like offsets, or getting JMX metrics from those brokers for the partitions), then it could easily have caused this. We’ve seen this behavior from multiple types of monitorin

Re: [Streams] Threading Frustration

2016-12-12 Thread Damian Guy
Yep - that looks correct On Mon, 12 Dec 2016 at 17:18 Avi Flax wrote: > > > On Dec 12, 2016, at 11:42, Damian Guy wrote: > > > > If you want to split these out so that they can run in parallel, then you > > will need to create a new stream for each topic. > > > Just to make sure I’m understandi

Re: [Streams] Threading Frustration

2016-12-12 Thread Avi Flax
> On Dec 12, 2016, at 11:42, Damian Guy wrote: > > If you want to split these out so that they can run in parallel, then you > will need to create a new stream for each topic. Just to make sure I’m understanding this, does this code change look right? From this: ```ruby topic_names = SDP::Co

Re: stunning error - Request of length 1550939497 is not valid, it is larger than the maximum size of 104857600 bytes

2016-12-12 Thread Ramya Ramamurthy
We have got exactly the same problem. nvalid receive (size = 1347375956 larger than 104857600). When trying to increase the size, Java Out of Memory Exception. Did you find a work around for the same ?? Thanks.

Re: [Streams] Threading Frustration

2016-12-12 Thread Avi Flax
> On Dec 12, 2016, at 11:42, Damian Guy wrote: > > If you want to split these out so that they can run in parallel, then you > will need to create a new stream for each topic. Super, super helpful — thanks so much! Avi

Re: Deleting a topic without delete.topic.enable=true?

2016-12-12 Thread Tim Visher
I wonder if datadog monitoring triggers that behavior. That's the only other piece of our infrastructure that may have been talking to that topic. On Mon, Dec 12, 2016 at 12:40 AM, Surendra , Manchikanti < surendra.manchika...@gmail.com> wrote: > If "auto.create.topics.enable" is set to true in y

Re: [Streams] Threading Frustration

2016-12-12 Thread Damian Guy
Hi Avi, Thanks for sharing your code. I believe the reason you are only seeing 20 threads used is due to a couple of things: KafkaStreams uses a concept of topic groups to group related topics into StreamTasks. A StreamTask is a task an independent part of the topology that can be executed on its

Re: [Streams] Threading Frustration

2016-12-12 Thread Avi Flax
> On Dec 12, 2016, at 10:24, Damian Guy wrote: > > The code for your Streams Application. Doesn't have to be the actual code, > but an example of how you are using Kafka Streams. OK, I’ve prepared a gist with (I hope) the relevant code, and also some log records just in case they might help:

Re: doing > 1 'parallel' operation on a stream

2016-12-12 Thread Eno Thereska
Hi there, Just pass your stream twice to the aggregator. No need to duplicate it or copy it etc. Eg. stream2 = stream1.aggregate(...) stream3 = stream1.aggregate(/* some different aggregation */) Thanks Eno > On 12 Dec 2016, at 11:09, Jon Yeargers wrote: > > If I want to aggregate a stream t

Re: [Streams] Threading Frustration

2016-12-12 Thread Damian Guy
Hi Avi, The code for your Streams Application. Doesn't have to be the actual code, but an example of how you are using Kafka Streams. On Mon, 12 Dec 2016 at 15:13 Avi Flax wrote: > > > On Dec 12, 2016, at 10:08, Damian Guy wrote: > > > > Can you provide an example of your topology? > > I’d be

Re: [Streams] Threading Frustration

2016-12-12 Thread Avi Flax
> On Dec 12, 2016, at 10:08, Damian Guy wrote: > > Can you provide an example of your topology? I’d be happy to, but I’m not sure what you mean? Are you interested in the implementation at the code level? You want a better understanding of the work being done? The infrastructure?

Re: [Streams] Threading Frustration

2016-12-12 Thread Damian Guy
Hi Avi, Can you provide an example of your topology? Thanks, Damian On Mon, 12 Dec 2016 at 15:02 Avi Flax wrote: Hi all, I’m running Kafka 0.10.0.1 on Java 8 on Linux — same for brokers and streams nodes. I’m attempting to scale up a Streams app that does a lot of I/O — in particular, I’m ho

Re: checking consumer lag on KStreams app?

2016-12-12 Thread Damian Guy
Hi Sachin, You should use the kafka-consumer-groups.sh command. The ConsumerOffsetChecker is deprecated and is only for the old consumer. Thanks, Damian On Mon, 12 Dec 2016 at 14:32 Sachin Mittal wrote: > Hi, > I have a streams application running with application id test. > When I try to chec

[Streams] Threading Frustration

2016-12-12 Thread Avi Flax
Hi all, I’m running Kafka 0.10.0.1 on Java 8 on Linux — same for brokers and streams nodes. I’m attempting to scale up a Streams app that does a lot of I/O — in particular, I’m hoping to isolate each partition into its own thread — and I’m confused: * I recently created 11 new topics and part

Re: checking consumer lag on KStreams app?

2016-12-12 Thread Sachin Mittal
Hi, I have a streams application running with application id test. When I try to check consumer lag like you suggested I get the following issue: bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --zookeeper localhost:2181 --group test [2016-12-12 10:26:01,348] WARN WARNING: ConsumerOffsetC

RE: ActiveControllerCount is always will be either 0 or 1 in 3 nodes kafka cluster?

2016-12-12 Thread Sven Ludwig
Hi,   in JMX each Kafka broker has a value 1 or 0 for ActiveControllerCount. As I understood from this thread, the sum of these values across the cluster should never be something other than 1. The documentation at http://docs.confluent.io/3.1.0/kafka/monitoring.html should be improved to make

Re: rocksdb error(s)

2016-12-12 Thread Damian Guy
Just set the log level to debug and then run your app until you start seeing the problem. Thanks On Mon, 12 Dec 2016 at 12:47 Jon Yeargers wrote: > I can log whatever you need. Tell me what is useful. > > On Mon, Dec 12, 2016 at 4:43 AM, Damian Guy wrote: > > > If you provide the logs from your

Re: rocksdb error(s)

2016-12-12 Thread Jon Yeargers
I can log whatever you need. Tell me what is useful. On Mon, Dec 12, 2016 at 4:43 AM, Damian Guy wrote: > If you provide the logs from your streams application then we might have > some chance of working out what is going on. Without logs then we really > don't have much hope of diagnosing the p

Re: rocksdb error(s)

2016-12-12 Thread Damian Guy
If you provide the logs from your streams application then we might have some chance of working out what is going on. Without logs then we really don't have much hope of diagnosing the problem. On Mon, 12 Dec 2016 at 12:18 Jon Yeargers wrote: > Im running as many threads as I have partitions on

Re: rocksdb error(s)

2016-12-12 Thread Jon Yeargers
Im running as many threads as I have partitions on this topic. Just curious if it would make any difference to the seemingly endless rebalancing woes. So far no change. In fact, I'll often see all 10 partitions (plus the 2 x 10 for the two aggregations) assigned to a single thread. On Mon, Dec 12

Re: rocksdb error(s)

2016-12-12 Thread Jon Yeargers
At this moment I have 5 instances each running 2 threads. Single instance / machine. Define 'full logs' ? On Mon, Dec 12, 2016 at 3:54 AM, Damian Guy wrote: > Jon, > > How many StreamThreads do you have running? > How many application instances? > Do you have more than one instance per machine?

Re: rocksdb error(s)

2016-12-12 Thread Damian Guy
Jon, How many StreamThreads do you have running? How many application instances? Do you have more than one instance per machine? If yes, are they sharing the same State Directory? Do you have full logs that can be provided so we can try and see how/what is happening? Thanks, Damian On Mon, 12 De

partition count multiples - adverse effects on rebalancing?

2016-12-12 Thread Jon Yeargers
Just curious - how is rebalancing handled when the number of potential consumer threads isn't a multiple of the number of partitions? IE If I have 9 partitions and 6 threads - will the cluster be forever trying to balance this?

doing > 1 'parallel' operation on a stream

2016-12-12 Thread Jon Yeargers
If I want to aggregate a stream twice using different windows do I need to split / copy / duplicate the source stream somehow? Or will this be handled without my interference?

Re: Running cluster of stream processing application

2016-12-12 Thread Damian Guy
In the scenario you mention above about max.poll.interval.ms, yes if the timeout was reached then there would be a rebalance and one of the standby tasks would take over. However the original task may still be processing the data when the rebalance occurs and would throw an exception when it tries

"Log end offset should not change while restoring"

2016-12-12 Thread Jon Yeargers
Im seeing this error occur more frequently of late. I ran across this thread: https://groups.google.com/forum/#!topic/confluent-platform/AH5QClSNZBw The implication from the thread is that a fix is available. Where can I get it?

Re: Running cluster of stream processing application

2016-12-12 Thread Sachin Mittal
Understood. Also the line Thanks for the application. It is not clear that clustering depends on how source topics are partitioned. Should be read as Thanks for the explanation. It is now clear that clustering depends on how source topics are partitioned. Apologies for auto-correct. One think I

Re: rocksdb error(s)

2016-12-12 Thread Jon Yeargers
No luck here. Moved all state storage to a non-tmp folder and restarted. Still hitting the 'No locks available' error quite frequently. On Sun, Dec 11, 2016 at 3:45 PM, Jon Yeargers wrote: > I moved the state folder to a separate drive and linked out to it. > > I'll try your suggestion and point

internals.AbstractCoordinator {} - Marking the coordinator ip-xyz:9092 (id: 2144 rack: null) dead for group

2016-12-12 Thread shahab
Hello, And sorry for posting the question again, since I didn't get any response last time, so I posted it, I am using Kafka 1.0.1 and in Kafka Consumer side, I am regularly (every 10-20 min) seeing the following Kafka log message which causes the partition assignment to consumer ,... Consumer s

Re: Running cluster of stream processing application

2016-12-12 Thread Damian Guy
Hi Sachin, The KafkaStreams StreamsPartitionAssignor will take care of assigning the Standby Tasks to the other instances of your Kafka Streams application. The state store updates are all handled by reading from the change-logs and updating local copies, there is no communication required between

Looking for some info on kafka homepage

2016-12-12 Thread Aki Yoshida
Hi, I am looking for the following information on Kafka's project home page but am not able to find them. The home page seems to have been updated recently (looks really nice ;-) and maybe I just don't know where to navigate through. The information that I would like to find are: - List of Kafka r

Re: Running cluster of stream processing application

2016-12-12 Thread Sachin Mittal
Hi, Thanks for the application. It is not clear that clustering depends on how source topics are partitioned. In our case I guess num.standby.replicas settings is best suited. If say I set this to 2 and run two more same application in two different machines, how would my original instance know in

Re: How does 'TimeWindows.of().until()' work?

2016-12-12 Thread Sachin Mittal
Hi, We are facing the exact problem as described by Matthias above. We are keeping default until which is 1 day. Our record's times tamp extractor has a field which increases with time. However for short time we cannot guarantee the time stamp is always increases. So at the boundary ie after 24 hr

Re: Upgrading from 0.10.0.1 to 0.10.1.0

2016-12-12 Thread Hagen Rother
Thanks Ewen, I am just trying to ensure a smooth role over. So what I was able to test so far, was 0.10.1 mirror maker with source (0.10.0.1) and target (0.10.1) brokers. That doesn't work, even if I specify zookeepers in the consumer config. Not work as in "comes up, does nothing, eventually thro