Re: [VOTE] KIP-657: Add Customized Kafka Streams Logo

2020-08-19 Thread Michael Noll
: users@kafka.apache.org > > Cc: d...@kafka.apache.org > > Subject: Re: [VOTE] KIP-657: Add Customized Kafka Streams Logo > > > > I echo what Michael says here. > > > > Another consideration is that logos are often shrunk (when used on > slides) > > an

Re: [VOTE] KIP-657: Add Customized Kafka Streams Logo

2020-08-19 Thread Michael Noll
Hi all! Great to see we are in the process of creating a cool logo for Kafka Streams. First, I apologize for sharing feedback so late -- I just learned about it today. :-) Here's my *personal, subjective* opinion on the currently two logo candidates for Kafka Streams. TL;DR: Sorry, but I really

Re: Kafka-streams: mix Processor API with windowed grouping

2018-04-10 Thread Michael Noll
Also, if you want (or can tolerate) probabilistic counting, with the option to also do TopN in that manner, we also have an example that uses Count Min Sketch: https://github.com/confluentinc/kafka-streams-examples/blob/4.0.0-post/src/test/scala/io/confluent/examples/streams/ProbabilisticCountingSc

Re: Kafka 11 | Stream Application crashed the brokers

2017-12-01 Thread Michael Noll
Thanks for reporting back, Sameer! On Fri, Dec 1, 2017 at 2:46 AM, Guozhang Wang wrote: > Thanks for confirming Sameer. > > > Guozhang > > On Thu, Nov 30, 2017 at 3:52 AM, Sameer Kumar > wrote: > > > Just wanted to let everyone know that this issue got fixed in Kafka > 1.0.0. > > I recently mi

Re: left join between PageViews(KStream) & UserProfile (KTable)

2017-10-23 Thread Michael Noll
> *What key should the join on ? * The message key, on both cases, should contain the user ID in String format. > *There seems to be no common key (eg. user) between the 2 classes - PageView and UserProfile* The user ID is the common key, but the user ID is stored in the respective message *keys

Re: KTable-KTable Join Semantics on NULL Key

2017-09-14 Thread Michael Noll
Perhaps a clarification to what Damian said: It is shown in the (HTML) table at the link you shared [1] what happens when you get null values for a key. We also have slightly better join documentation at [2], the content/text of which we are currently migrating over to the official Apache Kafka d

Re: Avro Serialization & Schema Registry ..

2017-07-19 Thread Michael Noll
In short, Avro serializers/deserializers provided by Confluent always integrate with (and thus require) Confluent Schema Registry. That's why you must set the `schema.registry.url` configuration for them. If you want to use Avro but without a schema registry, you'd need to work with the Avro API

Re: Finding StreamsMetadata with value-dependent partitioning

2017-06-06 Thread Michael Noll
Happy to hear you found a working solution, Steven! -Michael On Sat, Jun 3, 2017 at 12:53 AM, Steven Schlansker < sschlans...@opentable.com> wrote: > > > > On Jun 2, 2017, at 3:32 PM, Matthias J. Sax > wrote: > > > > Thanks. That helps to understand the use case better. > > > > Rephrase to ma

Re: [DISCUSS]: KIP-161: streams record processing exception handlers

2017-05-30 Thread Michael Noll
Thanks for your work on this KIP, Eno -- much appreciated! - I think it would help to improve the KIP by adding an end-to-end code example that demonstrates, with the DSL and with the Processor API, how the user would write a simple application that would then be augmented with the proposed KIP ch

Re: Kafka Streams 0.10.0.1 - multiple consumers not receiving messages

2017-04-28 Thread Michael Noll
To add to what Eno said: You can of course use the Kafka Streams API to build an application that consumes from multiple Kafka topics. But, going back to your original question, the scalability of Kafka and the Kafka Streams API is based on partitions, not on topics. -Michael On Fri, Apr 28,

Re: [ANNOUNCE] New committer: Rajini Sivaram

2017-04-24 Thread Michael Noll
Congratulations, Rajini! On Mon, Apr 24, 2017 at 11:50 PM, Ismael Juma wrote: > Congrats Rajini! Well-deserved. :) > > Ismael > > On Mon, Apr 24, 2017 at 10:06 PM, Gwen Shapira wrote: > > > The PMC for Apache Kafka has invited Rajini Sivaram as a committer and we > > are pleased to announce tha

Re: Joining on non-keyed values - how to lookup fields

2017-04-20 Thread Michael Noll
Jon, the recently introduced GlobalKTable ("global tables") allow you to perform non-key lookups. See http://docs.confluent.io/current/streams/developer-guide.html#kstream-globalktable-join (and the javadocs link) > So called "internal" values can't be looked up. If I understand you correctly:

Re: Kafka Streams: a back-pressure question for windowed streams

2017-04-20 Thread Michael Noll
Hi there! In short, Kafka Streams ensures that your application consumes only as much data (or: as fast) as it can process it. The main "problem" you might encounter is not that you run into issues with state stores (like in-memory stores or RocksDB stores), but -- which is a more general issue -

Re: auto.offset.reset for Kafka streams 0.10.2.0

2017-04-11 Thread Michael Noll
It's also documented at http://docs.confluent.io/current/streams/developer-guide.html#non-streams-configuration-parameters . FYI: We have already begun syncing the Confluent docs for Streams into the Apache Kafka docs for Streams, but there's still quite some work left (volunteers are welcome :-P)

Re: weird SerializationException when consumer is fetching and parsing record in streams application

2017-03-30 Thread Michael Noll
Sachin, there's a JIRA that seems related to what you're seeing: https://issues.apache.org/jira/browse/KAFKA-4740 Perhaps you could check the above and report back? -Michael On Thu, Mar 30, 2017 at 3:23 PM, Michael Noll wrote: > Hmm, I re-read the stacktrace again. It does

Re: weird SerializationException when consumer is fetching and parsing record in streams application

2017-03-30 Thread Michael Noll
Hmm, I re-read the stacktrace again. It does look like the value-side being the culprit (as Sachin suggested earlier). -Michael On Thu, Mar 30, 2017 at 3:18 PM, Michael Noll wrote: > Sachin, > > you have this line: > > > builder.stream(Serdes.String(), serde, "advice

Re: weird SerializationException when consumer is fetching and parsing record in streams application

2017-03-30 Thread Michael Noll
V data = null; > > > > > try { > > > > > data = objectMapper.readValue(paramArrayOfByte, new > > > > > TypeReference() {}); > > > > > } catch (Exception e) { > > > > > e.printStackTra

Re: Understanding ReadOnlyWindowStore.fetch

2017-03-30 Thread Michael Noll
ey().reduce(..., "somekeystore"); > >> > > >> > and then call this: > >> > > >> > kt.forEach()-> ... > >> > > >> > Can I assume that everything that comes out will be available in > >> > "

Re: weird SerializationException when consumer is fetching and parsing record in streams application

2017-03-30 Thread Michael Noll
Could this be a corrupted message ("poison pill") in your topic? If so, take a look at http://docs.confluent.io/current/streams/faq.html#handling-corrupted-records-and-deserialization-errors-poison-pill-messages FYI: We're currently investigating a more elegant way to address such poison pill pro

Re: ThoughWorks Tech Radar: Assess Kafka Streams

2017-03-30 Thread Michael Noll
Aye! Thanks for sharing, Jan. :-) On Wed, Mar 29, 2017 at 8:56 PM, Eno Thereska wrote: > Thanks for the heads up Jan! > > Eno > > > On 29 Mar 2017, at 19:08, Jan Filipiak wrote: > > > > Regardless of how usefull you find the tech radar. > > > > Well deserved! even though we all here agree that

Re: Using Kafka Stream in the cluster of kafka on one or multiple docker-machine/s

2017-03-30 Thread Michael Noll
main/java/io/confluent/examples/streams/ > > WordCountLambdaExample.java#L55-L62 and > > https://github.com/confluentinc/examples/tree/3.2.x/kafka- > > streams#packaging-and-running I missed the fact that the jar should be > > run in a separate container. > > > > Bes

Re: Understanding ReadOnlyWindowStore.fetch

2017-03-29 Thread Michael Noll
Jon, there's a related example, using a window store and a key-value store, at https://github.com/confluentinc/examples/blob/3.2.x/kafka-streams/src/test/java/io/confluent/examples/streams/ValidateStateWithInteractiveQueriesLambdaIntegrationTest.java (this is for Confluent 3.2 / Kafka 0.10.2). -M

Re: Custom stream processor not triggering #punctuate()

2017-03-28 Thread Michael Noll
Elliot, in the current API, `punctuate()` is called based on the current stream-time (which defaults to event-time), not based on the current wall-clock time / processing-time. See http://docs.confluent.io/ current/streams/faq.html#why-is-punctuate-not-called. The stream-time is advanced only wh

Re: APPLICATION_SERVER_CONFIG ?

2017-03-27 Thread Michael Noll
kafka since I joined this company last year. I think my > largest issue is rethinking some preexisting notions about streaming to > make them work in the kstream universe. > > On Fri, Mar 24, 2017 at 6:07 AM, Michael Noll > wrote: > > > > If I understand thi

Re: using a state store for deduplication

2017-03-27 Thread Michael Noll
Jon, Damian already answered your direct question, so my comment is a FYI: There's a demo example at https://github.com/confluentinc/examples/blob/3.2.x/kafka-streams/src/test/java/io/confluent/examples/streams/EventDeduplicationLambdaIntegrationTest.java (this is for Confluent 3.2 / Kafka 0.10.2

Re: YASSQ (yet another state store question)

2017-03-27 Thread Michael Noll
IIRC this may happen, for example, if the first host runs all the stream tasks (here: 2 in total) and migration of stream task(s) to the second host hasn't happened yet. -Michael On Sun, Mar 26, 2017 at 3:14 PM, Jon Yeargers wrote: > Also - if I run this on two hosts - what does it imply if t

Re: Streams RocksDBException with no message?

2017-03-27 Thread Michael Noll
We're talking about `ulimit` (CLI tool) and the `nofile` limit (number of open files), which you can access via `ulimit -n`. Examples: https://access.redhat.com/solutions/61334 https://stackoverflow.com/questions/21515463/how-to-increase-maximum-file-open-limit-ulimit-in-ubuntu Depending on the o

Re: APPLICATION_SERVER_CONFIG ?

2017-03-24 Thread Michael Noll
> If I understand this correctly: assuming I have a simple aggregator > distributed across n-docker instances each instance will _also_ need to > support some sort of communications process for allowing access to its > statestore (last param from KStream.groupby.aggregate). Yes. See http://docs.c

Re: clearing an aggregation?

2017-03-23 Thread Michael Noll
his sort of 'aggregate and clear' approach still requires an > external datastore (like Redis). Please correct me if Im wrong. > > On Mon, Mar 20, 2017 at 9:29 AM, Michael Noll > wrote: > > > Jon, > > > > the windowing operation of Kafka's Streams API

Re: Getting current value of aggregated key

2017-03-23 Thread Michael Noll
Jon, you can use Kafka's interactive queries feature for this: http://docs.confluent.io/current/streams/developer-guide.html#interactive-queries -Michael On Thu, Mar 23, 2017 at 1:52 PM, Jon Yeargers wrote: > If I have an aggregation : > > KTable, VideoLogLine> outTable = > sourceStream.grou

Re: Error in running PageViewTypedDemo

2017-03-23 Thread Michael Noll
mThread.runLoop( > > StreamThread.java:415) > > at org.apache.kafka.streams.processor.internals. > > StreamThread.run(StreamThread.java:242) > > > > Any example on the correct input value is really appreciated. > > > > Thanks > > > > On Wed, Mar 2

Re: kafka streams in-memory Keyvalue store iterator remove broken on upgrade to 0.10.2.0 from 0.10.1.1

2017-03-22 Thread Michael Noll
To add to what Matthias said, in case the following isn't clear: - You should not (and, in 0.10.2, cannot any longer) call the iterator's remove() method, i.e. `KeyValueIterator#remove()` when iterating through a `KeyValueStore`. Perhaps this is something we should add to the `KeyValueIterator` j

Re: Error in running PageViewTypedDemo

2017-03-22 Thread Michael Noll
IIRC the PageViewTypedDemo example requires input data where the username/userId is captured in the keys of messages/records, and further information in the values of those messages. The problem you are running into is that, when you are writing your input data via the console consumer, the record

Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-22 Thread Michael Noll
Forwarding to kafka-user. -- Forwarded message -- From: Michael Noll Date: Wed, Mar 22, 2017 at 8:48 AM Subject: Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API To: d...@kafka.apache.org Matthias, > @Michael: > > You seemed to agree with Jay about not exp

Re: Using Kafka Stream in the cluster of kafka on one or multiple docker-machine/s

2017-03-21 Thread Michael Noll
Typically you'd containerize your app and then launch e.g. 10 containers if you need to run 10 instances of your app. Also, what do you mean by "in a cluster of Kafka containers" and "in the cluster of Kafkas"? On Tue, Mar 21, 2017 at 9:08 PM, Mina Aslani wrote: > Hi, > > I am trying to underst

Re: Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-20 Thread Michael Noll
d the topology if provided as a constructor argument. However, > >> especially for DSL (not sure if it would make sense for PAPI), the DSL > >> builder could create the client for the user. > >> > >> Something like this: > >> > >>> KStreamBuilder bu

Re: Out of order message processing with Kafka Streams

2017-03-20 Thread Michael Noll
ucket) should be started, and future messages should belong to that > 'session', until the next 30+ min gap). > > On Mon, Mar 20, 2017 at 11:44 PM, Michael Noll > wrote: > > > > Can windows only be used for aggregations, or can they also be used for > > fore

Re: Out of order message processing with Kafka Streams

2017-03-20 Thread Michael Noll
t possible to get metadata on the message, such as whether or not > its late, its index/position within the other messages, etc? > > On Mon, Mar 20, 2017 at 9:44 PM, Michael Noll > wrote: > > > And since you asked for a pointer, Ali: > > http://docs.confluent.io/current/st

Re: clearing an aggregation?

2017-03-20 Thread Michael Noll
Jon, the windowing operation of Kafka's Streams API (in its DSL) aligns time-based windows to the epoch [1]: Quoting from e.g. hopping windows (sometimes called sliding windows in other technologies): > Hopping time windows are aligned to the epoch, with the lower interval bound > being inclusiv

Re: Out of order message processing with Kafka Streams

2017-03-20 Thread Michael Noll
Late-arriving and out-of-order data is only treated specially for windowed aggregations. For stateless operations such as `KStream#foreach()` or `KStream#map()`, records are processed in the order they arrive (per partition). -Michael On Sat, Mar 18, 2017 at 10:47 PM, Ali Akhtar wrote: > >

Re: Out of order message processing with Kafka Streams

2017-03-20 Thread Michael Noll
And since you asked for a pointer, Ali: http://docs.confluent.io/current/streams/concepts.html#windowing On Mon, Mar 20, 2017 at 5:43 PM, Michael Noll wrote: > Late-arriving and out-of-order data is only treated specially for windowed > aggregations. > > For stateless operat

Re: Not Serializable Result Error

2017-03-15 Thread Michael Noll
Hi Armaan, > org.apache.spark.SparkException: Job aborted due to stage failure: >Task 0.0 in stage 0.0 (TID 0) had a not serializable result: org.apache.kafka.clients.consumer.ConsumerRecord perhaps you should ask that question in the Spark mailing list, which should increase your chances of

Re: Trying to use Kafka Stream

2017-03-15 Thread Michael Noll
ob/3.2.x/kafka-streams/src/main/ > java/io/confluent/examples/streams/WordCountLambdaExample.java#L178-L181) > in my IDE was not and still is not working. > > Best regards, > Mina > > > On Wed, Mar 15, 2017 at 4:43 AM, Michael Noll > wrote: > > > Mina, > > > &g

Re: Trying to use Kafka Stream

2017-03-15 Thread Michael Noll
Mina, in your original question you wrote: > However, I do not see the word count when I try to run below example. Looks like that it does not connect to Kafka. The WordCount demo example writes its output to Kafka only -- it *does not* write any results to the console/STDOUT. >From what I can

Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-14 Thread Michael Noll
I see Jay's point, and I agree with much of it -- notably about being careful which concepts we do and do not expose, depending on which user group / user type is affected. That said, I'm not sure yet whether or not we should get rid of "Topology" (or a similar term) in the DSL. For what it's wor

Re: Kafka Streams question

2017-03-14 Thread Michael Noll
application's processing needs. -Michael On Tue, Mar 14, 2017 at 9:00 AM, BYEONG-GI KIM wrote: > Dear Michael Noll, > > I have a question; Is it possible converting JSON format to YAML format via > using Kafka Streams? > > Best Regards > > KIM > > 2017-03-10 11:36 GMT+09

Re: ~20gb of kafka-streams state, unexpected?

2017-03-10 Thread Michael Noll
In addition to what Eno already mentioned here's some quick feedback: - Only for reference, I'd add that 20GB of state is not necessarily "massive" in absolute terms. I have talked to users whose apps manage much more state than that (1-2 orders of magnitude more). Whether or not 20 GB is massiv

Re: Streams - Got bit by log levels of the record cache

2017-03-10 Thread Michael Noll
I think a related JIRA ticket is https://issues.apache.org/jira/browse/KAFKA-4829 (see Guozhang's comment about the ticket's scope). -Michael On Thu, Mar 9, 2017 at 6:22 PM, Damian Guy wrote: > Hi Nicolas, > > Please do file a JIRA. > > Many thanks, > Damian > > On Thu, 9 Mar 2017 at 15:54 Nic

Re: Kafka Streams question

2017-03-09 Thread Michael Noll
There's actually a demo application that demonstrates the simplest use case for Kafka's Streams API: to read data from an input topic and then write that data as-is to an output topic. https://github.com/confluentinc/examples/blob/3.2.x/kafka-streams/src/test/java/io/confluent/examples/streams/Pa

Re: [DISCUSS] KIP-120: Cleanup Kafka Streams builder API

2017-03-09 Thread Michael Noll
Thanks for the update, Matthias. +1 to the points 1,2,3,4 you mentioned. Naming is always a tricky subject, but renaming KStreamBuilder to StreamsTopologyBuilder looks ok to me (I would have had a slight preference towards DslTopologyBuilder, but hey.) The most important aspect is, IMHO, what yo

Re: Can I create user defined stream processing function/api?

2017-03-07 Thread Michael Noll
There's also an end-to-end example for DSL and Processor API integration: https://github.com/confluentinc/examples/blob/3.2.x/kafka-streams/src/test/java/io/confluent/examples/streams/MixAndMatchLambdaIntegrationTest.java Best, Michael On Tue, Mar 7, 2017 at 4:51 PM, LongTian Wang wrote: > Re

Re: Kafka streams DSL advantage

2017-03-06 Thread Michael Noll
The DSL has some unique features that aren't in the Processor API, such as: - KStream and KTable abstractions. - Support for time windows (tumbling windows, hopping windows) and session windows. The Processor API only has stream-time based `punctuate()`. - Record caching, which is slightly better

Re: Writing data from kafka-streams to remote database

2017-03-06 Thread Michael Noll
I'd use option 2 (Kafka Connect). Advantages of #2: - The code is decoupled from the processing code and easier to refactor in the future. (same as #4) - The runtime/uptime/scalability of your Kafka Streams app (processing) is decoupled from the runtime/uptime/scalability of the data ingestion in

Re: Chatty StreamThread commit messages

2017-03-01 Thread Michael Noll
Good point, Steven. +1 here. On Wed, Mar 1, 2017 at 8:52 AM, Damian Guy wrote: > +1 > On Wed, 1 Mar 2017 at 07:15, Guozhang Wang wrote: > > > Hey Steven, > > > > That is a good question, and I think your proposal makes sense. Could you > > file a JIRA for this change to keep track of it? > > >

Re: Kafka Streams vs Spark Streaming

2017-02-28 Thread Michael Noll
size 1 hour and retention of 3 hours. > > So to conclude if you can manage rocks db, then kafka streams is good to > start with, its simple and very intuitive to use. > > Again on rocksdb side, is there a way to eliminate that and is > > disableLogging > > for that? > &g

Re: Kafka Streams vs Spark Streaming

2017-02-27 Thread Michael Noll
> Also, is it possible to stop the syncing between state stores to brokers, if I am fine with failures? Yes, you can disable the syncing (or the "changelog" feature) of state stores: http://docs.confluent.io/current/streams/developer-guide.html#enable-disable-state-store-changelogs > I do have a

Re: Simple data-driven app design using Kafka

2017-02-23 Thread Michael Noll
Pete, have you looked at Kafka's Streams API yet? There are many examples available in the `kafka-streams` folder at https://github.com/confluentinc/examples. The simplest example of "Do sth to a new data record as soon as it arrives" might be the MapFunctionLambdaExample. You can create differ

Re: KTable send old values API

2017-02-22 Thread Michael Noll
Dmitry, I think your use case is similar to the one I described in the link below (discussion in the kafka-dev mailing list): http://search-hadoop.com/m/uyzND1rVOQ12OJ84U&subj=Re+Streams+TTLCacheStore Could you take a quick look? -Michael On Wed, Feb 22, 2017 at 12:39 AM, Dmitry Minkovsky w

Re: Kafka streams: Getting a state store associated to a processor

2017-02-14 Thread Michael Noll
> By the way - do I understand correctly that when a state store is persistent, it is logged by default? Yes. > So enableLogging(Map) only is a way to provide default configuration to the default logging? Yes. That is, any configs that should be applied to the state store's changelog topic. >

Re: Kafka streams: Getting a state store associated to a processor

2017-02-13 Thread Michael Noll
Adam, also a FYI: The upcoming 0.10.2 version of the Streams API will be backwards compatible with 0.10.1 clusters, so you can keep your brokers on 0.10.1.1 and still use the latest Streams API version (including the one from trunk, as Matthias mentioned). -Michael On Mon, Feb 13, 2017 at 1:04

Re: KIP-121 [Discuss]: Add KStream peek method

2017-02-07 Thread Michael Noll
Many thanks for the KIP and the PR, Steven! My opinion, too, is that we should consider including this. One thing that I would like to see clarified is the difference between the proposed peek() and existing functions map() and foreach(), for instance. My understanding (see also the Java 8 links

Re: Kafka docs for current trunk

2017-01-31 Thread Michael Noll
Thanks for bringing this up, Matthias. +1 On Wed, Feb 1, 2017 at 8:15 AM, Gwen Shapira wrote: > +1 > > On Tue, Jan 31, 2017 at 5:57 PM, Matthias J. Sax > wrote: > > Hi, > > > > I want to collect feedback about the idea to publish docs for current > > trunk version of Apache Kafka. > > > > Curr

Re: Fwd: [DISCUSS] KIP-114: KTable materialization and improved semantics

2017-01-27 Thread Michael Noll
t;>>>> afterwards >>>>> >>>>>> but we have already decided to materialize it, we can replace the >>>>>>>> >>>>>>> internal >>>>>>> >>>>>>>> name with the user's provided

Re: Streams: Global state & topic multiplication questions

2017-01-20 Thread Michael Noll
As Eno said I'd use the interactive queries API for Q2. Demo apps: - https://github.com/confluentinc/examples/blob/3.1.x/kafka-streams/src/main/java/io/confluent/examples/streams/interactivequeries/kafkamusic/KafkaMusicExample.java - https://github.com/confluentinc/examples/blob/3.1.x/kafka-stream

Re: Kafka Streams: how can I get the name of the Processor when calling `KStream.process`

2017-01-18 Thread Michael Noll
essor was generated by a > high-level topologies. And names of processors created by `KStreamBuilder` > are not accessible. (unless by inspecting the topology nodes I guess) > > [1] https://gist.github.com/nfo/c4936a24601352db23b18653a8ccc352 > > Thanks. > Nicolas > &

Re: Kafka Streams: how can I get the name of the Processor when calling `KStream.process`

2017-01-18 Thread Michael Noll
Nicolas, if I understand your question correctly you'd like to add further operations after having called `KStream#process()`, which -- as you report -- doesn't work because `process()` returns void. If that's indeed the case, +1 to Damian's suggest to use `KStream.transform()` instead of `KStrea

Re: Kafka Streams: from a KStream, aggregating records with the same key and updated metrics ?

2017-01-17 Thread Michael Noll
1:00 Nicolas Fouché : > > > Hi Michael, > > > > got it. I understand that it would be less error-prone to generate the > > final "altered" timestamp on the Producer side, instead of trying to > > compute it each time the record is consumed. > > > > T

Re: Kafka Streams: from a KStream, aggregating records with the same key and updated metrics ?

2017-01-16 Thread Michael Noll
Nicolas, quick feedback on timestamps: > In our system, clients send data to an HTTP API. This API produces the > records in Kafka. I can't rely on the clock of the clients sending the > original data, (so the records' timestamps are set by the servers ingesting > the records in Kafka), but I can

Re: [ANNOUNCE] New committer: Grant Henke

2017-01-16 Thread Michael Noll
My congratulations, Grant -- more work's awaiting you then. ;-) Best wishes, Michael On Fri, Jan 13, 2017 at 2:50 PM, Jeff Holoman wrote: > Well done Grant! Congrats! > > On Thu, Jan 12, 2017 at 1:13 PM, Joel Koshy wrote: > > > Hey Grant - congrats! > > > > On Thu, Jan 12, 2017 at 10:00 AM,

Re: kafka streams consumer partition assignment is uneven

2017-01-09 Thread Michael Noll
What does the processing topology of your Kafka Streams application look like, and what's the exact topic and partition configuration? You say you have 12 partitions in your cluster, presumably across 7 topics -- that means that most topics have just a single partition. Depending on your topology

Re: Kafka Logo as HighRes or Vectorgraphics

2016-12-02 Thread Michael Noll
Jan, Here's vector files for the logo. One of our teammates went ahead and helped resized it to fit nicely into a 2x4m board with 15cm of margin all around. Note: I was told to kindly remind you (and other readers of this) to follow the Apache branding guidelines for the logo, and please not mani

Re: I need some help with the production server architecture

2016-12-01 Thread Michael Noll
+1 to what Dave said. On Thu, Dec 1, 2016 at 4:29 PM, Tauzell, Dave wrote: > For low volume zookeeper doesn't seem to use many resources. I would put > it on nodejs server as that will have less IO and heavy IO could impact > zookeeper. Or, you could put some ZK nodes on nodejs and some on

Re: Interactive Queries

2016-11-28 Thread Michael Noll
There are also some examples/demo applications at https://github.com/confluentinc/examples that demonstrate the use of interactive queries: - https://github.com/confluentinc/examples/blob/3.1.x/kafka-streams/src/main/java/io/confluent/examples/streams/interactivequeries/kafkamusic/KafkaMusicExampl

Re: Data (re)processing with Kafka (new wiki page)

2016-11-25 Thread Michael Noll
Thanks a lot, Matthias! I have already begun to provide feedback. -Michael On Wed, Nov 23, 2016 at 11:41 PM, Matthias J. Sax wrote: > Hi, > > we added a new wiki page that is supposed to collect data (re)processing > scenario with Kafka: > > https://cwiki.apache.org/confluence/display/KAFKA/

Re: KafkaStreams KTable#through not creating changelog topic

2016-11-25 Thread Michael Noll
> > > works > > > > perfectly fine with with a naming convention for the topics and by > > > creating > > > > them in Kafka upfront. > > > > > > > > My point is that it would help me (and maybe others), if the API of >

Re: KafkaStreams KTable#through not creating changelog topic

2016-11-23 Thread Michael Noll
e > created by `through()` and `to()` [...] Addendum: And that's because the topic that is created by `KTable#through()` and `KTable#to()` is, by definition, a changelog of that KTable and the associated state store. On Wed, Nov 23, 2016 at 4:15 PM, Michael Noll wrote: > Mikael,

Re: KafkaStreams KTable#through not creating changelog topic

2016-11-23 Thread Michael Noll
Mikael, regarding your second question: > 2) Regarding the use case, the topology looks like this: > > .stream(...) > .aggregate(..., "store-1") > .mapValues(...) > .through(..., "store-2") The last operator above would, without "..." ellipsis, be sth like `KTable#through("through-topic", "store

Re: Kafka Streaming message loss

2016-11-21 Thread Michael Noll
Also: Since your testing is purely local, feel free to share the code you have been using so that we can try to reproduce what you're observing. -Michael On Mon, Nov 21, 2016 at 4:04 PM, Michael Noll wrote: > Please don't take this comment the wrong way, but have you d

Re: Kafka Streaming message loss

2016-11-21 Thread Michael Noll
Please don't take this comment the wrong way, but have you double-checked whether your counting code is working correctly? (I'm not implying this could be the only reason for what you're observing.) -Michael On Fri, Nov 18, 2016 at 4:52 PM, Eno Thereska wrote: > Hi Ryan, > > Perhaps you could

Re: Kafka windowed table not aggregating correctly

2016-11-21 Thread Michael Noll
On Mon, Nov 21, 2016 at 1:06 PM, Sachin Mittal wrote: > I am using kafka_2.10-0.10.0.1. > Say I am having a window of 60 minutes advanced by 15 minutes. > If the stream app using timestamp extractor puts the message in one or more > bucket(s), it will get aggregated in those buckets. > I assume t

Re: Kafka Streams internal topic naming

2016-11-18 Thread Michael Noll
> > > > Srikanth > > > > On Wed, Nov 16, 2016 at 2:32 PM, Michael Noll > wrote: > > > >> Srikanth, > >> > >> no, there's isn't any API to control the naming of internal topics. > >> > >> Is the reason you'

Re: Kafka Streams internal topic naming

2016-11-16 Thread Michael Noll
Srikanth, no, there's isn't any API to control the naming of internal topics. Is the reason you're asking for such functionality only/mostly about multi-tenancy issues (as you mentioned in your first message)? -Michael On Wed, Nov 16, 2016 at 8:20 PM, Srikanth wrote: > Hello, > > Does kafka

Re: Process KTable on Predicate

2016-11-15 Thread Michael Noll
Nick, if I understand you correctly you can already do this today: Think: KTable.toStream().filter().foreach() (or just KTable.filter().foreach(), depending on what you are aiming to do) Would that work for you? On Sun, Nov 13, 2016 at 12:12 AM, Nick DeCoursin wrote: > Feature proposal: > >

Re: Is it possible to resubcribe KafkaStreams in runtime to different set of topics?

2016-11-09 Thread Michael Noll
I am not aware of any short-term plans to support that, but perhaps others in the community / mailing list are. On Wed, Nov 9, 2016 at 11:15 AM, Timur Yusupov wrote: > Are there any nearest plans to support that? > > On Wed, Nov 9, 2016 at 1:11 PM, Michael Noll wrote: > >

Re: Is it possible to resubcribe KafkaStreams in runtime to different set of topics?

2016-11-09 Thread Michael Noll
This is not possible at the moment. However, depending on your use case, you might be able to leverage regex topic subscriptions (think: "b*" to read from all topics starting with letter `b`). On Wed, Nov 9, 2016 at 10:56 AM, Timur Yusupov wrote: > Hello, > > In our system it is possible to add

Re: Kafka Streaming

2016-10-20 Thread Michael Noll
I suspect you are running Kafka 0.10.0.x on Windows? If so, this is a known issue that is fixed in Kafka 0.10.1 that was just released today. Also: which examples are you referring to? And, to confirm: which git branch / Kafka version / OS in case my guess above was wrong. On Thursday, October

Re: Dismissing late messages in Kafka Streams

2016-10-20 Thread Michael Noll
Nicolas, > I set the maintain duration of the window to 30 days. > If it consumes a message older than 30 days, then a new aggregate is created for this old window. I assume you mean: If a message should have been included in the original ("old") window but that message happens to arrive late (a

Re: How to block tests of Kafka Streams until messages processed?

2016-10-20 Thread Michael Noll
dvantage to using the kafka connect method? Seems like > it'd just add an extra step of overhead? > > On Thu, Oct 20, 2016 at 12:35 PM, Michael Noll > wrote: > > > Ali, > > > > my main feedback is similar to what Eno and Dave have already said. In > >

Re: How to block tests of Kafka Streams until messages processed?

2016-10-20 Thread Michael Noll
Ali, my main feedback is similar to what Eno and Dave have already said. In your situation, options like these are what you'd currently need to do since you are writing directly from your Kafka Stream app to Cassandra, rather than writing from your app to Kafka and then using Kafka Connect to ing

Re: kafka streams metadata request fails on 0.10.0.1 broker/topic from 0.10.1.0 client

2016-10-20 Thread Michael Noll
t; > Just my 2 cents > > Decoupling the kafka streams from the core kafka changes will help so that > the broker can be upgraded without notice and streaming apps can evolve to > newer streaming features on their own pace > > Regards > Sai > > > On Wednesday, October 19, 2

Re: kafka streams metadata request fails on 0.10.0.1 broker/topic from 0.10.1.0 client

2016-10-19 Thread Michael Noll
Apps built with Kafka Streams 0.10.1 only work against Kafka clusters running 0.10.1+. This explains your error message above. Unfortunately, Kafka's current upgrade story means you need to upgrade your cluster in this situation. Moving forward, we're planning to improve the upgrade/compatibilit

Re: How to detect an old consumer

2016-10-17 Thread Michael Noll
Old consumers use ZK to store their offsets. Could you leverage the timetamps of the corresponding znodes [1] for this? [1] https://zookeeper.apache.org/doc/r3.4.5/zookeeperProgrammers.html#sc_zkDataModel_znodes On Mon, Oct 17, 2016 at 4:45 PM, Fernando Bugni wrote: > Hello, > > I want to de

Re: Understanding out of order message processing w/ Streaming

2016-10-13 Thread Michael Noll
> But if they arrive out of order, I have to detect / process that myself in > the processor logic. Yes -- if your processing logic depends on the specific ordering of messages (which is the case for you), then you must manually implement this ordering-specific logic at the moment. Other use case

Re: KafkaStream Merging two topics is not working fro custom datatypes

2016-10-12 Thread Michael Noll
t; > >> } > >> > >> > >> *Serilizer/Deserializer* > >> > >> > >> > >> public class KafkaPayloadSerializer implements Serializer, > >> Deserializer { > >> > >> private static final Logger log = org.apache.logging.log4j.LogManager > >

Re: Support for Kafka

2016-10-11 Thread Michael Noll
Actually, I wanted to include the following link for the JVM docs (the information matches what's written in the earlier link I shared): http://kafka.apache.org/documentation#java On Tue, Oct 11, 2016 at 11:21 AM, Michael Noll wrote: > Regarding the JVM, we recommend running the latest

Re: Support for Kafka

2016-10-11 Thread Michael Noll
Regarding the JVM, we recommend running the latest version of JDK 1.8 with the G1 garbage collector: http://docs.confluent.io/current/kafka/deployment.html#jvm And yes, Kafka does run on Ubuntu 16.04, too. (Confluent provides .deb packages [1] for Apache Kafka if you are looking for these to inst

Re: puncutuate() never called

2016-10-11 Thread Michael Noll
re empty > topics. Punctuate will never be called. > > -David ” > > On 10/10/16, 1:55 AM, "Michael Noll" wrote: > > > We have run the application (and have confirmed data is being > received) > for over 30 mins…with a 60-second timer. > >

Re: KafkaStream Merging two topics is not working fro custom datatypes

2016-10-11 Thread Michael Noll
When I wrote: "If you haven't changed to default key and value serdes, then `to()` will fail because [...]" it should have read: "If you haven't changed the default key and value serdes, then `to()` will fail because [...]" On Tue, Oct 11, 2016 at

Re: KafkaStream Merging two topics is not working fro custom datatypes

2016-10-11 Thread Michael Noll
Ratha, if you based your problematic code on the PipeDemo example, then you should have these two lines in your code (which most probably you haven't changed): props.put(StreamsConfig.KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass()); props.put(StreamsConfig.VALUE_SERDE_CLASS_CONFIG, Se

Re: sFlow/NetFlow/Pcap Plugin for Kafka Producer

2016-10-10 Thread Michael Noll
kafka ecosystem page. > > https://cwiki.apache.org/confluence/display/KAFKA/Ecosystem > > > Best Regards, > Aris. > > > On Mon, Oct 10, 2016 at 6:55 PM, Michael Noll > wrote: > > > Aris, > > > > even today you can already use Kafka to deliver Netflow/Pcap/et

  1   2   >