groupBy without auto-repartition topics for Kafka Streams

2017-03-01 Thread Tianji Li
Hi there, I wonder if it makes sense to give the option to disable auto repartitioning while doing groupBy. I understand with https://issues.apache.org/jira/browse/KAFKA-3561, an internal topic for repartition will be automatically created and synced to brokers, which is useful when aggregation k

Re: groupBy without auto-repartition topics for Kafka Streams

2017-03-02 Thread Tianji Li
eam.groupBy(...) then we > see > >> it as a key changing operation, hence we need to repartition the data. > >> > >> On Wed, 1 Mar 2017 at 18:59 Tianji Li wrote: > >> > >>> Hi there, > >>> > >>> I wonder if it makes sense to

Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-05 Thread Tianji Li
Hi Jay, The hybrid solution is exactly what I expect and need for our use cases when dealing with telecom data. Thanks Tianji On Wed, Apr 5, 2017 at 12:01 AM, Jay Kreps wrote: > Hey guys, > > One thing I've always found super important for this kind of design work is > to do a really good job

Re: [DISCUSS] KIP-138: Change punctuate semantics

2017-04-07 Thread Tianji Li
> absence of events. Otherwise the event counts for unexpired windows > would > > > be 0 which is bad. > > > > > > "Maybe a hybrid solution works: I window by event time but trigger > results > > > by system time for windows that have upda

Re: Kafka Connect / Access to OffsetStorageReader from SourceConnector

2017-02-22 Thread Tianji Li
Hi Florian, Just curious, what 'shared storage' you guys use to keep the files before ingested into Kafka? In our case, we could not figure out such a nice distributed+shared file system that is NOT HDFS alike and runs before Kafka. So we use individual harddisks on connector machines and keep of

[jira] [Commented] (KAFKA-4371) Sporadic ConnectException shuts down the whole connect process

2016-11-03 Thread Tianji Li (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15632506#comment-15632506 ] Tianji Li commented on KAFKA-4371: -- This seems like a bug in the https://github

[jira] [Comment Edited] (KAFKA-4371) Sporadic ConnectException shuts down the whole connect process

2016-11-03 Thread Tianji Li (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15632506#comment-15632506 ] Tianji Li edited comment on KAFKA-4371 at 11/3/16 11:5

[jira] [Commented] (KAFKA-4400) Prefix for sink task consumer groups should be configurable

2016-11-16 Thread Tianji Li (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15671457#comment-15671457 ] Tianji Li commented on KAFKA-4400: -- [~ewencp] We got bugged by this very issue a