Hi there,
I have a doubt regarding how to realize 'back-pressure' for windowed
streams.
Say I have a pipeline that consumes from a topic on a windowed basis, then
do some processing (whenever punctuate is called), and produces into
another topic.
If the incoming rates from all consumers is 10M/s
ze$20streams|sort:
> relevance/confluent-platform/jazg8f-qn0M/hulJwk60AAAJ>.
>
> One caveat is that RocksDb uses off-heap memory, but that can also be
> tuned.
>
> Thanks
> Eno
>
> > On 6 Apr 2017, at 14:23, Tianji Li wrote:
> >
> > If so, how to?
> >
> > Thanks
> > Tianji
>
>
If so, how to?
Thanks
Tianji
215 <https://github.com/apache/
> kafka/blob/trunk/streams/src/main/java/org/apache/kafka/
> streams/processor/internals/StreamTask.java#L215>
>
> Cheers
> Eno
> > On 1 Apr 2017, at 17:14, Tianji Li wrote:
> >
> > Hi Eno,
> >
> > Could you point to
Eno
>
> > On Apr 1, 2017, at 3:26 PM, Tianji Li wrote:
> >
> > Hi there,
> >
> > Say a processor that is consuming topic A and producing into topic B, and
> > somehow the processing takes long time, is it possible to pause the
> > consuming from topic A,
Hi there,
Say a processor that is consuming topic A and producing into topic B, and
somehow the processing takes long time, is it possible to pause the
consuming from topic A, and later on resume?
Or does it make sense to do so? If not, what are the options to resolve
this issue?
Thanks
Tianji
evicted because max.poll.interval exceeds
> > the
> > > set limit.
> > >
> > > Try running rocksdb in memory https://github.com/facebook/
> > > rocksdb/wiki/How-to-persist-in-memory-RocksDB-database%3F.
> > >
> > > Thanks
> >
to increase
> max.poll.interval config value.
> If that doesn't work, could you revert to using one thread for now. Also
> let us know either way since we might need to open a bug report.
>
> Thanks
> Eno
>
> > On 16 Mar 2017, at 20:47, Tianji Li wrote:
> >
Hi there,
I always got this crashes and wonder if anyone knows why. Please let me
know what information I should provide to help with trouble shooting.
I am using 0.10.2.0. My application is reading one topic and then
groupBy().aggregate() 50 times on different keys.
I use memory store, without
ummary, I'd recommend you use RocksDb as is since 7 vs 5 is a
> reasonable difference.
>
> However, the real performance will be when you actually enable logging,
> right? You might want RocksDb to be backed to Kafka for fault tolerance.
>
> Finally, make sure to use 0.10.2, the l
k that
> has largely reduced the rebalance latency so I'd recommend try using a
> build from Kafka trunk for testing if possible.
>
>
> Guozhang
>
> On Wed, Mar 15, 2017 at 10:46 AM, Tianji Li wrote:
>
> > It seems independent to the rocksdb sizes. It also took 5
ores.create(storeName)
> >.withKeys(stringSerde)
> >.withValues(avroSerde)
> >.persistent()
> >.disableLogging()
> >.build();
>
>
> Thanks
> Eno
>
>
> > On 15 Mar 2017, at 13:02, Tianji Li wrote:
> >
>
May be partitioning the
> source topic, increasing the number of threads/instances processing the
> source and reducing the time window of aggregation can help in reducing the
> startup time.
>
>
>
> On Wed, Mar 15, 2017 at 6:36 PM, Tianji Li wrote:
>
> > Hi there,
> >
&
Hi there,
In the experiments I am doing now, if I restart the streams application, I
have to wait for around 5 minutes for some reason.
I can see something in the Kafka logs:
[2017-03-15 08:36:18,118] INFO [GroupCoordinator 0]: Preparing to
restabilize group xxx-test25 with old generation 2
(kaf
Hi there,
It seems that the RocksDB state store is quite slow in my case and I wonder
if I did anything wrong.
I have a topic, that I groupBy() and then aggregate() 50 times. That is, I
will create 50 results topics and a lot more changelog and repartition
topics.
There are a few things that are
Hi Guys,
Thanks very much for your help.
A final question, is it possible to use different commit intervals for
state-store change-logs topics and for sink topics?
Thanks
Tianji
, are you doing 100
> > different aggregation against record key ? Then you should have a single
> > data object for those 100 values, anyway it sounds like we have similar
> > problem ..
> >
> > -Kohki
> >
> >
> >
> >
> >
>
a lots of tiny>
> > information and I have a big concern about the size of the state
store />
> > topic, so my decision is that I'm going with my own handling of
Kafka API>
> > ..>
> >>
> > If you do stateless operation and don't have a spark cluster, yea
Hi there,
Can anyone give a good explanation in what cases Kafka Streams is
preferred, and in what cases Sparking Streaming is better?
Thanks
Tianji
Just curious...
Thanks
Tianji
20 matches
Mail list logo