[
https://issues.apache.org/jira/browse/KAFKA-3062?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15082004#comment-15082004
]
xingang edited comment on KAFKA-3062 at 1/4/16 11:30 PM:
---------------------------------------------------------
Yes! Example:
Huge volume data producing to >60 partition, and 15 consumers will work on this
data.
10 of them are time-latency sensitive, which is nearly real-time processing,
it's better for them to consume from the page cache to get the data, sometime a
little data loss even can be tolerant as its processing shows processing result
for realtime
5 of them are reports processing from the data, it's Ok to be hours or even
daily jobs, it does not require to show its result in a short time.
considering, if the 5 stats-processing are in a lag, and they will consume from
the disk, and make the page cache full of them, since such history data
consuming are N times faster than the producing rate. hence, the 10
time-latency sensitive processing are sad, since they always see the page cache
missing~~ once they get a short time lag
Thanks for your quick response!
was (Author: itismewxg):
Yes! Example:
Huge volume data producing to >60 partition, and 15 consumers will works on
this data.
10 of them are time-latency sensitive, which is nearly real-time processing,
it's better for them to consume from the page cache to get the data, sometime a
little data loss even can be tolerant as its processing shows processing result
for realtime
5 of them are reports processing from the data, it's Ok to be hours or even
daily jobs, it does not require to show its result in a short time.
considering, if the 5 stats-processing are in a lag, and they will consume from
the disk, and make the page cache full of them, since such history data
consuming are N times faster than the producing rate. hence, the 10
time-latency sensitive processing are sad, since they always see the page cache
missing~~ once they get a short time lag
Thanks for your quick response!
> Read from kafka replication to get data likely Version based
> ------------------------------------------------------------
>
> Key: KAFKA-3062
> URL: https://issues.apache.org/jira/browse/KAFKA-3062
> Project: Kafka
> Issue Type: Improvement
> Reporter: xingang
>
> Since Kafka require all the reading happens in the leader for the consistency.
> If there could be possible for the reading can happens in replication, thus,
> for data have a number of consumers, for the consumers Not latency-sensitive
> But Data-Loss sensitive can fetch its data from replication, in this case, it
> will pollute the Pagecache for other consumers which are latency-sensitive
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)