[ 
https://issues.apache.org/jira/browse/KAFKA-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-6953.
------------------------------------
    Resolution: Abandoned

> [Streams] Schedulable KTable as Graph source (for minimizing aggregation 
> pressure)
> ----------------------------------------------------------------------------------
>
>                 Key: KAFKA-6953
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6953
>             Project: Kafka
>          Issue Type: New Feature
>          Components: streams
>            Reporter: Flavio Stutz
>            Priority: Major
>
> === PROBLEM ===
> We have faced the following scenario/problem in a lot of situations with 
> KStreams:
>    - Huge incoming data being processed by numerous application instances
>    - Need to aggregate, or count the overall data as a single value 
> (something like "count the total number of messages that has been processed 
> among all distributed instances")
>    - The challenge here is to manage this kind of situation without any 
> bottlenecks. We don't need the overall aggregation of all instances states at 
> each processed message, so it is possible to store the partial aggregations 
> on local stores and, at time to time, query those states and aggregate them, 
> avoiding bottlenecks.
> Some ways we KNOW it wouldn't work because of bottlenecks:
>     - Sink all instances local counter/aggregation result to a Topic with a 
> single partition so that we could have another Graph with a single instance 
> that could aggregate all results
>          - In this case, if I had 500 instances processing 1000/s each (with 
> no bottlenecks), I would have a single partition topic with 500k messages/s 
> for my single aggregating instance to process that much messages (IMPOSSIBLE 
> bottleneck)
> === TRIALS ===
> These are some ways we managed to do this:
>    - Expose a REST endpoint so that Prometheus could extract local metrics of 
> each application instance's state stores and them calculate the total count 
> on Prometheus using queries
>          - we don't like this much because we believe KStreams was meant to 
> INPUT and OUTPUT data using Kafka Topics for simplicity and power
>    - Create a scheduled Punctuate at the end of the Graph so that we can 
> query (using getAllMetadata) all other instances's state store counters, sum 
> them all and them publish to another Kafka Topic from time to time.
>           - For this to work we created a way so that only one application 
> instance's Punctuate algorithm would perform the calculations (something like 
> a master election through instance ids and metadata)
> === PROPOSAL ===
> Create a new DSL Source with the following characteristics:
>    - Source parameters: "scheduled time" (using cron's like config), "state 
> store name", bool "from all application instances"
>    - Behavior: At the desired time, query all K,V tuples from the state store 
> and source those messages to the Graph
>           - If "from all application instances" is true, query the tuples 
> from all application instances state stores and source them all, concatenated
>    - This is a way to create a "timed aggregation barrier" to avoid 
> bottlenecks. With this we could enhance the ability of KStreams to better 
> handle the CAP Theorem characteristics, so that one could choose to have 
> Consistency over Availability.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to