That really was a helpful overview, Israel. Might make a good blog post! 😀

Ola, C# would make it so that you can’t use Kafka Streams, but you may not need 
it.  The Kafka Consumer API, which is available in C#, might be enough for you. 
 

For a good explanation of topics, partitions, and pretty much everything else 
Israel mentioned, I would suggest you go to http://developer.confluent.io 

There you’ll find free video courses, quick-starts, tutorials, and more.  

Sounds like you are at the beginning of an exciting journey!  Enjoy!

Dave


> On Dec 30, 2021, at 8:29 AM, Ola Bissani <ola.biss...@easysoft.com.lb> wrote:
> 
> Dear Israel,
> 
> Thank you so much for your support, I will check the links you sent in your 
> email to start my service. 
> 
> As for your question, yes the events generated by the devices are similar in 
> data structures. I would also like to state that my service will be either 
> done in java or C#. Would using C# be an issue? Also is there some link you 
> recommend I can check before writing my code.
> 
> I have also one more question, in your mail you mentioned using one topic 
> with many partitions, I would like to state that the number of devices I'm 
> using is dynamic, are you suggesting I create a partition for each device and 
> would it be possible if I don't know the exact number of devices I have, or 
> should I create multiple partition for the purpose of multi-processing only?
> 
> Thank you,
> 
> Best Regards
> Ola Bissani
> Developer Manager
> Easysoft
> Mobile Lebanon   : +961       3 61 16 90
> Office Lebanon      :+961       1 33 55 15/17
> E mail:     ola.biss...@easysoft.com.lb   
> web site:www.easysoft.com.lb
> "Tailored to Perfection"                                                      
>                            
> 
> The information transmitted is intended only for the person or entity to 
> which it is addressed and it may contain proprietary, business-confidential, 
> and/or legally privileged information. If you are not the intended recipient 
> of this email you are hereby notified that any use, review, retransmission, 
> dissemination, distribution, reproduction or any other action taken in 
> reliance upon this email is strictly prohibited. If you have received this 
> email in error, please contact the sender and delete this email and its 
> contents from any computer. Any views expressed in this email are those of 
> the individual sender and may not necessarily reflect the views of the 
> company.                                                                      
>                                                                Please 
> consider the environmet before printing this email.
> 
> -----Original Message-----
> From: Israel Ekpo <israele...@gmail.com> 
> Sent: Thursday, December 30, 2021 3:47 PM
> To: Users <users@kafka.apache.org>
> Subject: Re: Kafka-Real Time Update
> 
> Ola,
> 
> Let's review the Apache Kafka ecosystem briefly, and then I will make an 
> attempt to address your concerns:
> 
> In the Kafka Ecosystem, we have the following components:
> 
> - Brokers (stores events in logical containers called Topics. Topics are 
> analogous to Tables in relational databases like MySQL or PostgreSQL)
> - Producers (the generate events and sends them to the brokers for storage
> - Consumers (picks up the events from the Topics and processes or consumes
> them)
> - Streams (at a high level combines Consumer and Producer mechanism to 
> process events in near real time and send them back to the Topics)
> - Schema Registry (keeps track of data structures in the topics. Can be used 
> for Avro, JSON, Protobuf formats)
> 
> https://kafka.apache.org/documentation/#api
> 
> https://github.com/confluentinc/schema-registry
> 
> There are two main things to consider here in your scenario.
> 
> Each of the devices is a prospective Producer of events that will be sent to 
> the topic.
> 
> You don't necessarily need to dedicate topics uniquely for each producer just 
> like how you will not need to create a table for each customer record that 
> you need to store.
> Events sent to a topic are generally grouped together because they have 
> similar data structure, so if your devices are generating messages with the 
> same data structure, then regardless of the number of devices, you should 
> still be able to send them to the same topic. Just make sure that you have 
> enough partitions and you should be able to consume them in parallel. The 
> partition count is important because the maximum number of consumers within a 
> group of Consumers is limited by default by the number of partitions in the 
> topic. If you are looking to have up to let's say 50 parallel processors in 
> your Consumer Group then you need to specify 50 partitions when creating the 
> topic
> 
> Nevertheless, with the parallel consumer you can mitigate this partition 
> limitation by using the parallel consumer by Confluent to process your events 
> with key-based ordering.
> 
> https://github.com/confluentinc/parallel-consumer
> 
> Key-Based ordering essentially eliminates this limitation 
> https://github.com/confluentinc/parallel-consumer#ordered-by-key
> 
> The second item of consideration is that you wanted to "loop" to process the 
> events. I don't think you need to do this. You can consider the Streams API, 
> to process your events as they arrive without needing to do this
> 
> https://kafka.apache.org/30/documentation/streams/
> 
> The Streams API has so many built-in mechanisms that allow you to just focus 
> on how to process, join and aggregate your events as they arrive at the 
> topics without the need to loop
> 
> I definitely would not recommend having a topic (table) for each device.
> Find a way to group the data structures that are similar into a particular 
> topic, then you can use the Consumer API or Streams API to process the events 
> in near-real time.
> 
> If you are not really comfortable with writing Java Code for the stream 
> processing, you can also take a look at KSQLDB that allows you to leverage 
> SQL-like syntax to process streams arriving in Kafka Brokers
> 
> https://ksqldb.io/
> 
> These systems are capable of handling a significantly large amount of events 
> per second at scale so I have no doubt that you will be able to figure out 
> how to implement the architecture to resolve your needs.
> 
> When you have a moment, could you confirm if your events generated by the 
> devices are similar in data structures?
> 
> I hope this message gives you enough information to get started.
> 
> Sincerely,
> 
> Israel Ekpo
> Lead Instructor, IzzyAcademy.com
> https://izzyacademy.com/
> https://www.youtube.com/c/izzyacademy
> <https://www.youtube.com/c/izzyacademy>
> 
> 
>> On Thu, Dec 30, 2021 at 5:13 AM Ola Bissani <ola.biss...@easysoft.com.lb>
>> wrote:
>> 
>> Dears,
>> 
>> 
>> 
>> I'm looking for a way to get real-time updates using my service, I 
>> believe kafka is the way to go but I still have an issue on how to use it.
>> 
>> 
>> 
>> My system gets data from devices using GPRS, I then read this data and 
>> analyze it to check what action I should do afterwards. I need the 
>> analyzing step to be as fast as possible. I was thinking of two options:
>> 
>> 
>> 
>> The first option is to gather all the data sent from all the devices 
>> into one huge topic and then getting all the data from this topic and 
>> analyzing it. The downside of this option is that the data analysis 
>> step is delaying my work since I was to loop through the topic data, 
>> on the other hand the advantage is that I have a manageable number of topics 
>> ( only 1 topic).
>> 
>> 
>> 
>> The other option is to divide the data I'm gathering into several 
>> small topics by allowing each device to have its own topic, take into 
>> consideration that the number of devices is large, I'm talking about 
>> more that 5000 devices. The downside of this option is that I have 
>> thousands of topics, where the advantage is that each topic will have 
>> a manageable amount of data allowing me to get my analysis done in 
>> much more reasonable time.
>> 
>> 
>> 
>> Can you advise on what option is better and whether there is a third 
>> option that I'm not considering,
>> 
>> *Best Regards*
>> 
>> *Ola Bissani*
>> 
>> Developer Manager
>> 
>> *Easysoft*
>> 
>> Mobile Lebanon   : +961       3 61 16 90
>> 
>> Office Lebanon      :+961       1 33 55 15/17
>> 
>> E mail:     ola.biss...@easysoft.com.lb
>> 
>> web site:www.easysoft.com.lb
>> 
>> *"Tailored to Perfection"*
>> 
>> 
>>  [image: image1] [image: most innov 2017 final logo][image: Description:
>> Description: easysoft-logo transparent2012]
>> 
>> The information transmitted is intended only for the person or entity 
>> to which it is addressed and it may contain proprietary, 
>> business-confidential, and/or legally privileged information. If you 
>> are not the intended recipient of this email you are hereby notified 
>> that any use, review, retransmission, dissemination, distribution, 
>> reproduction or any other action taken in reliance upon this email is 
>> strictly prohibited.
>> If you have received this email in error, please contact the sender 
>> and delete this email and its contents from any computer. Any views 
>> expressed in this email are those of the individual sender and may not 
>> necessarily reflect the views of the company.
>> Please consider the environmet before printing this email.
>> 
>> 
>> 
> 

Reply via email to