Paolo,

Hi, that sounds like a perfect use case for Kafka, probably in conjunction with 
Kafka Connect and maybe Kafka streams to.

I've built some demo Kafka IoT applications over the last 6 years, you may get 
some ideas from them.

Summary of blogs here: 
https://www.linkedin.com/pulse/complete-guide-apache-kafka-developers-everything-i-know-paul-brebner/

There's also a Use Cases section (4) which would be useful.
[https://media.licdn.com/dms/image/C5612AQETlzxol4RL_Q/article-cover_image-shrink_720_1280/0/1649653491404?e=2147483647&v=beta&t=9dWpd4hIVIbO95efSwM2YnyFEbW-E0XiPhBxvmC0s-I]<https://www.linkedin.com/pulse/complete-guide-apache-kafka-developers-everything-i-know-paul-brebner/>
A Complete Guide to Apache Kafka for Developers (or, everything I know about 
Kafka in one 
place)<https://www.linkedin.com/pulse/complete-guide-apache-kafka-developers-everything-i-know-paul-brebner/>
Paul Brebner, 11 April 2022 Last Updated 15 November 2022 I started learning 
and using Apache Kafka in 2017 and since then I’ve been building realistic 
Kafka (and other open source technologies) demonstration applications, and 
blogging and talking about Kafka. Everything I know about Kafka is pretty
www.linkedin.com


"Kongo" IoT logistics application included a streams example: 
https://www.instaclustr.com/blog/kongo-5-3-apache-kafka-streams-examples/
Apache Kafka® "Kongo" 5.3: Kongo Streams 
Example<https://www.instaclustr.com/blog/kongo-5-3-apache-kafka-streams-examples/>
Continuing our blog on Kongo IoT application, where we develop a more complex 
Kafka streams application to keep track of the weight of goods in trucks
www.instaclustr.com


Kafka Connect pipeline series may be relevant as it ingested IoT data from a 
REST API into Kafka for further processing: 
https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/
[https://www.instaclustr.com/wp-content/uploads/2021/12/Pipeline-Series-Banner-10.jpg]<https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/>
Comparison of Apache Kafka Connect, Plus Elasticsearch™/Kibana™ vs. 
PostgreSQL®/Apache Superset Pipelines: Conclusions (Pipeline Series Part 
10)<https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/>
In this part of the pipeline series, we sum up the results, and compare the two 
approaches for functionality, robustness, and performance.
www.instaclustr.com


And the "Anomalia Machina" series is an example of a combined Kafka+Cassandra 
(database) approach to anomaly detection, 
https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/
 (final part).

Good luck! Paul Brebner
[https://www.instaclustr.com/wp-content/uploads/2021/12/Pipeline-Series-Banner-10.jpg]<https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/>
Comparison of Apache Kafka Connect, Plus Elasticsearch™/Kibana™ vs. 
PostgreSQL®/Apache Superset Pipelines: Conclusions (Pipeline Series Part 
10)<https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/>
In this part of the pipeline series, we sum up the results, and compare the two 
approaches for functionality, robustness, and performance.
www.instaclustr.com

________________________________
From: paolo francia <paolo.francia1...@gmail.com>
Sent: 23 February 2023 05:13 PM
To: users@kafka.apache.org <users@kafka.apache.org>
Subject: Kafka for IoT ingestion pipeline

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.




Hello,
I would ask if there are some cases/examples in which Kafka has been used
in the backend of an ingestion pipeline for IoT data, with the purpose to
make it scalable.

My case is briefly doing this:
- a web api is waiting data from IoT devices (10 million expeted by day).
Data are not ordered in terms of time, we could receive old data that the
device can't sent previously.
- then data are stored in a database to be processed
- a database job pulls and process the data (create the average, min, max,
over/under quota, add the internal sensor id from the serial number,...)

I'm wondering if Kafka could be the right choice to leave database pulling
and which benefits it brings.
I really appreciate if there is an example or case study.

Thank you very much
Paolo

Reply via email to