Paolo, Hi, that sounds like a perfect use case for Kafka, probably in conjunction with Kafka Connect and maybe Kafka streams to.
I've built some demo Kafka IoT applications over the last 6 years, you may get some ideas from them. Summary of blogs here: https://www.linkedin.com/pulse/complete-guide-apache-kafka-developers-everything-i-know-paul-brebner/ There's also a Use Cases section (4) which would be useful. [https://media.licdn.com/dms/image/C5612AQETlzxol4RL_Q/article-cover_image-shrink_720_1280/0/1649653491404?e=2147483647&v=beta&t=9dWpd4hIVIbO95efSwM2YnyFEbW-E0XiPhBxvmC0s-I]<https://www.linkedin.com/pulse/complete-guide-apache-kafka-developers-everything-i-know-paul-brebner/> A Complete Guide to Apache Kafka for Developers (or, everything I know about Kafka in one place)<https://www.linkedin.com/pulse/complete-guide-apache-kafka-developers-everything-i-know-paul-brebner/> Paul Brebner, 11 April 2022 Last Updated 15 November 2022 I started learning and using Apache Kafka in 2017 and since then I’ve been building realistic Kafka (and other open source technologies) demonstration applications, and blogging and talking about Kafka. Everything I know about Kafka is pretty www.linkedin.com "Kongo" IoT logistics application included a streams example: https://www.instaclustr.com/blog/kongo-5-3-apache-kafka-streams-examples/ Apache Kafka® "Kongo" 5.3: Kongo Streams Example<https://www.instaclustr.com/blog/kongo-5-3-apache-kafka-streams-examples/> Continuing our blog on Kongo IoT application, where we develop a more complex Kafka streams application to keep track of the weight of goods in trucks www.instaclustr.com Kafka Connect pipeline series may be relevant as it ingested IoT data from a REST API into Kafka for further processing: https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/ [https://www.instaclustr.com/wp-content/uploads/2021/12/Pipeline-Series-Banner-10.jpg]<https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/> Comparison of Apache Kafka Connect, Plus Elasticsearch™/Kibana™ vs. PostgreSQL®/Apache Superset Pipelines: Conclusions (Pipeline Series Part 10)<https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/> In this part of the pipeline series, we sum up the results, and compare the two approaches for functionality, robustness, and performance. www.instaclustr.com And the "Anomalia Machina" series is an example of a combined Kafka+Cassandra (database) approach to anomaly detection, https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/ (final part). Good luck! Paul Brebner [https://www.instaclustr.com/wp-content/uploads/2021/12/Pipeline-Series-Banner-10.jpg]<https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/> Comparison of Apache Kafka Connect, Plus Elasticsearch™/Kibana™ vs. PostgreSQL®/Apache Superset Pipelines: Conclusions (Pipeline Series Part 10)<https://www.instaclustr.com/blog/kafka-connect-pipelines-conclusion-pipeline-series-part-10/> In this part of the pipeline series, we sum up the results, and compare the two approaches for functionality, robustness, and performance. www.instaclustr.com ________________________________ From: paolo francia <paolo.francia1...@gmail.com> Sent: 23 February 2023 05:13 PM To: users@kafka.apache.org <users@kafka.apache.org> Subject: Kafka for IoT ingestion pipeline NetApp Security WARNING: This is an external email. Do not click links or open attachments unless you recognize the sender and know the content is safe. Hello, I would ask if there are some cases/examples in which Kafka has been used in the backend of an ingestion pipeline for IoT data, with the purpose to make it scalable. My case is briefly doing this: - a web api is waiting data from IoT devices (10 million expeted by day). Data are not ordered in terms of time, we could receive old data that the device can't sent previously. - then data are stored in a database to be processed - a database job pulls and process the data (create the average, min, max, over/under quota, add the internal sensor id from the serial number,...) I'm wondering if Kafka could be the right choice to leave database pulling and which benefits it brings. I really appreciate if there is an example or case study. Thank you very much Paolo