Hey Mich,
Thanks for this introduction on your forthcoming proposal "Spark Structured
Streaming and Flask REST API for Real-Time Data Ingestion and Analytics". I
recently came across an article by Databricks with title Scalable Spark
Structured Streaming for REST API Destinations. Their use case is similar to
your suggestion but what they are saying is that they have incoming stream of
data from sources like Kafka, AWS Kinesis, or Azure Event Hub. In other words,
a continuous flow of data where messages are sent to a REST API as soon as they
are available in the streaming source. Their approach is practical but wanted
to get your thoughts on their article with a better understanding on your
proposal and differences.
Thanks
On Tuesday, 9 January 2024 at 00:24:19 GMT, Mich Talebzadeh
<mich.talebza...@gmail.com> wrote:
Please also note that Flask, by default, is a single-threaded web framework.
While it is suitable for development and small-scale applications, it may not
handle concurrent requests efficiently in a production environment.In
production, one can utilise Gunicorn (Green Unicorn) which is a WSGI ( Web
Server Gateway Interface) that is commonly used to serve Flask applications in
production. It provides multiple worker processes, each capable of handling a
single request at a time. This makes Gunicorn suitable for handling multiple
simultaneous requests and improves the concurrency and performance of your
Flask application.
HTH
Mich Talebzadeh,Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom
view my Linkedin profile
https://en.everybodywiki.com/Mich_Talebzadeh
Disclaimer: Use it at your own risk. Any and all responsibility for any loss,
damage or destructionof data or any other property which may arise from relying
on this email's technical content is explicitly disclaimed.The author will in
no case be liable for any monetary damages arising from suchloss, damage or
destruction.
On Mon, 8 Jan 2024 at 19:30, Mich Talebzadeh <mich.talebza...@gmail.com> wrote:
Thought it might be useful to share my idea with fellow forum members. During
the breaks, I worked on the seamless integration of Spark Structured Streaming
with Flask REST API for real-time data ingestion and analytics. The use case
revolves around a scenario where data is generated through REST API requests in
real time. The Flask REST API efficiently captures and processes this data,
saving it to a Spark Structured Streaming DataFrame. Subsequently, the
processed data could be channelled into any sink of your choice including Kafka
pipeline, showing a robust end-to-end solution for dynamic and responsive data
streaming. I will delve into the architecture, implementation, and benefits of
this combination, enabling one to build an agile and efficient real-time data
application. I will put the code in GitHub for everyone's benefit. Hopefully
your comments will help me to improve it.
Cheers
Mich Talebzadeh,
Dad | Technologist | Solutions Architect | Engineer
London
United Kingdom
view my Linkedin profile
https://en.everybodywiki.com/Mich_Talebzadeh
Disclaimer: Use it at your own risk. Any and all responsibility for any loss,
damage or destructionof data or any other property which may arise from relying
on this email's technical content is explicitly disclaimed.The author will in
no case be liable for any monetary damages arising from suchloss, damage or
destruction.