Hi, What is the definition of real time here?
The engineering definition of real time is roughly fast enough to be interactive. However, I put a stronger definition. In real time application or data, there is no such thing as an answer which is supposed to be late and correct. The timeliness is part of the application. If we get the right answer too slowly it becomes useless or wrong. We also need to be aware of latency trades off with throughput. So it all depends what do you want to do with all these artifacts for your needs. Also within a larger architecture often latency is dictated by the lowest denominator which often does not adhere to our definition of low latency. For example, Kafka as widely deployed today in Big Data Architecture is micro-batch. A moderate-latency message queue that is Kafka plus low latency processor equals a moderate-latency architecture. Hence, the low latency architecture must be treated within that context. Have a look at this article of mine: https://www.linkedin.com/pulse/real-time-processing-trade-data-kafka-flume-spark-talebzadeh-ph-d-/ HTH Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* http://talebzadehmich.wordpress.com *Disclaimer:* Use it at your own risk. Any and all responsibility for any loss, damage or destruction of data or any other property which may arise from relying on this email's technical content is explicitly disclaimed. The author will in no case be liable for any monetary damages arising from such loss, damage or destruction. On Wed, 21 Aug 2019 at 09:53, Eliza <e...@chinabuckets.com> wrote: > Hi, > > on 2019/8/21 16:44, Aziret Satybaldiev wrote: > > In my experience, Kafka + Spark streaming + (perhaps HBase if you want > > to store the metrics) is so far the best combo. > > Not only because the technology is mature, but also because there are a > > lot of examples available on the web and books. > > They typically should cover most of what you would probably need and you > > can easily build something more complex on top of that. > > However, if you can, I would advise building several prototypes and then > > choosing the best stack. At least that's how we got where we are. > > > > Someone other told me that we should consider to use Flink for realtime > streaming and Kafka for message queues, and Spark streaming is weak, and > never consider Storm. > > I will try to test them in my cases. > > Thanks & regards. > > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > >