Hello, regarding the Lambda architecture there is a following book - https://www.manning.com/books/big-data (Big Data. Principles and best practices of scalable realtime data systems Nathan Marz and James Warren).
Regards, Roman 2015-11-12 4:47 GMT+03:00 Welly Tambunan <if05...@gmail.com>: > Hi Stephan, > > > Thanks for your response. > > > We are trying to justify whether it's enough to use Kappa Architecture > with Flink. This more about resiliency and message lost issue etc. > > The article is worry about message lost even if you are using Kafka. > > No matter the message queue or broker you rely on whether it be RabbitMQ, > JMS, ActiveMQ, Websphere, MSMQ and yes even Kafka you can lose messages in > any of the following ways: > > - A downstream system from the broker can have data loss > - All message queues today can lose already acknowledged messages > during failover or leader election. > - A bug can send the wrong messages to the wrong systems. > > Cheers > > On Wed, Nov 11, 2015 at 4:13 PM, Stephan Ewen <se...@apache.org> wrote: > >> Hi! >> >> Can you explain a little more what you want to achieve? Maybe then we can >> give a few more comments... >> >> I briefly read through some of the articles you linked, but did not quite >> understand their train of thoughts. >> For example, letting Tomcat write to Cassandra directly, and to Kafka, >> might just be redundant. Why not let the streaming job that reads the Kafka >> queue >> move the data to Cassandra as one of its results? Further more, durable >> storing the sequence of events is exactly what Kafka does, but the article >> suggests to use Cassandra for that, which I find very counter intuitive. >> It looks a bit like the suggested approach is only adopting streaming for >> half the task. >> >> Greetings, >> Stephan >> >> >> On Tue, Nov 10, 2015 at 7:49 AM, Welly Tambunan <if05...@gmail.com> >> wrote: >> >>> Hi All, >>> >>> I read a couple of article about Kappa and Lambda Architecture. >>> >>> >>> http://www.confluent.io/blog/real-time-stream-processing-the-next-step-for-apache-flink/ >>> >>> I'm convince that Flink will simplify this one with streaming. >>> >>> However i also stumble upon this blog post that has valid argument to >>> have a system of record storage ( event sourcing ) and finally lambda >>> architecture is appear at the solution. Basically it will write twice to >>> Queuing system and C* for safety. System of record here is basically >>> storing the event (delta). >>> >>> [image: Inline image 1] >>> >>> >>> https://lostechies.com/ryansvihla/2015/09/17/event-sourcing-and-system-of-record-sane-distributed-development-in-the-modern-era-2/ >>> >>> Another approach is about lambda architecture for maintaining the >>> correctness of the system. >>> >>> >>> https://lostechies.com/ryansvihla/2015/09/17/real-time-analytics-with-spark-streaming-and-cassandra/ >>> >>> >>> Given that he's using Spark for the streaming processor, do we have to >>> do the same thing with Apache Flink ? >>> >>> >>> >>> Cheers >>> -- >>> Welly Tambunan >>> Triplelands >>> >>> http://weltam.wordpress.com >>> http://www.triplelands.com <http://www.triplelands.com/blog/> >>> >> >> > > > -- > Welly Tambunan > Triplelands > > http://weltam.wordpress.com > http://www.triplelands.com <http://www.triplelands.com/blog/> >