Hi Ted,

Maybe it's usefull to take a look at samza, http://samza.apache.org/ they
use kafka in a way which sounds similar to how you want to use it. As I
recall from a youtube conference the creator of samza also mentioned to
never delete the events. These things are off course very dependent on your
use case, some events aren't worth keeping them around for long.

On Wed, Feb 24, 2016 at 9:08 AM Ted Swerve <ted.swe...@gmail.com> wrote:

> Hello,
>
> One of the big attractions of Kafka for me was the ability to write new
> consumers of topics that would then be able to connect to a topic and
> replay all the previous events.
>
> However, most of the time, Kafka appears to be used with a retention period
> - presumably in such cases, the events have been warehoused into HDFS
> or something similar.
>
> So my question is - how do people typically approach the scenario where a
> new piece of code needs to process all events in a topic from "day one",
> but has to source some of them from e.g HDFS and then connect to the
> real-time Kafka topic?  Are there any wrinkles with such an approach?
>
> Thanks,
> Ted
>

Reply via email to