On Fri, Aug 14, 2015 at 10:57 AM, Jay Kreps <j...@confluent.io> wrote:

> I thought batch was dead? :-)
>
> Yeah I think this would be really useful. Kafka kind of allows you to unify
> batch and streams since you produce or consume your stream on your own
> schedule so you would want the ingress/egress to work the same.
>
> Ewen, rather than sleeping, I think the use case is that I want to be able
> to crontab up the copycat process to run hourly or daily to either push or
> pull data and then quit when there is no more data. Scheduling the process
> to start is easy, the challenge is how does copycat know it is done?
>
> The sink side is a little easier since you can define the end of the stream
> to be the last offset for each partition at the time the connector starts
> (this is what Camus does iirc). So at startup you check the end offset for
> each partition, a partition is complete when it reaches that offset. When
> all jobs are complete the process exists.
>
> Not sure how the source side could work since the offset concept is
> heterogenous across different systems.
>

You can indicate this by returning from poll() without any data. Since
poll() is allowed to block indefinitely, there is no reason under streaming
mode that it would need to return no data.

-Ewen


>
> -Jay
>
> On Thu, Aug 13, 2015 at 10:23 PM, Gwen Shapira <g...@confluent.io> wrote:
>
> > Hi Team Kafka,
> >
> > (sorry for the flood, this is last one! promise!)
> >
> > If you tried out PR-99, you know that CopyCat now does on-going
> > export/import. So it will continuously read data from a source and write
> it
> > to Kafka (or vice versa). This is great for tailing logs and replicating
> > from MySQL binlog.
> >
> > But, I'm wondering if there's a need for a batch-mode too.
> > This can be useful for:
> > * Camus-like thing. You can stream data to HDFS, but the benefits are
> > limited and there are some known issues there.
> > * Dump large parts of an RDBMS at once.
> >
> > Do you agree that this need exist? or is stream export/import good
> enough?
> >
> > Also, anyone has ideas how he would like the batch mode to work?
> >
> > Gwen
> >
>



-- 
Thanks,
Ewen

Reply via email to