Re: Database Replication Question

Jay Kreps Wed, 04 Mar 2015 09:05:00 -0800

Hey Xiao,

Yeah I agree that without fsync you will not get durability in the case of
a power outage or other correlated failure, and likewise without
replication you won't get durability in the case of disk failure.


If each batch is fsync'd it will definitely be slower, depending on the
capability of the disk subsystem. Either way that feature is there now.

-Jay

On Wed, Mar 4, 2015 at 8:50 AM, Xiao <lixiao1...@gmail.com> wrote:

> Hey Jay,
>
> Yeah. I understood the advantage of Kafka is one to many. That is why I am
> reading the source codes of Kafka. Your guys did a good product! : )
>
> Our major concern is its message persistency. Zero data loss is a must in
> our applications. Below is what I copied from the Kafka document.
>
> "The log takes two configuration parameter M which gives the number of
> messages to write before forcing the OS to flush the file to disk, and S
> which gives a number of seconds after which a flush is forced. This gives a
> durability guarantee of losing at most M messages or S seconds of data in
> the event of a system crash."
>
> Basically, our producers needs to know if the data have been
> flushed/fsynced to the disk. Our model is disconnected. Producers and
> consumers do not talk with each other. The only media is a Kafka-like
> persistence message queue.
>
> Unplanned power outage is not rare in 24/7 usage. Any data loss could
> cause a very expensive full refresh. That is not acceptable for many
> financial companies.
>
> If we do fsync for each transaction or each batch, the throughput could be
> low? Or another way is to let our producers check recovery points very
> frequently, and then the performance bottleneck will be on reading/copying
> the recovery-point file. Any other ideas?
>
> I have not read the source codes for synchronous disk replication. That
> will be my next focus. I am not sure if that can resolve our above concern.
>
> BTW, do you have any plan to support mainframe?
>
> Thanks,
>
> Xiao Li
>
>
> On Mar 4, 2015, at 8:01 AM, Jay Kreps <jay.kr...@gmail.com> wrote:
>
> > Hey Xiao,
> >
> > 1. Nothing prevents applying transactions transactionally on the
> > destination side, though that is obviously more work. But I think the key
> > point here is that much of the time the replication is not
> Oracle=>Oracle,
> > but Oracle=>{W, X, Y, Z} where W/X/Y/Z are totally heterogenous systems
> > that aren't necessarily RDBMSs.
> >
> > 2. I don't think fsync is really relevant. You can fsync on every message
> > if you like, but Kafka's durability guarantees don't depend on this as it
> > allows synchronous commit across replicas. This changes the guarantee
> from
> > "won't be lost unless the disk dies" to "won't be lost unless all
> replicas
> > die" but the later is generally a stronger guarantee in practice given
> the
> > empirical reliability of disks (#1 reason for server failure in my
> > experience was disk failure).
> >
> > -Jay
> >
> > On Tue, Mar 3, 2015 at 4:23 PM, Xiao <lixiao1...@gmail.com> wrote:
> >
> >> Hey Josh,
> >>
> >> If you put different tables into different partitions or topics, it
> might
> >> break transaction ACID at the target side. This is risky for some use
> >> cases. Besides unit of work issues, you also need to think about the
> load
> >> balancing too.
> >>
> >> For failover, you have to find the timestamp for point-in-time
> >> consistency. This part is tricky. You have to ensure all the changes
> before
> >> a specific timestamp have been flushed to the disk. Normally, you can
> >> maintain a bookmark for different partition at the target side to know
> what
> >> is the oldest transactions have been flushed to the disk. Unfortunately,
> >> based on my understanding, Kafka is unable to do it because it does not
> do
> >> fsync regularly for achieving better throughput.
> >>
> >> Best wishes,
> >>
> >> Xiao Li
> >>
> >>
> >> On Mar 3, 2015, at 3:45 PM, Xiao <lixiao1...@gmail.com> wrote:
> >>
> >>> Hey Josh,
> >>>
> >>> Transactions can be applied in parallel in the consumer side based on
> >> transaction dependency checking.
> >>>
> >>> http://www.google.com.ar/patents/US20080163222
> >>>
> >>> This patent documents how it work. It is easy to understand, however,
> >> you also need to consider the hash collision issues. This has been
> >> implemented in IBM Q Replication since 2001.
> >>>
> >>> Thanks,
> >>>
> >>> Xiao Li
> >>>
> >>>
> >>> On Mar 3, 2015, at 3:36 PM, Jay Kreps <jay.kr...@gmail.com> wrote:
> >>>
> >>>> Hey Josh,
> >>>>
> >>>> As you say, ordering is per partition. Technically it is generally
> >> possible
> >>>> to publish all changes to a database to a single partition--generally
> >> the
> >>>> kafka partition should be high throughput enough to keep up. However
> >> there
> >>>> are a couple of downsides to this:
> >>>> 1. Consumer parallelism is limited to one. If you want a total order
> to
> >> the
> >>>> consumption of messages you need to have just 1 process, but often you
> >>>> would want to parallelize.
> >>>> 2. Often what people want is not a full stream of all changes in all
> >> tables
> >>>> in a database but rather the changes to a particular table.
> >>>>
> >>>> To some extent the best way to do this depends on what you will do
> with
> >> the
> >>>> data. However if you intend to have lots
> >>>>
> >>>> I have seen pretty much every variation on this in the wild, and here
> is
> >>>> what I would recommend:
> >>>> 1. Have a single publisher process that publishes events into Kafka
> >>>> 2. If possible use the database log to get these changes (e.g. mysql
> >>>> binlog, Oracle xstreams, golden gate, etc). This will be more complete
> >> and
> >>>> more efficient than polling for changes, though that can work too.
> >>>> 3. Publish each table to its own topic.
> >>>> 4. Partition each topic by the primary key of the table.
> >>>> 5. Include in each message the database's transaction id, scn, or
> other
> >>>> identifier that gives the total order within the record stream. Since
> >> there
> >>>> is a single publisher this id will be monotonic within each partition.
> >>>>
> >>>> This seems to be the best set of tradeoffs for most use cases:
> >>>> - You can have parallel consumers up to the number of partitions you
> >> chose
> >>>> that still get messages in order per ID'd entity.
> >>>> - You can subscribe to just one table if you like, or to multiple
> >> tables.
> >>>> - Consumers who need a total order over all updates can do a "merge"
> >> across
> >>>> the partitions to reassemble the fully ordered set of changes across
> all
> >>>> tables/partitions.
> >>>>
> >>>> One thing to note is that the requirement of having a single consumer
> >>>> process/thread to get the total order isn't really so much a Kafka
> >>>> restriction as it just is a restriction about the world, since if you
> >> had
> >>>> multiple threads even if you delivered messages to them in order their
> >>>> processing might happen out of order (just do to the random timing of
> >> the
> >>>> processing).
> >>>>
> >>>> -Jay
> >>>>
> >>>>
> >>>>
> >>>> On Tue, Mar 3, 2015 at 3:15 PM, Josh Rader <jrader...@gmail.com>
> wrote:
> >>>>
> >>>>> Hi Kafka Experts,
> >>>>>
> >>>>>
> >>>>>
> >>>>> We have a use case around RDBMS replication where we are
> investigating
> >>>>> Kafka.  In this case ordering is very important.  Our understanding
> is
> >>>>> ordering is only preserved within a single partition.  This makes
> >> sense as
> >>>>> a single thread will consume these messages, but our question is can
> we
> >>>>> somehow parallelize this for better performance?   Is there maybe
> some
> >>>>> partition key strategy trick to have your cake and eat it too in
> terms
> >> of
> >>>>> keeping ordering, but also able to parallelize the processing?
> >>>>>
> >>>>>
> >>>>>
> >>>>> I am sorry if this has already been asked, but we tried to search
> >> through
> >>>>> the archives and couldn’t find this response.
> >>>>>
> >>>>>
> >>>>>
> >>>>> Thanks,
> >>>>>
> >>>>> Josh
> >>>>>
> >>>
> >>
> >>
>
>

Re: Database Replication Question

Reply via email to