[ceph-users] Re: does the RBD client block write when the Watcher times out?

Frank Schilder Thu, 23 May 2024 05:16:17 -0700

Hi, we run into the same issue and there is actually another use case: 
live-migration of VMs. This requires an RBD image being mapped to two clients 
simultaneously, so this is intentional. If multiple clints map an image in 
RW-mode, the ceph back-end will cycle the write lock between the clients to 
allow each of them to flush writes, this is intentional. The way to coordinate 
here is the job of the orchestrator. In this case specifically, its explicitly 
managing a write lock during live-migration such that writes occur in the 
correct order.

Its not a ceph job, its an orchestration job. The rbd interface just provides 
the tools to do it, for example, you can attach information that helps you 
hunting down dead-looking clients and kill them proper before mapping an image 
somewhere else.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Ilya Dryomov <idryo...@gmail.com>
Sent: Thursday, May 23, 2024 2:05 PM
To: Yuma Ogami
Cc: ceph-users@ceph.io
Subject: [ceph-users] Re: does the RBD client block write when the Watcher 
times out?

On Thu, May 23, 2024 at 4:48 AM Yuma Ogami <yuma.ogami.cyb...@gmail.com> wrote:
>
> Hello.
>
> I'm currently verifying the behavior of RBD on failure. I'm wondering
> about the consistency of RBD images after network failures. As a
> result of my investigation, I found that RBD sets a Watcher to RBD
> image if a client mounts this volume to prevent multiple mounts. In

Hi Yuma,

The watcher is created to watch for updates (technically, to listen to
notifications) on the RBD image, not to prevent multiple mounts.  RBD
allows the same image to be mapped multiple times on the same node or
on different nodes.

> addition, I found that if the client is isolated from the network for
> a long time, the Watcher is released. However, the client still mounts
> this image. In this situation, if another client can also mount this
> image and the image is writable from both clients, data corruption
> occurs. Could you tell me whether this is a realistic scenario?

Yes, this is a realistic scenario which can occur even if the client
isn't isolated from the network.  If the user does this, it's up to the
user to ensure that everything remains consistent.  One use case for
mapping the same image on multiple nodes is a clustered (also referred
to as a shared disk) filesystem, such as OCFS2.

Thanks,

                Ilya
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: does the RBD client block write when the Watcher times out?

Reply via email to