[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-07 Thread Gregory Farnum
More generally, as Manuel noted you can (and should!) make use of fsync et al for data safety. Ceph’s async operations are not any different at the application layer from how data you send to the hard drive can sit around in volatile caches until a consistency point like fsync is invoked. -Greg On

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-07 Thread Dhairya Parmar
Hi Charles, There are many scenarios where the write/close operation can fail but generally failures/errors are logged (normally every time) to help debug the case. Therefore there are no silent failures as such except you encountered a very rare bug. - Dhairya On Wed, Dec 7, 2022 at 11:38 PM C

[ceph-users] Re: rbd-mirror stops replaying journal on primary cluster

2022-12-07 Thread Josef Johansson
Hi, I've updated https://tracker.ceph.com/issues/57396 with some more info, it seems that disabling discard within a guest solves the problem (or switching from virtio-scsi-single to virtio-blk in older kernels). I'm testing two different VMs on the same hypervisor with identical configs, one work

[ceph-users] Re: Questions about r/w low performance on ceph pacific vs ceph luminous

2022-12-07 Thread Paul Mezzanini
I looked into nocache vs direct. It looks like nocache just requests that the caches be dumped before doing it's operations while direct uses direct IO. Writes getting cached would make it appear much faster. Those tests are not apples-to-apples. I'm also trying to decode how you did your

[ceph-users] Re: Anyone else having Problems with lots of dying Seagate Exos X18 18TB Drives ?

2022-12-07 Thread Paul Mezzanini
I started installing the SAS version of these drives two years ago in our cluster and I haven't had one fail yet. I've been working on replacing every spinner we have with them. I know it's not helping you figure out what is going on in your environment but hopefully a "the drive works for me"

[ceph-users] Re: Questions about r/w low performance on ceph pacific vs ceph luminous

2022-12-07 Thread Marc
> 2. Why when using oflag=direct on both luminous and pacific has such a big > difference in I/O performance? is it related to moving from ceph-disk to > ceph-volume? How do you know it is ceph and not the os, kernel and/or virtualisation environment? __

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-07 Thread Anthony D'Atri
Especially on SSDs. > On Dec 7, 2022, at 14:16, Matthias Ferdinand wrote: > > The usefulness of %util is limited anyway. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-07 Thread Manuel Holtgrewe
POSIX fsync guarantees should hold with cephfs, I guess? Nathan Fish schrieb am Mi., 7. Dez. 2022, 20:13: > OSDs journal writes and can ACK before writeback finishes. But the > journal is still stable storage. I'm not aware of anything important > that is ACK'd while only in RAM. > > On Wed, Dec

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-07 Thread Matthias Ferdinand
On Wed, Dec 07, 2022 at 11:13:49AM +0100, Boris Behrens wrote: > Hi Sven, > thanks for the input. > > So I did some testing and "maybe" optimization. > The same disk type in two different hosts (one Ubuntu and one Centos7) have > VERY different iostat %util values: I guess Centos7 has a rather ol

[ceph-users] Re: what happens if a server crashes with cephfs?

2022-12-07 Thread Nathan Fish
OSDs journal writes and can ACK before writeback finishes. But the journal is still stable storage. I'm not aware of anything important that is ACK'd while only in RAM. On Wed, Dec 7, 2022 at 1:08 PM Charles Hedrick wrote: > > I believe asynchronous operations are used for some operations in ceph

[ceph-users] what happens if a server crashes with cephfs?

2022-12-07 Thread Charles Hedrick
I believe asynchronous operations are used for some operations in cephfs. That means the server acknowledges before data has been written to stable storage. Does that mean there are failure scenarios when a write or close will return an error? fail silently?

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-07 Thread Alex Gorbachev
Hi Boris, You have sysfs access in /sys/block//device - this will show a lot of settings. You can go to this directory on CentOS vs. Ubuntu, and see if any setting is different? -- Alex Gorbachev https://alextelescope.blogspot.com On Wed, Dec 7, 2022 at 5:14 AM Boris Behrens wrote: > Hi Sven

[ceph-users] Re: cephfs snap-mirror stalled

2022-12-07 Thread Holger Naundorf
On 06.12.22 14:17, Venky Shankar wrote: On Tue, Dec 6, 2022 at 6:34 PM Holger Naundorf wrote: On 06.12.22 09:54, Venky Shankar wrote: Hi Holger, On Tue, Dec 6, 2022 at 1:42 PM Holger Naundorf wrote: Hello, we have set up a snap-mirror for a directory on one of our clusters - running cep

[ceph-users] Questions about r/w low performance on ceph pacific vs ceph luminous

2022-12-07 Thread Shai Levi (Nokia)
Hey, We saw that the performance of rbd disk image IOPS over Ceph-Pacific is much slower than the rbd disk image IOPS over Ceph-Luminous. We performed a simple dd write test with the following results: **the hardware and the osd layout is the same on both environments Writing to rbd image (via

[ceph-users] Extending RadosGW HTTP Request Body With Additional Claim Values Present in OIDC token.

2022-12-07 Thread Ahmad Alkhansa
Hi, We are using RadosGW STS functionality to allow OIDC AuthN/Z of Ceph users. In addition, we have enabled Open Policy Agent (OPA) to manage AuthZ policies in a continuous integration environment. After performing Assume Role with Web Identity with RadosGW, the HTTP request body that is sen

[ceph-users] Anyone else having Problems with lots of dying Seagate Exos X18 18TB Drives ?

2022-12-07 Thread Christoph Adomeit
Hi, I am using Seagate Exos X18 18TB Drives in a Ceph Archives Cluster which is mainly write once/read sometimes. The drives are about 6 months old. I use them in a ceph cluster and also in a ZFS Server. Different Servers (all Supermicro) and different controllers but all of type LSI SAS3008 I

[ceph-users] Re: Newer linux kernel cephfs clients is more trouble?

2022-12-07 Thread William Edwards
> Op 7 dec. 2022 om 11:59 heeft Stefan Kooman het volgende > geschreven: > > On 5/13/22 09:38, Xiubo Li wrote: >>> On 5/12/22 12:06 AM, Stefan Kooman wrote: >>> Hi List, >>> >>> We have quite a few linux kernel clients for CephFS. One of our customers >>> has been running mainline kernels (C

[ceph-users] Re: Newer linux kernel cephfs clients is more trouble?

2022-12-07 Thread Stefan Kooman
On 5/13/22 09:38, Xiubo Li wrote: On 5/12/22 12:06 AM, Stefan Kooman wrote: Hi List, We have quite a few linux kernel clients for CephFS. One of our customers has been running mainline kernels (CentOS 7 elrepo) for the past two years. They started out with 3.x kernels (default CentOS 7), bu

[ceph-users] Re: octopus rbd cluster just stopped out of nowhere (>20k slow ops)

2022-12-07 Thread Boris Behrens
Hi Sven, thanks for the input. So I did some testing and "maybe" optimization. The same disk type in two different hosts (one Ubuntu and one Centos7) have VERY different iostat %util values: Ubuntu: Devicer/s rkB/s rrqm/s %rrqm r_await rareq-sz w/s wkB/s wrqm/s %wrqm w_

[ceph-users] add an existing rbd image to iscsi target

2022-12-07 Thread farhad kh
i have cluster (v 17.2.4) with cephadm --- [root@ceph-01 ~]# ceph -s cluster: id: c61f6c8a-42a1-11ed-a5f1-000c29089b59 health: HEALTH_OK services: mon:3 daemons, quorum ceph-01.fns.com,ceph-03,ceph-02 (age 109m) mgr:ceph-01.fns.com.vdoxhd(active, since 1