Frank,

Sorry for late.

On 24/03/2023 01:56, Frank Schilder wrote:
Hi Xiubo and Gregory,

sorry for the slow reply, I did some more debugging and didn't have too much 
time. First some questions to collecting logs, but please see also below for 
reproducing the issue yourselves.

I can reproduce it reliably but need some input for these:

enabling the kclient debug logs and
How do I do that? I thought the kclient ignores the ceph.conf and I'm not aware of a 
mount option to this effect. Is there a "ceph config set ..." setting I can 
change for a specific client (by host name/IP) and how exactly?

$ echo "module ceph +p" > /sys/kernel/debug/dynamic_debug/control

This will enable the debug logs in kernel ceph. Then please provide the message logs.


also the mds debug logs
I guess here I should set a higher loglevel for the MDS serving this directory 
(it is pinned to a single rank) or is it something else?

$ ceph daemon mds.X config set debug_mds 25
$ ceph daemon mds.X config set debug_ms 1


The issue seems to require a certain load to show up. I created a minimal tar 
file mimicking the problem and having 2 directories with a hard link from a 
file in the first to a new name in the second directory. This does not cause 
any problems, so its not that easy to reproduce.

How you can reproduce it:

As an alternative to my limited skills of pulling logs out, I make the 
tgz-archive available to you both. You will receive an e-mail from our 
one-drive with a download link. If you un-tar the archive on an NFS client dir 
that's a re-export of a kclient mount, after some time you should see the 
errors showing up.

I can reliably reproduce these errors on our production- as well as on our test 
cluster. You should be able to reproduce it too with the tgz file.

Here is a result on our set-up:

- production cluster (executed in a sub-dir conda to make cleanup easy):

$ time tar -xzf ../conda.tgz
tar: mambaforge/pkgs/libstdcxx-ng-9.3.0-h6de172a_18/lib/libstdc++.so.6.0.28: 
Cannot hard link to ‘envs/satwindspy/lib/libstdc++.so.6.0.28’: Read-only file 
system
[...]
tar: mambaforge/pkgs/boost-cpp-1.72.0-h9d3c048_4/lib/libboost_log.so.1.72.0: 
Cannot hard link to ‘envs/satwindspy/lib/libboost_log.so.1.72.0’: Read-only 
file system
^C

real    1m29.008s
user    0m0.612s
sys     0m6.870s

By this time there are already hard links created, so it doesn't fail right 
away:
$ find -type f -links +1
./mambaforge/pkgs/libev-4.33-h516909a_1/share/man/man3/ev.3
./mambaforge/pkgs/libev-4.33-h516909a_1/include/ev++.h
./mambaforge/pkgs/libev-4.33-h516909a_1/include/ev.h
...

- test cluster (octopus latest stable, 3 OSD hosts with 3 HDD OSDs each, simple 
ceph-fs):

# ceph fs status
fs - 2 clients
==
RANK  STATE     MDS        ACTIVITY     DNS    INOS
  0    active  tceph-02  Reqs:    0 /s  1807k  1739k
   POOL      TYPE     USED  AVAIL
fs-meta1  metadata  18.3G   156G
fs-meta2    data       0    156G
fs-data     data    1604G   312G
STANDBY MDS
   tceph-01
   tceph-03
MDS version: ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) 
octopus (stable)

Its the new recommended 3-pool layout with fs-data being a 4+2 EC pool.

$ time tar -xzf / ... /conda.tgz
tar: mambaforge/ssl/cacert.pem: Cannot hard link to 
‘envs/satwindspy/ssl/cacert.pem’: Read-only file system
[...]
tar: mambaforge/lib/engines-1.1/padlock.so: Cannot hard link to 
‘envs/satwindspy/lib/engines-1.1/padlock.so’: Read-only file system
^C

real    6m23.522s
user    0m3.477s
sys     0m25.792s

Same story here, a large number of hard links has already been created before 
it starts failing:

$ find -type f -links +1
./mambaforge/lib/liblzo2.so.2.0.0
...

Looking at the output of find in both cases it also looks a bit 
non-deterministic when it starts failing.

It would be great if you can reproduce the issue on a similar test setup using 
the archive conda.tgz. If not, I'm happy to collect any type of logs on our 
test cluster.

We have now one user who has problems with rsync to an NFS share and it would 
be really appreciated if this could be sorted.

The ceph qa teuthology test cases have already one similar test, which will untar a kernel tarball, but never seen this yet.

I will try this again tomorrow without the NFS client.

Thanks

- Xiubo


Thanks for your help and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Xiubo Li <xiu...@redhat.com>
Sent: Thursday, March 23, 2023 2:41 AM
To: Frank Schilder; Gregory Farnum
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ln: failed to create hard link 'file name': 
Read-only file system

Hi Frank,

Could you reproduce it again by enabling the kclient debug logs and also
the mds debug logs ?

I need to know what exactly has happened in kclient and mds side.
Locally I couldn't reproduce it.

Thanks

- Xiubo

On 22/03/2023 23:27, Frank Schilder wrote:
Hi Gregory,

thanks for your reply. First a quick update. Here is how I get ln to work after 
it failed, there seems no timeout:

$ ln envs/satwindspy/include/ffi.h 
mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h
ln: failed to create hard link 
'mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h': Read-only file system
$ ls -l envs/satwindspy/include mambaforge/pkgs/libffi-3.3-h58526e2_2
envs/satwindspy/include:
total 7664
-rw-rw-r--.   1 rit rit    959 Mar  5  2021 ares_build.h
[...]
$ ln envs/satwindspy/include/ffi.h 
mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h

After an ls -l on both directories ln works.

To the question: How can I pull out a log from the nfs server? There is nothing 
in /var/log/messages.

I can't reproduce it with simple commands on the NFS client. It seems to occur 
only when a large number of files/dirs is created. I can make the archive 
available to you if this helps.

Best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Gregory Farnum <gfar...@redhat.com>
Sent: Wednesday, March 22, 2023 4:14 PM
To: Frank Schilder
Cc: ceph-users@ceph.io
Subject: Re: [ceph-users] Re: ln: failed to create hard link 'file name': 
Read-only file system

Do you have logs of what the nfs server is doing?
Managed to reproduce it in terms of direct CephFS ops?


On Wed, Mar 22, 2023 at 8:05 AM Frank Schilder 
<fr...@dtu.dk<mailto:fr...@dtu.dk>> wrote:
I have to correct myself. It also fails on an export with "sync" mode. Here is 
an strace on the client (strace ln envs/satwindspy/include/ffi.h 
mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h):

[...]
stat("mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h", 0x7ffdc5c32820) = 
-1 ENOENT (No such file or directory)
lstat("envs/satwindspy/include/ffi.h", {st_mode=S_IFREG|0664, st_size=13934, 
...}) = 0
linkat(AT_FDCWD, "envs/satwindspy/include/ffi.h", AT_FDCWD, 
"mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h", 0) = -1 EROFS (Read-only file 
system)
[...]
write(2, "ln: ", 4ln: )                     = 4
write(2, "failed to create hard link 'mamb"..., 80failed to create hard link 
'mambaforge/pkgs/libffi-3.3-h58526e2_2/include/ffi.h') = 80
[...]
write(2, ": Read-only file system", 23: Read-only file system) = 23
write(2, "\n", 1
)                       = 1
lseek(0, 0, SEEK_CUR)                   = -1 ESPIPE (Illegal seek)
close(0)                                = 0
close(1)                                = 0
close(2)                                = 0
exit_group(1)                           = ?
+++ exited with 1 +++

Has anyone advice?

Thanks!
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Frank Schilder <fr...@dtu.dk<mailto:fr...@dtu.dk>>
Sent: Wednesday, March 22, 2023 2:44 PM
To: ceph-users@ceph.io<mailto:ceph-users@ceph.io>
Subject: [ceph-users] ln: failed to create hard link 'file name': Read-only 
file system

Hi all,

on an NFS re-export of a ceph-fs (kernel client) I observe a very strange 
error. I'm un-taring a larger package (1.2G) and after some time I get these 
errors:

ln: failed to create hard link 'file name': Read-only file system

The strange thing is that this seems only temporary. When I used "ln src dst" for manual 
testing, the command failed as above. However, after that I tried "ln -v src dst" and 
this command created the hard link with exactly the same path arguments. During the period when the 
error occurs, I can't see any FS in read-only mode, neither on the NFS client nor the NFS server. 
Funny thing is that file creation and write still works, its only the hard-link creation that fails.

For details, the set-up is:

file-server: mount ceph-fs at /shares/path, export /shares/path as nfs4 to 
other server
other server: mount /shares/path as NFS

More precisely, on the file-server:

fstab: MON-IPs:/shares/folder /shares/nfs/folder ceph 
defaults,noshare,name=NAME,secretfile=sec.file,mds_namespace=FS-NAME,_netdev 0 0
exports: /shares/nfs/folder 
-no_root_squash,rw,async,mountpoint,no_subtree_check DEST-IP

On the host at DEST-IP:

fstab: FILE-SERVER-IP:/shares/nfs/folder /mnt/folder nfs defaults,_netdev 0 0

Both, the file server and the client server are virtual machines. The file 
server is on Centos 8 stream (4.18.0-338.el8.x86_64) and the client machine is 
on AlmaLinux 8 (4.18.0-425.13.1.el8_7.x86_64).

When I change the NFS export from "async" to "sync" everything works. However, 
that's a rather bad workaround and not a solution. Although this looks like an NFS issue, I'm 
afraid it is a problem with hard links and ceph-fs. It looks like a race with scheduling and 
executing operations on the ceph-fs kernel mount.

Has anyone seen something like that?

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io<mailto:ceph-users@ceph.io>
To unsubscribe send an email to 
ceph-users-le...@ceph.io<mailto:ceph-users-le...@ceph.io>

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

--
Best Regards,

Xiubo Li (李秀波)

Email: xiu...@redhat.com/xiu...@ibm.com
Slack: @Xiubo Li

--
Best Regards,

Xiubo Li (李秀波)

Email: xiu...@redhat.com/xiu...@ibm.com
Slack: @Xiubo Li
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to