Am 01.05.19 um 00:51 schrieb Patrick Donnelly:
> On Tue, Apr 30, 2019 at 8:01 AM Oliver Freyermuth
> <freyerm...@physik.uni-bonn.de> wrote:
>>
>> Dear Cephalopodians,
>>
>> we have a classic libvirtd / KVM based virtualization cluster using Ceph-RBD 
>> (librbd) as backend and sharing the libvirtd configuration between the nodes 
>> via CephFS
>> (all on Mimic).
>>
>> To share the libvirtd configuration between the nodes, we have symlinked 
>> some folders from /etc/libvirt to their counterparts on /cephfs,
>> so all nodes see the same configuration.
>> In general, this works very well (of course, there's a "gotcha": Libvirtd 
>> needs reloading / restart for some changes to the XMLs, we have automated 
>> that),
>> but there is one issue caused by Yum's cleverness (that's on CentOS 7). 
>> Whenever there's a libvirtd update, unattended upgrades fail, and we see:
>>
>>    Transaction check error:
>>      installing package 
>> libvirt-daemon-driver-network-4.5.0-10.el7_6.7.x86_64 needs 2 inodes on the 
>> /cephfs filesystem
>>      installing package 
>> libvirt-daemon-config-nwfilter-4.5.0-10.el7_6.7.x86_64 needs 18 inodes on 
>> the /cephfs filesystem
>>
>> So it seems yum follows the symlinks and checks the available inodes on 
>> /cephfs. Sadly, that reveals:
>>    [root@kvm001 libvirt]# LANG=C df -i /cephfs/
>>    Filesystem     Inodes IUsed IFree IUse% Mounted on
>>    ceph-fuse          68    68     0  100% /cephfs
>>
>> I think that's just because there is no real "limit" on the maximum inodes 
>> on CephFS. However, returning 0 breaks some existing tools (notably, Yum).
>>
>> What do you think? Should CephFS return something different than 0 here to 
>> not break existing tools?
>> Or should the tools behave differently? But one might also argue that if the 
>> total number of Inodes matches the used number of Inodes, the FS is indeed 
>> "full".
>> It's just unclear to me who to file a bug against ;-).
>>
>> Right now, I am just using:
>> yum -y --setopt=diskspacecheck=0 update
>> as a manual workaround, but this is naturally rather cumbersome.
> 
> This is fallout from [1]. See discussion on setting f_free to 0 here
> [2]. In summary, userland tools are trying to be too clever by looking
> at f_free. [I could be convinced to go back to f_free = ULONG_MAX if
> there are other instances of this.]
> 
> [1] https://github.com/ceph/ceph/pull/23323
> [2] https://github.com/ceph/ceph/pull/23323#issuecomment-409249911

Thanks for the references! That certainly enlightens me on why this decision 
was taken, and of course I congratulate upon trying to prevent false 
monitoring. 
Still, even though I don't have other instances at hand (yet), I am not yet 
convinced "0" is a better choice than "ULONG_MAX". 
It certainly alerts users / monitoring software about doing something wrong, 
but it prevents a check which any file system (or rather, any file system I 
encountered so far) allows. 

Yum (or other package managers doing things in a safe manner) need to ensure 
they can fully install a package in an "atomic" way before doing so,
since rolling back may be complex or even impossible (for most file systems). 
So they need a way to check if a file system can store the additional files in 
terms of space and inodes, before placing the data there,
or risk installing something only partially, and potentially being unable to 
roll back. 

In most cases, the free number of inodes allows for that check. Of course, that 
has no (direct) meaning for CephFS, so one might argue the tools should add an 
exception for CephFS - 
but as the discussion correctly stated, there's no defined way to find out 
where the file system has a notion of "free inodes", and - if we go for an 
exceptional treatment for a list of file systems - 
not even a "clean" way to find out if the file system is CephFS (the tools will 
only see it is FUSE for ceph-fuse) [1]. 

So my question is: 
How are tools which need to ensure that a file system can accept a given number 
of bytes and inodes before actually placing the data there check that in case 
of CephFS? 
And if they should not, how do they find out that this check which is valid on 
e.g. ext4 is not useful on CephFS? 
(or, in other words: if I would file a bug report against Yum, I could not 
think of any implementation they could make to solve this issue)

Of course, if it's just us, we can live with the workaround. We monitor space 
consumption on all file systems, and may start monitoring free inodes on our 
ext4 file systems, 
such that we can safely disable the Yum check on the affected nodes. 
But I wonder whether this is the best way to go (it prevents a valid use case 
of a package manager, and there seems to be no clean way to fix it inside Yum 
that I am aware of). 

Hence, my personal preference would be ULONG_MAX, but of course feel free to 
stay with 0. If nobody else complains, it's probably a non-issue for other 
users ;-). 

Cheers,
        Oliver

[1] https://github.com/ceph/ceph/pull/23323#issuecomment-409249911

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to