if a OSD becomes unavailble (broken disk, rebooting server) then all
I/O to the PGs stored on that OSD will block until replication level of
2 is reached again. So, for a highly available cluster you need a
replication level of 3
On Wed, 2021-02-03 at 10:24 +0100, Mario Giammarco wrote:
> Hello,
On Wed, 2021-02-03 at 09:39 +, Max Krasilnikov wrote:
> > if a OSD becomes unavailble (broken disk, rebooting server) then
> > all
> > I/O to the PGs stored on that OSD will block until replication
> > level of
> > 2 is reached again. So, for a highly available cluster you need a
> > replicatio
Hi there,
we are in the process of growing our Nautilus ceph cluster. Currently,
we have 6 nodes, 3 nodes with 2×5.5TB, 6x11TB disks and 8x186GB SSD and
3 nodes with 6×5.5TB and 6×7.5TB disks. All with dual link 10GE NICs.
The SSDs are used for the CephFS metadata pool, the hard drives are
used for
On Wed, 2021-03-17 at 08:26 +, Andrew Walker-Brown wrote:
> When setting a quota on a pool (or directory in Cephfs), is it the
> amount of client data written or the client data x number of replicas
> that counts toward the quota?
It's the amount of data stored so independent of replication le
we recently added 3 new nodes with 12x12TB OSDs. It took 3 days or so
to reshuffle the data and another 3 days to split the pgs. I did
increase the number of max backfills to speed up the process. We didn't
notice the reshuffling in normal operation.
On Wed, 2021-03-24 at 19:32 +0100, Dan van der
On Wed, 2021-04-14 at 08:55 +0200, Martin Palma wrote:
> Hello,
>
> what is the currently preferred method, in terms of stability and
> performance, for exporting a CephFS directory with Samba?
>
> - locally mount the CephFS directory and export it via Samba?
> - using the "vfs_ceph" module of Samb
I totally agree - we use a management system to manage all our Linux
machines. Adding containers just makes that a lot more complex,
especially since our management system does not support containers.
Regards
magnus
On Wed, 2021-06-02 at 10:36 +0100, Matthew Vernon wrote:
> This email was sent to
Hi all,
I know this came up before but I couldn't find a resolution.
We get the error
libceph: monX session lost, hunting for new mon
a lot on our samba servers that reexport cephfs. A lot means more than
once a minute. On other machines that are less busy we get it about
every 10-30 minutes. We on
We are using SL7 to export our cephfs via samba to windows. The
RHEL7/Centos7/SL7 distros do not come with packages for the samba
cephfs module. This is one of the reasons why we are mounting the file
system locally using the kernel cephfs module with the automounter and
reexporting it using vanill
Hi all,
we have hit the problem where a directory tree containing over a
million entries was deleted on a snapshotted cephfs. The cluster
reports mostly healthy except for some slow MDS responses. However, the
filesystem became unusable. The MDS reports
ceph daemon mds.`hostname -s` perf dump | gr
Hi there,
further to my earlier email
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/46BETLK5CIHBLRLCP5ZW4IAWTY4POADL/
so we tried to reduce the number of meta data servers to 1 (from 2).
rank.1 is now sitting in the stopping state but nothing is happening.
We no longer have any c
On Thu, 2021-10-14 at 14:25 +0200, Dan van der Ster wrote:
> Is that confirmed with a higher debug_mds setting on that "stuck"
> mds?
>
>
>
> You should try to understand what mds.1 is doing, via debug_mds=10 or
> so.
>
>
>
> If it really looks idle, then it might be worth restarting mds.1's
>
> da
Hi all,
we seem to have recovered from our cephfs misadventure. Having said
that I would like to better understand what went wrong and if/how we
can avoid that in future.
We have nautilus ceph cluster that provides cephfs to our school. We
keep nightly snapshots for one week.
One user has a parti
Hi all,
during our recent cephfs misadventure we have been staring a lot at the
output from
ceph fs status
and we were wondering what the numbers under the dns and inos heading
mean?
Cheers
magnus
The University of Edinburgh is a charitable body, registered in Scotland, with
registration number SC
Hi there,
on our pacific (16.2.9) cluster one of the OSD daemons has died and
fails to restart. The OSD exposes a NVMe drive and is one of 4
identical machines. We are using podman to orchestrate the ceph
daemons. The underlying OS is managed. The system worked fine without
any issues until recentl
We have increased the cache on our MDS which makes this issue mostly go away.
It is due to an interaction between the MDS and the ganesha NFS server which
keeps its own cache. I believe newer versions of ganesha can deal with it.
Sent from Android device
On 20 Oct 2021 09:37, Marc wrote:
This
Hi Artur,
we did write a script (in fact a series of scripts) that we use to
manage our users and their quotas. Our script adds a new user to our
LDAP and sets the default quotas for various storage areas. Quota
information is kept in the LDAP. Another script periodically scans the
LDAP for changes
Hi Mathias,
I have noticed in the past the moving directories within the same mount
point can take a very long time using the system mv command. I use a
python script to archive old user directories by moving them to a
different part of the filesystem which is not exposed to the users. I
use the re
Hi all,
it seems to be the time of stuck MDSs. We also have our ceph filesystem
degraded. The MDS is stuck in replay for about 20 hours now.
We run a nautilus ceph cluster with about 300TB of data and many
millions of files. We run two MDSs with a particularly large directory
pinned to one of them
at this stage we are not so worried about recovery since we moved to
our new pacific cluster. The problem arose during one of the nightly
syncs of the old cluster to the new cluster. However, we are quite keen
to use this as a learning opportunity to see what we can do to bring
this filesystem back
On Sat, 2022-06-04 at 14:36 -0400, Ramana Venkatesh Raja wrote:
> If that's not helpful, then try setting `ceph config set mds
>
> debug_objecter 10`, restart the MDS, and check the objecter related
>
> logs in the MDS?
This didn't reveal anything useful - I just got the occasional tick. I
restar
Hi there,
we currently have a ceph cluster with 6 nodes and a public and cluster
network. Each node has two bonded 2x1GE network interfaces, one for the
public and one for the cluster network. We are planning to upgrade the
networking to 10GE. Given the modest size of our cluster we would like
to s
Hi there,
we reconfigured our ceph cluster yesterday to remove the cluster
network and things didn't quite go to plan. I am trying to figure out
what went wrong and also what to do next.
We are running nautilus 14.2.10 on Scientific Linux 7.8.
So, we are using a mixture of RBDs and cephfs. For th
Hi Patrick,
thanks for the reply
On Fri, 2020-09-04 at 10:25 -0700, Patrick Donnelly wrote:
> > We then started using the cephfs (we keep VM images on the cephfs).
> > The
> > MDS were showing an error. I restarted the MDS but they didn't come
> > back.We then followed the instructions here:
> > h
On Sat, 2020-09-05 at 08:10 +, Magnus HAGDORN wrote:
> > I don't have any recent data on how long it could take but you
> > might
> > try using at least 8 workers.
>
>
> We are using 4 workers and the first stage hasn't completed yet. Is
> it
>
&g
25 matches
Mail list logo