We’re happy to announce the availability of the third Octopus stable
release series. This release mainly is a workaround for a potential OSD
corruption in v15.2.2. We advise users to upgrade to v15.2.3 directly.
For users running v15.2.2 please execute the following::
ceph config set osd blue
Jarett;
It is and it isn't. Replication can be thought of as continuous backups.
Backups, especially as SpiderFox is suggesting, are point-in-time, immutable
copies of data. Until they are written over, they don't change, even if the
data does.
In Ceph's RadosGW (RGW) multi-site replication
For some reason I’d thought replication between clusters was an “official”
method of backing up.
> On May 29, 2020, at 4:31 PM,
> wrote:
>
> Ludek;
>
> As a cluster system, Ceph isn't really intended to be backed up. It's
> designed to take quite a beating, and preserve your data.
>
> Fro
SpiderFox;
If you're concerned about ransomware (and you should be), then you should:
a) protect the cluster from the internet AND from USERS.
b) place another technology between your cluster and your users (I use
Nextcloud backed by RadosGW through S3 buckets)
c) turn on versioning in your bucke
Hi all,
I am attempting to prevent bluestore rocksdb Level 3/4 spillover with
a 150GB logical volume for the db/wal.
I am thinking of setting max_bytes_for_level_base to about 1.3G
(1342177280). This should let Level 3 fill up the 150GB logical
volume. I don't expect to ever actually need L4.
An
Willi;
ZFS on RBD seems like a waste, and overkill. A redundant storage solution on
top of a redundant storage solution?
You can have multiple file systems within CephFS, the thing to note is that
each CephFS MUST have a SEPARATE active MDS.
For failover, each should have a secondary MDS, and
Am Fr., 29. Mai 2020 um 23:32 Uhr schrieb :
> Ludek;
>
> As a cluster system, Ceph isn't really intended to be backed up. It's
> designed to take quite a beating, and preserve your data.
>
>
But that does not save me when a crypto trojan encrypts all my data. There
should always be an offline bac
Ludek;
As a cluster system, Ceph isn't really intended to be backed up. It's designed
to take quite a beating, and preserve your data.
From a broader disaster recovery perspective, here's how I architected my
clusters:
Our primary cluster is laid out in such a way that an entire rack can fail
BR;
I've built my own iSCSI targets (using Fedora and CentOS), and use them in
production. I've also built 2 different Ceph clusters.
They are completely different. Set aside everything you know about iSCSI, it
doesn't apply.
Ceph is a clustered object store, it can dynamically expand (nearl
Hi,the pg balancer is not working at all and if I call status the plugin does
not respond, it just hangs forever. Mgr restart doesnt help.I have a PG
distribution issue now, how to fix this?v 14.2.5
kind regards
___
ceph-users mailing list -- ceph-users
I have a 3 hosts, 10 4TB HDDs per host ceph storage set up. I deined a 3
replica rbd pool and some images and presented them to a Vmware host via ISCSI,
but the write performance is so bad the I managed to freeze a VM doing a big
rsync to a datastore inside ceph and had to reboot it's host (seem
I have a 3 hosts, 10 4TB HDDs per host ceph storage set up. I deined a 3
replica rbd pool and some images and presented them to a Vmware host via ISCSI,
but the write performance is so bad the I managed to freeze a VM doing a big
rsync to a datastore inside ceph and had to reboot it's host (seem
Hello List,
first of all: Yes - i made mistakes. Now i am trying to recover :-/
I had a healthy 3 node cluster which i wanted to convert to a single one.
My goal was to reinstall a fresh 3 Node cluster and start with 2 nodes.
I was able to healthy turn it from a 3 Node Cluster to a 2 Node cluste
HI all,
what is the best approach for OSD backups and recovery?
We use only Radosgw with S3 API and I need to backup the content of S3 buckets.
Currently I sync s3 buckets to local filesystem and backup the content using
Amanda.
I believe that there must a better way to do this but I couldn't fi
OK, where to start. I have been debugging intensively the last two days,
but can't seem to wrap my head around the performance issues we see in
one of our two hyperconverged (ceph) proxmox clusters.
Let me introduce our two clusters and some of the debugging results.
*1. Cluster for internal p
We have been using a cephfs pool to store machine data to, the data is not
overly critical at this time but.
Its got to around 8TB and we started to see kernel panics with the hosts that
had the mounts in place.
Now when try to start the MDS's they cycle through, Active, Replay,
ClientReplay
Se désabonner
des communications du groupe Partouche
L'email ne s'affiche pas correctement ? Cliquez ici.
Offre dédiée aux entreprises
Vous souhaitez organiser un séminaire, un cocktail,
une soirée de lancement de produit ou tout autre évènement
pour votre entreprise ?
découvrez nos lieux
I’m pretty sure I’ve seen that happen with QFX5100 switches and
net.core.netdev_max_backlog=25
net.ipv4.tcp_max_syn_backlog=10
net.ipv4.tcp_max_tw_buckets=200
> On May 29, 2020, at 10:53 AM, Dave Hall wrote:
>
> I agree with Paul 100%. Going further - there are many more 'knobs
Hello Everyone,
I'm still having issues getting the OSDs to properly create on a brand new
Ceph 15.2.2 cluster. I don't see to be able to have OSDs created based on
service definition of 2 osds per disk and encryption. It seems to hang
and/or I see "No Deployments..."
Has anyone had luck with t
I agree with Paul 100%. Going further - there are many more 'knobs to
turn' than just Jumbo Frames, which makes the problem even harder.
Changing any one setting may just move the bottleneck, or possibly
introduce instabilities. In the worst case, one might tune their Linux
system so well th
Phil, this would be an excellent contribution to the blog or the introductory
documentation. I’ve been using Ceph for over a year this brought together a lot
of concepts that I hadn’t related so succinctly in the past.
One of the things that I hadn’t really conceptualized well was “why size of
What are the specs of your nodes? And what specific harddisks are you using?
On Fri, May 29, 2020, 18:41 Salsa wrote:
> I have a 3 hosts, 10 4TB HDDs per host ceph storage set up. I deined a 3
> replica rbd pool and some images and presented them to a Vmware host via
> ISCSI, but the write perfo
Dear Igor,
thanks a lot for the analysis and recommendations.
> Here is a brief analysis:
> 1) Your DB is pretty large - 27GB at DB device (making it full) and
> 279GB at main spinning one. I.e. RocksDB is experiencing huge
> spillover to slow main device - expect performance drop. And generall
Doesn’t SMB support require a paid subscription?
> On Feb 13, 2020, at 3:12 AM, Martin Verges wrote:
>
> Hello Lars,
>
> we have full SMB Support in our Ceph management solution. You can create
> simple (user+pass) or complex SMB (AD) high available shares on CTDB
> clustered Samba with ease.
On Fri, May 29, 2020 at 12:09 PM Miguel Castillo
wrote:
> Happy New Year Ceph Community!
>
> I'm in the process of figuring out RBD mirroring with Ceph and having a
> really tough time with it. I'm trying to set up just one way mirroring
> right now on some test systems (baremetal servers, all De
I have a 3 node ceph cluster for my house that I have been using for a few
years now without issue. Each node is a MON, MGR, and MDS, and has 2-3 OSDs
on them. It has, however been slow. I decided to finally move the bluestore
DBs to SSDs. I did one OSD as a test case to make sure everything was go
Phil;
I like to refer to basic principles, and design assumptions / choices when
considering things like this. I also like to refer to more broadly understood
technologies. Finally; I'm still relatively new to Ceph, so here it goes...
TLDR: Ceph is (likes to be) double-redundent (like RAID-6)
Yeah, it's been released in Octopus. But this is rather about better
using DB space not adjusting rocksdb level sizes.
See https://github.com/ceph/ceph/pull/29687
Thanks,
Igor
On 5/28/2020 7:18 PM, Frank R wrote:
If I remember correctly, being able to configure the rocksdb level
sizes was
Hello,
Unfortunately, due to the COVID-19 pandemic, the Ceph Foundation is
looking into running future Ceph Days virtually. We have created a
survey to gather feedback from the community on how these events should run.
You can access the survey here: https://survey.zohopublic.com/zs/jsCsIn
W
Simon, Harald
thanks for the information. Got your log offline too.
Here is a brief analysis:
1) Your DB is pretty large - 27GB at DB device (making it full) and
279GB at main spinning one. I.e. RocksDB is experiencing huge spillover
to slow main device - expect performance drop. And general
Hello,
What I can see here : http://download.ceph.com/rpm-octopus/el8/ is that the
first Ceph release available on CentOS 8 is Octopus and is already
accessible.
Thanks, Eric.
On Fri, May 29, 2020 at 5:44 PM Guillaume Abrioux
wrote:
> Hi Jan,
>
> I might be wrong but I don't think download.ceph.
On Fri, May 29, 2020 at 5:43 PM 李亚锋 wrote:
>
> hi:
>
> I deployed ceph cluster with rdma, it's version is "15.0.0-7282-g05d685d
> (05d685dd37b34f2a015e77124c537f3f8e663152) octopus (dev)".
>
> the cluster status is ok as follows:
>
> [root@node83 lyf]# ceph -s
> cluster:
> id: cd389d63
On Fri, May 29, 2020 at 11:38 AM Palanisamy wrote:
> Hello Team,
>
> Can I get any update on this request.
>
The Ceph team is not really involved in the out-of-tree rbd-provisioner.
Both the in-tree and this out-of-tree RBD provisioner are deprecated to the
ceph-csi [1][2] RBD provisioner. The c
Hi all,
I have an issue on my Ceph cluster.
For one of my pools I have 107TiB STORED and 298TiB USED.
This is strange, since I've configured erasure coding (6 data chunks, 3
coding chunks).
So, in an ideal world this should result in approx. 160.5TiB USED.
The question now is why this is the case
-- Forwarded message -
From: David Seith
Date: Wed, 12 Feb 2020 at 11:24
Subject: Finding erasure-code-profile of crush rule
To:
Dear all,
On our ceph cluster we have created multiple erasure coding profiles and
then created a number of crush rules for this profiles using:
cep
Upgraded to 14.2.7, doesn't appear to have affected the behavior. As requested:
~$ ceph tell mds.mds1 heap stats
2020-02-10 16:52:44.313 7fbda2cae700 0 client.59208005
ms_handle_reset on v2:x.x.x.x:6800/3372494505
2020-02-10 16:52:44.337 7fbda3cb0700 0 client.59249562
ms_handle_reset on v2:x.x.x
Hi,
I'd like to fix the crush tree and crush rule and would like to know the
correct steps and worst case scenario what can happen during the
maintenance.
Steps should be like:
1. Create the rack structured crush tree under root default
2. create the replicated crush rules
3. Move the nodes under
Dear all
Running nautilus 14.2.7. The data in the FS are important and cannot be
lost.
Today I increased the PGS of the volume pool from 8k to 16k. The active
mds started reporting slow ops. (the filesystem is not in the volume
pool). After few hours the FS was very slow, I reduced the backf
Hi,
I'd like to fix the crush tree and crush rule and would like to know the
correct steps and worst case scenario what can happen during the
maintenance.
Steps should be like:
1. Create the rack structured crush tree under root default
2. create the replicated crush rules
3. Move the nodes under
Hello,
I'm a beginner on ceph. I set up some ceph clusters in google cloud.
Cluster1 has three nodes and each node has three disks. Cluster2 has three
nodes and each node has two disks. Cluster3 has five nodes and each node
has five disks. Disk speed shown by `dd if=/dev/zero of=here bs=1G count=1
Hello All,
I have a HW RAID based 240 TB data pool with about 200 million files for
users in a scientific institution. Data sizes range from tiny parameter
files for scientific calculations and experiments to huge images of
brain scans. There are group directories, home directories, Windows
r
Hello all,
I have a problem with my CephFS that I’m stumped on. I recently had to rebuild
a node who’s system disk failed. Once I did that, I re-created the osd
directory structure in /var/lib/ceph/osd and the osds came back into the
cluster, then had to backfill. However, I now have the pro
Hi all,
I've been trying out RHCS 3.3 using Bluestore for a few months now, however I
came across the challenge below...
Can someone please assist with a guide on how to map and replace failed
bluestore OSD disks in a containerized Rehat Ceph environment?
Thanks in advance.
Regards
John.
Vry
Hi Igor,
Dne 23. 01. 20 v 15:37 Igor Fedotov napsal(a):
>
> Martin,
>
> suggest a couple more checks:
>
> 1) Try different value(s) for memory target. Including one that is
> equal to default 4Gb
>
I've tried many random numbers and even original value acquired by:
# ceph daemon osd.0 config get
Hi Igor,
unfortunately same result:
# ceph config dump
WHO MASK LEVEL OPTION VALUE RO
osd basic osd_memory_target 2147483648
# /usr/bin/ceph-osd -d --cluster ceph --id 0 --setuser ceph --setgroup ceph
0> 2020-01-23 10:48:04.436 7fc61b5b5c80 -1 *** Caught sig
Hey all,
We’ve been running some benchmarks against Ceph which we deployed using the
Rook operator in Kubernetes. Everything seemed to scale linearly until a point
where I see a single OSD receiving much higher CPU load than the other OSDs
(nearly 100% saturation). After some investigation we no
Hello everyone
If ms_type = async + dpdk, Ceph can work in the master version? I have a
problem using dpdk in the master version. Mon cannot be started. Please hep me,
thank you very much!
1、My network card configuration information:
[root@ebs12 dpdk]# python usertools/dpdk-devbind.py --s
Has anyone ever tried using this feature? I've added it to the [global]
section of the ceph.conf on my POC cluster but I'm not sure how to tell if
it's actually working. I did find a reference to this feature via Google and
they had it in their [OSD] section?? I've tried that too..
TIA
Adam
oops.
I posted this to the "Old" list, but supposedly this is the new list and the
better place to ask questions?
A google search didnt seem to find the answer on this, so thought I'd ask here:
what determines if an rdb is "100% busy"?
I have some backend OSDs, and an iSCSI gateway, serving out
Hi all,
I am new to Ceph. But I have a some good understanding of iSCSI protocol. I
will dive into Ceph because it looks promising. I am particularly
interested in Ceph-RBD. I have a request. Can you please tell me, if any,
what are the common similarities between iSCSI and Ceph. If someone has to
Happy New Year Ceph Community!
I'm in the process of figuring out RBD mirroring with Ceph and having a really
tough time with it. I'm trying to set up just one way mirroring right now on
some test systems (baremetal servers, all Debian 9). The first cluster is 3
nodes, and the 2nd cluster is 2
Thanks Jason, you are right. My code didn't update in time, The bug has fixed
in https://tracker.ceph.com/issues/42248,
zheng...@cmss.chinamobile.com
From: Jason Dillaman
Date: 2020-01-03 00:25
To: zheng...@cmss.chinamobile.com
CC: yangjun; ceph-users
Subject: Re: report librbd bug export-
Hi,
I am trying to get ceph cluster health and cluster status via REST api. My
cluster is running the latest nautilus release(v14). I found that the
"ceph-rest-api" is deprecated (
https://docs.ceph.com/docs/mimic/releases/luminous/ ) and now there is ceph
RESTful module. But cluster health inform
Hi
I can't find el8 rpm package here.
http://download.ceph.com/rpm-hammer/
Might I know when the rpm package will be ready for centos8 ?
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
The radosgw can't start normally, the error in log file:
---
2019-12-20 14:37:04.058 7fd5b088f700 -1 Initialization timeout, failed
to initialize
2019-12-20 14:37:04.304 7fe7148c0780 0 deferred set uid:gid to
167:167 (ceph:ceph)
2019-12-20 14:37:04.304 7fe7148c0780 0 c
Based on what we've seen with perf, we think this is the relevant section.
(attached is also the whole file)
Thread: 73 (mgr-fin) - 1000 samples
+ 100.00% clone
+ 100.00% start_thread
+ 100.00% Finisher::finisher_thread_entry()
+ 99.40% Context::complete(int)
| + 99.40% Fun
Hi, are the centos 8 packages for ceph ready to use?.
Thanks.
--
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Hi Lars,
> Is there a mean to list all snapshots existing in a (subdir of)
> Cephfs?
> I can't use the find dommand to look for the ".snap" dirs.
You can, but you can't search for the '.snap' directories, you have to
append them to the directory like `find $cephFsDir/.snap` but I it's
better to u
Hello David,
I'm experiencing issues with OSD balancing, too.
My ceph cluster is running on release
ceph version 14.2.4.1 (596a387fb278758406deabf997735a1f706660c9)
nautilus (stable)
Would you be able to test (the latest code) on my OSDmap and verify if
balancing would work?
I have attached it to
ceph version 14.2.4 (75f4de193b3ea58512f204623e6c5a16e6c1e1ba) nautilus (stable)
os :CentOS Linux release 7.7.1908 (Core)
single node ceph cluster with 1 mon,1mgr,1 mds,1rgw and 12osds , but only
cephfs is used.
ceph -s is blocked after shutting down the machine (192.168.0.104), then ip
add
在 12/7/2019 3:56 PM, yang...@cmss.chinamobile.com 写道:
Hi dongsheng
https://tracker.ceph.com/issues/43178, Can I open a PR to fix it like
you said?Thanks.
Sure, feel free to open a PR to fix it.
yang...@cmss.chinamobil
Hi dongsheng
https://tracker.ceph.com/issues/43178 , Can I open a PR to fix it like you
said?Thanks.
yang...@cmss.chinamobile.com
From: Jason Dillaman
Date: 2019-12-06 22:58
To: Dongsheng Yang
CC: yang...@cmss.chinamobile.com; ceph-users
Subject: Re: rbd_open_by_id crash when connection time
在 12/6/2019 9:46 PM, Jason Dillaman 写道:
On Fri, Dec 6, 2019 at 12:12 AM Dongsheng Yang
wrote:
On 12/06/2019 12:50 PM, yang...@cmss.chinamobile.com wrote:
Hi Jason, dongsheng
I found a problem using rbd_open_by_id when connection timeout(errno = 110,
ceph version 12.2.8, there is no change
On 12/06/2019 12:50 PM, yang...@cmss.chinamobile.com wrote:
Hi Jason, dongsheng
I found a problem using rbd_open_by_id when connection timeout(errno =
110, ceph version 12.2.8, there is no change about rbd_open_by_id in
master branch).
int r = ictx->state->open(false);
if (r < 0) { //
Hi Jason, dongsheng
I found a problem using rbd_open_by_id when connection timeout(errno = 110,
ceph version 12.2.8, there is no change about rbd_open_by_id in master branch).
int r = ictx->state->open(false);
if (r < 0) { // r = -110
delete ictx; // crash,the stack is shown below:
} e
On 12/4/19 9:19 AM, Konstantin Shalygin wrote:
> CephFS indeed support snapshots. Since Samba 4.11 support this feature
> too with vfs_ceph_snapshots.
You can snapshot, but you cannot export a diff of snapshots
> ___
> ceph-users mailing list -- ceph-us
Hi Jan,
I might be wrong but I don't think download.ceph.com provides RPMs that can
be consumed using CentOS 8 at the moment.
Internally, for testing ceph@master on CentOS8, we use RPMs hosted in
chacra.
Dimitri who has worked a bit on this topic might have more inputs.
Thanks,
*Guillaume Abrio
hi:
I deployed ceph cluster with rdma, it's version is "15.0.0-7282-g05d685d (05d685dd37b34f2a015e77124c537f3f8e663152) octopus (dev)".
the cluster status is ok as follows:
[root@node83 lyf]# ceph -s cluster: id: cd389d63-3eda-406b-8025-b26bba106d91 health: HEALTH_OK
service
Guillaume, Dimitry, what does ceph-ansible do?
Thanks!
–
Sébastien Han
Senior Principal Software Engineer, Storage Architect
"Always give 100%. Unless you're giving blood."
On Mon, Dec 2, 2019 at 11:16 AM Jan Kasprzak wrote:
>
> Hello, Ceph users,
>
> does anybody use Ceph on rec
On Mon, Dec 02, 2019 at 10:27:21AM +0100, Marc Roos wrote:
>
> I have been asking before[1]. Since Nautilus upgrade I am having these,
> with a total node failure as a result(?). Was not expecting this in my
> 'low load' setup. Maybe now someone can help resolving this? I am also
> waiting quit
Thanks for the information, I'll take a look at this pr and think it over.
Jeff Layton 于2019年11月27日周三 下午6:50写道:
> On Wed, 2019-11-27 at 15:14 +0800, j j wrote:
> > Hi all,
> >
> >Recently I encountered a situation requires reliable file storage
> with cephfs, and the point is those data is n
Hello Team,
Can I get any update on this request.
*Best Regards,*
*Palanisamy*
On Fri, Nov 22, 2019 at 1:53 PM Palanisamy wrote:
> Hello Team,
>
> We've integrated Ceph cluster storage with Kubernetes and provisioning
> volumes through rbd-provisioner. When we're creating volumes from yaml
>
Hi,
Recently, I'm trying to mount cephfs using non-privileged users via
ceph-fuse,but it‘s always fail. I looked at the code and found that there
will be a remount operation when using ceph-fuse mount. Remount will
execute the 'mount -i -o remount {mountpoint}' command and that causes the
mount to
Hello
please can you help me with that.
I'm on ubuntu 18.04 bionic trying to install ceph version 12.2.12 with
ceph-deploy to 3 nodes on odroid xu4 (armhf).
Creating osd with lvm volumes seem's to work but after starting manager by
the command "ceph-deploy mgr create node1", mgr go to starting mod
Hello Team,
We've integrated Ceph cluster storage with Kubernetes and provisioning
volumes through rbd-provisioner. When we're creating volumes from yaml
files in Kubernetes, pv > pvc > mounting to pod, In kubernetes end pvc are
showing as meaningful naming convention as per yaml file defined. But
You need to disable _MSPAC_SUPPORT to get rid of this dep.
Daniel
On 11/17/19 5:55 AM, Marc Roos wrote:
==
Package Arch Version
Hi guys, I deploy an efk cluster and use ceph as block storage in kubernetes,
but RBD write iops sometimes becomes zero and last for a few minutes. I want to
check logs about RBD so I add some config to ceph.conf and restart ceph.
Here is my ceph.conf:
[global]
fsid = 53f4e1d5-32ce-4e9c-bf36-f6b
Hi,
Is anyone using librados AIO APIs? I seem to have a problem with that where
the rados_aio_wait_for_complete() call just waits for a long period of time
before it finishes without error.
More info on my setup:
I am using Ceph 14.2.4 and write 8MB objects.
I run my AIO program on 24 nodes at t
Unsubscribe please from all email ids
On Tue, 29 Oct 2019 at 7:12 am, wrote:
> Send ceph-users mailing list submissions to
> ceph-users@ceph.io
>
> To subscribe or unsubscribe via email, send a message with subject or
> body 'help' to
> ceph-users-requ...@ceph.io
>
> You can reac
Hi,
to have more meaningful names, I also prefer LVM for the db drive.
Create the volume on OSD by hand too:
vgcreate ceph-block-0 /dev/sda
lvcreate -l 100%FREE -n block-0 ceph-block-0
DB Drive:
vgcreate ceph-db-0 /dev/nvme0n1
lvcreate -L 50GB -n db-0 ceph-db-0
lvcreate -L 50GB -n db-1 ceph-db
Hi,
> so I need to transfer the data from the failed OSD to the other OSDs that are
> healthy.
It’s risky, but if you think the failing disk is healthy “enough”, you
can try migrate the data off of it with "ceph osd out {osd-num}” and waiting
for it to empty. I’m assuming you have eno
Hi,
> so I need to transfer the data from the failed OSD to the other OSDs that are
> healthy.
It’s risky, but if you think the failing disk is healthy “enough”, you
can try migrate the data off of it with "ceph osd out {osd-num}” and waiting
for it to empty. I’m assuming you have eno
Dear ceph users,
we're experiencing a segfault during MDS startup (replay process) which is
making our FS inaccessible.
MDS log messages:
Oct 15 03:41:39.894584 mds1 ceph-mds: -472> 2019-10-15 00:40:30.201
7f3c08f49700 1 -- 192.168.8.195:6800/3181891717 <== osd.26
192.168.8.209:6821/2419345 3
Dear Igor,
thanks a lot for your assistance. We're still trying to bring OSDs back
up... the cluster is not in a great shape right now.
> so the log from the ticket I can see a huge ((400+ MB) bluefs log
> kept over many small non-adjustent extents.
> Presumably it was caused by either setting
Can you manually flush/evict the cache? Maybe reduce max_target_bytes
and max_target_objects to see if that triggers anything. We use
cache_mode writeback, maybe give that a try?
I don't see many differences between our and your cache tier config,
except for cache_mode and we don't have a max
Simon, Harry,
so the log from the ticket I can see a huge ((400+ MB) bluefs log kept
over many small non-adjustent extents.
Presumably it was caused by either setting small bluefs_alloc_size or
high disk space fragmentation or both. Now I'd like more details on your
OSDs.
Could you please
Actually it's the opposite: I have enabled it: `ceph config set
'osd.*' osd_scrub_during_recovery true`, but still no scrubbing. But
now I'm started thinking that the change of `osd_scrub_during_recovery
true` was not imminent. I have waited for some time, and then reverted
it back to the default `
Hi Simon,
your analysis is correct, you've stepped into an unexpected state for
BlueFS log.
This is the second occurrence of the issue, the first one is mentioned at
https://tracker.ceph.com/issues/45519
Looking if we can get out of this state and how to fix that...
Thanks,
Igor
On 5/29/
Please do not apply any optimization without benchmarking *before* and
*after* in a somewhat realistic scenario.
No, iperf is likely not a realistic setup because it will usually be
limited by available network bandwidth which is (should) rarely be maxed
out on your actual Ceph setup.
Paul
--
P
Hello!
Fri, May 29, 2020 at 09:58:58AM +0200, pr wrote:
> Hans van den Bogert (hansbogert) writes:
> > I would second that, there's no winning in this case for your requirements
> > and single PSU nodes. If there were 3 feeds, then yes; you could make an
> > extra layer in your crushmap much l
Did you disable "osd scrub during recovery"?
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://croit.io
croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90
On Fri, May 29, 2020 at 12:04 AM Vytenis A wrote:
> Forgot to mention th
There are two bugs that may cause the tag to be missing from the pools, you
can somehow manually add these tags with "ceph osd pool application ..."; I
think I posted these commands some time ago on tracker.ceph.com
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at ht
Colleague of Harry's here...
Harald Staub writes:
> This is again about our bad cluster, with too much objects, and the
> hdd OSDs have a DB device that is (much) too small (e.g. 20 GB, i.e. 3
> GB usable). Now several OSDs do not come up any more.
> Typical error message:
> /build/ceph-14.2.8/sr
Thanks for pointing this out!
One thing to mention is that we're not using cache tiering, as
described on https://tracker.ceph.com/issues/44286 , but it's a good
lead.
This means that we can't restart (or experience crashes of) OSDs
during rebalancing.
On Fri, May 29, 2020 at 4:18 AM Chad Willia
Hi,
Is there a way to recover UUID from the partition?
Someone mapped in fstab to /sd* not UUID and all the metadata is gone.
The data is there , just can't mount it to access the data.
Any idea how to get it back or determine what was it before?
Thank you.
This
Does this happen with any random node or specific to 1 node?
If specific to 1 node, does this node holds more data compared to other nodes
(ceph osd df)?
Sinan Polat
> Op 29 mei 2020 om 09:56 heeft Boris Behrens het volgende
> geschreven:
>
> Well, this happens when any OSD goes offline. (I
Hi Sinan,
this happens with any node, and any single OSD.
On Fri, May 29, 2020 at 10:09 AM si...@turka.nl wrote:
>
> Does this happen with any random node or specific to 1 node?
>
> If specific to 1 node, does this node holds more data compared to other nodes
> (ceph osd df)?
>
> Sinan Polat
>
_
Burkhard Linke (Burkhard.Linke) writes:
>
> Buy some power transfer switches. You can connect those to the two PDUs, and
> in case of a power failure on one PDUs they will still be able to use the
> second PDU.
ATS = power switches (in my original mail).
> We only use them for "small" ma
Hans van den Bogert (hansbogert) writes:
> I would second that, there's no winning in this case for your requirements
> and single PSU nodes. If there were 3 feeds, then yes; you could make an
> extra layer in your crushmap much like you would incorporate a rack topology
> in the crushmap.
Chris Palmer (chris.palmer) writes:
> Immediate thought: Forget about crush maps, osds, etc. If you lose half the
> nodes (when one power rail fails) your MONs will lose quorum. I don't see
> how you can win with that configuration...
That's a good point, I'll have to think that one throug
1 - 100 of 101 matches
Mail list logo