turning it off with ceph balancer off
Best,
Laimis J.
On 17 Dec 2024, at 13:15, Torkil Svensgaard wrote:
On 17/12/2024 12:05, Torkil Svensgaard wrote:
Hi
Running upgrade from 18.2.4 to 19.2.0 and it managed to upgrade the
managers but no further progress.
Now it actually seems to have upgraded
On 17/12/2024 12:05, Torkil Svensgaard wrote:
Hi
Running upgrade from 18.2.4 to 19.2.0 and it managed to upgrade the
managers but no further progress.
Now it actually seems to have upgraded 1 MON now then the orchestrator
crashed again:
"
{
"mon": {
"
[17/Dec/2024:10:43:11] ENGINE Bus STARTED
2024-12-17T10:43:11.964+ 7f70ebaf6640 0 log_channel(cephadm) log
[INF] : [17/Dec/2024:10:43:11] ENGINE Bus STARTED
...
"
It will recover after some timeout, maybe 5-10 mins, and then just sit
there with no upgrade progress.
Nothing in mgr/ceph
Hi
18.2.4
We had some hard drives going AWOL due to a failing SAS expander so I
initiated "ceph orch host drain host". After a couple days I'm now
looking at this:
"
OSD HOST STATEPGS REPLACE FORCE ZAPDRAIN
STARTED AT
528 gimpy done, waiting for purge0
h mgr fail
And then it hopefully works again.
Indeed it did, thanks! =)
Mvh.
Torkil
Zitat von Torkil Svensgaard :
On 12-11-2024 09:29, Eugen Block wrote:
Hi Torkil,
Hi Eugen
this sounds suspiciously like https://tracker.ceph.com/issues/67329
Do you have the same (or similar) stack trace in t
ar to me from the tracker how to recover though. The issue
seems to be resolved, so should I be able to just pull new container
images somehow?
Mvh.
Torkil
Regards,
Eugen
Zitat von Torkil Svensgaard :
Hi
18.2.4.
After failing over the active manager ceph orch commands seems to have
Hi
18.2.4.
After failing over the active manager ceph orch commands seems to have
stopped working. There's this in the mgr log:
"
2024-11-12T08:16:30.136+ 7f1b2d887640 0 log_channel(audit) log
[DBG] : from='client.2088861125 -' entity='client.admin' cmd=[{"prefix":
"orch osd rm status"
o
firewall logs.
Joachim
Am Di., 13. Aug. 2024 um 14:36 Uhr schrieb Eugen Block :
Hi Torkil,
did anything change in the network setup? If those errors haven't
popped up before, what changed? I'm not sure if I have seen this one
yet...
Zitat von Torkil Svensgaard :
Ceph version 18.2.1.
Hypervisor <-> Palo Alto firewall <-> OpenBSD firewall <-> Ceph
Any ideas? I haven't found anything in the ceph logs yet.
Mvh.
Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegå
On 06/08/2024 12:37, Ilya Dryomov wrote:
On Tue, Aug 6, 2024 at 11:55 AM Torkil Svensgaard wrote:
Hi
[ceph: root@ceph-flash1 /]# rbd info rbd_ec/projects
rbd image 'projects':
size 750 TiB in 196608000 objects
order 22 (4 MiB objects)
snapsho
size not compatible with object map
We can do 800T though:
[ceph: root@ceph-flash1 /]# rbd resize rbd_ec/projects --size 800T
Resizing image: 100% complete...done.
A problem with the --1024T notation? Or we hitting some sort of size
limit for RBD?
Mvh.
Torkil
--
Torkil Svensgaard
S
and use models instead of
sizes for everything not HDD but we have a lot of different models so as
long as it's not broken this will do.
Thanks for the suggestions!
Mvh.
Torkil
Regards,
Frédéric.
- Le 26 Juin 24, à 8:48, Torkil Svensgaard tor...@drcmr.dk a écrit :
Hi
We have a
On 26/06/2024 08:48, Torkil Svensgaard wrote:
Hi
We have a bunch of HDD OSD hosts with DB/WAL on PCI NVMe, either 2 x
3.2TB or 1 x 6.4TB. We used to have 4 SSDs pr node for journals before
bluestore and those have been repurposed for an SSD pool (wear level is
fine).
We've been usin
ifier to be AND.
I can do a osd.fast2 spec with size: 7000G: and change the db_devices
size for osd.slow to something like 1000G:7000G but curious to see if
anyone would have a different suggestion?
Mvh.
Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Resea
ceph kept going even
though I panicked and flailed with my arms a lot until I managed to
revert the bad crush map changes.
Good to know, thanks =)
Mvh.
Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hosp
638492
> Com. register: Amtsgericht Munich HRB 231263
> Web: https://croit.io <https://croit.io> | YouTube:
https://goo.gl/PGE1Bx <https://goo.gl/PGE1Bx>
>
>> On 12. Jun 2024, at 09:13, Torkil Svensgaard mailto:tor...@drcmr.dk>> wrote:
>>
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx
On 12. Jun 2024, at 09:33, Torkil Svensgaard wrote:
On 12/06/2024 10:22, Matthias Grandl wrote:
Correct, this should only result in misplaced objects.
&
https://goo.gl/PGE1Bx
On 12. Jun 2024, at 09:13, Torkil Svensgaard wrote:
Hi
We have 3 servers for replica 3 with failure domain datacenter:
-1 4437.29248 root default
-33 1467.84814 datacenter 714
-69 69.86389 host ceph-flash1
-34 1511.25378
lashX
datacenter=Y" we will just end up with a lot of misplaced data and some
churn, right? Or will the affected pool go degraded/unavailable?
Mvh.
Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
and if so how?
Do I run multiple SSDs in RAID?
I do realize that for some of these, there might not be the one perfect
answer that fits all use cases. I am looking for best practices and in
general just trying to avoid any obvious mistakes.
Any advice is much appreciated.
Sincerely
d the alert module is
somewhat broken?
Mvh.
Torkil
--
Torkil Svensgaard
Systems Administrator
Danish Research Centre for Magnetic Resonance DRCMR, Section 714
Copenhagen University Hospital Amager and Hvidovre
Kettegaard Allé 30, 2650 Hvidovre, Denmark
___
On 06-04-2024 18:10, Torkil Svensgaard wrote:
Hi
Cephadm Reef 18.2.1
Started draining 5 18-20 TB HDD OSDs (DB/WAL om NVMe) on one host. Even
with osd_max_backfills at 1 the OSDs get slow ops from time to time
which seems odd as we recently did a huge reshuffle[1] involving the
same host
ed (3.073%)
9931 active+clean
893 active+remapped+backfill_wait
24 active+remapped+backfilling
1active+clean+inconsistent
io:
client: 3.5 KiB/s rd, 2.0 MiB/s wr, 5 op/s rd, 115 op/s wr
"
Any ideas on how to get the nfsd threa
On 25-03-2024 23:07, Kai Stian Olstad wrote:
On Mon, Mar 25, 2024 at 10:58:24PM +0100, Kai Stian Olstad wrote:
On Mon, Mar 25, 2024 at 09:28:01PM +0100, Torkil Svensgaard wrote:
My tally came to 412 out of 539 OSDs showing up in a blocked_by list
and that is about every OSD with data prior
On 25-03-2024 22:58, Kai Stian Olstad wrote:
On Mon, Mar 25, 2024 at 09:28:01PM +0100, Torkil Svensgaard wrote:
My tally came to 412 out of 539 OSDs showing up in a blocked_by list
and that is about every OSD with data prior to adding ~100 empty OSDs.
How 400 read targets and 100 write
t the
numbers I want so that will do. Thank you all for taking the time to
look at this.
Mvh.
Torkil
On 25-03-2024 20:44, Anthony D'Atri wrote:
First try "ceph osd down 89"
On Mar 25, 2024, at 15:37, Alexander E. Patrakov wrote:
On Mon, Mar 25, 2024 at 7:37 PM Torkil Svens
On 24/03/2024 01:14, Torkil Svensgaard wrote:
On 24-03-2024 00:31, Alexander E. Patrakov wrote:
Hi Torkil,
Hi Alexander
Thanks for the update. Even though the improvement is small, it is
still an improvement, consistent with the osd_max_backfills value, and
it proves that there are still
On 24-03-2024 13:41, Tyler Stachecki wrote:
On Sat, Mar 23, 2024, 4:26 AM Torkil Svensgaard wrote:
Hi
... Using mclock with high_recovery_ops profile.
What is the bottleneck here? I would have expected a huge number of
simultaneous backfills. Backfill reservation logjam?
mClock is very
No latency spikes seen the last 24 hours after manually compacting all
the OSDs so it seemed to solve it for us at least. Thanks all.
Mvh.
Torkil
On 23-03-2024 12:32, Torkil Svensgaard wrote:
Hi guys
Thanks for the suggestions, we'll do the offline compaction and see how
big an impa
rt.
On Sun, Mar 24, 2024 at 4:56 AM Torkil Svensgaard wrote:
On 23-03-2024 21:19, Alexander E. Patrakov wrote:
Hi Torkil,
Hi Alexander
I have looked at the CRUSH rules, and the equivalent rules work on my
test cluster. So this cannot be the cause of the blockage.
Thank you for taki
_max_pg_per_osd
250
Mvh.
Torkil
On Sun, Mar 24, 2024 at 1:08 AM Torkil Svensgaard wrote:
On 2024-03-23 17:54, Kai Stian Olstad wrote:
On Sat, Mar 23, 2024 at 12:09:29PM +0100, Torkil Svensgaard wrote:
The other output is too big for pastebin and I'm not familiar with
paste services,
output of "ceph osd pool ls detail".
On Sun, Mar 24, 2024 at 1:43 AM Alexander E. Patrakov
wrote:
Hi Torkil,
Unfortunately, your files contain nothing obviously bad or suspicious,
except for two things: more PGs than usual and bad balance.
What's your "mon max pg per osd"
he whole host and your failure domain
allows for that)
3. ceph config set osd osd_compact_on_start false
The OSD will restart, but will not show as "up" until the compaction
process completes. In your case, I would expect it to take up to 40
minutes.
On Fri, Mar 22, 2024 at 3:46 PM Torkil S
too big for pastebin and I'm not familiar with paste
services, any suggestion for a preferred way to share such output?
Mvh.
Torkil
On Sat, Mar 23, 2024 at 4:26 PM Torkil Svensgaard wrote:
Hi
We have this after adding some hosts and changing crush failure domain
to datacenter:
with 6 hosts and ~400 HDD OSDs with DB/WAL on
NVMe. Using mclock with high_recovery_ops profile.
What is the bottleneck here? I would have expected a huge number of
simultaneous backfills. Backfill reservation logjam?
Mvh.
Torkil
--
Torkil Svensgaard
Systems Administrator
Danish Research
hanks.
Mvh.
Torkil
Thanks,
Igor
On 3/22/2024 9:59 AM, Torkil Svensgaard wrote:
Good morning,
Cephadm Reef 18.2.1. We recently added 4 hosts and changed a failure
domain from host to datacenter which is the reason for the large
misplaced percentage.
We were seeing some pretty crazy
n between with normal low latencies I think
it unlikely that it is just because the cluster is busy.
Also, how come there's only a small amount of PGs doing backfill when we
have such a large misplaced percentage? Can this be just from backfill
reservation logjam?
Mvh.
Torkil
--
Tork
ld just change 3 to 2 for the chooseleaf line for
the 4+2 rule since for 4+5 each DC needs 3 shards and for 4+2 each DC
needs 2 shards. Comments?
Mvh.
Torkil
[1] https://docs.ceph.com/en/reef/rados/operations/crush-map-edits/
--
Torkil Svensgaard
Systems Administrator
Danish Research Centre for
On 13/02/2024 13:31, Torkil Svensgaard wrote:
Hi
Cephadm Reef 18.2.0.
We would like to remove our cluster_network without stopping the cluster
and without having to route between the networks.
global advanced cluster_network 192.168.100.0/24
*
global
On 07/03/2024 08:52, Torkil Svensgaard wrote:
Hi
I tried to do offline read optimization[1] this morning but I am now
unable to map the RBDs in the pool.
I did this prior to running the pg-upmap-primary commands suggested by
the optimizer, as suggested by the latest documentation[2
Mvh.
Torkil
[1] https://docs.ceph.com/en/reef/rados/operations/read-balancer/
[2] https://docs.ceph.com/en/latest/rados/operations/read-balancer/
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård
g-ref/#id3
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: tor...@drcmr.dk
___
ceph-users mailing li
github.com/ceph/ceph-container.git, CEPH_POINT_RELEASE=-18.2.0, org.label-schema.build-date=20231212, org.label-schema.name=CentOS Stream 8 Base Image, org.label-schema.schema-version=1.0)
...
Feb 01 04:10:08 dopey practical_hypatia[766758]: 167 167
...
Feb 01 04:10:08 dopey systemd[1]:
libpod-conmon-95
dopey practical_hypatia[766758]: 167 167
...
Feb 01 04:10:08 dopey systemd[1]:
libpod-conmon-95967a040795bd61588dcfdc6ba5daf92553cd2cb3ecd7318cd8b16c1b15782d.scope:
Deactivated successfully
"
Mvh.
Torkil
On 01/02/2024 08:24, Torkil Svensgaard wrote:
We have ceph (currently 18.2.0) log to
: {}
2024-02-01T05:42:17+01:00 dopey goofy_hypatia[845150]: 167 167
"
Anyone else had this issue? Suggestions on how to get a real program
name instead?
Mvh.
Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre
ig
in the logs. Thanks!
Mvh.
Torkil
Zitat von Torkil Svensgaard :
On 31/01/2024 09:36, Eugen Block wrote:
Hi,
if I understand this correctly, with the "keepalive-only" option only
one ganesha instance is supposed to be deployed:
If a user additionally supplies --ingress-mode
gress service is puzzling me, as it worked just
fine prior to the upgrade and the upgrade shouldn't have touched the
service spec in any way?
Mvh.
Torkil
[1]
https://docs.ceph.com/en/latest/cephadm/services/nfs/#nfs-with-virtual-ip-but-no-haproxy
Regards,
Eugen
Zitat von Torkil Svensg
On 31/01/2024 08:38, Torkil Svensgaard wrote:
Hi
Last week we created an NFS service like this:
"
ceph nfs cluster create jumbo "ceph-flash1,ceph-flash2,ceph-flash3"
--ingress --virtual_ip 172.21.15.74/22 --ingress-mode keepalive-only
"
Worked like a charm. Yester
}
],
"virtual_ip": null
}
}
"
Service spec:
"
service_type: nfs
service_id: jumbo
service_name: nfs.jumbo
placement:
count: 1
hosts:
- ceph-flash1
- ceph-flash2
- ceph-flash3
spec:
port: 2049
virtual_ip: 172.21.15.74
"
I've tried restarting the nf
bd_ec_data stores 683TB in 4096 pgs -> warn should be 1024
Pool rbd_internal stores 86TB in 1024 pgs-> warn should be 2048
That makes no sense to me based on the amount of data stored. Is this a
bug or what am I missing? Ceph version is 17.2.7.
Mvh.
Torkil
--
Torkil Svensgaard
Systems
k-max-backfills-recovery-limits
-Sridhar
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
--
Torkil Svensgaard
Systems Administrator
Danish Research Centre for Magnetic Resonance DRCMR, Section
5139 active+remapped+backfill_wait
109 active+remapped+backfilling
io:
client: 28 MiB/s rd, 258 MiB/s wr, 677 op/s rd, 772 op/s wr
recovery: 3.5 GiB/s, 1.00k objects/s
"
Thanks again.
Mvh.
Torkil
On 18-01-2024 13:26, Torkil Svensgaard wrote:
Np. Thanks, we'
ceph config set osd osd_op_queue wpq
[1]
https://docs.ceph.com/en/quincy/rados/configuration/mclock-config-ref/
[2]
https://docs.clyso.com/blog/2023/03/22/ceph-how-do-disable-mclock-scheduler/
Zitat von Torkil Svensgaard :
Hi
Our 17.2.7 cluster:
"
-33 886.00842
m anywhere
near the target capacity, and the one we just added has 22 empty OSDs,
having just 22 PGs backfilling and 1 recovering seems somewhat
underwhelming.
Is this to be expected with such a pool? Mclock profile is
high_recovery_ops.
Mvh.
Torkil
--
Torkil Svensgaard
Sysadmin
MR-Fors
, 2024 9:46 AM
To: ceph-users@ceph.io
Subject: [ceph-users] Re: Adding OSD's results in slow ops, inactive PG's
I'm glad to hear (or read) that it worked for you as well. :-)
Zitat von Torkil Svensgaard :
On 18/01/2024 09:30, Eugen Block wrote:
Hi,
[ceph: root@lazy /]# ceph-con
ame in right away,
some are stuck on the aio thing. Hopefully they will recover eventually.
Thanks you again for the osd_max_pg_per_osd_hard_ratio suggestion, that
seems to have solved the core issue =)
Mvh.
Torkil
Zitat von Torkil Svensgaard :
On 18/01/2024 07:48, Eugen Block wrot
ev(0x56295d586400
/var/lib/ceph/osd/ceph-436/block) aio_submit retries 108
...
"
Daemons are running but those last OSDs won't come online.
I've tried upping bdev_aio_max_queue_depth but it didn't seem to make a
difference.
Mvh.
Torkil
Zitat von Torkil Svensgaard :
ol, placed on
spinning rust, some 200-ish disks distributed across 13 nodes. I'm not
sure if other pools break, but that particular 4+2 EC pool is rather
important so I'm a little wary of experimenting blindly.
Any thoughts on where to look next?
Thanks,
Ruben Vestergaard
[1] https://docs.ceph.com/en/reef/rados/trouble
r
AIT Risø Campus
Bygning 109, rum S14
From: Torkil Svensgaard
Sent: Friday, January 12, 2024 10:17 AM
To: Frédéric Nass
Cc: ceph-users@ceph.io; Ruben Vestergaard
Subject: [ceph-users] Re: 3 DC with 4+5 EC not quite working
On 12-01-2024 09:35, Frédéric
related replicated
pools. Looking at it now I guess that was because the 5 OSDs were
blocked for everything and not just the PGs for that data pool?
We tried restarting the 5 blocked OSDs to no avail and eventually
resorted to deleting the cephfs.hdd.data data pool to restore service.
Any suggestio
the PGs for that data pool?
We tried restarting the 5 blocked OSDs to no avail and eventually
resorted to deleting the cephfs.hdd.data data pool to restore service.
Any suggestions as to what we did wrong? Something to do with min_size?
The crush rule?
Thanks.
Mvh.
Torkil
--
Torkil Svens
age quay.io/ceph/ceph:v17.2.7
"
Mvh.
Torkil
Any advice or feedback is much appreciated.
Best,
Josh
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
--
Torkil Svensgaard
Systems Admini
On 13-10-2023 16:57, John Mulligan wrote:
On Friday, October 13, 2023 10:46:24 AM EDT Torkil Svensgaard wrote:
On 13-10-2023 16:40, Torkil Svensgaard wrote:
On 13-10-2023 14:00, John Mulligan wrote:
On Friday, October 13, 2023 6:11:18 AM EDT Torkil Svensgaard wrote:
Hi
We have kerberos
On 13-10-2023 16:40, Torkil Svensgaard wrote:
On 13-10-2023 14:00, John Mulligan wrote:
On Friday, October 13, 2023 6:11:18 AM EDT Torkil Svensgaard wrote:
Hi
We have kerberos working with bare metal kernel NFS exporting RBDs. I
can see in the ceph documentation[1] that nfs-ganesha should
On 13-10-2023 14:00, John Mulligan wrote:
On Friday, October 13, 2023 6:11:18 AM EDT Torkil Svensgaard wrote:
Hi
We have kerberos working with bare metal kernel NFS exporting RBDs. I
can see in the ceph documentation[1] that nfs-ganesha should work with
kerberos but I'm having little
/#create-cephfs-export
--
Torkil Svensgaard
Systems Administrator
Danish Research Centre for Magnetic Resonance DRCMR, Section 714
Copenhagen University Hospital Amager and Hvidovre
Kettegaard Allé 30, 2650 Hvidovre, Denmark
___
ceph-users mailing list -- cep
y: 96 MiB/s, 49 objects/s
progress:
Global Recovery Event (2d)
[===.] (remaining: 59m)
"
Mvh.
Torkil
Thanks,
Kevin
________
From: Torkil Svensgaard
Sent: Tuesday, January 10, 2023 2:36 AM
To: ceph-users-a8pt6iju...@public.g
30% 1.0 261 osd.12
53.58% 1.0 51 osd.82 |32.17% 1.0 172 osd.4
53.52% 1.0 50 osd.72 |0% 0 0 osd.49
+--------
"
Mvh.
Torkil
--
Torkil Svensgaard
Systems Administrator
Danish Researc
have done after the upgrade?
Mvh.
Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: tor...@drcmr.dk
__
scrubs
and snaptrim, no difference.
Am I missing something obvious I should have done after the upgrade?
Mvh.
Torkil
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel
On 8/4/22 09:17, Torkil Svensgaard wrote:
Hi
We have a lot of OSDs flapping during recovery and eventually they don't
come up again until kicked with "ceph orch daemon restart osd.x".
This is the end of the log for one OSD going down for good:
"
2022-08-04T09:57:31.7
r::cpu_tp thread 0x7fab9ede5700' had timed out after 0.0s
Aug 04 06:59:29 dcn-ceph-01 bash[5230]: debug
2022-08-04T06:59:29.808+ 7fab9cde1700 1 mon.dcn-ceph-01@4(electing)
e21 collect_metadata md0: no unique device id for md0: fallback method
has no model nor serial'
&q
void the fraction:
290966526510 / 4.194.304 = 69.371,82581663132 extents pr db
290963062784 / 4.194.304 = 69.371 extents pr db
69.371 x 11 = 763.081 total extents
69.371 x 10 = 693.710 used extents
"
pvdisplay /dev/nvme0n1
PE Size 4.00 MiB
Total PE 763089
Free PE
On 12/15/21 14:18, Arthur Outhenin-Chalandre wrote:
On 12/15/21 13:50, Torkil Svensgaard wrote:
Ah, so as long as I don't run the mirror daemons on site-a there is no
risk of overwriting production data there?
To be perfectly clear there should be no risk whatsoever (as Ilya also
sai
On 15/12/2021 13.58, Ilya Dryomov wrote:
Hi Torkil,
Hi Ilya
I would recommend sticking to rx-tx to make potential failback back to
the primary cluster easier. There shouldn't be any issue with running
rbd-mirror daemons at both sites either -- it doesn't start replicating
until it is instruc
On 15/12/2021 10.17, Arthur Outhenin-Chalandre wrote:
Hi Torkil,
Hi Arthur
On 12/15/21 09:45, Torkil Svensgaard wrote:
I'm having trouble getting snapshot replication to work. I have 2
clusters, 714-ceph on RHEL/16.2.0-146.el8cp and dcn-ceph on CentOS
Stream 8/16.2.6. I trying to e
On 15/12/2021 13.44, Arthur Outhenin-Chalandre wrote:
Hi Torkil,
Hi Arthur
On 12/15/21 13:24, Torkil Svensgaard wrote:
I'm confused by the direction parameter in the documentation[1]. If I
have my data at site-a and want one way replication to site-b should the
mirroring be configur
Hi
I'm confused by the direction parameter in the documentation[1]. If I
have my data at site-a and want one way replication to site-b should the
mirroring be configured as the documentation example, directionwise?
E.g.
rbd --cluster site-a mirror pool peer bootstrap create --site-name
site
Hi
I'm having trouble getting snapshot replication to work. I have 2
clusters, 714-ceph on RHEL/16.2.0-146.el8cp and dcn-ceph on CentOS
Stream 8/16.2.6. I trying to enable one-way replication from 714-ceph ->
dcn-ceph.
Adding peer:
"
# rbd mirror pool info
Mode: image
Site Name: dcn-ceph
P
On 22/08/2021 00.42, Torkil Svensgaard wrote:
Hi
Any suggestions as to the cause of this error? The device list seems
fine, a mix of already active OSDs and 4 empty, available drives.
There were 2 orphaned LVs on the db device. After I removed those the 4
available devices came up as OSDs
Hi
Any suggestions as to the cause of this error? The device list seems
fine, a mix of already active OSDs and 4 empty, available drives.
RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host
--stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume
--privileged --group
On 18/08/2021 21.26, Torkil Svensgaard wrote:
Did I miss something obvious?
Restarting the rbd-mirror daemons was the thing I missed. All good now.
Thanks,
Torkil
Thanks,
Torkil
On 18/08/2021 14.30, Ilya Dryomov wrote:
On Wed, Aug 18, 2021 at 12:40 PM Torkil Svensgaard
wrote:
Hi
I
6.84360 GiB
Did I miss something obvious?
Thanks,
Torkil
On 18/08/2021 14.30, Ilya Dryomov wrote:
On Wed, Aug 18, 2021 at 12:40 PM Torkil Svensgaard wrote:
Hi
I am looking at one way mirroring from cluster A to B cluster B.
As pr [1] I have configured two pools for RBD on cluster B:
1
--
Torkil Svensgaard
Sysadmin
MR-Forskningssektionen, afs. 714
DRCMR, Danish Research Centre for Magnetic Resonance
Hvidovre Hospital
Kettegård Allé 30
DK-2650 Hvidovre
Denmark
Tel: +45 386 22828
E-mail: tor...@drcmr.dk
___
ceph-users mailing list -- ceph-users
uot;
Mvh.
Torkil
On 15/06/2021 11.38, Sebastian Wagner wrote:
Hi Torkil,
you should see more information in the MGR log file.
Might be an idea to restart the MGR to get some recent logs.
Am 15.06.21 um 09:41 schrieb Torkil Svensgaard:
Hi
Looking at this error in v15.2.13:
"
[
Hi
Looking at this error in v15.2.13:
"
[ERR] MGR_MODULE_ERROR: Module 'devicehealth' has failed:
Module 'devicehealth' has failed:
"
It used to work. Since the module is always on I can't seem to restart
it and I've found no clue as to why it failed. I've tried rebooting all
hosts to no
86 matches
Mail list logo