Thanks, Eugen
I'll look into how to do this.
Eugen Block 于2024年11月27日周三 15:21写道:
>
> Sure, I haven't done that with rbd-mirror but I don't see a reason why
> it shouldn't work. You can create different clusters within the same
> subnet, you just have to pay attention to the correct configs etc.
Sure, I haven't done that with rbd-mirror but I don't see a reason why
it shouldn't work. You can create different clusters within the same
subnet, you just have to pay attention to the correct configs etc.
Each cluster has its own UUID and separate MONs, it should just work.
If not, let us
Hi Eugen
Do you mean that it is possible to create multiple clusters on one
infrastructure and then perform backups in each cluster?
Eugen Block 于2024年11月27日周三 14:48写道:
>
> Hi,
>
> I don't think there's a way to achieve that. You could create two
> separate backup clusters and configure mirrorin
Hi,
I don't think there's a way to achieve that. You could create two
separate backup clusters and configure mirroring on both of them. You
should already have enough space (enough OSDs) to mirror both pools,
so that would work. You can colocate MONs with OSDs so you don't need
additional
Hi,
I'd like to understand how the free space allocation is calculated when osd
crash happens and says no free space on the device (maybe due to fragmentation
or allocation issue).
I checked all the graphs back to September when we had multiple osd failures on
Octopus 15.2.17 co-located wal+db
Got it, the perf dump can give information:
ceph daemon osd.x perf dump|jq .bluefs
From: Szabo, Istvan (Agoda)
Sent: Wednesday, November 27, 2024 9:20 AM
To: Frédéric Nass ; John Jasen
; Igor Fedotov
Cc: ceph-users
Subject: [ceph-users] Re: down OSDs, Bluesto
Hello, everyone.
Under normal circumstances, we synchronize from PoolA of ClusterA to
PoolA of ClusterB (same pool name), which is also easy to configure.
Now the requirements are as follows:
ClusterA/Pool synchronizes to BackCluster/PoolA
ClusterB/Pool synchronizes to BackCluster/PoolB
After re
Hi,
Is there a way to check without shutting down an osd the remaining free space
on the co-located osd how much has left for DB or how full is it?
From: Frédéric Nass
Sent: Wednesday, November 27, 2024 6:11 AM
To: John Jasen
Cc: Igor Fedotov ; ceph-users
Sub
I haven’t gone through all the details since you seem to know you’ve done
some odd stuff, but the “.snap” issue is because you’ve run into a CephFS
feature which I recently discovered is embarrassingly under-documented:
https://docs.ceph.com/en/reef/dev/cephfs-snapshots
So that’s a special fake di
Don't laugh. I am experimenting with Ceph in an enthusiast,
small-office, home-office setting. Yes, this is not the conventional
use case, but I think Ceph almost is, almost could be used for this.
Do I need to explain why? These kinds of people (i.e. me) already run
RAID. And maybe CIFS/Samba or
Hi,
This issue should not happen anymore from 17.2.8 am I correct? In this version
all the fragmentation issue should have gone even with collocated wal+db+block.
From: Frédéric Nass
Sent: Wednesday, November 27, 2024 6:12:46 AM
To: John Jasen
Cc: Igor Fedotov
Hello Ceph community,
Wanted to highlight one observation and gather any Squid users having similar
experiences.
Since upgrading to 19.2.0 (from 18.4.0) we have observed that pg deep scrubbing
times have drastically increased. Some pgs take 2-5 days to complete deep
scrubbing while others incre
Hi John,
That's about right. Two potential solutions exist:
1. Adding a new drive to the server and sharing it for RocksDB metadata, or
2. Repurposing one of the failed OSDs for the same purpose (if adding more
drives isn't feasible).
Igor's post #6 [1] explains the challenges with co-located
May not apply, but usually when I have strange (and bad) behaviours like
these, I double check name resolution/DNS configuration on all hosts
involved in.
Il 26/11/2024 11:19, Martin Gerhard Loschwitz ha scritto:
Hi Alex,
thank you for the reply. Here are all the steps we’ve done in the last
On 26/11/2024 15:09, Peter Grandi wrote:
Regardless of the specifics: 4KiB write IOPS is definitely not
what Ceph was designed for. Yet so many people know better and
use Ceph for VM disk images, even with logs and databases on
them.
It depends...for a single thread/queue depth Ceph will give
That is indeed a lot nicer hardware and 1804 iops is faster, but still
lower than a usd thumb drive.
The thing with ceph is that is scales out really really well, but
scaling up is harder. That is, if you run like 500 of these tests at the
same time, then you can see what it can do.
Some guy
Not really. I'm assuming that they have been working hard at it and I
remember hearing something about a more recent rocksdb version shaving
off significant time. It would also depend on your CPU and memory speed.
I wouldn't be all surprised if latency is lower today, but I havent
really measu
Hi Martin,
I think what Peter suggests is that you should try with --numjobs=128 and
--iodepth=16 to see what your hardware is really capable of with this very
small I/O workload.
Regards,
Frédéric.
De : Martin Gerhard Loschwitz
Envoyé : mardi 26 novembre 202
can you check if you have any power saving settings, make sure cpu is
set to max performance, use cpupower tool to check and disable all
c-states, and run at max frequency.
for hdd qd=1, 60 iops is ok
for ssd qd=1, you should get roughly 3-5k iops read, 1k iops write, but
if your cpu is pow
Here’s a benchmark of another setup I did a few months back, with NVME flash
drives and a Mellanox EVPN fabric (Spectrum ASIC) between the nodes (no RDMA).
3 hosts and 24 drives in total.
root@test01:~# fio --ioengine=libaio --filename=/dev/sdb --direct=1 --sync=1
--rw=write --bs=4K --numjobs=1
>
> In my experience, ceph will add around 1ms even if only on localhost. If
> this is in the client code or on the OSD's, I dont really know. I don't
> even know the precise reason, but the latency is there nevertheless.
> Perhaps you can find the reason here among the tradeoffs ceph and
> simila
In my experience, ceph will add around 1ms even if only on localhost. If
this is in the client code or on the OSD's, I dont really know. I don't
even know the precise reason, but the latency is there nevertheless.
Perhaps you can find the reason here among the tradeoffs ceph and
similar systems
With qd=1 (queue depth?) and a single thread, this isn't totally
unreasonable.
Ceph will have an internal latency of around 1ms or so, add some network
to that and an operation can take 2-3ms. With a single operation in
flight all the time, this means 333-500 operations per second. With
hdds,
Martin, are MONs set up on the same hosts, or is there latency to them by
any chance?
--
Alex Gorbachev
https://alextelescope.blogspot.com
On Tue, Nov 26, 2024 at 5:20 AM Martin Gerhard Loschwitz <
martin.loschw...@true-west.com> wrote:
> Hi Alex,
>
> thank you for the reply. Here are all the s
Let me see if I have the approach right'ish:
scrounge some more disk for the servers with full/down OSDs.
partition the new disks into LVs for each downed OSD.
Attach as a lvm new-db to the downed OSDs.
Restart the OSDs.
Profit.
Is that about right?
On Tue, Nov 26, 2024 at 11:28 AM Igor Fedotov
Well, so there is a single shared volume (disk) per OSD, right?
If so one can add dedicated DB volume to such an OSD - one done OSD will
have two underlying devices: main(which is original shared disk) and new
dedicated DB ones. And hence this will effectively provide additional
space for Blu
They're all bluefs_single_shared_device, if I understand your question.
There's no room left on the devices to expand.
We started at quincy with this cluster, and didn't vary too much from the
Redhat Ceph storage 6 documentation for setting it up.
On Tue, Nov 26, 2024 at 4:48 AM Igor Fedotov wr
Wait … 1 gigabit?? That sure isn’t doing you any favors. Remember that RADOS
sends replication sub-ops over that, though you mentioned a size=1 pool.
You’ll have mon <> OSD traffic and OSD <—> OSD heartbeats going over that
link as well.
> On Nov 26, 2024, at 5:22 AM, Martin Gerhard Los
Hi,
> On 26 Nov 2024, at 16:10, Matthew Darwin wrote:
>
> I guess there is a missing dependency (which really should be
> auto-installed), which is not also documented in the release notes as a new
> requirement. This seems to fix it:
This caused by [1], the fix was not backported to quincy,
> [...] All-SSD cluster I will get roughly 400 IOPS over more
> than 250 devices. I’ve know SAS-SSDs are not ideal, but 250
> looks a bit on the low side of things to me. In the second
> cluster, also All-SSD based, I get roughly 120 4k IOPS. And
> the HDD-only cluster delivers 60 4k IOPS.
Regardl
>>> On Mon, 25 Nov 2024 15:22:32 +0100, Martin Gerhard Loschwitz
>>> said:
> [...] abysmal 4k IOPS performance [...
Also:
https://www.google.com/search?q=ceph+bench+small+blocks
https://www.reddit.com/r/ceph/comments/10b2846/low_iops_with_all_ssd_cluster_for_4k_writes/
https://www.r
I guess there is a missing dependency (which really should be
auto-installed), which is not also documented in the release notes as
a new requirement. This seems to fix it:
$ apt install --no-install-recommends python3-packaging
On 2024-11-26 08:03, Matthew Darwin wrote:
I have upgraded from
I have upgraded from 17.2.7 to 17.2.8 on debian 11 and the OSDs fail
to start. Advise how to proceed would be welcome.
This is my log from ceph-volume-systemd.log
[2024-11-26 12:51:30,251][systemd][INFO ] raw systemd input received:
lvm-1-1c136e54-6f58-4f36-af10-d47d215b991b
[2024-11-26 12:5
Good morning!
It's a novelty for our situation. The list of cluster locks now contains
entries for the addresses of the servers on which the mds is running. The
lifetime of the records is a day. For now, we decided to wait until the records
are deleted and see how the cluster behaves.
Best
Hi everyone,
I encountered an issue while setting up a multisite configuration in Ceph.
After adding the second secondary zone and enabling the archive module, the
RGW daemons started crashing repeatedly after running the period update
command.
The crash log shows the following error:
Caught
Hi,
I see in our logs continuously these kind of messaages:
RGW-SYNC:data:sync:shard[7]:entry: ERROR: failed to remove omap key from error
repo
(hkg.rgw.log:datalog.sync-status.shard.61c9d940-fde4-4bed-9389-edc8d7741817.7.retry
retcode=-1
And we started to receive large omaps in the log pool
Hi Anthony,
I think problems have always been like this, albeit these setups are a bit
older already. We’ve specifically set the MTU to 9000 on both switches and all
affected machines, but MTU 1500 or MTU 9000 literally doesn’t make a difference.
Network is non-LACP on one of the test clusters
Hi Alex,
thank you for the reply. Here are all the steps we’ve done in the last weeks to
reduce complexity (we’re focussing on the HDD cluster for now in which we are
seeing the worst results in relation — but it also happens to be the easiest
setup network-wise, despite only having a 1G link b
Hi John,
you haven't described your OSD volume configuration but you might want
to try adding standalone DB volume if OSD uses LVM and has single main
device only.
'ceph-volume lvm new-db' command is the preferred way of doing that, see
https://docs.ceph.com/en/quincy/ceph-volume/lvm/newdb/
Good morning!
It's a novelty for our situation. The list of cluster locks now contains
entries for the addresses of the servers on which the mds is running. The
lifetime of the records is a day. For now, we decided to wait until the records
are deleted and see how the cluster behaves.
Best r
On Tue, Nov 26, 2024 at 4:38 PM Gregory Orange
wrote:
> I simply registered for the dev event, and put a note in there that I
> was only going for the 3pm session.
>
If this is a common case, does it mean there are still some spaces
available at the Dev Summit?
Hope info could be updated~
Best
We have ceph clusters in multiple regions to provide rbd services.
We are currently preparing a remote backup plan, which is to
synchronize pools with the same name in each region to different pools
in one cluster.
For example:
Cluster A Pool synchronized to backup cluster poolA
Cluster B Pool sy
And, Eugen, try to see ceph fs status during write.
I can see next INOS, DNS and Reqs distribution:
RANK STATE MDS ACTIVITY DNSINOS DIRS CAPS
0active c Reqs:127 /s 12.6k 12.5k 333505
1active b Reqs:11 /s21 24 19 1
I mean that e
>
> Hm, the same test worked for me with version 16.2.13... I mean, I only
> do a few writes from a single client, so this may be an invalid test,
> but I don't see any interruption.
I tried many times and I'm sure that my test is correct.
Yes, write can be active for some time after rank 1 went
On 26/11/24 09:47, Stefan Kooman wrote:
> The dev event is full. So unable to register anymore and leave a note
> like you did. Hence my question.
I guess the contact email address on that page is worth a shot
ceph-devsummit-2...@cern.ch
___
ceph-users
On 26-11-2024 09:37, Gregory Orange wrote:
On 25/11/24 15:57, Stefan Kooman wrote:
Update: The Ceph Developer Summit is nearing capacity for "Developers".
There is still room for "Power Users" to register for the afternoon
session. See below for details...
However, it's unclear to me if you nee
On 25/11/24 15:57, Stefan Kooman wrote:
> Update: The Ceph Developer Summit is nearing capacity for "Developers".
> There is still room for "Power Users" to register for the afternoon
> session. See below for details...
>
> However, it's unclear to me if you need to register for the "Power
> Users
47 matches
Mail list logo