Any comments regarding `osd noin`, please?
/Z
On Tue, 2 Apr 2024 at 16:09, Zakhar Kirpichenko wrote:
> Hi,
>
> I'm adding a few OSDs to an existing cluster, the cluster is running with
> `osd noout,noin`:
>
> cluster:
> id: 3f50555a-ae2a-11eb-a2fc-ffde44714d86
> health: HEALTH_WAR
Hello,
We are currently experiencing a lot of rgw service crashes that all seem to
terminate with the same message. We have kept our RGW services at 17.2.5
but the rest of the cluster is 17.2.7 due to a bug introduced in 17.2.7.
terminate called after throwing an instance of
> 'ceph::buffer::v15_
Thanks. I'll PR up some doc updates reflecting this and run them by the RGW /
RADOS folks.
> On Apr 3, 2024, at 16:34, Joshua Baergen wrote:
>
> Hey Anthony,
>
> Like with many other options in Ceph, I think what's missing is the
> user-visible effect of what's being altered. I believe the re
Hey Anthony,
Like with many other options in Ceph, I think what's missing is the
user-visible effect of what's being altered. I believe the reason why
synchronous recovery is still used is that, assuming that per-object
recovery is quick, it's faster to complete than asynchronous recovery,
which h
Depending on your Ceph release you might need to enable rbdstats.
Are you after provisioned, allocated, or both sizes? Do you have object-map
and fast-diff enabled? They speed up `rbd du` massively.
> On Apr 3, 2024, at 00:26, Szabo, Istvan (Agoda)
> wrote:
>
> Hi,
>
> Trying to pull out s
We currently have in src/common/options/global.yaml.in
- name: osd_async_recovery_min_cost
type: uint
level: advanced
desc: A mixture measure of number of current log entries difference and
historical
missing objects, above which we switch to use asynchronous recovery when
appropriat
On Wed, Apr 3, 2024 at 3:09 PM Lorenz Bausch wrote:
>
> Hi Casey,
>
> thank you so much for analysis! We tested the upgraded intensively, but
> the buckets in our test environment were probably too small to get
> dynamically resharded.
>
> > after upgrading to the Quincy release, rgw would
> > loo
Hi and sorry for the delay, I was on vacation last week. :-) I just
read your responses. I have no idea how to modify the default timeout
for cephadm, maybe Adam or someone else can comment on that. But
everytime I've been watching cephadm (ceph-volume) create new OSDs
they are not created
Hi,
1. I see no systemd units with the fsid in them, as described in the
document above. Both before and after the upgrade, my mon and other
units are:
ceph-mon@.serviceceph-osd@[N].service
etc
Should I be concerned?
I think this is expected because it's not containerized, no reason to
b
Hi,
how many OSDs do you have in total? Can you share your osd tree, please?
You could check the unit.meta file on each OSD host to see which
service it refers to and simply change it according to the service you
intend to keep:
host1:~ # grep -r service_name
/var/lib/ceph/543967bc-e586-
Hi Casey,
thank you so much for analysis! We tested the upgraded intensively, but
the buckets in our test environment were probably too small to get
dynamically resharded.
after upgrading to the Quincy release, rgw would
look at the wrong object names when trying to list those buckets.
As we
Hi,
you need to deploy more daemons because your current active MDS is
responsible for the already existing CephFS. There are several ways to
do this, I like the yaml file approach and increase the number of MDS
daemons, just as an example from a test cluster with one CephFS I
added the l
to expand on this diagnosis: with multisite resharding, we changed how
buckets name/locate their bucket index shard objects. any buckets that
were resharded under this Red Hat pacific release would be using the
new object names. after upgrading to the Quincy release, rgw would
look at the wrong obj
We've had success using osd_async_recovery_min_cost=0 to drastically
reduce slow ops during index recovery.
Josh
On Wed, Apr 3, 2024 at 11:29 AM Wesley Dillingham
wrote:
>
> I am fighting an issue on an 18.2.0 cluster where a restart of an OSD which
> supports the RGW index pool causes cripplin
I am fighting an issue on an 18.2.0 cluster where a restart of an OSD which
supports the RGW index pool causes crippling slow ops. If the OSD is marked
with primary-affinity of 0 prior to the OSD restart no slow ops are
observed. If the OSD has a primary affinity of 1 slow ops occur. The slow
ops o
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io
Everything goes fine except execute "ceph fs new kingcephfs
cephfs-king-metadata cephfs-king-data", its shows 1 filesystem is offline 1
filesystem is online with fewer MDS than max_mds.
But i see there is one mds services running, please help me to fix the issue,
thanks a lot.
bash-4.4$
bash
We have a ceph cluster of only nvme drives.
Very recently our overall OSD write latency increase pretty dramatically and
our overall thoughput has really decreased.
One thing that seems to correlate with the start of this problem are the below
ERROR line from the logs. All our OSD nodes a
I have the same issue, can someone help me, thanks in advance!
bash-4.4$ ceph fs new kingcephfs cephfs-king-metadata cephfs-king-data
new fs with metadata pool 7 and data pool 8
bash-4.4$
bash-4.4$ ceph -s
cluster:
id: de9af3fe-d3b1-4a4b-bf61-929a990295f6
health: HEALTH_ERR
On Wed, Apr 3, 2024 at 11:58 AM Lorenz Bausch wrote:
>
> Hi everybody,
>
> we upgraded our containerized Red Hat Pacific cluster to the latest
> Quincy release (Community Edition).
i'm afraid this is not an upgrade path that we try to test or support.
Red Hat makes its own decisions about what to
Hi everybody,
we upgraded our containerized Red Hat Pacific cluster to the latest
Quincy release (Community Edition).
The upgrade itself went fine, the cluster is HEALTH_OK, all daemons run
the upgraded version:
%<
$ ceph -s
cluster:
id: 68675a58-cf09-4ebd-949c-b9fcc4f2264
Call for Submission
Stabilization Period: Monday, April 1st - Friday, April 15th, 2024
Submission Deadline: Tuesday, May 3rd, 2024 AoE
The IO500 is now accepting and encouraging submissions for the upcoming
14th semi-annual IO500 Production and Research lists, in conjunction
with ISC24. Once a
removed the config setting for mon. a001s016.
Here it is
# ceph config get mon container_image
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697a5a2cb7ff7668f776b61b9d586
# ceph config get osd container_image
docker.io/ceph/daemon@sha256:261bbe628f4b438f5bf10de5a8ee05282f2697
Thanks for considerations.
On 4/3/24 13:08, Janne Johansson wrote:
Hi every one,
I'm new to ceph and I'm still studying it.
In my company we decided to test ceph for possible further implementations.
Although I undestood its capabilities I'm still doubtful about how to
setup replication.
Defa
> Hi every one,
> I'm new to ceph and I'm still studying it.
> In my company we decided to test ceph for possible further implementations.
>
> Although I undestood its capabilities I'm still doubtful about how to
> setup replication.
Default settings in ceph will give you replication = 3, which i
Hi every one,
I'm new to ceph and I'm still studying it.
In my company we decided to test ceph for possible further implementations.
Although I undestood its capabilities I'm still doubtful about how to
setup replication.
Once implemented in production I can accept a little lacking of
perf
Hi GM,
sorry for the late reply. anmyway, you are right.
in "quincy" (v17) only the owner of the bucket was allowed to set a
notification on the bucket.
in "reef" (v18) we fixed that, so that we follow the permissions set on the
bucket.
you can use the "s3PutBucketNotification" policy on the bucket
I have no idea what you did there ;-) I would remove that config
though and rather configure the ceph image globally, there have been
several issues when cephadm tries to launch daemons with different
ceph versions. Although in your case it looks like they are actually
the same images accor
28 matches
Mail list logo