Hi Frank,
thanks for looking up those trackers. I haven't looked into them yet,
I'll read your response in detail later, but I wanted to add some new
observation:
I added another root bucket (custom) to the osd tree:
# ceph osd tree
ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT
Hello everyone,
Nice talk yesterday. :-)
Regarding containers vs RPMs and orchestration, and the related discussion from
yesterday, I wanted to share a few things (which I wasn't able to share
yesterday on the call due to a headset/bluetooth stack issue) to explain why we
use cephadm and ceph
Hi,
thanks for picking that up so quickly!
I haven't used a host spec file yet to add new hosts, but if you read
my thread about the unknown PGs, this might be my first choice to do
that in the future. So thanks again for bringig it to my attention. ;-)
Regards,
Eugen
Zitat von Matthew Ve
Hi,
we are currently in the process of adopting the main s3 cluster to
orchestrator.
We have two realms (one for us and one for the customer).
The old config worked fine and depending on the port I requested, I got
different x-amz-request-id header back:
x-amz-request-id: tx0307170ac0d734ab4-
Thanks for being my rubber ducky.
Turns out I didn't had the rgw_zonegroup configured in the first apply.
Then adding it to the config and applying it, does not restart or
reconfigure the containers.
After doing a ceph orch restart rgw.customer it seems to work now.
Happy weekend everybody.
Am Fr
I start to think that the root cause of the remapping is just the fact
that the crush rule(s) contain(s) the "step take default" line:
step take default class hdd
My interpretation is that crush simply tries to honor the rule:
consider everything underneath the "default" root, so PG
Hi Frédéric,
I agree. Maybe we should re-frame things? Containers can run on
bare-metal and containers can run virtualized. And distribution packages
can run bare-metal and virtualized as well.
What about asking independently about:
* Do you run containers or distribution packages?
* Do yo
Hello Sebastian,
I just checked the survey and you're right, the issue was within the question.
Got me a bit confused when I read it but I clicked anyway. Who doesn't like
clicking? :-D
What best describes your deployment target? *
1/ Bare metal (RPMs/Binary)
2/ Containers (cephadm/Rook)
3/ Bot
Hi Eugen,
so it is partly "unexpectedly expected" and partly buggy. I really wish the
crush implementation was honouring a few obvious invariants. It is extremely
counter-intuitive that mappings taken from a sub-set change even if both, the
sub-set and the mapping instructions themselves don't.
Thanks Enrico,
We are only syncing metadata between sites, so I don't think that bug will be
the cause of our issues.
I have been able to delete ~30k objects without causing the RGW to stop
processing.
Thanks
Iain
From: Enrico Bocchi
Sent: 22 May 2024 13:48
T
Hi all,
Goodness I'd say it's been at least 3 major releases since I had to do a
recovery. I have disks with 60-75,000 power_on_hours. I just updated from
Octopus to Reef last month and I'm hit with 3 disk failures and the mclock
ugliness. My recovery is moving at a wondrous 21 mb/sec after some
Hey Chris,
A number of users have been reporting issues with recovery on Reef
with mClock. Most folks have had success reverting to
osd_op_queue=wpq. AIUI 18.2.3 should have some mClock improvements but
I haven't looked at the list myself yet.
Josh
On Fri, May 24, 2024 at 10:55 AM Mazzystr wrot
Is that a setting that can be applied runtime or does it req osd restart?
On Fri, May 24, 2024 at 9:59 AM Joshua Baergen
wrote:
> Hey Chris,
>
> A number of users have been reporting issues with recovery on Reef
> with mClock. Most folks have had success reverting to
> osd_op_queue=wpq. AIUI 18.
It requires an OSD restart, unfortunately.
Josh
On Fri, May 24, 2024 at 11:03 AM Mazzystr wrote:
>
> Is that a setting that can be applied runtime or does it req osd restart?
>
> On Fri, May 24, 2024 at 9:59 AM Joshua Baergen
> wrote:
>
> > Hey Chris,
> >
> > A number of users have been reporti
Hi,
I guess you mean use something like "step take DCA class hdd"
instead of "step take default class hdd" as in:
rule rule-ec-k7m11 {
id 1
type erasure
min_size 3
max_size 18
step set_chooseleaf_tries 5
step set_choose_tries 100
step ta
I did the obnoxious task of updating ceph.conf and restarting all my osds.
ceph --admin-daemon /var/run/ceph/ceph-osd.*.asok config get osd_op_queue
{
"osd_op_queue": "wpq"
}
I have some spare memory on my target host/osd and increased the target
memory of that OSD to 10 Gb and restarted. No
Now that you're on wpq, you can try tweaking osd_max_backfills (up)
and osd_recovery_sleep (down).
Josh
On Fri, May 24, 2024 at 1:07 PM Mazzystr wrote:
>
> I did the obnoxious task of updating ceph.conf and restarting all my osds.
>
> ceph --admin-daemon /var/run/ceph/ceph-osd.*.asok config get
On 24.05.2024 21:07, Mazzystr wrote:
I did the obnoxious task of updating ceph.conf and restarting all my
osds.
ceph --admin-daemon /var/run/ceph/ceph-osd.*.asok config get
osd_op_queue
{
"osd_op_queue": "wpq"
}
I have some spare memory on my target host/osd and increased the target
memo
When running a cephfs scrub the MDS will crash with the following backtrace
-1> 2024-05-25T09:00:23.028+1000 7ef2958006c0 -1
/usr/src/debug/ceph/ceph-18.2.2/src/mds/MDSRank.cc: In function 'void
MDSRank::abort(std::string_view)' thread 7ef2958006c0 time
2024-05-25T09:00:23.031373+1000
/usr/src
Hi Everyone,
I'm putting together a HDD cluster with an ECC pool dedicated to the backup
environment. Traffic via s3. Version 18.2, 7 OSD nodes, 12 * 12TB HDD +
1NVME each, 4+2 ECC pool.
Wondering if there is some general guidance for startup setup/tuning in
regards to s3 object size. Files are
20 matches
Mail list logo