On Tue, Mar 23, 2021 at 6:13 AM duluxoz wrote:
>
> Hi All,
>
> I've got a new issue (hopefully this one will be the last).
>
> I have a working Ceph (Octopus) cluster with a replicated pool
> (my-pool), an erasure-coded pool (my-pool-data), and an image (my-image)
> created - all *seems* to be wor
Hi All,
I've got a new issue (hopefully this one will be the last).
I have a working Ceph (Octopus) cluster with a replicated pool
(my-pool), an erasure-coded pool (my-pool-data), and an image (my-image)
created - all *seems* to be working correctly. I also have the correct
Keyring specified
Hi,
Anyone knows how to know which client hold lock of a file in Ceph fs?
I met a dead lock problem that a client holding on get the lock, but I don't
kown which client held it.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an e
Hi Dan:
Aha - I think the first commit is probably it - before that commit, the
fact that lo is highest in the interfaces enumeration didn't matter for us
[since it would always be skipped].
This actually almost certainly also is associated with that other site with
a similar problem (OSDs drop o
There are two commits between 14.2.16 and 14.2.18 related to loopback
network. Perhaps one of these is responsible for your issue [1].
I'd try playing with the options like cluster/public bind addr and
cluster/public bind interface until you can convince the osd to bind to the
correct listening IP
I don't think we explicitly set any ms settings in the OSD host ceph.conf
[all the OSDs ceph.confs are identical across the entire cluster].
ip a gives:
ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group
default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 1
Which `ms` settings do you have in the OSD host's ceph.conf or the ceph
config dump?
And how does `ip a` look on one of these hosts where the osd is registering
itself as 127.0.0.1?
You might as well set nodown again now. This will make ops pile up, but
that's the least of your concerns at the m
Hm, yes it does [and I was wondering why loopbacks were showing up suddenly
in the logs]. This wasn't happening with 14.2.16 so what's changed about
how we specify stuff?
This might correlate with the other person on the IRC list who has problems
with 14.2.18 and their OSDs deciding they don't wor
What's with the OSDs having loopback addresses? E.g. v2:
127.0.0.1:6881/17664667,v1:127.0.0.1:6882/17664667
Does `ceph osd dump` show those same loopback addresses for each OSD?
This sounds familiar... I'm trying to find the recent ticket.
.. dan
On Mon, Mar 22, 2021, 6:07 PM Sam Skipsey wrot
hi Dan:
So, unsetting nodown results in... almost all of the OSDs being marked
down. (231 down out of 328).
Checking the actual OSD services, most of them were actually up and active
on the nodes, even when the mons had marked them down.
(On a few nodes, the down services corresponded to OSDs that
There will be a DocuBetter meeting on Thursday, 25 Mar 2021 at 0100 UTC.
We will discuss the Google Season of Docs proposal (the Comprehensive
Contribution Guide), the rewriting of the cephadm documentation and the new
sectin of the Teuthology Guide.
DocuBetter Meeting -- APAC
25 Mar 2021
0100 UT
Hi everyone!
I'm excited to announce two talks we have on the schedule for March 2021:
Persistent Bucket Notifications By Yuval Lifshitz
https://ceph.io/ceph-tech-talks/
The stream starts on March 25th at 17:00 UTC / 18:00 CET / 1:00 PM
EST / 10:00 AM PST
Persistent bucket notifications are go
Hi,
I would unset nodown (hiding osd failures) and norecover (blcoking PGs
from recovering degraded objects), then start starting osds.
As soon as you have some osd logs reporting some failures, then share those...
- Dan
On Mon, Mar 22, 2021 at 3:49 PM Sam Skipsey wrote:
>
> So, we started the
Hello,
follow up from my mail from 2020 [0], it seems that OSDs sometimes have
"multiple classes" assigned:
[15:47:15] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd crush
rm-device-class osd.4
done removing class of osd(s): 4
[15:47:17] server6.place6:/var/lib/ceph/osd/ceph-4# ceph osd cru
So, we started the mons and mgr up again, and here's the relevant logs,
including also ceph versions. We've also turned off all of the firewalls on
all of the nodes so we know that there can't be network issues [and,
indeed, all of our management of the OSDs happens via logins from the
service node
Hi everyone,
We are approaching the April 2nd deadline in two weeks, so we should
start proposing the next meeting to plan the survey results.
Anybody in the community is welcome to join the Ceph Working Groups.
Please add your name to:
https://ceph.io/user-survey/
I have started a doodle:
https
I tried cache tier in write-back mode in my cluster, but because my ssd
drive is home used, can not satisfy the needs of IOPS. Now I want disable
write-back mode , I founded office documents,but the doc was outdated
https://docs.ceph.com/en/latest/rados/operations/cache-tiering/?highlight=cache%20
Hey Rich!
Appreciate the info. This did work successfully! Just wanted to share my
experience in case others run into a similar situation:
First step, I disabled the tcmu-runner process on all 3 of our previous
iSCSI gateway nodes. Then from our MONs, I confirmed there were no
current locks
Thank you~
I will try to upgrade cluster too. Seem like this is the only way for now.
😭
I will let you know once I complete testing. :)
Have a good day
Szabo, Istvan (Agoda) 于2021年3月22日 周一下午3:38写道:
> Yeah, doesn't work. Last week they fixed my problem ticket which caused
> the crashes, and due
Some news, due to the ceph pg inactive list command gave back that 0 objects
are in this pg, I've marked complete on the primary osd, now it is unfound. Now
I've stucked again 😕
[WRN] OBJECT_UNFOUND: 4/58369044 objects unfound (0.000%)
pg 44.1aa has 4 unfound objects
[ERR] PG_DAMAGED: Possib
Hi.
Unfortunately, there isn't a good guide for sizing Ganesha. It's pretty
light weight, and so the machines it needs are generally smaller than
what Ceph needs, so you probably won't have much of a problem.
The scaling of Ganesha is in 2 factors, based on the workload involved:
the CPU us
Hi Dan:
Thanks for the reply - at present, our mons and mgrs are off [because of
the unsustainable nature of the filesystem usage]. We'll try putting them
on again for long enough to get "ceph status" out of them, but because the
mgr was unable to actually talk to anything, and reply at that point
Hi Sam,
The daemons restart (for *some* releases) because of this:
https://tracker.ceph.com/issues/21672
In short, if the selinux module changes, and if you have selinux
enabled, then midway through yum update, there will be a systemctl
restart ceph.target issued.
For the rest -- I think you shou
Forgot to say, this is an octopus 15.2.9 cluster, there isn't any
force_create_pg option that has couple of thread to make it work.
https://tracker.ceph.com/issues/10411
https://www.oreilly.com/library/view/mastering-proxmox-/9781788397605/42d80c67-10aa-4cf2-8812-e38c861cdc5d.xhtml
[https://www.or
Yeah, doesn't work. Last week they fixed my problem ticket which caused the
crashes, and due to the crashes stopped the replication I'll give a try this
week again after the update if the daemon doesn't crash, maybe it will work,
because if crash hasn't happened, the data was synced. Fingers cro
Hi,
What can I do with this pg to make it work?
We lost and don't have the osds 61,122 but we have the 32,33,70. I've exported
the pg chunk from them, but they are very small and when I imported back to
another osd that osd never started again so I had to remove that chunk
(44.1aas2, 44.1aas3
Hi everyone:
I posted to the list on Friday morning (UK time), but apparently my email
is still in moderation (I have an email from the list bot telling me that
it's held for moderation but no updates).
Since this is a bit urgent - we have ~3PB of storage offline - I'm posting
again.
To save ret
27 matches
Mail list logo