Hi,
I noticed strange situation in one of our clusters. The OSD deamons are taking
too much RAM.
We are running 12.2.12 and have default configuration of osd_memory_target
(4GiB).
Heap dump shows:
osd.2969 dumping heap profile now.
MALLOC: 638
Hi Mark
Thank you for your feedback!
The maximum number of PGs per OSD is only 123. But we have PGs with a
lot of objects. For RGW, there is an EC pool 8+3 with 1024 PGs with 900M
objects, maybe this is the problematic part. The OSDs are 510 hdd, 32 ssd.
Not sure, do you suggest to use somet
Hi Herald,
Changing the bluestore cache settings will have no effect at all on
pglog memory consumption. You can try either reducing the number of PGs
(you might want to check and see how many PGs you have and specifically
how many PGs on that OSD), or decrease the number of pglog entries pe
Hi,
I am running a small 3 node Ceph Nautilus 14.2.8 cluster on Ubuntu 18.04.
I am testing cluster to expose cephfs volume in samba v4 share for the user
to access from windows for latter use.
Samba Version 4.7.6-Ubuntu and mount.cifs version: 6.8.
When I did a test with DD Write (600 MB/s) and
Matching other fields in the token as part of the Condition Statement is
work in progress, but isnt there in nautilus.
Thanks,
Pritha
On Tue, May 12, 2020 at 10:21 PM Wyllys Ingersoll <
wyllys.ingers...@keepertech.com> wrote:
> Does STS support using other fields from the token as part of the
>
For now the storage new cluster used is more than old cluster whereas
the sync progress is about 50% to go. Besides, the amount of objects on
pool 'rgw.buckets.data' is larger too.
I'm not sure if the space of new cluster is enough or not for the whole
data.
Zhenshi Zhou 于2020年5月12日周二 上午11:40写道:
Hello David
I have physical devices i can use to mirror the OSD's no problem. But i
dont't think those disks are actually failing since there is no bad sector
on them and they are brand new with no issues reading from. But they got
corrupt OSD superblock which i believe happend because of bad SAS
Do you have access to another Ceph cluster with enough available space to
create rbds that you dd these failing disks into? That's what I'm doing
right now with some failing disks. I've recovered 2 out of 6 osds that
failed in this way. I would recommend against using the same cluster for
this, but
Hi Paul
I was able to mount both OSD's i need data from successfully using
"ceph-objectstore-tool --data-path /var/lib/ceph/osd/ceph-92 --op fuse
--mountpoint /osd92/"
I see the PG slices that are missing in the mounted folder
"41.b3s3_head" "41.ccs5_head" etc. And i can copy any data from inside
topic to a bucket, the topic and bucket must have
the same owner. So I tried creating a topic using AWS auth.
The credential header I tried was the same as what I use for get/put items to a
bucket:
Credential=/20200512/us-east-1/s3/aws4_request
However in this case rather than succeeding I
Several OSDs of one of our clusters are down currently because RAM usage
has increased during the last days. Now it is more than we can handle on
some systems. Frequently OSDs get killed by the OOM killer. Looking at
"ceph daemon osd.$OSD_ID dump_mempools", it shows that nearly all (about
8.5 G
Maybe some points to think about
- Have/allow semi beginners to add the manual pages. I have been making
my own manuals, but it is of course better to put these online. Maybe
something like here with php user contributions[1]
- When people have errors, put the solutions on the general
docum
Thank you..I looked through both logs and noticed this in the cancel one:
osd_op(unknown.0.0:4164 41.2 41:55b0279d:reshard::reshard.09:head [call
rgw.reshard_remove] snapc 0=[] ondisk+write+known_if_redirected e24984) v8 --
0x7fe9b3625710 con 0
osd_op_reply(4164 reshard.09 [call]
Perhaps the next step is to examine the generated logs from:
radosgw-admin reshard status --bucket=foo --debug-rgw=20 --debug-ms=1
radosgw-admin reshard cancel --bucket foo --debug-rgw=20 --debug-ms=1
Eric
--
J. Eric Ivancich
he / him / his
Red Hat Storage
Ann Arbor, Michigan, US
> I think, however, that a disappearing back network has no real consequences
> as the heartbeats always go over both.
FWIW this has not been my experience, at least through Luminous.
What I’ve seen is that when the cluster/replication net is configured but
unavailable, OSD heartbeats fail an
Does STS support using other fields from the token as part of the Condition
statement? For example looking for specific "sub" identities or matching
on custom token fields like lists of roles?
On Tue, May 12, 2020 at 11:50 AM Matt Benjamin wrote:
> yay! thanks Wyllys, Pritha
>
> Matt
>
> On
Hi MJ,
this should work. Note that when using cloned devices all traffic will go
through the same VLAN. In that case, I believe you an simply remove the cluster
network definition and use just one IP, there is no point having the second IP
on the same VLAN. You will probably have to do "noout,n
Hello,
Bumping this in hopes that someone can shed some light on this. I've tried
to find details on these metrics but I've come up empty handed.
Thank you,
John
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-use
There is a general documentation meeting called the "DocuBetter Meeting",
and it is held every two weeks. The next DocuBetter Meeting will be on 13
May 2020 at 0830 PST, and will run for thirty minutes. Everyone with a
documentation-related request or complaint is invited. The meeting will be
held
thanks a lot for all. Looks like dd zero does not help much about improving
security, but OSD encryption would be sufficent.
best regards,
Samuel
huxia...@horebdata.cn
From: Wido den Hollander
Date: 2020-05-12 14:03
To: Paul Emmerich; Dillaman, Jason
CC: Marc Roos; ceph-users
Subject: [ceph
yay! thanks Wyllys, Pritha
Matt
On Tue, May 12, 2020 at 11:38 AM Wyllys Ingersoll
wrote:
>
>
> Thanks for the hint, I fixed my keycloak configuration for that application
> client so the token only includes a single audience value and now it works
> fine.
>
> thanks!!
>
>
> On Tue, May 12, 20
Thanks for the hint, I fixed my keycloak configuration for that application
client so the token only includes a single audience value and now it works
fine.
thanks!!
On Tue, May 12, 2020 at 11:11 AM Wyllys Ingersoll <
wyllys.ingers...@keepertech.com> wrote:
> The "aud" field in the introspectio
The "aud" field in the introspection result is a list, not a single string.
On Tue, May 12, 2020 at 11:02 AM Pritha Srivastava
wrote:
> app_id must match with the 'aud' field in the token introspection result
> (In the example the value of 'aud' is 'customer-portal')
>
> Thanks,
> Pritha
>
> On
app_id must match with the 'aud' field in the token introspection result
(In the example the value of 'aud' is 'customer-portal')
Thanks,
Pritha
On Tue, May 12, 2020 at 8:16 PM Wyllys Ingersoll <
wyllys.ingers...@keepertech.com> wrote:
>
> Running Nautilus 14.2.9 and trying to follow the STS exa
Hello,
Thank you very much Joshua, it worked.
I have set up three nodes with the cephadm tool, which was very easy.
But I asked myself, what if node 1 goes down?
Before cephadm I just could manage everything from the other nodes with the
ceph commands.
Now I'm a bit stuck, because this cepha
Rgw users are a higher-level feature, and they don't have a direct
relationship to rados pools. Their permissions are controlled at the
bucket/object level by the S3/Swift APIs. I would start by reading
about S3's ACLs and bucket policies.
On Mon, May 11, 2020 at 1:42 AM Vishwas Bm wrote:
>
> Hi,
First thing I'd try is to use objectstore-tool to scrape the
inactive/broken PGs from the dead OSDs using it's PG export feature.
Then import these PGs into any other OSD which will automatically recover
it.
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at https://cr
Yes
ceph osd df tree and ceph -s is at https://pastebin.com/By6b1ps1
On Tue, May 12, 2020 at 10:39 AM Eugen Block wrote:
> Can you share your osd tree and the current ceph status?
>
>
> Zitat von Kári Bertilsson :
>
> > Hello
> >
> > I had an incidence where 3 OSD's crashed at once completely an
On 5/12/20 1:54 PM, Paul Emmerich wrote:
> And many hypervisors will turn writing zeroes into an unmap/trim (qemu
> detect-zeroes=unmap), so running trim on the entire empty disk is often the
> same as writing zeroes.
> So +1 for encryption being the proper way here
>
+1
And to add to this: N
And many hypervisors will turn writing zeroes into an unmap/trim (qemu
detect-zeroes=unmap), so running trim on the entire empty disk is often the
same as writing zeroes.
So +1 for encryption being the proper way here
Paul
--
Paul Emmerich
Looking for help with your Ceph cluster? Contact us at
I would also like to add that the OSDs can (and will) use redirect on write
techniques (not to mention the physical device hardware as well).
Therefore, your zeroing of the device might just cause the OSDs to allocate
new extents of zeros while the old extents remain intact (albeit
unreferenced and
Thanks Eric.
Using your command for SET reported that the OSD may need a restart (which sets
it back to default anyway) but the below seems to work:
ceph tell osd.24 config set objecter_inflight_op_bytes 1073741824
ceph tell osd.24 config set objecter_inflight_ops 10240
reading back the setting
Hi,
On 11/05/2020 08:50, Wido den Hollander wrote:
Great to hear! I'm still behind this idea and all the clusters I design
have a single (or LACP) network going to the host.
One IP address per node where all traffic goes over. That's Ceph, SSH,
(SNMP) Monitoring, etc.
Wido
We have an 'old-st
dd if=/dev/zero of=rbd :) but if you have encrypted osd's, what
would be the use of this?
-Original Message-
From: huxia...@horebdata.cn [mailto:huxia...@horebdata.cn]
Sent: 12 May 2020 12:55
To: ceph-users
Subject: [ceph-users] Zeroing out rbd image or volume
Hi, Ceph folks,
I
Hi, Ceph folks,
Is there a rbd command, or any other way, to zero out rbd images or volume? I
would like to write all zero data to an rbd image/volume before remove it.
Any comments would be appreciated.
best regards,
samuel
Horebdata AG
Switzerland
huxia...@horebdata.cn
__
Can you share your osd tree and the current ceph status?
Zitat von Kári Bertilsson :
Hello
I had an incidence where 3 OSD's crashed at once completely and won't power
up. And during recovery 3 OSD's in another host have somehow become
corrupted. I am running erasure coding with 8+2 setup usin
Hello
I had an incidence where 3 OSD's crashed at once completely and won't power
up. And during recovery 3 OSD's in another host have somehow become
corrupted. I am running erasure coding with 8+2 setup using crush map which
takes 2 OSDs per host, and after losing the other 2 OSD i have few PG's
Hi,
Any input on this ?
*Thanks & Regards,*
*Vishwas *
On Mon, May 11, 2020 at 11:11 AM Vishwas Bm wrote:
> Hi,
>
> I am a newbie to ceph. I have gone through the ceph docs, we are planning
> to use rgw for object storage.
>
> From the docs, what I have understood is that there are two types
38 matches
Mail list logo