[ceph-users] Ceph Days Dublin CFP ends today

2022-08-16 Thread Mike Perez
Hi everyone,

Ceph Days are returning, and we're starting with Dublin in September
13th! If you're attending Open Source Summit EU, consider adding this
event to your week!

A full-day event dedicated to sharing Ceph's transformative power and
fostering the vibrant Ceph community, Ceph Day Dublin is hosted by
WeWork, the Ceph community, and our friends.

The expert Ceph team, Ceph's customers and partners, and the Ceph
community join forces to discuss things like the status of the Ceph
project, recent Ceph project improvements and roadmap, and Ceph
community news. The day ends with a networking reception to foster
more Ceph learning.

Important Dates

CFP Opens: 2022-07-21
CFP Closes: 2022-08-17
Schedule Announcement: 2022-08-22
Event Date: 2022-09-13

CFP: https://survey.zohopublic.com/zs/xpD7fN
Event link: https://ceph.io/en/community/events/2022/ceph-days-dublin/
Registration: 
https://www.eventbrite.com/e/ceph-days-dublin-2022-tickets-388837191507


--
Mike Perez

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: CephFS perforamnce degradation in root directory

2022-08-16 Thread Robert Sander

Am 16.08.22 um 08:43 schrieb Gregory Farnum:
I was wondering if it had something to do with quota enforcement. The 
other possibility that occurs to me is if other clients are monitoring 
the system, or an admin pane (eg the dashboard) is displaying per-volume 
or per-client stats, they may be poking at the mountpoint and 
interrupting exclusive client caps?


It is really strange behavior.

It only happens in the mountpoint. It only affects the first file that 
gets written to. Additional files can be written to with full speed at 
the same time if started a little bit later.


We do see some activity on the MDS when the slowdown happens.

The write speed of the first process stays slow until finished (or 
canceling the process).


It does not happen in a subdirectory of the mountpoint.

There is no quota set in the filesystem. There are appr 270 files stored 
(qcow2 images). There may be a dozen clients for this filesystem.


Regards
--
Robert Sander
Heinlein Consulting GmbH
Schwedter Str. 8/9b, 10119 Berlin

http://www.heinlein-support.de

Tel: 030 / 405051-43
Fax: 030 / 405051-19

Zwangsangaben lt. §35a GmbHG:
HRB 220009 B / Amtsgericht Berlin-Charlottenburg,
Geschäftsführer: Peer Heinlein -- Sitz: Berlin
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-16 Thread Eugen Block

Hi,


However, the ceph-mds process is pretty much constantly over 100% CPU
and often over 200%. Given it's a single process, right? It makes me
think that some operations are too slow or some task is pegging the CPU
at 100%.


you might want look into multi-active MDS, especially with 5000  
clients. We encountered the same thing in a cluster with only around  
20 clients (kernel client mount) with lots of many small files. The  
single active MDS at that time was not heavily loaded, mds cache size  
was also not the problem but the performance was not good at all. So  
we decided to increase the number of MDS processes per MDS server and  
also started to use directory pinning which increased the performance  
significantly.



Zitat von distro...@gmail.com:


On Mon, 2022-08-15 at 08:33 +, Eugen Block wrote:

Hi,

do you see high disk utilization on the OSD nodes? 


Hi Eugen, thanks for the reply, much appreciated.


How is the load on 
the active MDS?


Yesterday I rebooted the three MDS nodes one at a time (which obviously
included a failover to a freshly booted node) and since then the
performance has improved. It could be a total coincidence though and
I'd really like to try and understand more of what's really going on.

The load seems to stay pretty low on the active MDS server (currently
1.56, 1.62, 1.57) and it has free ram (60G used, 195G free).

The MDS servers almost never have CPU spent waiting on access
(occasionally ~0.2 wa), so there does not seem to be a bottleneck to
disk or network.

However, the ceph-mds process is pretty much constantly over 100% CPU
and often over 200%. Given it's a single process, right? It makes me
think that some operations are too slow or some task is pegging the CPU
at 100%.

Perhaps profiling the MDS server somehow might tell me the kind of
thing it's stuck on?


How much RAM is configured for the MDS 
(mds_cache_memory_limit)?


Currently set to 51539607552, so ~50G?

We do often see this go over and as far as I understand, this triggers
MDS to ask clients to release unused caps (we do get clients who don't
respond).

I think restarting the MDS causes the clients to drop all of their
unused caps, but hold the used ones for when the new MDS comes online
(so as not to overwhelm it)?

I'm not sure whether increasing the cache size helps (because it can
store more caps and put less pressure on the system when it tries to
drop them), or whether that actually increases pressure (because it has
more to track and more things to do).

We do have RAM free on the node though so we could increase it if you
think it might help?


You can list all MDS sessions with 'ceph daemon mds. session
ls' 
to identify all your clients


Thanks, yeah there is a lot of nice info in there, although I'm not
quite sure which elements are useful. That's where I saw the
"request_load_avg" which I'm not quite sure what it means.

We do have ~5000 active clients (and that number is pretty consistent).

The top 5 clients have over a million caps each, with the top client
having over 5 million itself.


and 'ceph daemon mds. 
dump_blocked_ops' to show blocked requests.


There are no blocked ops at the moment, according to (ceph daemon
mds.$(hostname) dump_blocked_ops) but I can try again once the system
performance degrades.

I feel like I need to get some of these metrics out into Prometheus or
something, so that I can look for historical trends (and add alerts).


But simply killing 
sessions isn't a solution, so first you need to find out where the 
bottleneck is.


Yeah, I totally agree with finding the real bottleneck, thanks for your
help.

My thinking could be totally wrong but the reason I was looking into
identifying and killing problematic clients was because we get these
bursts where some clients might be doing some harsh requests (like
multiple jobs trying to read/link/unlink millions of tiny files at
once) and if I can identify them I could try and 1) stop them to
restore cluster performance for everyone else and 2) get them to find a
better way to do that task so we can avoid the issue...

To your point about finding the source of the bottleneck though, I'd
much rather the Ceph cluster was able to handle anything that was
thrown at it... :-) My feeling is that the MDS is easily overwhelmed,
hopefully profiling somehow can help shine a light there.


Do you see hung requests or something? Anything in 
'dmesg' on the client side?


I don't see anything useful on the client side in dmesg, unfortunately.
Just lots of clients talking to mons successfully. The clients are
using kernel ceph, and mounting with relatime (that could explain lots
of caps, even on a ro mount) and acl (assume this puts extra
load/checks on MDS).

At a guess, we can probably optimise the client mounts with noatime
instead and maybe remove acl if we're not using them - not sure of the
impact to workloads though, so haven't tried.

I'm not quite sure of the relationship of operations between MDS and
OSD da

[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-16 Thread Chris Smart
On Tue, 2022-08-16 at 07:50 +, Eugen Block wrote:
> Hi,
> 
> > However, the ceph-mds process is pretty much constantly over 100%
> > CPU
> > and often over 200%. Given it's a single process, right? It makes
> > me
> > think that some operations are too slow or some task is pegging the
> > CPU
> > at 100%.
> 
> you might want look into multi-active MDS, especially with 5000  
> clients. We encountered the same thing in a cluster with only around 
> 20 clients (kernel client mount) with lots of many small files. The  
> single active MDS at that time was not heavily loaded, mds cache
> size  
> was also not the problem but the performance was not good at all.

Thanks, yeah I agree that multi-active MDS is probably the way to
scale, but I'm not sure how well that would work on Luminous. I guess
it might be worth a shot and if it doesn't behave well, just turn it
off... I'll think about this some more, thanks. Perhaps the first step
is to upgrade the cluster to a more recent version where multi-MDS will
be more stable, but that's a whole separate issue.

And that does bring me back to trying to identify "bad" clients and ask
them to change the way their jobs work, to help relieve the pressure
for everyone else until a longer-term solution can be applied.

>  So 
> we decided to increase the number of MDS processes per MDS server
> and  
> also started to use directory pinning which increased the
> performance  
> significantly.
> 

Is "increasing the number of MDS processes per MDS server" only a
setting that's available in multi-MDS mode (I'm guessing so)? It'd be
kinda cool if there was a way to do that on the one MDS, then at least
I could use some of the other dozen or so CPU cores on the machine...

Thank you for taking the time to respond, much appreciated.

Cheers,
-c
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] How to verify the use of wire encryption?

2022-08-16 Thread Martin Traxl
Hi,

I am running a Ceph 16.2.9 cluster with wire encryption. From my ceph.conf:
_
  ms client mode = secure
  ms cluster mode = secure
  ms mon client mode = secure
  ms mon cluster mode = secure
  ms mon service mode = secure
  ms service mode = secure
_

My cluster is running both messenger v1 and messenger v2 listening on the 
default ports 6789 and 3300. Now I have Nautilus clients (krbd) mounting rados 
block devices from this cluster.
When looking at the current sessions (ceph daemon  sessions) for my 
rbd clients I see something like this:
_
{
"name": "client.*",
"entity_name": "client.fe-*",
"addrs": {
"addrvec": [
{
"type": "v1",
"addr": "10.238.194.4:0",
"nonce": 2819469832
}
]
},
"socket_addr": {
"type": "v1",
"addr": "10.238.194.4:0",
"nonce": 2819469832
},
"con_type": "client",
"con_features": 3387146417253690110,
"con_features_hex": "2f018fb87aa4aafe",
"con_features_release": "luminous",
"open": true,
"caps": {
"text": "profile rbd"
},
"authenticated": true,
"global_id": 256359885,
"global_id_status": "reclaim_ok",
"osd_epoch": 13120,
"remote_host": ""
},
_

As I understand, "type": "v1" means messenger v1 is used and therefore no 
secure wire encryption, which comes with messenger v2. Is this understanding 
correct? How can I enable wire encrytion here? Nautilus should be able to use 
msgr2. In general, how can I verify a client is using wire encryption or not?

Thank you,
Martin

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: What is client request_load_avg? Troubleshooting MDS issues on Luminous

2022-08-16 Thread Chris Smart
On Tue, 2022-08-16 at 10:52 +, Frank Schilder wrote:
> Hi Chris,
> 
> I would strongly advice not to use multi-MDS with 5000 clients on
> luminous. I enabled it on mimic with ca. 1750 clients and it was
> extremely dependent on luck if it converged to a stable distribution
> of dirfrags or ended up doing export_dir operations all the time,
> completely killing the FS performance. Also, even in mimic where
> multi-MDS is no longer experimental, it still has a lot of bugs. You
> will need to monitor the cluster tightly and might be forced to
> intervene regularly, including going back and forth between single-
> and multi-MDS.
> 

Hi Frank,

Thanks a lot for passing on your experience, that's really valuable
info for a CephFS n00b like me. I have been wary of enabling multi-MDS
as I figured I'd end up hitting a lot of issues on Luminuous, plus I'd
be in even more deep over my head...

> My recommendation would be to upgrade to octopus as fast as possible.
> Its the first version that supports ephemeral pinning, which I would
> say is pretty much the most useful multi-MDS mode, because it uses a
> static dirfrag distribution over all MDSes avoiding the painful
> export_dir operations.
> 

OK yeah, I was just reading about ephemeral pinning, actually. Sounds
like the best plan is to move to Octopus and then also ensure we have a
solid upgrade plan moving forward. I only inherited this a couple of
months ago and it's still the same original Lumiuous cluster.

> You are in the unlucky situation that you will need 2 upgrades. I
> think going L->M->O might be the least painful as it requires only 1
> OSD conversion. If you are a bit more adventurous, you could also aim
> for L->N->P. Nautilus will probably not solve your performance issue
> and any path including nautilus will have an extra OSD conversion.
> However, in case you are using file store, you might want to go this
> route and change from file store to bluestore with a re-deployment of
> OSDs when you are on pacific. You will get out of some performance
> issues with upgraded OSDs and pacific has fixes for a boat load of FS
> snapshot issues.
> 

I am wary of upgrading between releases in general, I've looked into
this a bit and have noticed a number of people hit some strange issues.
I guess the fortunate thing is that most people have probably
experienced them already and solutions are probably relatively easy to
find - on the downside, I'm not sure many people will be able to help
as this cluster is so old, people probably have forgotten or moved on.

But I guess I don't really have any other choice, it's either upgrade
or perhaps building a brand new cluster and migrating data.

Yeah, the cluster is also using filestore and it would be good to get
onto bluestore at some point. The cache is already on NVMe at least, so
that's helped.

> In the mean time, can you roll out something like ganglia on all
> client- and storage nodes and collect network traffic stats? I found
> the packet report combined with bytes-in/out extremely useful to hunt
> down rogue FS clients. If you use snapshots, also kworker CPU and
> wait-IO on the client node are are indicative of problems with this
> client.
> 

That's a good idea, I'll look into that.

Thanks again for the input, it's really helpful!

Cheers,
-c


> Best regards,
> =
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] ceph kernel client RIP when quota exceeded

2022-08-16 Thread Andrej Filipcic


Hi,

we experienced massive node failures when a user with cephfs quota 
exceeded submitted many jobs to a slurm cluster, home is on cephfs. The 
nodes still work for some time, but they eventually freeze due to too 
many stuck CPUs


Is this a kernel ceph client bug? running on 5.10.123, ceph cluster is 
16.2.9.


Best regards,
Andrej

2022-08-15T20:08:01+02:00 cn0539 kernel: [ cut here 
]
2022-08-15T20:08:01+02:00 cn0539 kernel: Attempt to access reserved 
inode number 0x101
2022-08-15T20:08:01+02:00 cn0539 kernel: WARNING: CPU: 172 PID: 4185848 
at fs/ceph/super.h:547 __lookup_inode+0x161/0x180 [ceph]
2022-08-15T20:08:14+02:00 cn0539 kernel: Modules linked in: squashfs 
loop overlay fuse ceph libceph mgc(O) lustre(O) lmv(O) mdc(O) fid(O) 
lov(O) fld(O) osc(O) ko2iblnd(O) ptlrpc(O) obdclass(O) lnet(O) libcfs(O) 
rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace nfs_ssc 
fscache rfkill ipmi_ssif nft_limit amd64_edac_mod edac_mce_amd 
amd_energy nft_ct kvm_amd nf_conntrack
nf_defrag_ipv6 kvm nf_defrag_ipv4 irqbypass crct10dif_pclmul 
crc32_pclmul ghash_clmulni_intel rapl pcspkr nf_tables libcrc32c 
nfnetlink sp5100_tco ccp acpi_ipmi k10temp i2c_piix4 ipmi_si rdma_ucm(O) 
rdma_cm(O) iw_cm(O) acpi_cpufreq ib_ipoib(O) ib_cm(O) ib_umad(O) sunrpc 
vfat fat ext4 mbcache jbd2 mlx5_ib(O) ib_uverbs(O) ib_core(O) 
mlx5_core(O) mlxfw(O) pci_hyperv_intf crc32c_inte
l tls ahci nvme psample igb libahci mlxdevm(O) auxiliary(O) nvme_core 
i2c_algo_bit libata t10_pi dca mlx_compat(O) pinctrl_amd xpmem(O) 
ipmi_devintf ipmi_msghandler
2022-08-15T20:08:14+02:00 cn0539 kernel: CPU: 172 PID: 4185848 Comm: 
slurm_script Tainted: G    W  O  5.10.123-2.el8.x86_64 #1
2022-08-15T20:08:16+02:00 cn0539 kernel: Hardware name: To be filled by 
O.E.M. To be filled by O.E.M./CER, BIOS BIOS_RME090.22.37.001 10/05/2021
2022-08-15T20:08:17+02:00 cn0539 kernel: RIP: 
0010:__lookup_inode+0x161/0x180 [ceph]
2022-08-15T20:08:18+02:00 cn0539 kernel: Code: dd 48 85 db 0f 85 27 ff 
ff ff 45 85 e4 0f 89 5d ff ff ff 49 63 ec e9 16 ff ff ff 48 89 de 48 c7 
c7 58 bb 40 c1 e8 1e 21 d8 d0 <0f> 0b e9 3f ff ff ff e8 53 3d 01 00 eb 
c6 be 03 00 00 00 e8 97 a2
2022-08-15T20:08:21+02:00 cn0539 kernel: RSP: 0018:b6d8de33fc18 
EFLAGS: 00010286
2022-08-15T20:08:22+02:00 cn0539 kernel: RAX:  RBX: 
0101 RCX: 0027
2022-08-15T20:08:23+02:00 cn0539 kernel: RDX: 0027 RSI: 
95f2afd207e0 RDI: 95f2afd207e8
2022-08-15T20:08:24+02:00 cn0539 kernel: RBP: 965345e568a0 R08: 
 R09: c000fffe
2022-08-15T20:08:25+02:00 cn0539 kernel: R10: 0001 R11: 
b6d8de33fa20 R12: 959e55081aa8
2022-08-15T20:08:27+02:00 cn0539 kernel: R13: 965345e568a8 R14: 
9593ea333e00 R15: 959e55081a80
2022-08-15T20:08:28+02:00 cn0539 kernel: FS:  7fbf7c8ba740() 
GS:95f2afd0() knlGS:
2022-08-15T20:08:29+02:00 cn0539 kernel: CS:  0010 DS:  ES:  
CR0: 80050033
2022-08-15T20:08:30+02:00 cn0539 kernel: CR2: 564324b8a588 CR3: 
004d5115 CR4: 00150ee0

2022-08-15T20:08:31+02:00 cn0539 kernel: Call Trace:
2022-08-15T20:08:31+02:00 cn0539 kernel: ? __do_request+0x3f0/0x450 [ceph]
2022-08-15T20:08:32+02:00 cn0539 kernel: ceph_lookup_inode+0xa/0x30 [ceph]
2022-08-15T20:08:34+02:00 cn0539 kernel: 
lookup_quotarealm_inode.isra.9+0x188/0x210 [ceph]
2022-08-15T20:08:34+02:00 cn0539 kernel: 
check_quota_exceeded+0x1bc/0x220 [ceph]

2022-08-15T20:08:34+02:00 cn0539 kernel: ceph_write_iter+0x1bf/0xc90 [ceph]
2022-08-15T20:08:35+02:00 cn0539 kernel: ? path_openat+0x666/0x1050
2022-08-15T20:08:36+02:00 cn0539 kernel: ? __touch_cap+0x1f/0xd0 [ceph]
2022-08-15T20:08:36+02:00 cn0539 kernel: ? ptep_set_access_flags+0x23/0x30
2022-08-15T20:08:37+02:00 cn0539 kernel: ? wp_page_reuse+0x5f/0x70
2022-08-15T20:08:38+02:00 cn0539 kernel: ? new_sync_write+0x11f/0x1b0
2022-08-15T20:08:38+02:00 cn0539 kernel: new_sync_write+0x11f/0x1b0
2022-08-15T20:08:39+02:00 cn0539 kernel: vfs_write+0x1bd/0x270
2022-08-15T20:08:40+02:00 cn0539 kernel: ksys_write+0x59/0xd0
2022-08-15T20:08:40+02:00 cn0539 kernel: do_syscall_64+0x33/0x40
2022-08-15T20:08:41+02:00 cn0539 kernel: 
entry_SYSCALL_64_after_hwframe+0x44/0xa9

2022-08-15T20:08:41+02:00 cn0539 kernel: RIP: 0033:0x7fbf7bfc65a8
2022-08-15T20:08:42+02:00 cn0539 kernel: Code: 89 02 48 c7 c0 ff ff ff 
ff eb b3 0f 1f 80 00 00 00 00 f3 0f 1e fa 48 8d 05 f5 3f 2a 00 8b 00 85 
c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 58 c3 0f 1f 80 00 
00 00 00 41 54 49 89 d4 55
2022-08-15T20:08:45+02:00 cn0539 kernel: RSP: 002b:7ffcc4ad6dd8 
EFLAGS: 0246 ORIG_RAX: 0001
2022-08-15T20:08:46+02:00 cn0539 kernel: RAX: ffda RBX: 
0417 RCX: 7fbf7bfc65a8
2022-08-15T20:08:47+02:00 cn0539 kernel: RDX: 0417 RSI: 
564324baa470 RDI: 0004
2022-08-15T20:08:48+02:00 cn0539 kernel: RBP: 564324baa470 R08: 
000

[ceph-users] Re: ceph kernel client RIP when quota exceeded

2022-08-16 Thread Xiubo Li

Hi Andrej,

The upstream kernel has one commit:

commit 0078ea3b0566e3da09ae8e1e4fbfd708702f2876
Author: Jeff Layton 
Date:   Tue Nov 9 09:54:49 2021 -0500

    ceph: don't check for quotas on MDS stray dirs

    玮文 胡 reported seeing the WARN_RATELIMIT pop when writing to an
    inode that had been transplanted into the stray dir. The client was
    trying to look up the quotarealm info from the parent and that tripped
    the warning.

    Change the ceph_vino_is_reserved helper to not throw a warning for
    MDS stray directories (0x100 - 0x1ff), only for reserved dirs that
    are not in that range.

    Also, fix ceph_has_realms_with_quotas to return false when encountering
    a reserved inode.

    URL: https://tracker.ceph.com/issues/53180
    Reported-by: Hu Weiwen 
    Signed-off-by: Jeff Layton 
    Reviewed-by: Luis Henriques 
    Reviewed-by: Xiubo Li 
    Signed-off-by: Ilya Dryomov 

It's not a bug, just a warning, you can safely ignore it.

Thanks.

On 8/16/22 7:39 PM, Andrej Filipcic wrote:


Hi,

we experienced massive node failures when a user with cephfs quota 
exceeded submitted many jobs to a slurm cluster, home is on cephfs. 
The nodes still work for some time, but they eventually freeze due to 
too many stuck CPUs


Is this a kernel ceph client bug? running on 5.10.123, ceph cluster is 
16.2.9.


Best regards,
Andrej

2022-08-15T20:08:01+02:00 cn0539 kernel: [ cut here 
]
2022-08-15T20:08:01+02:00 cn0539 kernel: Attempt to access reserved 
inode number 0x101
2022-08-15T20:08:01+02:00 cn0539 kernel: WARNING: CPU: 172 PID: 
4185848 at fs/ceph/super.h:547 __lookup_inode+0x161/0x180 [ceph]
2022-08-15T20:08:14+02:00 cn0539 kernel: Modules linked in: squashfs 
loop overlay fuse ceph libceph mgc(O) lustre(O) lmv(O) mdc(O) fid(O) 
lov(O) fld(O) osc(O) ko2iblnd(O) ptlrpc(O) obdclass(O) lnet(O) 
libcfs(O) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd 
grace nfs_ssc fscache rfkill ipmi_ssif nft_limit amd64_edac_mod 
edac_mce_amd amd_energy nft_ct kvm_amd nf_conntrack
nf_defrag_ipv6 kvm nf_defrag_ipv4 irqbypass crct10dif_pclmul 
crc32_pclmul ghash_clmulni_intel rapl pcspkr nf_tables libcrc32c 
nfnetlink sp5100_tco ccp acpi_ipmi k10temp i2c_piix4 ipmi_si 
rdma_ucm(O) rdma_cm(O) iw_cm(O) acpi_cpufreq ib_ipoib(O) ib_cm(O) 
ib_umad(O) sunrpc vfat fat ext4 mbcache jbd2 mlx5_ib(O) ib_uverbs(O) 
ib_core(O) mlx5_core(O) mlxfw(O) pci_hyperv_intf crc32c_inte
l tls ahci nvme psample igb libahci mlxdevm(O) auxiliary(O) nvme_core 
i2c_algo_bit libata t10_pi dca mlx_compat(O) pinctrl_amd xpmem(O) 
ipmi_devintf ipmi_msghandler
2022-08-15T20:08:14+02:00 cn0539 kernel: CPU: 172 PID: 4185848 Comm: 
slurm_script Tainted: G    W  O  5.10.123-2.el8.x86_64 #1
2022-08-15T20:08:16+02:00 cn0539 kernel: Hardware name: To be filled 
by O.E.M. To be filled by O.E.M./CER, BIOS BIOS_RME090.22.37.001 
10/05/2021
2022-08-15T20:08:17+02:00 cn0539 kernel: RIP: 
0010:__lookup_inode+0x161/0x180 [ceph]
2022-08-15T20:08:18+02:00 cn0539 kernel: Code: dd 48 85 db 0f 85 27 ff 
ff ff 45 85 e4 0f 89 5d ff ff ff 49 63 ec e9 16 ff ff ff 48 89 de 48 
c7 c7 58 bb 40 c1 e8 1e 21 d8 d0 <0f> 0b e9 3f ff ff ff e8 53 3d 01 00 
eb c6 be 03 00 00 00 e8 97 a2
2022-08-15T20:08:21+02:00 cn0539 kernel: RSP: 0018:b6d8de33fc18 
EFLAGS: 00010286
2022-08-15T20:08:22+02:00 cn0539 kernel: RAX:  RBX: 
0101 RCX: 0027
2022-08-15T20:08:23+02:00 cn0539 kernel: RDX: 0027 RSI: 
95f2afd207e0 RDI: 95f2afd207e8
2022-08-15T20:08:24+02:00 cn0539 kernel: RBP: 965345e568a0 R08: 
 R09: c000fffe
2022-08-15T20:08:25+02:00 cn0539 kernel: R10: 0001 R11: 
b6d8de33fa20 R12: 959e55081aa8
2022-08-15T20:08:27+02:00 cn0539 kernel: R13: 965345e568a8 R14: 
9593ea333e00 R15: 959e55081a80
2022-08-15T20:08:28+02:00 cn0539 kernel: FS:  7fbf7c8ba740() 
GS:95f2afd0() knlGS:
2022-08-15T20:08:29+02:00 cn0539 kernel: CS:  0010 DS:  ES:  
CR0: 80050033
2022-08-15T20:08:30+02:00 cn0539 kernel: CR2: 564324b8a588 CR3: 
004d5115 CR4: 00150ee0

2022-08-15T20:08:31+02:00 cn0539 kernel: Call Trace:
2022-08-15T20:08:31+02:00 cn0539 kernel: ? __do_request+0x3f0/0x450 
[ceph]
2022-08-15T20:08:32+02:00 cn0539 kernel: ceph_lookup_inode+0xa/0x30 
[ceph]
2022-08-15T20:08:34+02:00 cn0539 kernel: 
lookup_quotarealm_inode.isra.9+0x188/0x210 [ceph]
2022-08-15T20:08:34+02:00 cn0539 kernel: 
check_quota_exceeded+0x1bc/0x220 [ceph]
2022-08-15T20:08:34+02:00 cn0539 kernel: ceph_write_iter+0x1bf/0xc90 
[ceph]

2022-08-15T20:08:35+02:00 cn0539 kernel: ? path_openat+0x666/0x1050
2022-08-15T20:08:36+02:00 cn0539 kernel: ? __touch_cap+0x1f/0xd0 [ceph]
2022-08-15T20:08:36+02:00 cn0539 kernel: ? 
ptep_set_access_flags+0x23/0x30

2022-08-15T20:08:37+02:00 cn0539 kernel: ? wp_page_reuse+0x5f/0x70
2022-08-15T20:08:38+02:00 cn0539 kernel: ? new_sync_write+0x11f/0x1b0
2022-08-15T20:08:38+02:0

[ceph-users] Announcing go-ceph v0.17.0

2022-08-16 Thread Sven Anderson
We are happy to announce another release of the go-ceph API library. This
is a
regular release following our every-two-months release cadence.

https://github.com/ceph/go-ceph/releases/tag/v0.17.0

Changes include additions to the rados and rgw packages. More details are
available at the link above.

The library includes bindings that aim to play a similar role to the
"pybind"
python bindings in the ceph tree but for the Go language. The library also
includes additional APIs that can be used to administer cephfs, rbd, and rgw
subsystems.
There are already a few consumers of this library in the wild, including the
ceph-csi project.

Sven
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] RBD images Prometheus metrics : not all pools/images reported

2022-08-16 Thread Gilles Mocellin

Hello Cephers,

I'm trying to diagnose who's doing what on our cluster, which suffer 
from SLOW_OPS, High latency periods since Pacific.


And I can't see all pool / images in RBD stats.
I had activated RBD image stats while running Octopus, now it seems we 
only need to define mgr/prometheus/rbd_stats_pools.

I have put '*' to catch all pools.

First question: even specifying explicitly an EC data pool, it doesn't 
seem to have stats.

I can understand that image stats would be collected at metadata pool.
Is it correct ?

But, second question: I can only see 3 pools in Prometheus metrics like 
ceph_rbd_read_ops (among ~20, I use OpenStack with all its pools).


So, either in the Dashboard graphs or in my Grafana, I can only see 
metrics concerning these pools.


Mmm, I'm just seeing one thing... I have no image in the other pools... 
Gnocchi does not store images, my cinder-backup pool is empty, my second 
cinder pool also,

And finally, all radosgw pools are not storing rbd images too...

So I think I have my answer to that second question.

Anyway, it's strange that I can't find the same value comparing the pool 
statistics with the sum of the RBD image in it :


sum(irate(ceph_rbd_write_bytes{cluster="mycluster",pool="myvolumepool"}[1m]))
irate(ceph_pool_wr_bytes{cluster="mycluster",pool_id="myvolumedatapoolid"}[1m])

There's more than 10 times ceph_pool_wr_bytes on the datapool than the 
sum of all ceph_rbd_write_bytes on the metadata pool.


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Quincy: Corrupted devicehealth sqlite3 database from MGR crashing bug

2022-08-16 Thread Patrick Donnelly
Thank you, that's helpful. I have created a ticket with my findings so far:

https://tracker.ceph.com/issues/57152

Please follow there for updates.

On Mon, Aug 15, 2022 at 4:12 PM Daniel Williams  wrote:
>
> ceph-post-file: a9802e30-0096-410e-b5c0-f2e6d83acfd6
>
> On Tue, Aug 16, 2022 at 3:13 AM Patrick Donnelly  wrote:
>>
>> On Mon, Aug 15, 2022 at 11:39 AM Daniel Williams  wrote:
>> >
>> > Using ubuntu with apt repository from ceph.
>> >
>> > Ok that helped me figure out that it's .mgr not mgr.
>> > # ceph -v
>> > ceph version 17.2.3 (dff484dfc9e19a9819f375586300b3b79d80034d) quincy 
>> > (stable)
>> > # export CEPH_CONF='/etc/ceph/ceph.conf'
>> > # export CEPH_KEYRING='/etc/ceph/ceph.client.admin.keyring'
>> > # export CEPH_ARGS='--log_to_file true --log-file ceph-sqlite.log 
>> > --debug_cephsqlite 20 --debug_ms 1'
>> > # sqlite3
>> > SQLite version 3.31.1 2020-01-27 19:55:54
>> > Enter ".help" for usage hints.
>> > sqlite> .load libcephsqlite.so
>> > sqlite> .open file:///.mgr:devicehealth/main.db?vfs=ceph
>> > sqlite> .tables
>> > Segmentation fault (core dumped)
>> >
>> > # dpkg -l | grep ceph | grep sqlite
>> > ii  libsqlite3-mod-ceph  17.2.3-1focal 
>> >  amd64SQLite3 VFS for Ceph
>> >
>> > Attached ceph-sqlite.log
>>
>> No real good hint in the log unfortunately. I will need the core dump
>> to see where things went wrong. Can you upload it with
>>
>> https://docs.ceph.com/en/quincy/man/8/ceph-post-file/
>>
>> ?
>>
>> --
>> Patrick Donnelly, Ph.D.
>> He / Him / His
>> Principal Software Engineer
>> Red Hat, Inc.
>> GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
>>


-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io