[ceph-users] Re: help me enable ceph iscsi gatewaty in ceph octopus
Hi Sharad, To add the first gateway you need to execute `gwcli` on `ceph-gw-1` host, and guarantee that you use the host FQDN on `create` command (in this case 'ceph-gw-1'). You can check your host FQDN by running the following script: python -c 'import socket; print(socket.getfqdn())' From: Sharad Mehrotra Sent: Thursday, August 6, 2020 1:03 AM To: ceph-users@ceph.io Subject: [ceph-users] Re: help me enable ceph iscsi gatewaty in ceph octopus Adding some additional context for my question below. I am following the directions here: https://docs.ceph.com/docs/master/rbd/iscsi-target-cli/, but am getting stuck on step #3 of the "Configuring" section, similar to the issue reported above that you worked on. FYI, I installed my ceph-iscsi package manually using these directions: https://docs.ceph.com/docs/master/rbd/iscsi-target-cli-manual-install, and I am running CentOS 7.6 on the nodes where I am installing the iscsi gateways. Here is what happens when I try to create the first gateway: /iscsi-target...-igw/gateways> create ceph-gw-1 10.0.201.66 skipchecks=true *The first gateway defined must be the local machine* /iscsi-target...-igw/gateways> And here is my iscsi-gateway.cfg file. [root@sv49 ceph]# cat iscsi-gateway.cfg [config] # Name of the Ceph storage cluster. A suitable Ceph configuration file allowing # access to the Ceph storage cluster from the gateway node is required, if not # colocated on an OSD node. cluster_name = ceph # Place a copy of the ceph cluster's admin keyring in the gateway's /etc/ceph # directory and reference the filename here gateway_keyring = ceph.client.admin.keyring # API settings. # The API supports a number of options that allow you to tailor it to your # local environment. If you want to run the API under https, you will need to # create cert/key files that are compatible for each iSCSI gateway node, that is # not locked to a specific node. SSL cert and key files *must* be called # 'iscsi-gateway.crt' and 'iscsi-gateway.key' and placed in the '/etc/ceph/' directory # on *each* gateway node. With the SSL files in place, you can use 'api_secure = true' # to switch to https mode. # To support the API, the bear minimum settings are: api_secure = false # Additional API configuration options are as follows, defaults shown. api_user = admin api_password = admin api_port = 5001 trusted_ip_list = 10.0.201.66, 10.0.201.67 Any help you can provide would be appreciated, thanks! On Wed, Aug 5, 2020 at 11:38 AM Sharad Mehrotra wrote: > Sebastian et al: > > How did you solve the "The first gateway > defined must be the local machine" issue that I asked about on another > thread? > > I am deploying ceph-iscsi manually as described in the link that you sent > out (https://docs.ceph.com/docs/master/rbd/iscsi-target-cli/). > Thank you! > > > On Wed, Aug 5, 2020 at 2:37 AM Sebastian Wagner wrote: > >> hi David, hi Ricardo, >> >> I think we first have to clarify, if that was actually a cephadm >> deployment (and not ceph-ansible). >> >> If you install Ceph using ceph-ansible, then please refer to the >> ceph-ansible docs. >> >> If we're actually talking about cephadm here (which is not clear to me): >> iSCSI for cephadm will land in the next octopus release and at that >> point we can add a proper documentation. >> >> >> >> Hope that helps, >> >> Sebastian >> >> Am 05.08.20 um 11:11 schrieb Ricardo Marques: >> > Hi David, >> > >> > I was able to configure iSCSI gateways on my local test environment >> using the following spec: >> > >> > ``` >> > # tail -14 service_spec_gw.yml >> > --- >> > service_type: iscsi >> > service_id: iscsi_service >> > placement: >> > hosts: >> > - 'node1' >> > - 'node2' >> > spec: >> > pool: rbd >> > trusted_ip_list: 10.20.94.201,10.20.94.202,10.20.94.203 >> > api_port: 5000 >> > api_user: admin1 >> > api_password: admin2 >> > api_secure: False >> > >> > # ceph orch apply -i service_spec_gw.yml >> > ``` >> > >> > You can use this spec as a starting point, but note that the pool must >> exist (in this case `rbd` pool), and you will need to adapt `hosts`, >> `trusted_ip_list`, etc... >> > >> > You may also want to change `api_secure` to `True` and set `ssl_cert` >> and `ssh_key` accordingly. >> > >> > Unfortunately, iSCSI deployment is not included in the documentation >> yet (coming soon). >> > >> > [1] https://docs.ceph.com/docs/octopus/cephadm/install/ >> > >> > >> > Ricardo Marques >> > >> > >> > From: David Thuong >> > Sent: Wednesday, August 5, 2020 5:16 AM >> > To: ceph-users@ceph.io >> > Subject: [ceph-users] help me enable ceph iscsi gatewaty in ceph octopus >> > >> > Please help me enable ceph iscsi gatewaty in ceph octopus . when i >> install ceph complete . i see iscsi gateway not enable. please help me >> config it >> > ___ >> > ceph-users mailing lis
[ceph-users] Re: Ceph influxDB support versus Telegraf Ceph plugin?
On 2020-04-01 19:00, Stefan Kooman wrote: > > That said there are plenty of metrics not available outside of > prometheus plugin. I would recommend pushing ceph related metrics from > the built-in ceph plugin in the telegraf client as well. The PR to add > support for MDS and RGW to Ceph plugin has not been merged yet [1], but > can be used to built a telegraf daemon that does. > > [1]: https://github.com/influxdata/telegraf/pull/6915 ^^ This has been merged and MDS and RGW support is available since telegraf 1.15. FYI, Gr. Stefan -- | BIT BV https://www.bit.nl/Kamer van Koophandel 09090351 | GPG: 0xD14839C6 +31 318 648 688 / i...@bit.nl signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] I can just add 4Kn drives, not?
I can just add 4Kn drives to my existing setup not? Since this technology is only specific to how the osd daemon is talking to the disk? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: I can just add 4Kn drives, not?
Yes, no problem -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.ver...@croit.io Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin Verges - VAT-ID: DE310638492 Com. register: Amtsgericht Munich HRB 231263 Web: https://croit.io YouTube: https://goo.gl/PGE1Bx Am Do., 6. Aug. 2020 um 12:13 Uhr schrieb Marc Roos < m.r...@f1-outsourcing.eu>: > > > I can just add 4Kn drives to my existing setup not? Since this > technology is only specific to how the osd daemon is talking to the > disk? > > > > > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Contact Id not obliging in Cash App? Talk to a Cash App representative.
The touch Id feature of the application is one of the most colossal features as it gives you the adaptability to embrace a trade. If you can't, by then you can use the help and help that is found in the customer brain and pick to talk to a Cash App representative or you can research to the particular help site for additional assistance.https://www.pcmonks.net/blog/how-to-cancel-cash-app-payment/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Issue with attestation in Cash App? For help talk to a Cash App representative.
Affirmation is a key bit of the application. Before long, if you can't confirm a trade, by then you can use a few plans from the customer care by choosing to talk to a Cash App representative and get the tech issue settled. You can in like manner don't extra a second to inspect to the help site.https://www.pcmonks.net/blog/how-to-get-cash-app-refund/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Scanner not working in Cash App? Find support by picking to talk to a Cash App representative.
One of the features of the application to help you with filling the nuances of the recipient is the scanner. Notwithstanding, if you can't use it as a result of some goof, by then you can get the fundamental assistance from the customer care site by picking to talk to a Cash App representative. Another choice is to look at to the help arrange for outlines.https://gettosupport.net/cash-app-payment-failed/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Fix Cash App touch Id feature? Talk to a Cash App representative for help.
The touch Id feature of the app lets you approve the transaction by recognizing your fingerprints. But if the Id isn’t working, then you can use the assistance that is provided by various tech support sites or you can dial the tech support number to talk to a Cash App representative in order to get the error fixed.https://gettosupport.net/cash-app-customer-service/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Problem with Cash App activity tab? Contact support to talk to a Cash App representative.
The activity tab of the app is one of the features that help you with the refund issues. But if you can’t use the app’s feature, then you can get in touch with the tech support and talk to a Cash App representative or you can proceed to watch some tech videos and get troubleshooting solutions to fix the problem.https://bitcoin-customerservice.com/cash-app-customer-service-number/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Cash App icon error? Talk to a Cash App representative for assistance.
To perform any operation you first need to get inside the app and that can only be done by tapping on the icon. But if the icon is unresponsive, then you can get assistance by tech support sites or you can also try rebooting your device. You can talk to a Cash App representative to get the issue resolved.https://www.experts-support.com/cash-app-customer-service/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Error in Cash App tabs? Contact help team and talk to a Cash App representative.
The tabs are the features in the app that assist you in performing variety of tasks. But if you can’t use the tabs, then you must get in touch with the customer care and talk to a Cash App representative to get the issue resolved. In addition to that, you can also use tech support to get the matter resolved.https://www.yahoohome-page.com/cash-app-customer-service/ ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Can you block gmail.com or so!!!
Can you block gmail.com or so!!! ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Can you block gmail.com or so!!!
Please not a simple gmail block 8) Not everyone wants to use their corporate account or selfhost email or use a marginally better/worse commercial gmail alternative On 8/6/20 12:52 PM, Marc Roos wrote: Can you block gmail.com or so!!! ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: block.db/block.wal device performance dropped after upgrade to 14.2.10
Hi, I found the reasen of this behavior change. With 14.2.10 the default value of "bluefs_buffered_io" was changed from true to false. https://tracker.ceph.com/issues/44818 configureing this to true my problems seems to be solved. Regards Manuel On Wed, 5 Aug 2020 13:30:45 +0200 Manuel Lausch wrote: > Hello Vladimir, > > I just tested this with a single node testcluster with 60 HDDs (3 of > them with bluestore without separate wal and db). > > With the 14.2.10, I see on the bluestore OSDs a lot of read IOPs while > snaptrimming. With 14.2.9 this was not an issue. > > I wonder if this would explain the huge amount of slowops on my big > testcluster (44 Nodes 1056 OSDs) while snaptrimming. I > cannot test a downgrade there, because there are no packages of older > releases for CentOS 8 available. > > Regards > Manuel > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: snaptrim blocks IO on ceph nautilus
Hi, I think I found the reason why snaptriming causes slowops in my usecase. With 14.2.10 the default value of "bluefs_buffered_io" was changed from true to false. https://tracker.ceph.com/issues/44818 configureing this to true again my problems seems to be solved. Regards Manuel On Mon, 3 Aug 2020 10:31:49 +0200 Manuel Lausch wrote: > Hi, > > the problem still exists and I don't know whats the reason and how to > fix it. > > I figured out, that only about 20 OSDs was affected. After I did a > ceph daemon osd. compact on this the problem was gone. > > I compacted all OSDs in the hope my issue will be fixed with this. But > over the weekend I run into the same issue. > > Does anyone have/had the same issue and has a solution for this? > > Regards > Manuel > > > > On Mon, 27 Jul 2020 15:59:01 +0200 > Manuel Lausch wrote: > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Nautilus slow using "ceph tell osd.* bench"
Hi, I echo Jim's findings. Going from lower Nautilus versions up to .10 on 4 installations suddenly gave a huge drop in read performance, probably by about 3/4, so that my users were complaining that VMs were taking ages to boot up. Strangely, write performance was not affected so much. Kind regards, Dave -- Original Message -- From: "rainning" To: "Jim Forde" ; "ceph-users" Sent: Thursday, 6 Aug, 2020 At 02:02 Subject: [ceph-users] Re: Nautilus slow using "ceph tell osd.* bench" Hi Jim, did you check system stat (e.g. iostat, top, etc.) on both osds when you ran osd bench? Those might be able to give you some clues. Moreover, did you compare both osds' configurations? -- Original -- From: "Jim Forde" Date: Thu, Aug 6, 2020 06:51 AM To: "ceph-users"I have 2 clusters. Cluster 1 started at Hammer and has upgraded through the versions all the way to Nautilus 14.2.10 (Luminous to Nautilus in July 2020) . Cluster 2 started as Luminous and is now Nautilus 14.2.2 (Upgraded in September 2019) The clusters are basically identical 5 OSD Nodes with 6 osd's per node. They are both using disk drives. No SSD's. Prior to upgrading Cluster 1 running "ceph tell osd.0 bench -f plain" produced similar results across both clusters. ceph tell osd.0 bench -f plain bench: wrote 1 GiB in blocks of 4 MiB in 0.954819 sec at 1.0 GiB/sec 268 IOPS Now cluster 1 results are terrible, about 25% from before the upgrade. ceph tell osd.0 bench -f plain bench: wrote 1 GiB in blocks of 4 MiB in 4.03434 sec at 254 MiB/sec 63 IOPS Ceph -s shows HEALTH_OK. Dashboard looks good. 2 pools MON Dump min_mon_release 14 (nautilus) OSD Dump require_min_compat_client luminous min_compat_client jewel require_osd_release nautilus Not sure what is causing the slow performance. Ideas? ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Can you block gmail.com or so!!!
On 6/08/2020 8:52 pm, Marc Roos wrote: Can you block gmail.com or so!!! ! Gmail account here :( Can't we just restrict the list to emails from members? -- Lindsay ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Can you block gmail.com or so!!!
or just block that user On Thu, Aug 6, 2020 at 2:06 PM Lindsay Mathieson < lindsay.mathie...@gmail.com> wrote: > On 6/08/2020 8:52 pm, Marc Roos wrote: > > Can you block gmail.com or so!!! > > ! Gmail account here :( > > > Can't we just restrict the list to emails from members? > > -- > Lindsay > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Quick interruptions in the Ceph cluster
Hi, I’m pretty sure that the deep-scrubs are causing the slow requests. There have been several threads about it on this list [3], there are two major things you can do: 1. Change default settings for deep-scrubs [1] to run them outside business hours to avoid additional load. 2: Change OSD op queue [2]: osd op queue = wpq osd op queue cut off = high We were able to reduce the slow requests drastically in our production cluster with these actions. Regards Eugen [1] https://docs.ceph.com/docs/mimic/rados/configuration/osd-config-ref/#scrubbing [2] https://docs.ceph.com/docs/mimic/rados/configuration/osd-config-ref/#operations [3] https://www.spinics.net/lists/ceph-users/msg60589.html Zitat von Gesiel Galvão Bernardes : Hi, I have been experiencing rapid outage events in the Ceph cluster. During these events I receive messages from slow ops, OSD downs, but at the same time it is operating. Magically everything is back to normal. These events usually last about 2 minutes. I couldn't find anything that could direct me to what causes these events, can you help me? I'm using Mimic (13.2.6) and CentOS7 on all nodes. Below is "ceph -s" and log of when event occurs: # ceph -s cluster: id: 4ea72929-6f9e-453a-8cd5-bb0712f6b874 health: HEALTH_OK services: mon: 2 daemons, quorum cmonitor,cmonitor2 mgr: cmonitor(active), standbys: cmonitor2 osd: 74 osds: 74 up, 74 in tcmu-runner: 10 daemons active data: pools: 7 pools, 3072 pgs objects: 22.17 M objects, 83 TiB usage: 225 TiB used, 203 TiB / 428 TiB avail pgs: 3063 active+clean 9active+clean+scrubbing+deep == Log of event: 2020-08-05 18:00:00.000179 [INF] overall HEALTH_OK 2020-08-05 17:55:28.905024 [INF] Cluster is now healthy 2020-08-05 17:55:28.904975 [INF] Health check cleared: PG_DEGRADED (was: Degraded data redundancy: 1/60350974 objects degraded (0.000%), 1 pg degraded) 2020-08-05 17:55:27.746606 [WRN] Health check update: Degraded data redundancy: 1/60350974 objects degraded (0.000%), 1 pg degraded (PG_DEGRADED) 2020-08-05 17:55:22.745820 [WRN] Health check update: Degraded data redundancy: 55/60350897 objects degraded (0.000%), 26 pgs degraded, 1 pg undersized (PG_DEGRADED) 2020-08-05 17:55:17.744218 [WRN] Health check update: Degraded data redundancy: 123/60350666 objects degraded (0.000%), 63 pgs degraded (PG_DEGRADED) 2020-08-05 17:55:12.743568 [WRN] Health check update: Degraded data redundancy: 192/60350660 objects degraded (0.000%), 88 pgs degraded (PG_DEGRADED) 2020-08-05 17:55:07.741759 [WRN] Health check update: Degraded data redundancy: 290/60350737 objects degraded (0.000%), 117 pgs degraded (PG_DEGRADED) 2020-08-05 17:55:02.737913 [WRN] Health check update: Degraded data redundancy: 299/60350764 objects degraded (0.000%), 119 pgs degraded (PG_DEGRADED) 2020-08-05 17:54:57.736694 [WRN] Health check update: Degraded data redundancy: 299/60350746 objects degraded (0.000%), 119 pgs degraded (PG_DEGRADED) 2020-08-05 17:54:52.736132 [WRN] Health check update: Degraded data redundancy: 299/60350731 objects degraded (0.000%), 119 pgs degraded (PG_DEGRADED) 2020-08-05 17:54:47.735612 [WRN] Health check update: Degraded data redundancy: 299/60350689 objects degraded (0.000%), 119 pgs degraded (PG_DEGRADED) 2020-08-05 17:54:42.734877 [WRN] Health check update: Degraded data redundancy: 301/60350677 objects degraded (0.000%), 120 pgs degraded (PG_DEGRADED) 2020-08-05 17:54:38.210906 [INF] Health check cleared: SLOW_OPS (was: 35 slow ops, oldest one blocked for 1954017 sec, daemons [mon.cmonitor,mon.cmonitor2] have slow ops.) 2020-08-05 17:54:37.734218 [WRN] Health check update: 35 slow ops, oldest one blocked for 1954017 sec, daemons [mon.cmonitor,mon.cmonitor2] have slow ops. (SLOW_OPS) 2020-08-05 17:54:37.734132 [WRN] Health check update: Degraded data redundancy: 380/60350611 objects degraded (0.001%), 154 pgs degraded (PG_DEGRADED) 2020-08-05 17:54:34.171483 [INF] Health check cleared: PG_AVAILABILITY (was: Reduced data availability: 3 pgs inactive, 6 pgs peering) 2020-08-05 17:54:32.733499 [WRN] Health check update: Degraded data redundancy: 52121/60350544 objects degraded (0.086%), 211 pgs degraded (PG_DEGRADED) 2020-08-05 17:54:27.080529 [WRN] Monitor daemon marked osd.72 down, but it is still running 2020-08-05 17:54:32.102889 [WRN] Health check failed: 60 slow ops, oldest one blocked for 1954017 sec, daemons [osd.16,osd.22,osd.23,osd.27,osd.28,osd.29,osd.30,osd.35,osd.48,osd.5]... have slow ops. (SLOW_OPS) 2020-08-05 17:54:32.102699 [WRN] Health check update: Reduced data availability: 3 pgs inactive, 6 pgs peering (PG_AVAILABILITY) 2020-08-05 17:54:27.951343 [INF] osd.72 192.168.200.25:6844/64565 boot 2020-08-05 17:54:27.935996 [INF] Health check cleared: OSD_DOWN (was: 1 osds down) 2020-08-05 17:54:27.732679 [WRN] Health check update: Degraded data redundancy: 17
[ceph-users] Bluestore cache size, bluestore cache settings with nvme
Hi, Any idea what is going on if the "bluestore_cache_autotune" true, "bluestore_cache_size" 0 but the server has only nvme? Because if the bluestore_cache_size 0 it means it will pick ssd or hdd, but if no ssd and hdd, then what will be going on if autotune true? How I should size this number? Any help? Also is it good to use numa or not? On the internet different information is going around this topic. Thank you This message is confidential and is for the sole use of the intended recipient(s). It may also be privileged or otherwise protected by copyright or other legal rules. If you have received it by mistake please let us know by reply email and delete it from your system. It is prohibited to copy this message or disclose its content to anyone. Any confidentiality or privilege is not waived or lost by any mistaken delivery or unauthorized disclosure of the message. All messages sent to and from Agoda may be monitored to ensure compliance with company policies, to protect the company's interests and to remove potential malware. Electronic messages may be intercepted, amended, lost or deleted, or contain viruses. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: block.db/block.wal device performance dropped after upgrade to 14.2.10
Maneul, thank you for your input. This is actually huge, and the problem is exactly that. On a side note I will add, that I observed lower memory utilisation on OSD nodes since the update, and a big throughput on block.db devices(up to 100+MB/s) that was not there before, so logically that meant that some operations that were performed in memory before, now were executed directly on block device. Was digging through possible causes, but your time-saving message arrived earlier. Thank you! чт, 6 авг. 2020 г. в 14:56, Manuel Lausch : > Hi, > > I found the reasen of this behavior change. > With 14.2.10 the default value of "bluefs_buffered_io" was changed from > true to false. > https://tracker.ceph.com/issues/44818 > > configureing this to true my problems seems to be solved. > > Regards > Manuel > > On Wed, 5 Aug 2020 13:30:45 +0200 > Manuel Lausch wrote: > > > Hello Vladimir, > > > > I just tested this with a single node testcluster with 60 HDDs (3 of > > them with bluestore without separate wal and db). > > > > With the 14.2.10, I see on the bluestore OSDs a lot of read IOPs while > > snaptrimming. With 14.2.9 this was not an issue. > > > > I wonder if this would explain the huge amount of slowops on my big > > testcluster (44 Nodes 1056 OSDs) while snaptrimming. I > > cannot test a downgrade there, because there are no packages of older > > releases for CentOS 8 available. > > > > Regards > > Manuel > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] OSD Shard processing operations slowly
Hi, I'm running FIO benchmark to test my simple cluster (3 OSD's, 128 pg's - using Nautilus - v14.2.10) and after certain load of clients performing random read operations, the OSDs show very different performances in terms of op latency. In extreme cases there is an OSD that performs much worse than the others, despite receiving a similar number of operations. Getting more information on the distribution of operations, I can see that the operations are well distributed among the OSD's and the PG's, but in the OSD with poor performance, there is an internal queue (OSD Shard) that is dispatching requests very slowly. In my use case, for example, there is a OSD shard whose average wait time for operations was 120 ms and a OSD Shard that served a few more requests with an average wait time of 1.5 sec. The behavior of this queue ends up affecting the performance of ceph as a whole. The osd op queue implementation used is wpq, and during the execution I get a specific attribute of this queue (probably total_priority) that remains unchanged for a long time. The strange behavior is also repeated in other implementations (prio, m_clock). I've used the mimic version, another pg's distribution and the behavior is always the same, but it can happen in a different OSD or in a different shard. By default, the OSD has 5 shards. Increasing the number of shards considerably improves the performance of this OSD, but I would like to understand what is happening with this specific queue in the default configuration. Does anyone have any idea what might be happening? Thanks, Mafra. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph does not recover from OSD restart
Hi Eric, yes, I had network restarts as well along the way. However, these should also not lead to the redundancy degradation I observed, it doesn't really explain why ceph lost track of so many objects. A temporary network outage on a server is an event that the cluster ought to survive without such damage. What does "transitioning to Stray" mean/indicate here? I did another test today and collected logs for a tracker issue. The problem can be reproduced and occurs if an "old" OSD is restarted, it does not happen when a "new" OSD restarts. Ceph seems to loose track of any placement information computed according to the original crush map from before adding OSDs. It looks like PG remappings are deleted when an OSD shuts down and can only be recovered for placements according to the new crush map, hence, permanent loss of information on restart of an "old" OSD. If one restores the original crush map for long enough, for example, by moving OSDs out of the sub-tree, the cluster can restore all PG remappings and restore full redundancy. Correct behaviour would be either to maintain the remappings until an OSD is explicitly purged from the cluster, or to check for object locations with respect to all relevant crush maps in the history. Another option would be, that every OSD checks on boot if it holds a copy of a certain version of an object that the cluster is looking for (reports as missing) and says "hey, I have it here" if found. This is, in fact, what I expected was implemented. The current behaviour is a real danger. Rebalancing after storage extension is *not* a degraded state, but a normal and usually lengthy maintenance operation. A network fail or host reboot during such operations should not cause havoc, in particular, not this kind of "virtual" data loss while all data is physically present and all hrdware is healthy. Objects that are present on some disk should be found automatically under any circumstances. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eric Smith Sent: 05 August 2020 13:35:44 To: Frank Schilder; ceph-users Subject: RE: Ceph does not recover from OSD restart You have a LOT of state transitions during your maintenance and I'm not really sure why (There are a lot of complaints about the network). There's are also a lot of "transitioning to Stray" after initial startup of an OSD. I'd say let your cluster heal first before you start doing a ton a maintenance so old PG maps can be trimmed. That's the best I can ascertain from the logs for now. -Original Message- From: Frank Schilder Sent: Tuesday, August 4, 2020 8:35 AM To: Eric Smith ; ceph-users Subject: Re: Ceph does not recover from OSD restart If with monitor log you mean the cluster log /var/log/ceph/ceph.log, I should have all of it. Please find a tgz-file here: https://linkprotect.cudasvc.com/url?a=https%3a%2f%2ffiles.dtu.dk%2fu%2ftFCEZJzQhH2mUIRk%2flogs.tgz%3fl&c=E,1,uqVWoKuvpNjjLYU1JO2De96Pz8ZN-UBmy7cFmI6RllcEJg1Nboe8wNTzEx0kJ4WGDxciAY2Mnq_jWNncInKPg-wSwWzu2kV-ZmWlJVb_O9P-At48cWcXTDI9&typo=1 (valid 100 days). Contents: logs/ceph-2020-08-03.log - cluster log for the day of restart logs/ceph-osd.145.2020-08-03.log - log of "old" OSD trimmed to day of restart logs/ceph-osd.288.log - entire log of "new" OSD Hope this helps. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eric Smith Sent: 04 August 2020 14:15:11 To: Frank Schilder; ceph-users Subject: RE: Ceph does not recover from OSD restart Do you have any monitor / OSD logs from the maintenance when the issues occurred? Original message From: Frank Schilder Date: 8/4/20 8:07 AM (GMT-05:00) To: Eric Smith , ceph-users Subject: Re: Ceph does not recover from OSD restart Hi Eric, thanks for the clarification, I did misunderstand you. > You should not have to move OSDs in and out of the CRUSH tree however > in order to solve any data placement problems (This is the baffling part). Exactly. Should I create a tracker issue? I think this is not hard to reproduce with a standard crush map where host-bucket=physical host and I would, in fact, expect that this scenario is part of the integration test. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eric Smith Sent: 04 August 2020 13:58:47 To: Frank Schilder; ceph-users Subject: RE: Ceph does not recover from OSD restart All seems in order in terms of your CRUSH layout. You can speed up the rebalancing / scale-out operations by increasing the osd_max_backfills on each OSD (Especially during off hours). The unnecessary degradation is not expected behavior with a cluster in HEALTH_OK status, but with backfill / rebalancing ongoing it's not unexpected. You should not have to move OSDs in and out
[ceph-users] Re: RGW unable to delete a bucket
BUMP... - Original Message - > From: "Andrei Mikhailovsky" > To: "ceph-users" > Sent: Tuesday, 4 August, 2020 17:16:28 > Subject: [ceph-users] RGW unable to delete a bucket > Hi > > I am trying to delete a bucket using the following command: > > # radosgw-admin bucket rm --bucket= --purge-objects > > However, in console I get the following messages. About 100+ of those messages > per second. > > 2020-08-04T17:11:06.411+0100 7fe64cacf080 1 > RGWRados::Bucket::List::list_objects_ordered INFO ordered bucket listing > requires read #1 > > > The command has been running for about 35 days days and it still hasn't > finished. The size of the bucket is under 1TB for sure. Probably around 500GB. > > I have recently removed about a dozen of old buckets without any issues. It's > this particular bucket that is being very stubborn. > > Anything I can do to remove it, including it's objects and any orphans it > might > have? > > > Thanks > > Andrei > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: block.db/block.wal device performance dropped after upgrade to 14.2.10
Yeah, there are cases where enabling it will improve performance as rocksdb can then used the page cache as a (potentially large) secondary cache beyond the block cache and avoid hitting the underlying devices for reads. Do you have a lot of spare memory for page cache on your OSD nodes? You may be able to improve the situation with bluefs_buffered_io=false by increasing the osd_memory_target which should give the rocksdb block cache more memory to work with directly. One downside is that we currently double cache onodes in both the rocksdb cache and bluestore onode cache which hurts us when memory limited. We have some experimental work that might help in this area by better balancing bluestore onode and rocksdb block caches but it needs to be rebased after Adam's column family sharding work. The reason we had to disable bluefs_buffered_io again was that we had users with certain RGW workloads where the kernel started swapping large amounts of memory on the OSD nodes despite seemingly have free memory available. This caused huge latency spikes and IO slowdowns (even stalls). We never noticed it in our QA test suites and it doesn't appear to happen with RBD workloads as far as I can tell, but when it does happen it's really painful. Mark On 8/6/20 6:53 AM, Manuel Lausch wrote: Hi, I found the reasen of this behavior change. With 14.2.10 the default value of "bluefs_buffered_io" was changed from true to false. https://tracker.ceph.com/issues/44818 configureing this to true my problems seems to be solved. Regards Manuel On Wed, 5 Aug 2020 13:30:45 +0200 Manuel Lausch wrote: Hello Vladimir, I just tested this with a single node testcluster with 60 HDDs (3 of them with bluestore without separate wal and db). With the 14.2.10, I see on the bluestore OSDs a lot of read IOPs while snaptrimming. With 14.2.9 this was not an issue. I wonder if this would explain the huge amount of slowops on my big testcluster (44 Nodes 1056 OSDs) while snaptrimming. I cannot test a downgrade there, because there are no packages of older releases for CentOS 8 available. Regards Manuel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph does not recover from OSD restart
I created https://tracker.ceph.com/issues/46847 = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Frank Schilder Sent: 06 August 2020 13:17:20 To: Eric Smith; ceph-users Subject: [ceph-users] Re: Ceph does not recover from OSD restart Hi Eric, yes, I had network restarts as well along the way. However, these should also not lead to the redundancy degradation I observed, it doesn't really explain why ceph lost track of so many objects. A temporary network outage on a server is an event that the cluster ought to survive without such damage. What does "transitioning to Stray" mean/indicate here? I did another test today and collected logs for a tracker issue. The problem can be reproduced and occurs if an "old" OSD is restarted, it does not happen when a "new" OSD restarts. Ceph seems to loose track of any placement information computed according to the original crush map from before adding OSDs. It looks like PG remappings are deleted when an OSD shuts down and can only be recovered for placements according to the new crush map, hence, permanent loss of information on restart of an "old" OSD. If one restores the original crush map for long enough, for example, by moving OSDs out of the sub-tree, the cluster can restore all PG remappings and restore full redundancy. Correct behaviour would be either to maintain the remappings until an OSD is explicitly purged from the cluster, or to check for object locations with respect to all relevant crush maps in the history. Another option would be, that every OSD checks on boot if it holds a copy of a certain version of an object that the cluster is looking for (reports as missing) and says "hey, I have it here" if found. This is, in fact, what I expected was implemented. The current behaviour is a real danger. Rebalancing after storage extension is *not* a degraded state, but a normal and usually lengthy maintenance operation. A network fail or host reboot during such operations should not cause havoc, in particular, not this kind of "virtual" data loss while all data is physically present and all hrdware is healthy. Objects that are present on some disk should be found automatically under any circumstances. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eric Smith Sent: 05 August 2020 13:35:44 To: Frank Schilder; ceph-users Subject: RE: Ceph does not recover from OSD restart You have a LOT of state transitions during your maintenance and I'm not really sure why (There are a lot of complaints about the network). There's are also a lot of "transitioning to Stray" after initial startup of an OSD. I'd say let your cluster heal first before you start doing a ton a maintenance so old PG maps can be trimmed. That's the best I can ascertain from the logs for now. -Original Message- From: Frank Schilder Sent: Tuesday, August 4, 2020 8:35 AM To: Eric Smith ; ceph-users Subject: Re: Ceph does not recover from OSD restart If with monitor log you mean the cluster log /var/log/ceph/ceph.log, I should have all of it. Please find a tgz-file here: https://linkprotect.cudasvc.com/url?a=https%3a%2f%2ffiles.dtu.dk%2fu%2ftFCEZJzQhH2mUIRk%2flogs.tgz%3fl&c=E,1,uqVWoKuvpNjjLYU1JO2De96Pz8ZN-UBmy7cFmI6RllcEJg1Nboe8wNTzEx0kJ4WGDxciAY2Mnq_jWNncInKPg-wSwWzu2kV-ZmWlJVb_O9P-At48cWcXTDI9&typo=1 (valid 100 days). Contents: logs/ceph-2020-08-03.log - cluster log for the day of restart logs/ceph-osd.145.2020-08-03.log - log of "old" OSD trimmed to day of restart logs/ceph-osd.288.log - entire log of "new" OSD Hope this helps. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eric Smith Sent: 04 August 2020 14:15:11 To: Frank Schilder; ceph-users Subject: RE: Ceph does not recover from OSD restart Do you have any monitor / OSD logs from the maintenance when the issues occurred? Original message From: Frank Schilder Date: 8/4/20 8:07 AM (GMT-05:00) To: Eric Smith , ceph-users Subject: Re: Ceph does not recover from OSD restart Hi Eric, thanks for the clarification, I did misunderstand you. > You should not have to move OSDs in and out of the CRUSH tree however > in order to solve any data placement problems (This is the baffling part). Exactly. Should I create a tracker issue? I think this is not hard to reproduce with a standard crush map where host-bucket=physical host and I would, in fact, expect that this scenario is part of the integration test. Best regards, = Frank Schilder AIT Risø Campus Bygning 109, rum S14 From: Eric Smith Sent: 04 August 2020 13:58:47 To: Frank Schilder; ceph-users Subject: RE: Ceph does not recover from OSD restart All seems in order in terms of your CRUSH layout. You can speed up the reba
[ceph-users] Re: Can you block gmail.com or so!!!
Hi all, As previously mentioned, blocking the gmail domain isn't a feasible solution since the vast majority of @gmail.com subscribers (about 500 in total) are likely legitimate Ceph users. A mailing list member recommended some additional SPF checking a couple weeks ago which I just implemented today. I think what's actually happening is a bot will subscribe using a gmail address and then "clicks" the confirmation link. They then spam from a different domain pretending to be coming from gmail.com but it's not. The new config I put in place should block that. Hopefully this should cut down on the spam. I took over the Ceph mailing lists last year and it's been a never-ending cat and mouse game of spam filters/services, configuration changes, etc. I'm still learning how to be a mail admin so your patience and understanding is appreciated. -- David Galloway Systems Administrator, RDU Ceph Engineering IRC: dgalloway On 8/6/20 8:13 AM, Osama Elswah wrote: > or just block that user > > On Thu, Aug 6, 2020 at 2:06 PM Lindsay Mathieson < > lindsay.mathie...@gmail.com> wrote: > >> On 6/08/2020 8:52 pm, Marc Roos wrote: >>> Can you block gmail.com or so!!! >> >> ! Gmail account here :( >> >> >> Can't we just restrict the list to emails from members? >> >> -- >> Lindsay ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Can you block gmail.com or so!!!
Thanks for you hard work David! Mark On 8/6/20 1:09 PM, David Galloway wrote: Hi all, As previously mentioned, blocking the gmail domain isn't a feasible solution since the vast majority of @gmail.com subscribers (about 500 in total) are likely legitimate Ceph users. A mailing list member recommended some additional SPF checking a couple weeks ago which I just implemented today. I think what's actually happening is a bot will subscribe using a gmail address and then "clicks" the confirmation link. They then spam from a different domain pretending to be coming from gmail.com but it's not. The new config I put in place should block that. Hopefully this should cut down on the spam. I took over the Ceph mailing lists last year and it's been a never-ending cat and mouse game of spam filters/services, configuration changes, etc. I'm still learning how to be a mail admin so your patience and understanding is appreciated. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Can you block gmail.com or so!!!
I looked at the received-from headers, and it looks to me like these messages are being fed into the list from the web interface. The first received from is from mailman web and a private IP. On 8/6/20 2:09 PM, David Galloway wrote: > Hi all, > > As previously mentioned, blocking the gmail domain isn't a feasible > solution since the vast majority of @gmail.com subscribers (about 500 in > total) are likely legitimate Ceph users. > > A mailing list member recommended some additional SPF checking a couple > weeks ago which I just implemented today. I think what's actually > happening is a bot will subscribe using a gmail address and then > "clicks" the confirmation link. They then spam from a different domain > pretending to be coming from gmail.com but it's not. The new config I > put in place should block that. > > Hopefully this should cut down on the spam. I took over the Ceph > mailing lists last year and it's been a never-ending cat and mouse game > of spam filters/services, configuration changes, etc. I'm still > learning how to be a mail admin so your patience and understanding is > appreciated. > -- Tony Lill, OCT, ajl...@ajlc.waterloo.on.ca President, A. J. Lill Consultants (519) 650 0660 539 Grand Valley Dr., Cambridge, Ont. N3H 2S2 (519) 241 2461 -- http://www.ajlc.waterloo.on.ca/ --- signature.asc Description: OpenPGP digital signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: block.db/block.wal device performance dropped after upgrade to 14.2.10
In my case I only have 16GB RAM per node with 5 OSD on each of them, so I actually have to tune osd_memory_target=2147483648 because with the default value of 4GB my osd processes tend to get killed by OOM. That is what I was looking into before the correct solution. I disabled osd_memory_target limitation essentially setting it to default 4GB - it helped in a sense that workload on the block.db device significantly dropped, but overall pattern was not the same - for example there still were no merges on the block.db device. It all came back to the usual pattern with bluefs_buffered_io=true. osd_memory_target limitation was implemented somewhere around 10 > 12 release upgrade I think, before memory auto scaling feature for bluestore was introduced - that's when my osds started to get OOM. They worked fine before that. чт, 6 авг. 2020 г. в 20:28, Mark Nelson : > Yeah, there are cases where enabling it will improve performance as > rocksdb can then used the page cache as a (potentially large) secondary > cache beyond the block cache and avoid hitting the underlying devices > for reads. Do you have a lot of spare memory for page cache on your OSD > nodes? You may be able to improve the situation with > bluefs_buffered_io=false by increasing the osd_memory_target which > should give the rocksdb block cache more memory to work with directly. > One downside is that we currently double cache onodes in both the > rocksdb cache and bluestore onode cache which hurts us when memory > limited. We have some experimental work that might help in this area by > better balancing bluestore onode and rocksdb block caches but it needs > to be rebased after Adam's column family sharding work. > > The reason we had to disable bluefs_buffered_io again was that we had > users with certain RGW workloads where the kernel started swapping large > amounts of memory on the OSD nodes despite seemingly have free memory > available. This caused huge latency spikes and IO slowdowns (even > stalls). We never noticed it in our QA test suites and it doesn't > appear to happen with RBD workloads as far as I can tell, but when it > does happen it's really painful. > > > Mark > > > On 8/6/20 6:53 AM, Manuel Lausch wrote: > > Hi, > > > > I found the reasen of this behavior change. > > With 14.2.10 the default value of "bluefs_buffered_io" was changed from > > true to false. > > https://tracker.ceph.com/issues/44818 > > > > configureing this to true my problems seems to be solved. > > > > Regards > > Manuel > > > > On Wed, 5 Aug 2020 13:30:45 +0200 > > Manuel Lausch wrote: > > > >> Hello Vladimir, > >> > >> I just tested this with a single node testcluster with 60 HDDs (3 of > >> them with bluestore without separate wal and db). > >> > >> With the 14.2.10, I see on the bluestore OSDs a lot of read IOPs while > >> snaptrimming. With 14.2.9 this was not an issue. > >> > >> I wonder if this would explain the huge amount of slowops on my big > >> testcluster (44 Nodes 1056 OSDs) while snaptrimming. I > >> cannot test a downgrade there, because there are no packages of older > >> releases for CentOS 8 available. > >> > >> Regards > >> Manuel > >> > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Can you block gmail.com or so!!!
Oh, interesting. You appear to be correct. I'm running each of the mailing lists' services in their own containers so the private IP makes sense. I just commented on a FR for Hyperkitty to disable posting via Web UI: https://gitlab.com/mailman/hyperkitty/-/issues/264 Aside from that, I can confirm my new SPF filter has already blocked one spam e-mail from getting through so that's good. Thanks for the tip. On 8/6/20 2:34 PM, Tony Lill wrote: > I looked at the received-from headers, and it looks to me like these > messages are being fed into the list from the web interface. The first > received from is from mailman web and a private IP. > > On 8/6/20 2:09 PM, David Galloway wrote: >> Hi all, >> >> As previously mentioned, blocking the gmail domain isn't a feasible >> solution since the vast majority of @gmail.com subscribers (about 500 in >> total) are likely legitimate Ceph users. >> >> A mailing list member recommended some additional SPF checking a couple >> weeks ago which I just implemented today. I think what's actually >> happening is a bot will subscribe using a gmail address and then >> "clicks" the confirmation link. They then spam from a different domain >> pretending to be coming from gmail.com but it's not. The new config I >> put in place should block that. >> >> Hopefully this should cut down on the spam. I took over the Ceph >> mailing lists last year and it's been a never-ending cat and mouse game >> of spam filters/services, configuration changes, etc. I'm still >> learning how to be a mail admin so your patience and understanding is >> appreciated. >> > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Remapped PGs
Still haven't figured this out. We went ahead and upgraded the entire cluster to Podman 2.0.4 and in the process did OS/Kernel upgrades and rebooted every node, one at a time. We've still got 5 PGs stuck in 'remapped' state, according to 'ceph -s' but 0 in the pg dump output in that state. Does anybody have any suggestions on what to do about this? On Wed, Aug 5, 2020 at 10:54 AM David Orman wrote: > Hi, > > We see that we have 5 'remapped' PGs, but are unclear why/what to do about > it. We shifted some target ratios for the autobalancer and it resulted in > this state. When adjusting ratio, we noticed two OSDs go down, but we just > restarted the container for those OSDs with podman, and they came back up. > Here's status output: > > ### > root@ceph01:~# ceph status > INFO:cephadm:Inferring fsid x > INFO:cephadm:Inferring config x > INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 > cluster: > id: 41bb9256-c3bf-11ea-85b9-9e07b0435492 > health: HEALTH_OK > > services: > mon: 5 daemons, quorum ceph01,ceph04,ceph02,ceph03,ceph05 (age 2w) > mgr: ceph03.ytkuyr(active, since 2w), standbys: ceph01.aqkgbl, > ceph02.gcglcg, ceph04.smbdew, ceph05.yropto > osd: 168 osds: 168 up (since 2d), 168 in (since 2d); 5 remapped pgs > > data: > pools: 3 pools, 1057 pgs > objects: 18.00M objects, 69 TiB > usage: 119 TiB used, 2.0 PiB / 2.1 PiB avail > pgs: 1056 active+clean > 1active+clean+scrubbing+deep > > io: > client: 859 KiB/s rd, 212 MiB/s wr, 644 op/s rd, 391 op/s wr > > root@ceph01:~# > > ### > > When I look at ceph pg dump, I don't see any marked as remapped: > > ### > root@ceph01:~# ceph pg dump |grep remapped > INFO:cephadm:Inferring fsid x > INFO:cephadm:Inferring config x > INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 > dumped all > root@ceph01:~# > ### > > Any idea what might be going on/how to recover? All OSDs are up. Health is > 'OK'. This is Ceph 15.2.4 deployed using Cephadm in containers, on Podman > 2.0.3. > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: block.db/block.wal device performance dropped after upgrade to 14.2.10
a 2GB memory target will absolutely starve the OSDs of memory for rocksdb block cache which probably explains why you are hitting the disk for reads and a shared page cache is helping so much. It's definitely more memory efficient to have a page cache scheme rather than having more cache for each OSD, but for NVMe drives you can end up having more contention and overhead. For older systems with slower devices and lower amounts of memory the page cache is probably a win. FWIW with a 4GB+ memory target I suspect you would see far fewer cache miss reads (but obviously you can't do that on your nodes). Mark On 8/6/20 1:47 PM, Vladimir Prokofev wrote: In my case I only have 16GB RAM per node with 5 OSD on each of them, so I actually have to tune osd_memory_target=2147483648 because with the default value of 4GB my osd processes tend to get killed by OOM. That is what I was looking into before the correct solution. I disabled osd_memory_target limitation essentially setting it to default 4GB - it helped in a sense that workload on the block.db device significantly dropped, but overall pattern was not the same - for example there still were no merges on the block.db device. It all came back to the usual pattern with bluefs_buffered_io=true. osd_memory_target limitation was implemented somewhere around 10 > 12 release upgrade I think, before memory auto scaling feature for bluestore was introduced - that's when my osds started to get OOM. They worked fine before that. чт, 6 авг. 2020 г. в 20:28, Mark Nelson : Yeah, there are cases where enabling it will improve performance as rocksdb can then used the page cache as a (potentially large) secondary cache beyond the block cache and avoid hitting the underlying devices for reads. Do you have a lot of spare memory for page cache on your OSD nodes? You may be able to improve the situation with bluefs_buffered_io=false by increasing the osd_memory_target which should give the rocksdb block cache more memory to work with directly. One downside is that we currently double cache onodes in both the rocksdb cache and bluestore onode cache which hurts us when memory limited. We have some experimental work that might help in this area by better balancing bluestore onode and rocksdb block caches but it needs to be rebased after Adam's column family sharding work. The reason we had to disable bluefs_buffered_io again was that we had users with certain RGW workloads where the kernel started swapping large amounts of memory on the OSD nodes despite seemingly have free memory available. This caused huge latency spikes and IO slowdowns (even stalls). We never noticed it in our QA test suites and it doesn't appear to happen with RBD workloads as far as I can tell, but when it does happen it's really painful. Mark On 8/6/20 6:53 AM, Manuel Lausch wrote: Hi, I found the reasen of this behavior change. With 14.2.10 the default value of "bluefs_buffered_io" was changed from true to false. https://tracker.ceph.com/issues/44818 configureing this to true my problems seems to be solved. Regards Manuel On Wed, 5 Aug 2020 13:30:45 +0200 Manuel Lausch wrote: Hello Vladimir, I just tested this with a single node testcluster with 60 HDDs (3 of them with bluestore without separate wal and db). With the 14.2.10, I see on the bluestore OSDs a lot of read IOPs while snaptrimming. With 14.2.9 this was not an issue. I wonder if this would explain the huge amount of slowops on my big testcluster (44 Nodes 1056 OSDs) while snaptrimming. I cannot test a downgrade there, because there are no packages of older releases for CentOS 8 available. Regards Manuel ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: RGW unable to delete a bucket
Hi Folks, I don't know of a downstream issue that looks like this, and we've upstreamed every fix for bucket listing and cleanup we have. We are pursuing a space leak believed to arise in "radosgw-admin bucket rm --purge-objects" but not a non-terminating listing. The only upstream release not planned to get a backport of orphans list tools is Luminous. I thought backport to Octopus was already done by the backport team? regards, Matt On Thu, Aug 6, 2020 at 2:40 PM EDH - Manuel Rios wrote: > > You'r not the only one affected by this issue > > As far as i know several huge companies hitted this bug too, but private > patches or tools are not public released. > > This is caused for the a resharding process during upload in previous > versions. > > Workarround for us.: > > - Delete objects of the bucket at rados level. > - Delete the index file of the bucket. > > Pray to god to not happen again. > > Still pending backporting to Nautilus of the new experimental tool to find > orphans in RGW > > Maybe @Matt Benjamin can give us and ETA for get ready that tool backported... > > Regards > > > > -Mensaje original- > De: Andrei Mikhailovsky > Enviado el: jueves, 6 de agosto de 2020 13:55 > Para: ceph-users > Asunto: [ceph-users] Re: RGW unable to delete a bucket > > BUMP... > > > - Original Message - > > From: "Andrei Mikhailovsky" > > To: "ceph-users" > > Sent: Tuesday, 4 August, 2020 17:16:28 > > Subject: [ceph-users] RGW unable to delete a bucket > > > Hi > > > > I am trying to delete a bucket using the following command: > > > > # radosgw-admin bucket rm --bucket= --purge-objects > > > > However, in console I get the following messages. About 100+ of those > > messages per second. > > > > 2020-08-04T17:11:06.411+0100 7fe64cacf080 1 > > RGWRados::Bucket::List::list_objects_ordered INFO ordered bucket > > listing requires read #1 > > > > > > The command has been running for about 35 days days and it still > > hasn't finished. The size of the bucket is under 1TB for sure. Probably > > around 500GB. > > > > I have recently removed about a dozen of old buckets without any > > issues. It's this particular bucket that is being very stubborn. > > > > Anything I can do to remove it, including it's objects and any orphans > > it might have? > > > > > > Thanks > > > > Andrei > > ___ > > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > > email to ceph-users-le...@ceph.io > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to > ceph-users-le...@ceph.io > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Can you block gmail.com or so!!!
No please :-( ! I'm a Ceph user with a gmail account. On Thursday, August 6, 2020, David Galloway wrote: > Oh, interesting. You appear to be correct. I'm running each of the > mailing lists' services in their own containers so the private IP makes > sense. > > I just commented on a FR for Hyperkitty to disable posting via Web UI: > https://gitlab.com/mailman/hyperkitty/-/issues/264 > > Aside from that, I can confirm my new SPF filter has already blocked one > spam e-mail from getting through so that's good. > > Thanks for the tip. > > On 8/6/20 2:34 PM, Tony Lill wrote: >> I looked at the received-from headers, and it looks to me like these >> messages are being fed into the list from the web interface. The first >> received from is from mailman web and a private IP. >> >> On 8/6/20 2:09 PM, David Galloway wrote: >>> Hi all, >>> >>> As previously mentioned, blocking the gmail domain isn't a feasible >>> solution since the vast majority of @gmail.com subscribers (about 500 in >>> total) are likely legitimate Ceph users. >>> >>> A mailing list member recommended some additional SPF checking a couple >>> weeks ago which I just implemented today. I think what's actually >>> happening is a bot will subscribe using a gmail address and then >>> "clicks" the confirmation link. They then spam from a different domain >>> pretending to be coming from gmail.com but it's not. The new config I >>> put in place should block that. >>> >>> Hopefully this should cut down on the spam. I took over the Ceph >>> mailing lists last year and it's been a never-ending cat and mouse game >>> of spam filters/services, configuration changes, etc. I'm still >>> learning how to be a mail admin so your patience and understanding is >>> appreciated. >>> >> >> >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: RGW unable to delete a bucket
You'r not the only one affected by this issue As far as i know several huge companies hitted this bug too, but private patches or tools are not public released. This is caused for the a resharding process during upload in previous versions. Workarround for us.: - Delete objects of the bucket at rados level. - Delete the index file of the bucket. Pray to god to not happen again. Still pending backporting to Nautilus of the new experimental tool to find orphans in RGW Maybe @Matt Benjamin can give us and ETA for get ready that tool backported... Regards -Mensaje original- De: Andrei Mikhailovsky Enviado el: jueves, 6 de agosto de 2020 13:55 Para: ceph-users Asunto: [ceph-users] Re: RGW unable to delete a bucket BUMP... - Original Message - > From: "Andrei Mikhailovsky" > To: "ceph-users" > Sent: Tuesday, 4 August, 2020 17:16:28 > Subject: [ceph-users] RGW unable to delete a bucket > Hi > > I am trying to delete a bucket using the following command: > > # radosgw-admin bucket rm --bucket= --purge-objects > > However, in console I get the following messages. About 100+ of those > messages per second. > > 2020-08-04T17:11:06.411+0100 7fe64cacf080 1 > RGWRados::Bucket::List::list_objects_ordered INFO ordered bucket > listing requires read #1 > > > The command has been running for about 35 days days and it still > hasn't finished. The size of the bucket is under 1TB for sure. Probably > around 500GB. > > I have recently removed about a dozen of old buckets without any > issues. It's this particular bucket that is being very stubborn. > > Anything I can do to remove it, including it's objects and any orphans > it might have? > > > Thanks > > Andrei > ___ > ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an > email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Is it possible to rebuild a bucket instance?
I have a cluster running Nautilus where the bucket instance (backups.190) has gone missing: # radosgw-admin metadata list bucket | grep 'backups.19[0-1]' | sort "backups.190", "backups.191", # radosgw-admin metadata list bucket.instance | grep 'backups.19[0-1]' | sort "backups.191:00f195a6-c5f9-440f-9deb-62a35bd2b695.5684060.2808", Is there a way to recreate the bucket instance? Perhaps by using the bucket instance from backups.191 as a basis? From what I can tell the bucket index is fine, but since the bucket instance is gone I get this error: # radosgw-admin bucket list --bucket=backups.190 could not get bucket info for bucket=backups.190 ERROR: could not init bucket: (2) No such file or directory 2020-08-06 16:29:23.457 7f9a625076c0 -1 ERROR: get_bucket_instance_from_oid failed: -2 Thanks, Bryan ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Nautilus slow using "ceph tell osd.* bench"
SOLUTION FOUND! Reweight the osd to 0, then set it back to where it belongs. ceph osd crush reweight osd.0 0.0 Original ceph tell osd.0 bench -f plain bench: wrote 1 GiB in blocks of 4 MiB in 4.03434 sec at 254 MiB/sec 63 IOPS After reweight of osd.0 ceph tell osd.0 bench -f plain bench: wrote 1 GiB in blocks of 4 MiB in 1.54555 sec at 663 MiB/sec 165 IOPS ceph tell osd.1 bench -f plain bench: wrote 1 GiB in blocks of 4 MiB in 3.54652 sec at 289 MiB/sec 72 IOPS After reweight of osd.1 ceph tell osd.0 bench -f plain bench: wrote 1 GiB in blocks of 4 MiB in 0.948457 sec at 1.1 GiB/sec 269 IOPS ceph tell osd.1 bench -f plain bench: wrote 1 GiB in blocks of 4 MiB in 0.949384 sec at 1.1 GiB/sec 269 IOPS ceph tell osd.2 bench -f plain bench: wrote 1 GiB in blocks of 4 MiB in 3.56726 sec at 287 MiB/sec 71 IOPS I have finished reweight proceedure on osd node 1 and all 6 osd's are back where they belong, but have 4 more nodes to go. Looks like this should fix it. If anyone has an alternative method for getting around this I am all ears. Dave, would be interested to hear if this works for you. -Jim ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Nautilus slow using "ceph tell osd.* bench"
Hi Jim, when you do reweighting, balancing will be triggered, how did you set it back? Setting back immediately or waiting for balancing to complete? I did try both on my cluster and couldn't see osd bench changed significantly like yours (actually no changes), however, my cluster is 12.2.12, not sure if that is the reason. Moreover, I really can't figure out why flipping the reweight can make such difference. Hope experts can explain that. -- Original -- From: "Jim Forde";