[ceph-users] Re: cephadm - How to deploy ceph cluster with a partition on SSD for block.db
I found out that it's already possible to specify storage path in OSD service specification yaml. It works for data_devices, but unfortunately not for db_devices and wal_devices, at least not in my case. service_type: osd service_id: osd_spec_default placement: host_pattern: '*' data_devices: paths: - /dev/vdb1 db_devices: paths: - /dev/vdb2 wal_devices: paths: - /dev/vdb3 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Spam here still
Do know that this is the only mailing list I am subscribed to, that sends me so much spam. Maybe the list admin should finally have a word with other list admins on how they are managing their lists ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Spam here still
On 8/09/2020 5:30 pm, Marc Roos wrote: Do know that this is the only mailing list I am subscribed to, that sends me so much spam. Maybe the list admin should finally have a word with other list admins on how they are managing their lists ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io Same. I'm missing a number of legit emails as the list gets classified as spam. -- Lindsay ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Syncing cephfs from Ceph to Ceph
Hello, Is it possible to somehow sync a ceph from one site to a ceph form another site? I'm just using the cephfs feature and no block devices. Being able to sync cephfs pools between two sites would be great for a hot backup, in case one site fails. Thanks in advance, Simon ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Syncing cephfs from Ceph to Ceph
On 2020-09-08 11:22, Simon Sutter wrote: > Hello, > > > Is it possible to somehow sync a ceph from one site to a ceph form another > site? > I'm just using the cephfs feature and no block devices. > > Being able to sync cephfs pools between two sites would be great for a hot > backup, in case one site fails. It's a work in progress [1]. This might do what you want to right now: [2]. Note: I haven't used [2] myself. Gr. Stefan [1]: https://docs.ceph.com/docs/master/dev/cephfs-mirroring/ [2]: https://github.com/oliveiradan/cephfs-sync ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Syncing cephfs from Ceph to Ceph
Thanks Stefan, First of all, for a bit more context, we use this ceph cluster just for hot backups, so 99% write 1% read, no need for low latency. Ok so the snapshot function would mean, we would have like a colder backup. Just like a snapshot of a VM, without any incremental functionality, which also means scheduled but huge transfer rates. What about the idea of creating the cluster over two data centers? Would it be possible to modify the crush map, so one pool gets replicated over those two data centers and if one fails, the other one would still be functional? Additionally, would it be possible to prioritize one data center over the other? This would allow saving data from site1 to a pool on site2 in case of a disaster on site1, site2 would still have those Backups. We have a 10G connection with around 0.5ms latency. Thanks in advance, Simon Von: Stefan Kooman Gesendet: Dienstag, 8. September 2020 11:38:29 An: Simon Sutter; ceph-users@ceph.io Betreff: Re: [ceph-users] Syncing cephfs from Ceph to Ceph On 2020-09-08 11:22, Simon Sutter wrote: > Hello, > > > Is it possible to somehow sync a ceph from one site to a ceph form another > site? > I'm just using the cephfs feature and no block devices. > > Being able to sync cephfs pools between two sites would be great for a hot > backup, in case one site fails. It's a work in progress [1]. This might do what you want to right now: [2]. Note: I haven't used [2] myself. Gr. Stefan [1]: https://docs.ceph.com/docs/master/dev/cephfs-mirroring/ [2]: https://github.com/oliveiradan/cephfs-sync ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Spam here still
just my 5 cents, admin should disable postings on web interface ... all spams are injected via hyperkitty !! since there is no parameter to accomplish this, admin should hack into "post_to_list" and raise a exeption upon posting attempts to mittigate this ! regards Gerhard W. Recher net4sec UG (haftungsbeschränkt) Leitenweg 6 86929 Penzing +49 8191 4283888 +49 171 4802507 Am 08.09.2020 um 10:50 schrieb Lindsay Mathieson: > On 8/09/2020 5:30 pm, Marc Roos wrote: >> >> Do know that this is the only mailing list I am subscribed to, that >> sends me so much spam. Maybe the list admin should finally have a word >> with other list admins on how they are managing their lists >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io > > Same. I'm missing a number of legit emails as the list gets classified > as spam. > smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Spam here still
update: admin should consider to use Version 1.3.4 https://hyperkitty.readthedocs.io/en/latest/news.html * Implemented a new |HYPERKITTY_ALLOW_WEB_POSTING| that allows disabling the web posting feature. (Closes #264) Gerhard W. Recher net4sec UG (haftungsbeschränkt) Leitenweg 6 86929 Penzing +49 8191 4283888 +49 171 4802507 Am 08.09.2020 um 12:53 schrieb Gerhard W. Recher: > just my 5 cents, admin should disable postings on web interface ... > > all spams are injected via hyperkitty !! > > since there is no parameter to accomplish this, admin should hack into > "post_to_list" and raise a exeption upon posting attempts to mittigate > this ! > > regards > > Gerhard W. Recher > > net4sec UG (haftungsbeschränkt) > Leitenweg 6 > 86929 Penzing > > +49 8191 4283888 > +49 171 4802507 > Am 08.09.2020 um 10:50 schrieb Lindsay Mathieson: >> On 8/09/2020 5:30 pm, Marc Roos wrote: >>> >>> Do know that this is the only mailing list I am subscribed to, that >>> sends me so much spam. Maybe the list admin should finally have a word >>> with other list admins on how they are managing their lists >>> ___ >>> ceph-users mailing list -- ceph-users@ceph.io >>> To unsubscribe send an email to ceph-users-le...@ceph.io >> Same. I'm missing a number of legit emails as the list gets classified >> as spam. >> > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io smime.p7s Description: S/MIME Cryptographic Signature ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm - How to deploy ceph cluster with a partition on SSD for block.db
https://tracker.ceph.com/issues/46558 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Multipart uploads with partsizes larger than 16MiB failing on Nautilus
Hey all, I'm creating a new post for this issue as we've narrowed the problem down to a partsize limitation on multipart upload. We have discovered that in our production Nautilus (14.2.11) cluster and our lab Nautilus (14.2.10) cluster that multipart uploads with a configured part size of greater than 16777216 bytes (16MiB) will return a status 500 / internal server error from radosgw. So far I have increased the following rgw settings/values that looked suspect, without any success/improvement with partsizes. Such as: "rgw_get_obj_window_size": "16777216", "rgw_put_obj_min_window_size": "16777216", I am trying to determine if this is because of a conservative default setting somewhere that I don't know about or if this is perhaps a bug? I would appreciate it if someone on Nautilus with rgw could also test / provide feedback. It's very easy to reproduce and configuring your partsize with aws2cli requires you to put the following in your aws 'config' s3 = multipart_chunksize = 32MB rgw server logs during a failed multipart upload (32MB chunk/partsize): 2020-09-08 15:59:36.054 7f2d32fa6700 1 == starting new request req=0x55953dc36930 = 2020-09-08 15:59:36.082 7f2d32fa6700 -1 res_query() failed 2020-09-08 15:59:36.138 7f2d32fa6700 1 == req done req=0x55953dc36930 op status=0 http_status=200 latency=0.0839988s == 2020-09-08 16:00:07.285 7f2d3dfbc700 1 == starting new request req=0x55953dc36930 = 2020-09-08 16:00:07.285 7f2d3dfbc700 -1 res_query() failed 2020-09-08 16:00:07.353 7f2d00741700 1 == starting new request req=0x55954dd5e930 = 2020-09-08 16:00:07.357 7f2d00741700 -1 res_query() failed 2020-09-08 16:00:07.413 7f2cc56cb700 1 == starting new request req=0x55953dc02930 = 2020-09-08 16:00:07.417 7f2cc56cb700 -1 res_query() failed 2020-09-08 16:00:07.473 7f2cb26a5700 1 == starting new request req=0x5595426f6930 = 2020-09-08 16:00:07.473 7f2cb26a5700 -1 res_query() failed 2020-09-08 16:00:09.465 7f2d3dfbc700 0 WARNING: set_req_state_err err_no=35 resorting to 500 2020-09-08 16:00:09.465 7f2d3dfbc700 1 == req done req=0x55953dc36930 op status=-35 http_status=500 latency=2.17997s == 2020-09-08 16:00:09.549 7f2d00741700 0 WARNING: set_req_state_err err_no=35 resorting to 500 2020-09-08 16:00:09.549 7f2d00741700 1 == req done req=0x55954dd5e930 op status=-35 http_status=500 latency=2.19597s == 2020-09-08 16:00:09.605 7f2cc56cb700 0 WARNING: set_req_state_err err_no=35 resorting to 500 2020-09-08 16:00:09.609 7f2cc56cb700 1 == req done req=0x55953dc02930 op status=-35 http_status=500 latency=2.19597s == 2020-09-08 16:00:09.641 7f2cb26a5700 0 WARNING: set_req_state_err err_no=35 resorting to 500 2020-09-08 16:00:09.641 7f2cb26a5700 1 == req done req=0x5595426f6930 op status=-35 http_status=500 latency=2.16797s == awscli client side output during a failed multipart upload: root@jump:~# aws --no-verify-ssl --endpoint-url http://lab-object.cancercollaboratory.org:7480 s3 cp 4GBfile s3://troubleshooting upload failed: ./4GBfile to s3://troubleshooting/4GBfile An error occurred (UnknownError) when calling the UploadPart operation (reached max retries: 2): Unknown Thanks, Jared Baker Cloud Architect for the Cancer Genome Collaboratory Ontario Institute for Cancer Research ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: cephadm didn't create journals
journal_devices is for filestore and filestore isn't supported with cephadm ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Multipart uploads with partsizes larger than 16MiB failing on Nautilus
I had been looking into this issue all day and during testing found that a specific configuration option we had been setting for years was the culprit. Not setting this value and letting it fall back to the default seems to have fixed our issue with multipart uploads. If you are curious, the configuration option is rgw_obj_stripe_size which was being set to 67108864 bytes (64MiB). The default is 4194304 bytes (4MiB). This is a documented option (https://docs.ceph.com/docs/nautilus/radosgw/config-ref/) and from my testing it seems like using anything but the default (only tried larger values) breaks multipart uploads. On Tue, Sep 8, 2020 at 12:12 PM shubjero wrote: > > Hey all, > > I'm creating a new post for this issue as we've narrowed the problem > down to a partsize limitation on multipart upload. We have discovered > that in our production Nautilus (14.2.11) cluster and our lab Nautilus > (14.2.10) cluster that multipart uploads with a configured part size > of greater than 16777216 bytes (16MiB) will return a status 500 / > internal server error from radosgw. > > So far I have increased the following rgw settings/values that looked > suspect, without any success/improvement with partsizes. > Such as: > "rgw_get_obj_window_size": "16777216", > "rgw_put_obj_min_window_size": "16777216", > > I am trying to determine if this is because of a conservative default > setting somewhere that I don't know about or if this is perhaps a bug? > > I would appreciate it if someone on Nautilus with rgw could also test > / provide feedback. It's very easy to reproduce and configuring your > partsize with aws2cli requires you to put the following in your aws > 'config' > s3 = > multipart_chunksize = 32MB > > rgw server logs during a failed multipart upload (32MB chunk/partsize): > 2020-09-08 15:59:36.054 7f2d32fa6700 1 == starting new request > req=0x55953dc36930 = > 2020-09-08 15:59:36.082 7f2d32fa6700 -1 res_query() failed > 2020-09-08 15:59:36.138 7f2d32fa6700 1 == req done > req=0x55953dc36930 op status=0 http_status=200 latency=0.0839988s > == > 2020-09-08 16:00:07.285 7f2d3dfbc700 1 == starting new request > req=0x55953dc36930 = > 2020-09-08 16:00:07.285 7f2d3dfbc700 -1 res_query() failed > 2020-09-08 16:00:07.353 7f2d00741700 1 == starting new request > req=0x55954dd5e930 = > 2020-09-08 16:00:07.357 7f2d00741700 -1 res_query() failed > 2020-09-08 16:00:07.413 7f2cc56cb700 1 == starting new request > req=0x55953dc02930 = > 2020-09-08 16:00:07.417 7f2cc56cb700 -1 res_query() failed > 2020-09-08 16:00:07.473 7f2cb26a5700 1 == starting new request > req=0x5595426f6930 = > 2020-09-08 16:00:07.473 7f2cb26a5700 -1 res_query() failed > 2020-09-08 16:00:09.465 7f2d3dfbc700 0 WARNING: set_req_state_err > err_no=35 resorting to 500 > 2020-09-08 16:00:09.465 7f2d3dfbc700 1 == req done > req=0x55953dc36930 op status=-35 http_status=500 latency=2.17997s > == > 2020-09-08 16:00:09.549 7f2d00741700 0 WARNING: set_req_state_err > err_no=35 resorting to 500 > 2020-09-08 16:00:09.549 7f2d00741700 1 == req done > req=0x55954dd5e930 op status=-35 http_status=500 latency=2.19597s > == > 2020-09-08 16:00:09.605 7f2cc56cb700 0 WARNING: set_req_state_err > err_no=35 resorting to 500 > 2020-09-08 16:00:09.609 7f2cc56cb700 1 == req done > req=0x55953dc02930 op status=-35 http_status=500 latency=2.19597s > == > 2020-09-08 16:00:09.641 7f2cb26a5700 0 WARNING: set_req_state_err > err_no=35 resorting to 500 > 2020-09-08 16:00:09.641 7f2cb26a5700 1 == req done > req=0x5595426f6930 op status=-35 http_status=500 latency=2.16797s > == > > awscli client side output during a failed multipart upload: > root@jump:~# aws --no-verify-ssl --endpoint-url > http://lab-object.cancercollaboratory.org:7480 s3 cp 4GBfile > s3://troubleshooting > upload failed: ./4GBfile to s3://troubleshooting/4GBfile An error > occurred (UnknownError) when calling the UploadPart operation (reached > max retries: 2): Unknown > > Thanks, > > Jared Baker > Cloud Architect for the Cancer Genome Collaboratory > Ontario Institute for Cancer Research ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] ceph pgs inconsistent, always the same checksum
Hi, I've got a ceph cluster, 7 nodes, 168 OSDs, with 96G of ram on each server. Ceph has been instructed to set a memory target of 3G until we increase RAM to 128G per node. Available memory tends to hover around 14G. I do see a tiny bit (KB) of swap utilization per ceph-osd process, but there's no reason for it, so unsure what that's about: root@ceph02:~# cat /proc/14363/status |egrep 'Name|VmSwap' *Name*: ceph-osd *VmSwap*: 464 kB We're seeing repeated inconsistent PG warnings, generally on the order of 3-10 per week. pg 2.b9 is active+clean+inconsistent, acting [25,117,128,95,151,15] PG query on that PG: INFO:cephadm:Using recent ceph image docker.io/ceph/ceph:v15 { "snap_trimq": "[]", "snap_trimq_len": 0, "state": "active+clean+inconsistent", "epoch": 20278, "up": [ 25, 117, 128, 95, 151, 15 ], "acting": [ 25, 117, 128, 95, 151, 15 ], "acting_recovery_backfill": [ "15(5)", "25(0)", "95(3)", "117(1)", "128(2)", "151(4)" ], "info": { "pgid": "2.b9s0", "last_update": "20278'445510", "last_complete": "20278'445510", "log_tail": "20278'438137", "last_user_version": 445510, "last_backfill": "MAX", "purged_snaps": [], "history": { "epoch_created": 573, "epoch_pool_created": 100, "last_epoch_started": 14679, "last_interval_started": 14678, "last_epoch_clean": 14716, "last_interval_clean": 14678, "last_epoch_split": 573, "last_epoch_marked_full": 0, "same_up_since": 14678, "same_interval_since": 14678, "same_primary_since": 14396, "last_scrub": "20278'444009", "last_scrub_stamp": "2020-09-08T16:57:22.430246+", "last_deep_scrub": "20278'444009", "last_deep_scrub_stamp": "2020-09-08T16:57:22.430246+", "last_clean_scrub_stamp": "2020-09-07T06:34:26.320796+", "prior_readable_until_ub": 0 }, "stats": { "version": "20278'445510", "reported_seq": "896803", "reported_epoch": "20278", "state": "active+clean+inconsistent", "last_fresh": "2020-09-08T18:06:45.463880+", "last_change": "2020-09-08T16:57:22.430293+", "last_active": "2020-09-08T18:06:45.463880+", "last_peered": "2020-09-08T18:06:45.463880+", "last_clean": "2020-09-08T18:06:45.463880+", "last_became_active": "2020-08-06T19:35:02.634999+", "last_became_peered": "2020-08-06T19:35:02.634999+", "last_unstale": "2020-09-08T18:06:45.463880+", "last_undegraded": "2020-09-08T18:06:45.463880+", "last_fullsized": "2020-09-08T18:06:45.463880+", "mapping_epoch": 14678, "log_start": "20278'438137", "ondisk_log_start": "20278'438137", "created": 573, "last_epoch_clean": 14716, "parent": "0.0", "parent_split_bits": 10, "last_scrub": "20278'444009", "last_scrub_stamp": "2020-09-08T16:57:22.430246+", "last_deep_scrub": "20278'444009", "last_deep_scrub_stamp": "2020-09-08T16:57:22.430246+", "last_clean_scrub_stamp": "2020-09-07T06:34:26.320796+", "log_size": 7373, "ondisk_log_size": 7373, "stats_invalid": false, "dirty_stats_invalid": false, "omap_stats_invalid": false, "hitset_stats_invalid": false, "hitset_bytes_stats_invalid": false, "pin_stats_invalid": false, "manifest_stats_invalid": false, "snaptrimq_len": 0, "stat_sum": { "num_bytes": 322985947136, "num_objects": 78724, "num_object_clones": 0, "num_object_copies": 472344, "num_objects_missing_on_primary": 0, "num_objects_missing": 0, "num_objects_degraded": 0, "num_objects_misplaced": 0, "num_objects_unfound": 0, "num_objects_dirty": 78724, "num_whiteouts": 0, "num_read": 430713, "num_read_kb": 121695928, "num_write": 445501, "num_write_kb": 405283436, "num_scrub_errors": 1, "num_shallow_scrub_errors": 0, "num_deep_scrub_errors": 1, "num_objects_recovered": 21, "num_bytes_recovered": 88080384, "num_key
[ceph-users] Re: Multipart uploads with partsizes larger than 16MiB failing on Nautilus
thanks, Shubjero Would you consider creating a ceph tracker issue for this? regards, Matt On Tue, Sep 8, 2020 at 4:13 PM shubjero wrote: > > I had been looking into this issue all day and during testing found > that a specific configuration option we had been setting for years was > the culprit. Not setting this value and letting it fall back to the > default seems to have fixed our issue with multipart uploads. > > If you are curious, the configuration option is rgw_obj_stripe_size > which was being set to 67108864 bytes (64MiB). The default is 4194304 > bytes (4MiB). This is a documented option > (https://docs.ceph.com/docs/nautilus/radosgw/config-ref/) and from my > testing it seems like using anything but the default (only tried > larger values) breaks multipart uploads. > > On Tue, Sep 8, 2020 at 12:12 PM shubjero wrote: > > > > Hey all, > > > > I'm creating a new post for this issue as we've narrowed the problem > > down to a partsize limitation on multipart upload. We have discovered > > that in our production Nautilus (14.2.11) cluster and our lab Nautilus > > (14.2.10) cluster that multipart uploads with a configured part size > > of greater than 16777216 bytes (16MiB) will return a status 500 / > > internal server error from radosgw. > > > > So far I have increased the following rgw settings/values that looked > > suspect, without any success/improvement with partsizes. > > Such as: > > "rgw_get_obj_window_size": "16777216", > > "rgw_put_obj_min_window_size": "16777216", > > > > I am trying to determine if this is because of a conservative default > > setting somewhere that I don't know about or if this is perhaps a bug? > > > > I would appreciate it if someone on Nautilus with rgw could also test > > / provide feedback. It's very easy to reproduce and configuring your > > partsize with aws2cli requires you to put the following in your aws > > 'config' > > s3 = > > multipart_chunksize = 32MB > > > > rgw server logs during a failed multipart upload (32MB chunk/partsize): > > 2020-09-08 15:59:36.054 7f2d32fa6700 1 == starting new request > > req=0x55953dc36930 = > > 2020-09-08 15:59:36.082 7f2d32fa6700 -1 res_query() failed > > 2020-09-08 15:59:36.138 7f2d32fa6700 1 == req done > > req=0x55953dc36930 op status=0 http_status=200 latency=0.0839988s > > == > > 2020-09-08 16:00:07.285 7f2d3dfbc700 1 == starting new request > > req=0x55953dc36930 = > > 2020-09-08 16:00:07.285 7f2d3dfbc700 -1 res_query() failed > > 2020-09-08 16:00:07.353 7f2d00741700 1 == starting new request > > req=0x55954dd5e930 = > > 2020-09-08 16:00:07.357 7f2d00741700 -1 res_query() failed > > 2020-09-08 16:00:07.413 7f2cc56cb700 1 == starting new request > > req=0x55953dc02930 = > > 2020-09-08 16:00:07.417 7f2cc56cb700 -1 res_query() failed > > 2020-09-08 16:00:07.473 7f2cb26a5700 1 == starting new request > > req=0x5595426f6930 = > > 2020-09-08 16:00:07.473 7f2cb26a5700 -1 res_query() failed > > 2020-09-08 16:00:09.465 7f2d3dfbc700 0 WARNING: set_req_state_err > > err_no=35 resorting to 500 > > 2020-09-08 16:00:09.465 7f2d3dfbc700 1 == req done > > req=0x55953dc36930 op status=-35 http_status=500 latency=2.17997s > > == > > 2020-09-08 16:00:09.549 7f2d00741700 0 WARNING: set_req_state_err > > err_no=35 resorting to 500 > > 2020-09-08 16:00:09.549 7f2d00741700 1 == req done > > req=0x55954dd5e930 op status=-35 http_status=500 latency=2.19597s > > == > > 2020-09-08 16:00:09.605 7f2cc56cb700 0 WARNING: set_req_state_err > > err_no=35 resorting to 500 > > 2020-09-08 16:00:09.609 7f2cc56cb700 1 == req done > > req=0x55953dc02930 op status=-35 http_status=500 latency=2.19597s > > == > > 2020-09-08 16:00:09.641 7f2cb26a5700 0 WARNING: set_req_state_err > > err_no=35 resorting to 500 > > 2020-09-08 16:00:09.641 7f2cb26a5700 1 == req done > > req=0x5595426f6930 op status=-35 http_status=500 latency=2.16797s > > == > > > > awscli client side output during a failed multipart upload: > > root@jump:~# aws --no-verify-ssl --endpoint-url > > http://lab-object.cancercollaboratory.org:7480 s3 cp 4GBfile > > s3://troubleshooting > > upload failed: ./4GBfile to s3://troubleshooting/4GBfile An error > > occurred (UnknownError) when calling the UploadPart operation (reached > > max retries: 2): Unknown > > > > Thanks, > > > > Jared Baker > > Cloud Architect for the Cancer Genome Collaboratory > > Ontario Institute for Cancer Research > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] The confusing output of ceph df command
Hi, I have changed most of pools from 3-replica to ec 4+2 in my cluster, when I use ceph df command to show the used capactiy of the cluster: RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 1.8 PiB 788 TiB 1.0 PiB 1.0 PiB 57.22 ssd 7.9 TiB 4.6 TiB 181 GiB 3.2 TiB 41.15 ssd-cache 5.2 TiB 5.2 TiB 67 GiB 73 GiB 1.36 TOTAL 1.8 PiB 798 TiB 1.0 PiB 1.0 PiB 56.99 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL default-oss.rgw.control 1 0 B 8 0 B 0 1.3 TiB default-oss.rgw.meta 2 22 KiB 97 3.9 MiB 0 1.3 TiB default-oss.rgw.log 3 525 KiB 223 621 KiB 0 1.3 TiB default-oss.rgw.buckets.index 4 33 MiB 34 33 MiB 0 1.3 TiB default-oss.rgw.buckets.non-ec 5 1.6 MiB 48 3.8 MiB 0 1.3 TiB .rgw.root 6 3.8 KiB 16 720 KiB 0 1.3 TiB default-oss.rgw.buckets.data 7 274 GiB 185.39k 450 GiB 0.14 212 TiB default-fs-metadata 8 488 GiB 153.10M 490 GiB 10.65 1.3 TiB default-fs-data0 9 374 TiB 1.48G 939 TiB 74.71 212 TiB ... The USED = 3 * STORED in 3-replica mode is completely right, but for EC 4+2 pool (for default-fs-data0 ) the USED is not equal 1.5 * STORED, why...:( ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] How to delete OSD benchmark data
Dear Ceph Users, I am testing my 3 node Proxmox + Ceph cluster. I have performed osd benchmark with the below command. # ceph tell osd.0 bench Do I need to perform any cleanup to delete benchmark data from osd ? I have googled for same but nowhere mentioned post steps after osd benchmark commands Thanks Jayesh ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: ceph pgs inconsistent, always the same checksum
I googled "got 0x6706be76, expected" and found some hits regarding ceph, so whatever it is, you are not the first, and that number has some internal meaning. Redhat solution for similar issue says that checksum is for seeing all zeroes, and hints at a bad write cache on the controller or something that ends up clearing data instead of writing the correct information on shutdowns. Den tis 8 sep. 2020 kl 23:21 skrev David Orman : > > > We're seeing repeated inconsistent PG warnings, generally on the order of > 3-10 per week. > > pg 2.b9 is active+clean+inconsistent, acting [25,117,128,95,151,15] > > > Every time we look at them, we see the same checksum (0x6706be76): > > debug 2020-08-13T18:39:01.731+ 7fbc037a7700 -1 > bluestore(/var/lib/ceph/osd/ceph-25) _verify_csum bad crc32c/0x1000 > checksum at blob offset 0x0, got 0x6706be76, expected 0x61f2021c, device > location [0x12b403c~1000], logical extent 0x0~1000, object > 2#2:0f1a338f:::rbd_data.3.20d195d612942.01db869b:head# > > This looks a lot like: https://tracker.ceph.com/issues/22464 > That said, we've got the following versions in play (cluster was created > with 15.2.3): > ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus > (stable) > -- May the most significant bit of your life be positive. ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io