[ceph-users] Re: Manual bucket resharding problem

2020-11-27 Thread Mateusz Skała
Hi.
Thank You, I will try this solution probably on Sunday. Will write results here.
Regards
Mateusz Skała

> Wiadomość napisana przez Amit Ghadge  w dniu 24.11.2020, 
> o godz. 11:12:
> 
> Sorry for delay reply, I never tried on production but after revert the 
> changes I can able to reshard again
> You get same bucket two entries from metadata, 
> radosgw-admin metadata list bucket.instance | grep bucket
> You now the older bucket entry then update those two parameter first,
> radosgw-admin metadata get bucket.instance:bucket: > bucket.json
> Set reshard_status to 0 and new_bucket_instance_id to ""
> Update bucket instance by, radosgw-admin metadata put 
> bucket.instance:bucket: < bucket.json
> 
> On Sun, Nov 22, 2020 at 6:04 PM Mateusz Skała  > wrote:
> Thank You for response, how I can upload this to metadata? Is this operation 
> safe?
> Regards
> Mateusz Skała
> 
> W dniu sob., 21.11.2020 o 18:01 Amit Ghadge  > napisał(a):
> I go through this and you need to update bucket metadata, radosgw-admin 
> metadata get bucket.instance:bucket:xxx > bucket.json, update two parameter I 
> don't remember but it's look reshard: false and next_marker set empty.
> 
> -AmitG
> On Sat, 21 Nov, 2020, 2:04 PM Mateusz Skała,  > wrote:
> Hello Community.
> I need Your help. Few days ago I started manual resharding of one bucket with 
> large objects. Unfortunately I interrupted this by Ctrl+c. At now I can’t 
> start this process again. 
> There is message:
> # radosgw-admin bucket reshard --bucket objects --num-shards 2
> ERROR: the bucket is currently undergoing resharding and cannot be added to 
> the reshard list at this time
> 
> But list of reshard process is empty:
> # radosgw-admin reshard list
> []
> 
> # radosgw-admin reshard status --bucket objects
> [
> {
> "reshard_status": "not-resharding",
> "new_bucket_instance_id": "",
> "num_shards": -1
> }
> ]
> 
> How can I fix this situation ? How to restore possibility resharding this 
> bucket? 
> And BTW is resharding process locking writes/reads on bucket?
> Regards
> Mateusz Skała 
> ___
> ceph-users mailing list -- ceph-users@ceph.io 
> To unsubscribe send an email to ceph-users-le...@ceph.io 
> 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Manual bucket resharding problem

2020-11-27 Thread Mateusz Skała
Hi.
This list is empty :(
# radosgw-admin reshard stale-instances list
[]

I will try tu put fixed meta on Sunday, as I write before. Someone have done 
this on production?
Regards
Mateusz Skała

> Wiadomość napisana przez Konstantin Shalygin  w dniu 
> 24.11.2020, o godz. 11:35:
> 
> Try to look at `radosgw-admin reshard stale-instances list` command. And if 
> list is not empty just rm this stale reshard and then start reshard process 
> again.
> 
> 
> k
> 
> Sent from my iPhone
> 
>> On 22 Nov 2020, at 15:35, Mateusz Skała  wrote:
>> 
>> Thank You for response, how I can upload this to metadata? Is this
>> operation safe?
>> Regards
>> Mateusz Skała
>> 
>> W dniu sob., 21.11.2020 o 18:01 Amit Ghadge 
>> napisał(a):
>> 
>>> I go through this and you need to update bucket metadata, radosgw-admin
>>> metadata get bucket.instance:bucket:xxx > bucket.json, update two parameter
>>> I don't remember but it's look reshard: false and next_marker set empty.
>>> 
>>> -AmitG
 On Sat, 21 Nov, 2020, 2:04 PM Mateusz Skała, 
 wrote:
 
 Hello Community.
 I need Your help. Few days ago I started manual resharding of one bucket
 with large objects. Unfortunately I interrupted this by Ctrl+c. At now I
 can’t start this process again.
 There is message:
 # radosgw-admin bucket reshard --bucket objects --num-shards 2
 ERROR: the bucket is currently undergoing resharding and cannot be added
 to the reshard list at this time
 
 But list of reshard process is empty:
 # radosgw-admin reshard list
 []
 
 # radosgw-admin reshard status --bucket objects
 [
   {
   "reshard_status": "not-resharding",
   "new_bucket_instance_id": "",
   "num_shards": -1
   }
 ]
 
 How can I fix this situation ? How to restore possibility resharding this
 bucket?
 And BTW is resharding process locking writes/reads on bucket?
 Regards
 Mateusz Skała
 ___
 ceph-users mailing list -- ceph-users@ceph.io
 To unsubscribe send an email to ceph-users-le...@ceph.io
 
>>> 
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
> 
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Public Swift yielding errors since 14.2.12

2020-11-27 Thread Jukka Nousiainen
Greetings Vladimir, 

> Do you have anything interesting in rgw debug log (debug rgw = 20) or in
> keystone log?
This is with RadosGW 14.2.14: 

2020-11-27 11:10:17.582 7f2406c6c700 5 Searching permissions for uid=anonymous 
2020-11-27 11:10:17.582 7f2406c6c700 5 Permissions for user not found 
2020-11-27 11:10:17.582 7f2406c6c700 5 Searching permissions for group=1 
mask=49 
2020-11-27 11:10:17.582 7f2406c6c700 5 Permissions for group not found 
2020-11-27 11:10:17.582 7f2406c6c700 5 req 18 0.000s swift:list_bucket -- 
Getting permissions done for 
identity=rgw::auth::ThirdPartyAccountApplier(e5162d3caf094a159d00d80418f6f1c4) 
-> rgw::auth::SysReqApplier -> rgw::auth::LocalApplier(acct_user=anonymous, 
acct_name=, subuser=, perm_mask=15, is_admin=0), 
owner=e5162d3caf094a159d00d80418f6f1c4$anonymous, perm=0 
2020-11-27 11:10:17.582 7f2406c6c700 10 req 18 0.000s swift:list_bucket 
identity=rgw::auth::ThirdPartyAccountApplier(e5162d3caf094a159d00d80418f6f1c4) 
-> rgw::auth::SysReqApplier -> rgw::auth::LocalApplier(acct_user=anonymous, 
acct_name=, subuser=, perm_mask=15, is_admin=0) requested perm (type)=1, policy 
perm=0, user_perm_mask=1, acl perm=0 
2020-11-27 11:10:17.582 7f2406c6c700 20 op->ERRORHANDLER: err_no=-13 
new_err_no=-13 
2020-11-27 11:10:17.582 7f2406c6c700 2 req 18 0.000s swift:list_bucket op 
status=0 
2020-11-27 11:10:17.582 7f2406c6c700 2 req 18 0.000s swift:list_bucket http 
status=403 
2020-11-27 11:10:17.582 7f2406c6c700 1 == req done req=0x7f2406c657f0 op 
status=0 http_status=403 latency=0s == 
2020-11-27 11:10:17.582 7f2406c6c700 20 process_request() returned -13 
2020-11-27 11:10:17.582 7f2406c6c700 1 civetweb: 0x55bde0d96000: 10.72.0.124 - 
- [27/Nov/2020:11:10:17 +0200] "GET 
/swift/v1/AUTH_e5162d3caf094a159d00d80418f6f1c4/publictesti4/ HTTP/1.1" 403 335 
- curl/7.29.0 

> Could you provide the full ceph.conf?
Here you go: 

[client.rgw.HOSTNAME.ZONE] 
rgw_period_root_pool = ZONE.rgw.root 
delay_auth_decision = true 
rgw_swift_versioning_enabled = true 
rgw_swift_account_in_url = true 
rgw_num_rados_handles = 16 
rgw_region_root_pool = ZONE.rgw.root 
keyring = /etc/ceph/ceph.client.rgw.HOSTNAME.ZONE.keyring 
rgw_dns_name = SERVICE_FQDN 
rgw_trust_forwarded_https = true 
rgw_zone = ZONE 
log_to_syslog = true 
rgw_frontends = civetweb num_threads=4096 port=7480 
rgw_realm_root_pool = ZONE.rgw.root 
rgw_s3_auth_use_keystone = true 
rgw_keystone_api_version = 3 
rgw_realm = REALM 
rgw_zonegroup_root_pool = ZONE.rgw.root 
user = ceph 
rgw_keystone_url = https://:35357 
rgw_zone_root_pool = ZONE.rgw.root 
rgw_keystone_implicit_tenants = false 
rgw_s3_auth_order = local,external 
rgw_swift_url = http://:7480 
rgw_bucket_default_quota_max_size = -1 
rgw_keystone_token_cache_size = 500 
rgw_swift_enforce_content_length = true 
rgw_zonegroup = ZONEGROUP 
log_file = /var/log/ceph/radosgw.log 
rgw_bucket_default_quota_max_objects = 50 
rgw_user_default_quota_max_size = 10995116278000 
rgw_user_default_quota_max_objects = -1 
host = HOSTNAME 
rgw_keystone_accepted_roles = object_store_user 
rgw_thread_pool_size = 4096 

> As of today, I suspect, that could be a Keystone problem talking to the new 
> Ceph
> releases 14.2.12+ in your case and Octopus 15.2.x in my.
I'm not sure I understand this suspicion. When doing an authenticated Swift 
call, RadosGW reaches out to Keystone for tokens -- this can be validated with 
tcpdump on the Keystone port and/or logs. Conversely, when accessing a public 
bucket/object, RadosGW (at least testing with our current 14.2.11) can 
determine internally that the bucket is public and no calls to Keystone are 
made. So from our perspective this does seems like a regression in the RadosGW 
Swift internals, not with Keystone integration, but as mentioned we are happy 
to listen to pointers where the above config might be wrong. 

BR, 
Jukka 

> [ mailto:ceph-users-le...@ceph.io ]
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Tracing in ceph

2020-11-27 Thread Seena Fallah
Thanks so much for your help. Do we have to trace for the OSDs section too?
It seems it's just for the radosgw and don't trace the request in OSDs.

On Tue, Nov 24, 2020 at 5:04 PM Abhinav Singh 
wrote:

> hi Seena, sorry for the late reply,
>  I have used jaeger to trace the rgw req, the PR is still not merged with
> official repo, but you can give it a try
> https://github.com/suab321321/ceph/tree/wip-jaegerTracer-noNamespace,
> 1. the cmake option to build jaeger is on by default so you dont need to
> give any extra cmake cli parameters to build it,
> 2. download "jaeger-all-in-one" from here
> https://www.jaegertracing.io/download/
> 3. when you are ready to give req to rgw just run this "jaeger-all-in-one"
> exec in one terminal, and continue with your rgw req.
> 4. browse to http://localhost:16686 to jaeger frontend (you can see this
> video https://www.youtube.com/watch?v=-9_53PtwQHk which will only help
> you know how to see the tracing in frontend)
>
> On Thu, Nov 19, 2020 at 6:02 AM Seena Fallah 
> wrote:
>
>> Isn't there any plan to upgrade this doc?
>> https://docs.ceph.com/en/latest/dev/blkin/
>>
>> On Fri, Nov 13, 2020 at 3:21 AM Seena Fallah 
>> wrote:
>>
>> > Hi all,
>> >
>> > Does this project work with the latest zipkin apis?
>> > https://github.com/ceph/babeltrace-zipkin
>> >
>> > Also what do you prefer to trace requests for rgw and rbd in ceph?
>> >
>> > Thanks.
>> >
>> ___
>> ceph-users mailing list -- ceph-users@ceph.io
>> To unsubscribe send an email to ceph-users-le...@ceph.io
>>
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Access/Delete RGW user with leading whitespace

2020-11-27 Thread Benjamin . Zieglmeier
Hello,

In our environment we have a user that has a leading whitespace in the UID. I 
don’t know how it was created, however I am unable to GET or DELETE it either 
using `radosgw-admin` or the Admin API:

# radosgw-admin user list | grep rgw
" rgw-prometheus",
"rgw-prometheus",

When I try to get info of the user directly, I get:

# radosgw-admin user info --uid=" rgw-prometheus"
could not fetch user info: no user info saved

I’ve tried using the admin API as well while URL encoding the whitespace to %20 
and get “Invalid Argument”.
This user is not important in any way, however it creates issues trying to 
monitor the rgw usage logs using 
https://github.com/blemmenes/radosgw_usage_exporter

I’ve considered modifying the script to ignore that user, but clearly Ceph is 
having troubles addressing it as well, so I figured I’d try to get to the 
bottom of how to remove this user.
We are running 12.2.11 currently, however this cluster was built on 12.2.5 and 
I’m 99% certain the user was created in 12.2.5

Thanks,
Ben
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: [EXTERNAL] Access/Delete RGW user with leading whitespace

2020-11-27 Thread Benjamin . Zieglmeier
Following up on this one as I just figured it out in case it helps anyone else:

The leading whitespace must have been some other sort of non-standard 
whitespace. If I copied the entire output of the user value from the `user 
list` command, and pasted that as the value for --uid in the `radosgw-admin 
user info` command, I was able to retrieve user details.

Thanks,
Ben

On 11/27/20, 12:05 PM, "Benjamin.Zieglmeier"  
wrote:

Hello,

In our environment we have a user that has a leading whitespace in the UID. 
I don’t know how it was created, however I am unable to GET or DELETE it either 
using `radosgw-admin` or the Admin API:

# radosgw-admin user list | grep rgw
" rgw-prometheus",
"rgw-prometheus",

When I try to get info of the user directly, I get:

# radosgw-admin user info --uid=" rgw-prometheus"
could not fetch user info: no user info saved

I’ve tried using the admin API as well while URL encoding the whitespace to 
%20 and get “Invalid Argument”.
This user is not important in any way, however it creates issues trying to 
monitor the rgw usage logs using 
https://github.com/blemmenes/radosgw_usage_exporter

I’ve considered modifying the script to ignore that user, but clearly Ceph 
is having troubles addressing it as well, so I figured I’d try to get to the 
bottom of how to remove this user.
We are running 12.2.11 currently, however this cluster was built on 12.2.5 
and I’m 99% certain the user was created in 12.2.5

Thanks,
Ben
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: replace osd with Octopus

2020-11-27 Thread Tony Liu
> >> When replacing an osd, there will be no PG remapping, and backfill
> >>> will restore the data on the new disk, right?
> >>
> >> That depends on how you decide to go through the replacement process.
> >> Usually without your intervention (e.g. setting the appropriate OSD
> >> flags) the remapping will happen after an OSD goes down and out.
> >
> > This has been unclear to me. Is OSD going to be marked out and PGs
> > going to be remapped during replacing? Or it depends on process?
> >
> > When mark an OSD out, remapping will happen and it will take some time
> > for data migration. Is cluster in degraded state during such duration?
> 
> If you set the `noout` flag on the affected OSDs or the entire cluster,
> there won’t be remapping.
> 
> If the OSD fails and is marked `out`, there will be remapping and
> balancing.

Here is the context.
https://docs.ceph.com/en/latest/mgr/orchestrator/#replace-an-osd

When disk is broken,
1) orch osd rm  --replace [--force]
2) Replace disk.
3) ceph orch apply osd -i 

Step #1 marks OSD "destroyed". I assume it has the same effect as
"ceph osd destroy". And that keeps OSD "in", no PG remapping and
cluster is in "degrade" state.

After step #3, OSD will be "up" and "in", data will be recovered
back to new disk. Is that right?
Is cluster "degrade" or "healthy" during such recovery?

For another option, the difference is no "--replace" in step #1.
1) orch osd rm  [--force]
2) Replace disk.
3) ceph orch apply osd -i 

Step #1 evacuates PGs from OSD and removes it from cluster.
If disk is broken or OSD daemon is down, is this evacuation still
going to work?
Is it going to take a while if there is lots data on this disk?

After step #3, PGs will be rebalanced/remapped again when new OSD
joins the cluster.

I think, to replace with the same disk model, option #1 is preferred,
to replace with different disk model, it needs to be option #2.
Am I right? Any comments is welcome.

> > My understanding is that, remapping only happens when the OSD is
> > marked out.
> 
> CRUSH topology and rule changes can result in misplaced object too, but
> that’s a tangent.
> 
> > Replacement process will keep OSD always in, assuming replacing with
> > the same disk model.
> 
> `ceph osd destroy` is your friend.
> 
> > In case to replace with different size, it could be more complicated,
> > because weight has to be adjusted for size change and PG may be
> > rebalanced.
> 
> If you replace an OSD with a drive of a different size, and you do so in
> a way such that the CRUSH weight is changed to match, then yes almost
> certainly some PG acting sets will change.
> 
> 
> >>> The key here is how much time backfilling and rebalancing will take?
> >>> The intention is to not keep cluster in degraded state for too long.
> >>> I assume they are similar, because either of them is to copy the
> >>> same amount of data?
> >>> If that's true, then option #2 is pointless.
> >>> Could anyone share such experiences, like how long time it takes to
> >>> recover how much data on what kind of networking/computing env?
> >>
> >> No, option 2 is not pointless, it helps you prevent a degraded state.
> >> Having a small cluster or crush rules that only allow few failed OSDs
> >> it could be dangerous taking out an entire node, risking another
> >> failure and potential data loss. It highly depends on your specific
> >> setup and if you're willing to take the risk during rebuild of a node.
> 
> Agreed.  Overlapping failures can and do happen.  The flipside is that
> if one lets recovery complete, there has to be enough unused capacity in
> the right places to accomodate new data replicas.
> 
> If, say, the affected cluster is on a different continent and you don’t
> have trustworthy 24x7 remote hands, then it could take some time to
> replace a failed drive or node.  In this case, it likely is advantageous
> to let the cluster recover.
> 
> If however you can get the affected drive / node back faster than
> recovery would take, it can be advantageous to prevent recovery until
> the OSDs are back up.  Either way, Ceph has to create data replicas from
> survivors.  *If* you can replace a drive immediately, then there’s no
> extra risk and you can cut data movement very roughly in half.
> 
> This ties into the `mon_osd_down_out_subtree_limit` setting.  Depending
> on one’s topology, it can prevent a thundering herd of recovery, with
> the idea that it’s often faster to get a node back up than it would be
> to recover all that data.  This also avoids surviving OSDs potentially
> becoming full, but one has to have good monitoring so that this state
> does not continue indefinitely.
> 
> Basically, any time PGs are undersized, there’s risk of an overlapping
> failure.  The best course is often a question of which strategy will get
> them back to full size.  Remapped PGs aren’t so big a deal, because at
> all times you have the desired number of replicas.
> 
> >> The recovery/backfill speed is also depeneding on the size 

[ceph-users] rbd image backup best practice

2020-11-27 Thread Marc Roos


Is there a best practice or guide for backuping rbd images?




___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: Planning: Ceph User Survey 2020

2020-11-27 Thread Adiga, Anantha
Hi Yuval,

Your questions have been added.

Thank you,
Anantha

From: Yuval Lifshitz 
Sent: Wednesday, November 25, 2020 6:30 AM
To: Mike Perez 
Cc: ceph-users ; Adiga, Anantha ; 
Paul Mezzanini ; Anthony D'Atri 
Subject: Re: [ceph-users] Planning: Ceph User Survey 2020

Hi Mike,
Could we add more questions on RGW usecases and functionality adoption?

For instance:

bucket notifications:
* do you use "bucket notifications"?
* if so, which endpoint do you use: kafka, amqp, http?
* which other endpoints would you like to see there?

sync modules:
* do you use the cloud sync module? if so, with which cloud provider?
* do you use an archive zone?
* do you use the elasticsearch module?

multisite:
* do you have more than one realm in your setup? if so, how many?
* do you have more than one zone group in your setup?
* do you have more than one zone in your setup? if so, how many in the largest 
zone group?
* is the syncing policy between zones global or per bucket?

On Tue, Nov 24, 2020 at 8:06 PM Mike Perez 
mailto:mipe...@redhat.com>> wrote:
Hi everyone,

The Ceph User Survey 2020 is being planned by our working group. Please
review the draft survey pdf, and let's discuss any changes. You may also
join us in the next meeting *on November 25th at 12pm *PT

https://tracker.ceph.com/projects/ceph/wiki/User_Survey_Working_Group

https://tracker.ceph.com/attachments/download/5260/Ceph%20User%20Survey%202020.pdf

We're aiming to have something ready by mid-December.

--

Mike Perez

he/him

Ceph / Rook / RDO / Gluster Community Architect

Open-Source Program Office (OSPO)


M: +1-951-572-2633

494C 5D25 2968 D361 65FB 3829 94BC D781 ADA8 8AEA
@Thingee   Thingee
 

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to 
ceph-users-le...@ceph.io
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: replace osd with Octopus

2020-11-27 Thread Anthony D'Atri


>> 
> 
> Here is the context.
> https://docs.ceph.com/en/latest/mgr/orchestrator/#replace-an-osd
> 
> When disk is broken,
> 1) orch osd rm  --replace [--force]
> 2) Replace disk.
> 3) ceph orch apply osd -i 
> 
> Step #1 marks OSD "destroyed". I assume it has the same effect as
> "ceph osd destroy". And that keeps OSD "in", no PG remapping and
> cluster is in "degrade" state.
> 
> After step #3, OSD will be "up" and "in", data will be recovered
> back to new disk. Is that right?

Yes.

> Is cluster "degrade" or "healthy" during such recovery?

It will be degraded, because there are fewer copies of some data available than 
during normal operation.  Clients will continue to access all data.

> For another option, the difference is no "--replace" in step #1.
> 1) orch osd rm  [--force]
> 2) Replace disk.
> 3) ceph orch apply osd -i 
> 
> Step #1 evacuates PGs from OSD and removes it from cluster.
> If disk is broken or OSD daemon is down, is this evacuation still
> going to work?

Yes, of course — broken drives are the typical reason for removing OSDs.

> Is it going to take a while if there is lots data on this disk?

Yes, depending on what “a while” means to you, the size of the cluster, whether 
the pool is replicated or EC, and whether these are HDDs or SSDs.

> After step #3, PGs will be rebalanced/remapped again when new OSD
> joins the cluster.
> 
> I think, to replace with the same disk model, option #1 is preferred,
> to replace with different disk model, it needs to be option #2.

I haven’t tried it under Octopus, but I don’t think this is strictly true.  If 
you replace it with a different model that is approximately the same size, 
everything will be fine.  Through Luminous and I think Nautilus at least, if 
you `destroy` and replace with a larger drive, the CRUSH weight of the OSD will 
still reflect that of the old drive.  You could then run `ceph osd crush 
reweight` after deploying to adjust the size.  You could record the CRUSH 
weights of all your drive models for initial OSD deploys, or you could `ceph 
osd tree` and look for another OSD of the same model, and set the CRUSH weight 
accordingly.

If you replace with a smaller drive, your cluster will lose a small amount of 
usable capacity.  If you replace with a larger drive, the cluster may or may 
not enjoy a slight increase in capacity — that depends on replication strategy, 
rack/host weights, etc.

My personal philosophy on drive replacements:

o Build OSDs with `—dmcrypt` so that you don’t have to worry about data if/when 
you RMA or recycle bad drives.  RMAs are a hassle, so pick a certain value 
threshold before a drive is worth the effort.  This might be in the $250-500 
range for example, which means that for many HDDs it isn’t worth RMAing them.

o If you have an exact replacement, use it

o When buying spares, buy the largest size drive you have deployed — or will 
deploy within the next year or so.  That way you know that your spares can take 
the place of any drive you have, so you don’t have to maintain stock of more 
than one size. Worst case you don’t immediately make good use of that extra 
capacity, but you may in the future as drives in other failure domains fail and 
are replaced.  Be careful, though of mixing  drives that a lot different in 
size.  Mixing 12 and 14 TB drives, even 12 and 16 is usually no big deal, but 
if you mix say 1TB and 16 TB drives, you can end up exceeding 
`mon_max_pg_per_osd`.  Which is one reason why I like to increase it from the 
default value to, say, 400.
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io