[ceph-users] Re: Need help integrating radosgw with keystone for openstack swift

2020-10-22 Thread Burkhard Linke
Hi, in our setup (ceph 15.2.4, openstack train) the swift endpoint URLs are different, e.g. # openstack endpoint list --service swift +--+---+--+--+-+---+--+ | I

[ceph-users] Re: 6 PG's stuck not-active, remapped

2020-10-22 Thread Burkhard Linke
Hi, On 10/21/20 10:01 PM, Mac Wynkoop wrote: *snipsnap* *up: 0: 1131: 1382: 303: 1324: 1055: 576: 1067: 1408: 161acting: 0: 721: 1502: 21474836473: 21474836474: 245: 486: 327: 1578: 103* 21474836473 is -1 as unsigned integer. This value means that the CRUSH algorithm did not produce enough

[ceph-users] Re: Need help integrating radosgw with keystone for openstack swift

2020-10-22 Thread Bujack, Stefan
Hy, I tried your endpoint configuration but with the same outcome. Maybe I am missing something I also don't know if I am testing the right way. But thank you for your answer and your help. Greets Stefan Bujack root@keystone:~# openstack endpoint list | grep swift | 0ee9c91af2424e33a91a4c

[ceph-users] Re: multiple OSD crash, unfound objects

2020-10-22 Thread Frank Schilder
Sounds good. Did you re-create the pool again? If not, please do to give the devicehealth manager module its storage. In case you can't see any IO, it might be necessary to restart the MGR to flush out a stale rados connection. I would probably give the pool 10 PGs instead of 1, but that's up to

[ceph-users] Urgent help needed please - MDS offline

2020-10-22 Thread David C
Hi All My main CephFS data pool on a Luminous 12.2.10 cluster hit capacity overnight, metadata is on a separate pool which didn't hit capacity but the filesystem stopped working which I'd expect. I increased the osd full-ratio to give me some breathing room to get some data deleted once the filesy

[ceph-users] Re: multiple OSD crash, unfound objects

2020-10-22 Thread Frank Schilder
Could you also execute (and post the output of) # osdmaptool osd.map --test-map-pgs-dump --pool 7 with the osd map you pulled out (pool 7 should be the fs data pool)? Please check what mapping is reported for PG 7.39d? Just checking if osd map and pg dump agree here. Best regards, ==

[ceph-users] Hardware needs for MDS for HPC/OpenStack workloads?

2020-10-22 Thread Matthew Vernon
Hi, We're considering the merits of enabling CephFS for our main Ceph cluster (which provides object storage for OpenStack), and one of the obvious questions is what sort of hardware we would need for the MDSs (and how many!). These would be for our users scientific workloads, so they would

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread Dan van der Ster
You can disable that beacon by increasing mds_beacon_grace to 300 or 600. This will stop the mon from failing that mds over to a standby. I don't know if that is set on the mon or mgr, so I usually set it on both. (You might as well disable the standby too -- no sense in something failing back and

[ceph-users] OSD Failures after pg_num increase on one of the pools

2020-10-22 Thread Артём Григорьев
Hello everyone, I created a new ceph 14.2.7 Nautilus cluster recently. Cluster consists of 3 racks and 2 osd nodes on each rack, 12 new hdd in each node. HDD model is TOSHIBA MG07ACA14TE 14Tb. All data pools are ec pools. Yesterday I decided to increase pg number on one of the pools with command

[ceph-users] Re: Huge RAM Ussage on OSD recovery

2020-10-22 Thread Mark Nelson
On 10/21/20 10:54 PM, Ing. Luis Felipe Domínguez Vega wrote: El 2020-10-20 17:57, Ing. Luis Felipe Domínguez Vega escribió: Hi, today mi Infra provider has a blackout, then the Ceph was try to recover but are in an inconsistent state because many OSD can recover itself because the kernel kill i

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread David C
Dan, many thanks for the response. I was going down the route of looking at mds_beacon_grace but I now realise when I start my MDS, it's swallowing up memory rapidly and looks like the oom-killer is eventually killing the mds. With debug upped to 10, I can see it's doing EMetaBlob.replays on vario

[ceph-users] Ceph Octopus and Snapshot Schedules

2020-10-22 Thread Adam Boyhan
Hey all. I was wondering if Ceph Octopus is capable of automating/managing snapshot creation/retention and then replication? Ive seen some notes about it, but can't seem to find anything solid. Open to suggestions as well. Appreciate any input! ___

[ceph-users] Re: Ceph Octopus and Snapshot Schedules

2020-10-22 Thread Martin Verges
Hello Adam, in our croit Ceph Management Software, we have a snapshot manager feature that is capable of doing that. -- Martin Verges Managing director Mobile: +49 174 9335695 E-Mail: martin.ver...@croit.io Chat: https://t.me/MartinVerges croit GmbH, Freseniusstr. 31h, 81247 Munich CEO: Martin

[ceph-users] Re: Huge RAM Ussage on OSD recovery

2020-10-22 Thread Ing . Luis Felipe Domínguez Vega
El 2020-10-22 09:07, Mark Nelson escribió: On 10/21/20 10:54 PM, Ing. Luis Felipe Domínguez Vega wrote: El 2020-10-20 17:57, Ing. Luis Felipe Domínguez Vega escribió: Hi, today mi Infra provider has a blackout, then the Ceph was try to recover but are in an inconsistent state because many OSD c

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread Dan van der Ster
You could decrease the mds_cache_memory_limit but I don't think this will help here during replay. You can see a related tracker here: https://tracker.ceph.com/issues/47582 This is possibly caused by replaying a very large journal. Did you increase the journal segments? -- dan -- dan On T

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread David C
I've not touched the journal segments, current value of mds_log_max_segments is 128. Would you recommend I increase (or decrease) that value? And do you think I should change mds_log_max_expiring to match that value? On Thu, Oct 22, 2020 at 3:06 PM Dan van der Ster wrote: > > You could decrease t

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread Dan van der Ster
I wouldn't adjust it. Do you have the impression that the mds is replaying the exact same ops every time the mds is restarting? or is it progressing and trimming the journal over time? The only other advice I have is that 12.2.10 is quite old, and might miss some important replay/mem fixes. I'm th

[ceph-users] Strange USED size

2020-10-22 Thread Marcelo
Hello. I've searched a lot but couldn't find why the size of USED column in the output of ceph df is a lot times bigger than the actual size. I'm using Nautilus (14.2.8), and I've 1000 buckets with 100 objectsineach bucket. Each object is around 10B. ceph df RAW STORAGE: CLASS SIZE

[ceph-users] Hardware for new OSD nodes.

2020-10-22 Thread Dave Hall
Hello, (BTW, Nautilus 14.2.7 on Debian non-container.) We're about to purchase more OSD nodes for our cluster, but I have a couple questions about hardware choices.  Our original nodes were 8 x 12TB SAS drives and a 1.6TB Samsung NVMe card for WAL, DB, etc. We chose the NVMe card for perform

[ceph-users] Re: Huge RAM Ussage on OSD recovery

2020-10-22 Thread Mark Nelson
On 10/22/20 9:02 AM, Ing. Luis Felipe Domínguez Vega wrote: El 2020-10-22 09:07, Mark Nelson escribió: On 10/21/20 10:54 PM, Ing. Luis Felipe Domínguez Vega wrote: El 2020-10-20 17:57, Ing. Luis Felipe Domínguez Vega escribió: Hi, today mi Infra provider has a blackout, then the Ceph was try

[ceph-users] Re: Hardware needs for MDS for HPC/OpenStack workloads?

2020-10-22 Thread Dan van der Ster
Hi Matthew, On Thu, Oct 22, 2020 at 2:35 PM Matthew Vernon wrote: > > Hi, > > We're considering the merits of enabling CephFS for our main Ceph > cluster (which provides object storage for OpenStack), and one of the > obvious questions is what sort of hardware we would need for the MDSs > (and ho

[ceph-users] Re: Large map object found

2020-10-22 Thread Peter Eisch
Thank you! This was helpful. I opted for a manual reshard: [root@cephmon-s03 ~]# radosgw-admin bucket reshard --bucket=d2ff913f5b6542cda307c9cd6a95a214/NAME_segments --num-shards=3 tenant: d2ff913f5b6542cda307c9cd6a95a214 bucket name: backups_sql_dswhseloadrepl_segments old bucket instance id:

[ceph-users] Re: Large map object found

2020-10-22 Thread DHilsbos
Peter; I believe shard counts should be powers of two. Also, resharding makes the buckets unavailable, but occurs very quickly. As such it is not done in the background, but in the foreground, for a manual reshard. Notice the statement: "reshard of bucket from to completed successfully."

[ceph-users] Re: Hardware for new OSD nodes.

2020-10-22 Thread Brian Topping
> On Oct 22, 2020, at 9:14 AM, Eneko Lacunza wrote: > > Don't stripe them, if one NVMe fails you'll lose all OSDs. Just use 1 NVMe > drive for 2 SAS drives and provision 300GB for WAL/DB for each OSD (see > related threads on this mailing list about why that exact size). > > This way if a

[ceph-users] Re: Huge RAM Ussage on OSD recovery

2020-10-22 Thread Ing . Luis Felipe Domínguez Vega
El 2020-10-22 10:48, Mark Nelson escribió: On 10/22/20 9:02 AM, Ing. Luis Felipe Domínguez Vega wrote: El 2020-10-22 09:07, Mark Nelson escribió: On 10/21/20 10:54 PM, Ing. Luis Felipe Domínguez Vega wrote: El 2020-10-20 17:57, Ing. Luis Felipe Domínguez Vega escribió: Hi, today mi Infra provi

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread David C
I'm pretty sure it's replaying the same ops every time, the last "EMetaBlob.replay updated dir" before it dies is always referring to the same directory. Although interestingly that particular dir shows up in the log thousands of times - the dir appears to be where a desktop app is doing some analy

[ceph-users] Re: Hardware for new OSD nodes.

2020-10-22 Thread Anthony D'Atri
> Also, any thoughts/recommendations on 12TB OSD drives? For price/capacity > this is a good size for us Last I checked HDD prices seemed linear from 10-16TB. Remember to include the cost of the drive bay, ie. the cost of the chassis, the RU(s) it takes up, power, switch ports etc. I’ll gu

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread Dan van der Ster
I assume you aren't able to quickly double the RAM on this MDS ? or failover to a new MDS with more ram? Failing that, you shouldn't reset the journal without recovering dentries, otherwise the cephfs_data objects won't be consistent with the metadata. The full procedure to be used is here: https:

[ceph-users] Re: Hardware for new OSD nodes.

2020-10-22 Thread Anthony D'Atri
> Yeah, didn't think about a RAID10 really, although there wouldn't be enough > space for 8x300GB = 2400GB WAL/DBs. 300 is overkill for many applications anyway. > > Also, using a RAID10 for WAL/DBs will: > - make OSDs less movable between hosts (they'd have to be moved all > together -

[ceph-users] 14.2.12 breaks mon_host pointing to Round Robin DNS entry

2020-10-22 Thread Wido den Hollander
Hi, I already submitted a ticket: https://tracker.ceph.com/issues/47951 Maybe other people noticed this as well. Situation: - Cluster is running IPv6 - mon_host is set to a DNS entry - DNS entry is a Round Robin with three -records root@wido-standard-benchmark:~# ceph -s unable to parse add

[ceph-users] Re: Hardware for new OSD nodes.

2020-10-22 Thread Brian Topping
> On Oct 22, 2020, at 10:34 AM, Anthony D'Atri wrote: > >>- You must really be sure your raid card is dependable. (sorry but I have >> seen so much management problems with top-tier RAID cards I avoid them like >> the plague). > > This. I’d definitely avoid a RAID card. If I can do adva

[ceph-users] Re: 14.2.12 breaks mon_host pointing to Round Robin DNS entry

2020-10-22 Thread Jason Dillaman
This backport [1] looks suspicious as it was introduced in v14.2.12 and directly changes the initial MonMap code. If you revert it in a dev build does it solve your problem? [1] https://github.com/ceph/ceph/pull/36704 On Thu, Oct 22, 2020 at 12:39 PM Wido den Hollander wrote: > > Hi, > > I alrea

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread David C
Thanks, guys I can't add more RAM right now or have access to a server that does, I'd fear it wouldn't be enough anyway. I'll give the swap idea a go and try and track down the thread you mentioned, Frank. 'cephfs-journal-tool journal inspect' tells me the journal is fine. I was able to back it u

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread Dan van der Ster
On Thu, 22 Oct 2020, 19:03 David C, wrote: > Thanks, guys > > I can't add more RAM right now or have access to a server that does, > I'd fear it wouldn't be enough anyway. I'll give the swap idea a go > and try and track down the thread you mentioned, Frank. > > 'cephfs-journal-tool journal inspe

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread David C
On Thu, Oct 22, 2020 at 6:09 PM Dan van der Ster wrote: > > > > On Thu, 22 Oct 2020, 19:03 David C, wrote: >> >> Thanks, guys >> >> I can't add more RAM right now or have access to a server that does, >> I'd fear it wouldn't be enough anyway. I'll give the swap idea a go >> and try and track down

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread Dan van der Ster
On Thu, 22 Oct 2020, 19:14 David C, wrote: > On Thu, Oct 22, 2020 at 6:09 PM Dan van der Ster > wrote: > > > > > > > > On Thu, 22 Oct 2020, 19:03 David C, wrote: > >> > >> Thanks, guys > >> > >> I can't add more RAM right now or have access to a server that does, > >> I'd fear it wouldn't be en

[ceph-users] Re: Hardware for new OSD nodes.

2020-10-22 Thread Dave Hall
Eneko, On 10/22/2020 11:14 AM, Eneko Lacunza wrote: Hi Dave, El 22/10/20 a las 16:48, Dave Hall escribió: Hello, (BTW, Nautilus 14.2.7 on Debian non-container.) We're about to purchase more OSD nodes for our cluster, but I have a couple questions about hardware choices.  Our original nodes

[ceph-users] Re: Hardware for new OSD nodes.

2020-10-22 Thread Eneko Lacunza
Hi Dave, El 22/10/20 a las 16:48, Dave Hall escribió: Hello, (BTW, Nautilus 14.2.7 on Debian non-container.) We're about to purchase more OSD nodes for our cluster, but I have a couple questions about hardware choices.  Our original nodes were 8 x 12TB SAS drives and a 1.6TB Samsung NVMe car

[ceph-users] Re: multiple OSD crash, unfound objects

2020-10-22 Thread Michael Thomas
Done. I gave it 4 PGs (I read somewhere that PG counts should be multiples of 2), and restarted the mgr. I still don't see any traffic to the pool, though I'm also unsure how much traffic is to be expected. --Mike On 10/22/20 2:32 AM, Frank Schilder wrote: Sounds good. Did you re-create the

[ceph-users] Re: multiple OSD crash, unfound objects

2020-10-22 Thread Michael Thomas
On 10/22/20 3:22 AM, Frank Schilder wrote: Could you also execute (and post the output of) # osdmaptool osd.map --test-map-pgs-dump --pool 7 osdmaptool dumped core. Here is stdout: https://pastebin.com/HPtSqcS1 The PG map for 7.39d matches the pg dump, with the expected difference of 21

[ceph-users] Re: Hardware for new OSD nodes.

2020-10-22 Thread Eneko Lacunza
Hi Brian, El 22/10/20 a las 17:50, Brian Topping escribió: On Oct 22, 2020, at 9:14 AM, Eneko Lacunza > wrote: Don't stripe them, if one NVMe fails you'll lose all OSDs. Just use 1 NVMe drive for 2  SAS drives  and provision 300GB for WAL/DB for each OSD (see rel

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread Frank Schilder
If you can't add RAM, you could try provisioning SWAP on a reasonably fast drive. There is a thread from this year where someone had a similar problem, the MDS running out of memory during replay. He could quickly add sufficient swap and the MDS managed to come up. Took a long time though, but m

[ceph-users] Re: Urgent help needed please - MDS offline

2020-10-22 Thread Frank Schilder
The post was titled "mds behind on trimming - replay until memory exhausted". > Load up with swap and try the up:replay route. > Set the beacon to 10 until it finishes. Good point! The MDS will not send beacons for a long time. Same was necessary in the other case. Good luck! ==

[ceph-users] Switch docker image?

2020-10-22 Thread Harry G. Coin
This has got to be ceph/docker "101" but I can't find the answer in the docs and need help. The latest docker octopus images support using the ntpsec time daemon.  The default stable octopus image doesn't as yet. I want to add a mon to a cluster that needs to use ntpsec  (just go with it..), so I