[ceph-users] Existing Cluster to cephadm - mds start failing

2020-04-12 Thread Ashley Merrick
Completed the migration of an existing Ceph cluster on Octopus to cephadm. All OSD/MON/MGR moved fine, however upon running the command to setup some new MDS for cephfs they both failed to start. After looking into the cephadm log's I found the following error: Apr 13 06:26:15 sn-s01 syst

[ceph-users] Re: [Octopus] OSD overloading

2020-04-12 Thread Jack
Yep I am The issue is solved now .. and by solved, brace yourselves, I mean I had to recreate all OSDs And this the cluster would not heal itself (because of the original issue), I had to drop every rados pool, stop all OSDs, destroy & recreate them .. Yeah, well, hum There is definitly an under

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-12 Thread Maged Mokhtar
On 12/04/2020 21:41, huxia...@horebdata.cn wrote: thanks again. I will try PetaSAN later. How big is the recommended cache size (dm-writecache) for a OSD? Actual number of partitions per SSD is more important, each partition serves 1 HDD/OSD, we allow 1-8. For size anything above 50GB is

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-12 Thread huxia...@horebdata.cn
thanks again. I will try PetaSAN later. How big is the recommended cache size (dm-writecache) for a OSD? huxia...@horebdata.cn From: Maged Mokhtar Date: 2020-04-12 21:34 To: huxia...@horebdata.cn; Reed Dier; jesper CC: ceph-users Subject: Re: [ceph-users] Re: Recommendation for decent writ

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-12 Thread Maged Mokhtar
On 12/04/2020 20:35, huxia...@horebdata.cn wrote: That said, with a recent kernel such as 4.19 stable release, and a decent enterprise SSD such as Intel D4510/4610, i do not need to worry about the data safety related to dm-writecache. thanks a lot. samuel The patch recently went in 5.

[ceph-users] Re: MDS: obscene buffer_anon memory use when scanning lots of files

2020-04-12 Thread Dan van der Ster
Hi John, Did you make any progress on investigating this? Today I also saw huge relative buffer_anon usage on our 2 active mds's running 14.2.8: "mempool": { "by_pool": { "bloom_filter": { "items": 2322, "bytes": 2322 },

[ceph-users] Re: Fwd: question on rbd locks

2020-04-12 Thread Void Star Nill
Paul, Ilya, others, Any inputs on this? Thanks, Shridhar On Thu, 9 Apr 2020 at 12:30, Void Star Nill wrote: > Thanks Ilya, Paul. > > I dont have the panic traces and probably they are not related to rbd. I > was merely describing our use case. > > On our setup that we manage, we have a softwa

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-12 Thread huxia...@horebdata.cn
That said, with a recent kernel such as 4.19 stable release, and a decent enterprise SSD such as Intel D4510/4610, i do not need to worry about the data safety related to dm-writecache. thanks a lot. samuel huxia...@horebdata.cn From: Maged Mokhtar Date: 2020-04-12 20:03 To: huxia...@hor

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-12 Thread Maged Mokhtar
On 12/04/2020 18:10, huxia...@horebdata.cn wrote: Dear Maged Mokhtar, It is very interesting to know that your experiment shows dm-writecache would be better than other alternatives. I have two questions: yes much better. 1  can one cache device serve multiple HDDs? I know bcache can do

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-12 Thread huxia...@horebdata.cn
Dear Maged Mokhtar, It is very interesting to know that your experiment shows dm-writecache would be better than other alternatives. I have two questions: 1 can one cache device serve multiple HDDs? I know bcache can do this, which is convenient. dont know whether dm-writecache has such a feat

[ceph-users] Re: Recommendation for decent write latency performance from HDDs

2020-04-12 Thread Maged Mokhtar
On 10/04/2020 23:17, Reed Dier wrote: Going to resurrect this thread to provide another option: LVM-cache, ie putting a cache device in-front of the bluestore-LVM LV. I only mention this because I noticed it in the SUSE documentation for SES6 (based on Nautilus) here: https://documentation.su

[ceph-users] How to fix 1 pg stale+active+clean of cephfs pool

2020-04-12 Thread Marc Roos
The cause of the stale pg, is a fs_data.r1 1 replica pool. This should be empty but ceph df shows 128 KiB used. I have already marked the osd as lost and removed the osd from the crush map. PG_AVAILABILITY Reduced data availability: 1 pg stale pg 30.4 is stuck stale for 407878.113092,

[ceph-users] MDS: what's the purpose of using LogEvent with empty metablob?

2020-04-12 Thread Xinying Song
Hi, cephers: What's the purpose of using LogEvent with empty metablob? For example in link/unlink operation cross two active mds, when slave receives OP_FINISH it will write an ESlaveUpdate::OP_COMMIT to the journal, then send OP_COMMITTED to master. When master receives OP_COMMITTED it will write