[ceph-users] Re: Provide more documentation for MDS performance tuning on large file systems

2020-12-15 Thread Janek Bevendorff
My current settings are: mds   advanced  mds_beacon_grace 15.00 mds   basic mds_cache_memory_limit 4294967296 mds   advanced  mds_cache_trim_threshold 393216 global    advanced  mds_export_ephemeral_distributed true mds   advanced  mds_recall_global_max

[ceph-users] Weird ceph df

2020-12-15 Thread Szabo, Istvan (Agoda)
Hi, It is a nautilus 14.2.13 ceph. The quota on the pool is 745GiB, how can be the stored data 788GiB? (2 replicas pool). Based on the used column it means just 334GiB is used because the pool has 2 replicas only. I don't understand. POOLS: POOLID STORED OBJECTS

[ceph-users] Re: performance degredation every 30 seconds

2020-12-15 Thread Sebastian Trojanowski
Hi, check your rbd cache, by default it's enabled, for ssd/nvme better is to disable it. Looks like your cache/buffers are full and need flush. It could harmful your env. BR, Sebastian On 11.12.2020 19:08, Philip Brown wrote: I have a new 3 node octopus cluster, set up on SSDs. I'm runnin

[ceph-users] Re: Whether read I/O is accpted when the number of replica is under pool's min_size

2020-12-15 Thread Eugen Block
Hi, it's correct that both read and write I/O is paused when a pool's min_size is not met. Regards, Eugen Zitat von Satoru Takeuchi : Hi, Could you tell me whether read I/O is acxepted when the number of replicas is under pool's min_size? I read the official document and found that ther

[ceph-users] Re: PGs down

2020-12-15 Thread Wout van Heeswijk
Hi Igor, Are you referring to the bug reports: - https://tracker.ceph.com/issues/48276 | OSD Crash with ceph_assert(is_valid_io(off, len)) - https://tracker.ceph.com/issues/46800 | Octopus OSD died and fails to start with FAILED ceph_assert(is_valid_io(off, len)) If that is the case, do you th

[ceph-users] Re: PGs down

2020-12-15 Thread Igor Fedotov
Hi Wout, On 12/15/2020 1:18 PM, Wout van Heeswijk wrote: Hi Igor, Are you referring to the bug reports: - https://tracker.ceph.com/issues/48276 | OSD Crash with ceph_assert(is_valid_io(off, len)) - https://tracker.ceph.com/issues/46800 | Octopus OSD died and fails to start with FAILED ceph_a

[ceph-users] osd has slow request and currently waiting for peered

2020-12-15 Thread 912273...@qq.com
Hi all, After reboot one node, one OSD in other node has 'slow requests' and 'currently waiting for peered' a long time util restart this OSD. Is this a bug? See the attachment for more osd log. 2020-12-11 15:39:12.837391 7f3906fa2700 0 log_channel(cluster) log [WRN] : 15 slow requests, 1 inc

[ceph-users] Re: Provide more documentation for MDS performance tuning on large file systems

2020-12-15 Thread Patrick Donnelly
On Tue, Dec 15, 2020 at 12:50 AM Janek Bevendorff wrote: > > My current settings are: > > mds advanced mds_beacon_grace 15.00 This should be a global setting. It is used by the mons and mdss. > mds basic mds_cache_memory_limit 4294967296 > mds advanced mds

[ceph-users] Re: Provide more documentation for MDS performance tuning on large file systems

2020-12-15 Thread Janek Bevendorff
My current settings are: mds advanced mds_beacon_grace 15.00 True. I might as well remove it completely, it's an artefact of earlier experiments. This should be a global setting. It is used by the mons and mdss. mds basic mds_cache_memory_limit 4294967296

[ceph-users] Re: performance degredation every 30 seconds

2020-12-15 Thread Philip Brown
It wont be on the same node... but since as you saw, the problem still shows up with iodepth=32 seems we're still in the same problem ball park also... there may be 100 client machines.. but each client can have anywhere between 1-30 threads running at a time. as far as fio using the rados e

[ceph-users] Re: performance degredation every 30 seconds

2020-12-15 Thread Jason Dillaman
On Tue, Dec 15, 2020 at 12:24 PM Philip Brown wrote: > > It wont be on the same node... > but since as you saw, the problem still shows up with iodepth=32 seems > we're still in the same problem ball park > also... there may be 100 client machines.. but each client can have anywhere > betwee

[ceph-users] Re: performance degredation every 30 seconds

2020-12-15 Thread Philip Brown
I did a git pull of latest fio from git://git.kernel.dk/fio.git and built with # gcc --version gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5) Results were as expected. Using straight rados, there were no performance hiccups. But using fio --direct=1 --rw=randwrite --bs=4k --ioengine=rbd --pool=te

[ceph-users] Re: performance degredation every 30 seconds

2020-12-15 Thread Philip Brown
I would think it should be something like that. However, I just tried: rbd image-meta set testpool/testrbd conf_rbd_cache false fio --direct=1 --rw=randwrite --bs=4k --ioengine=rbd --pool=testpool --rbdname=testrbd --iodepth=256 --numjobs=1 --time_based --group_reporting --name=iops-rbd-t

[ceph-users] Re: multiple OSD crash, unfound objects

2020-12-15 Thread Michael Thomas
Hi Frank, I was able to migrate the data off of the "broken" pool (fs.data.archive.frames) and onto the new one (fs.data.archive.newframes). I verified that no useful data is left on the "broken" pool: * 'find + getfattr -n ceph.file.layout.pool' shows no files on the bad pool * 'find + ge

[ceph-users] Ceph Outage (Nautilus) - 14.2.11

2020-12-15 Thread Suresh Rama
Dear All, We have a 38 node HP Apollo cluster with 24 3.7T Spinning disk and 2 NVME for journal. This is one of our 13 clusters which was upgraded from Luminous to Nautilus (14.2.11). When one of our openstack customers uses elastic search (they offer Logging as a Service) to their end users rep

[ceph-users] Re: performance degredation every 30 seconds

2020-12-15 Thread Philip Brown
btw, I also tried putting [client] rbd cache = false in the /etc/ceph/ceph.conf file on the main node, then doing systemctl stop ceph.target systemctl status ceph.target on the main node. but after restart, it tells me rbd cache is still enabled # ceph --admin-daemon /var/run/ceph/7994e544

[ceph-users] Re: multiple OSD crash, unfound objects

2020-12-15 Thread Frank Schilder
Hi Michael, that sounds like a big step forward. I would probably remove the data pool from the ceph fs first before doing anything on it. Is the new pool set as data pool on the root of the entire ceph fs? If so, I see no reason for not detaching the pool from the ceph fs right away. Also to

[ceph-users] Re: Whether read I/O is accpted when the number of replica is under pool's min_size

2020-12-15 Thread Satoru Takeuchi
2020年12月15日(火) 18:48 Eugen Block : > > Hi, > > it's correct that both read and write I/O is paused when a pool's > min_size is not met. > > Regards, > Eugen Thank you! I'll send a PR to fix the Pool's configuration document. Regards, Satoru Satoru > > > Zitat von Satoru Takeuchi : > > > Hi, > >

[ceph-users] issue on adding SSD to SATA cluster for db/wal

2020-12-15 Thread Zhenshi Zhou
Hi all, I have a 14.2.15 cluster with all SATA OSDs. Now we plan to add SSDs in the cluster for db/wal usage. I checked the docs and found a command 'ceph-bluestore-tool' can deal with the issue. I added db/wal to the osd in my test environment but in the end it still get the warning message. "os

[ceph-users] Re: issue on adding SSD to SATA cluster for db/wal

2020-12-15 Thread Eugen Block
Hi, does 'show-label' reflect your changes for block.db? ---snip--- host2:~ # ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-3/ inferring bluefs devices from bluestore path { "/var/lib/ceph/osd/ceph-3/block": { [...] }, "/var/lib/ceph/osd/ceph-3/block.db": { "os

[ceph-users] Re: issue on adding SSD to SATA cluster for db/wal

2020-12-15 Thread Zhenshi Zhou
Hi Eugen, I checked the LVM label and there was no tag for db or wal. I solved the issue by running bluefs-bdev-migrate. Thanks Eugen Block 于2020年12月16日周三 下午2:29写道: > Hi, > > does 'show-label' reflect your changes for block.db? > > ---snip--- > host2:~ # ceph-bluestore-tool show-label --path

[ceph-users] ceph stuck removing image from trash

2020-12-15 Thread Andre Gebers
Hi, I'm running a 15.2.4 test cluster in a rook-ceph environment. The cluster is reporting HEALTH_OK but it seems it is stuck removing an image. Last section of 'ceph status' output: progress: Removing image replicapool/43def5e07bf47 from trash (6h) [] (r

[ceph-users] Re: ceph stuck removing image from trash

2020-12-15 Thread 胡 玮文
Hi Andre, I once faced the same problem. It turns out that ceph need to scan every object in the image when deleting it, if object map is not enabled. This will take years on such a huge image. I ended up deleted the whole pool to get rid of the huge image. Maybe you can scan all the objects i