My current settings are:
mds advanced mds_beacon_grace 15.00
mds basic mds_cache_memory_limit 4294967296
mds advanced mds_cache_trim_threshold 393216
global advanced mds_export_ephemeral_distributed true
mds advanced mds_recall_global_max
Hi,
It is a nautilus 14.2.13 ceph.
The quota on the pool is 745GiB, how can be the stored data 788GiB? (2 replicas
pool).
Based on the used column it means just 334GiB is used because the pool has 2
replicas only. I don't understand.
POOLS:
POOLID STORED OBJECTS
Hi,
check your rbd cache, by default it's enabled, for ssd/nvme better is to
disable it. Looks like your cache/buffers are full and need flush. It
could harmful your env.
BR,
Sebastian
On 11.12.2020 19:08, Philip Brown wrote:
I have a new 3 node octopus cluster, set up on SSDs.
I'm runnin
Hi,
it's correct that both read and write I/O is paused when a pool's
min_size is not met.
Regards,
Eugen
Zitat von Satoru Takeuchi :
Hi,
Could you tell me whether read I/O is acxepted when the number of replicas
is under pool's min_size?
I read the official document and found that ther
Hi Igor,
Are you referring to the bug reports:
- https://tracker.ceph.com/issues/48276 | OSD Crash with
ceph_assert(is_valid_io(off, len))
- https://tracker.ceph.com/issues/46800 | Octopus OSD died and fails to start
with FAILED ceph_assert(is_valid_io(off, len))
If that is the case, do you th
Hi Wout,
On 12/15/2020 1:18 PM, Wout van Heeswijk wrote:
Hi Igor,
Are you referring to the bug reports:
- https://tracker.ceph.com/issues/48276 | OSD Crash with
ceph_assert(is_valid_io(off, len))
- https://tracker.ceph.com/issues/46800 | Octopus OSD died and fails to start
with FAILED ceph_a
Hi all,
After reboot one node, one OSD in other node has 'slow requests' and
'currently waiting for peered' a long time util restart this OSD.
Is this a bug? See the attachment for more osd log.
2020-12-11 15:39:12.837391 7f3906fa2700 0 log_channel(cluster) log [WRN] : 15
slow requests, 1 inc
On Tue, Dec 15, 2020 at 12:50 AM Janek Bevendorff
wrote:
>
> My current settings are:
>
> mds advanced mds_beacon_grace 15.00
This should be a global setting. It is used by the mons and mdss.
> mds basic mds_cache_memory_limit 4294967296
> mds advanced mds
My current settings are:
mds advanced mds_beacon_grace 15.00
True. I might as well remove it completely, it's an artefact of earlier
experiments.
This should be a global setting. It is used by the mons and mdss.
mds basic mds_cache_memory_limit 4294967296
It wont be on the same node...
but since as you saw, the problem still shows up with iodepth=32 seems
we're still in the same problem ball park
also... there may be 100 client machines.. but each client can have anywhere
between 1-30 threads running at a time.
as far as fio using the rados e
On Tue, Dec 15, 2020 at 12:24 PM Philip Brown wrote:
>
> It wont be on the same node...
> but since as you saw, the problem still shows up with iodepth=32 seems
> we're still in the same problem ball park
> also... there may be 100 client machines.. but each client can have anywhere
> betwee
I did a git pull of latest fio from git://git.kernel.dk/fio.git
and built with
# gcc --version
gcc (GCC) 7.3.1 20180303 (Red Hat 7.3.1-5)
Results were as expected.
Using straight rados, there were no performance hiccups.
But using
fio --direct=1 --rw=randwrite --bs=4k --ioengine=rbd --pool=te
I would think it should be something like that.
However, I just tried:
rbd image-meta set testpool/testrbd conf_rbd_cache false
fio --direct=1 --rw=randwrite --bs=4k --ioengine=rbd --pool=testpool
--rbdname=testrbd --iodepth=256 --numjobs=1 --time_based --group_reporting
--name=iops-rbd-t
Hi Frank,
I was able to migrate the data off of the "broken" pool
(fs.data.archive.frames) and onto the new one
(fs.data.archive.newframes). I verified that no useful data is left on
the "broken" pool:
* 'find + getfattr -n ceph.file.layout.pool' shows no files on the bad pool
* 'find + ge
Dear All,
We have a 38 node HP Apollo cluster with 24 3.7T Spinning disk and 2 NVME
for journal. This is one of our 13 clusters which was upgraded from
Luminous to Nautilus (14.2.11). When one of our openstack customers uses
elastic search (they offer Logging as a Service) to their end users
rep
btw, I also tried putting
[client]
rbd cache = false
in the /etc/ceph/ceph.conf file on the main node, then doing
systemctl stop ceph.target
systemctl status ceph.target
on the main node.
but after restart, it tells me rbd cache is still enabled
# ceph --admin-daemon
/var/run/ceph/7994e544
Hi Michael,
that sounds like a big step forward.
I would probably remove the data pool from the ceph fs first before doing
anything on it. Is the new pool set as data pool on the root of the entire ceph
fs? If so, I see no reason for not detaching the pool from the ceph fs right
away. Also to
2020年12月15日(火) 18:48 Eugen Block :
>
> Hi,
>
> it's correct that both read and write I/O is paused when a pool's
> min_size is not met.
>
> Regards,
> Eugen
Thank you! I'll send a PR to fix the Pool's configuration document.
Regards,
Satoru
Satoru
>
>
> Zitat von Satoru Takeuchi :
>
> > Hi,
> >
Hi all,
I have a 14.2.15 cluster with all SATA OSDs. Now we plan to add SSDs in the
cluster for db/wal usage. I checked the docs and found a command
'ceph-bluestore-tool' can deal with the issue.
I added db/wal to the osd in my test environment but in the end it still
get the warning message.
"os
Hi,
does 'show-label' reflect your changes for block.db?
---snip---
host2:~ # ceph-bluestore-tool show-label --path /var/lib/ceph/osd/ceph-3/
inferring bluefs devices from bluestore path
{
"/var/lib/ceph/osd/ceph-3/block": {
[...]
},
"/var/lib/ceph/osd/ceph-3/block.db": {
"os
Hi Eugen,
I checked the LVM label and there was no tag for db or wal.
I solved the issue by running bluefs-bdev-migrate.
Thanks
Eugen Block 于2020年12月16日周三 下午2:29写道:
> Hi,
>
> does 'show-label' reflect your changes for block.db?
>
> ---snip---
> host2:~ # ceph-bluestore-tool show-label --path
Hi,
I'm running a 15.2.4 test cluster in a rook-ceph environment. The cluster is
reporting HEALTH_OK but it seems it is stuck removing an image. Last section of
'ceph status' output:
progress:
Removing image replicapool/43def5e07bf47 from trash (6h)
[] (r
Hi Andre,
I once faced the same problem. It turns out that ceph need to scan every object
in the image when deleting it, if object map is not enabled. This will take
years on such a huge image. I ended up deleted the whole pool to get rid of the
huge image.
Maybe you can scan all the objects i
23 matches
Mail list logo