ceph df [detail] output (POOLS section) has been modified in plain format:
-
‘BYTES USED’ column renamed to ‘STORED’. Represents amount of data
stored by the user.
-
‘USED’ column now represent amount of space allocated purely for data by
all OSD nodes in KB.
source: https://d
Perhaps WAL is filling up when iodepth is so high? Is WAL on the same
SSDs? If you double the WAL size, does it change?
On Mon, Dec 14, 2020 at 9:05 PM Jason Dillaman wrote:
>
> On Mon, Dec 14, 2020 at 1:28 PM Philip Brown wrote:
> >
> > Our goal is to put up a high performance ceph cluster tha
On Mon, Dec 14, 2020 at 1:28 PM Philip Brown wrote:
>
> Our goal is to put up a high performance ceph cluster that can deal with 100
> very active clients. So for us, testing with iodepth=256 is actually fairly
> realistic.
100 active clients on the same node or just 100 active clients?
> but
I found a merge request, ceph mon has new optíon :mon_sync_max_payload_keys
https://github.com/ceph/ceph/commit/d6037b7f484e13cfc9136e63e4cf7fac6ad68960#diff-495ccc5deb4f8fbd94e795e66c3720677f821314d4b9042f99664cd48a9506fd
My value of options mon_sync_max_payload_size is 4096.
If mon_sync_max_pa
We had to rebuild our mons on a few occasions because of this. Only one mon
was ever dropped from quorum at a time in our case. In other scenarios with
the same error the mon was able to rejoin after thirty minutes or so. We
believe we may have tracked it down (in our case) to the upgrade of an AV
On Mon, Dec 7, 2020 at 12:06 PM Patrick Donnelly wrote:
>
> Hi Dan & Janek,
>
> On Sat, Dec 5, 2020 at 6:26 AM Dan van der Ster wrote:
> > My understanding is that the recall thresholds (see my list below)
> > should be scaled proportionally. OTOH, I haven't played with the decay
> > rates (and d
I forgot to mention "If with bluefs_buffered_io=false, the %util is over
75% most of the time ** during data removal (like snapshot removal) **,
then you'd better change it to true."
Regards,
Frédéric.
Le 14/12/2020 à 21:35, Frédéric Nass a écrit :
Hi Stefan,
Initial data removal could also
Hi Frédéric,
Thanks for the additional input. We are currently only running RGW on the
cluster, so no snapshot removal, but there have been plenty of remappings with
the OSDs failing (all of them at first during and after the OOM incident, then
one-by-one). I haven't had a chance to look into o
Hi Stefan,
Initial data removal could also have resulted from a snapshot removal
leading to OSDs OOMing and then pg remappings leading to more removals
after OOMed OSDs rejoined the cluster and so on.
As mentioned by Igor : "Additionally there are users' reports that
recent default value's m
Our goal is to put up a high performance ceph cluster that can deal with 100
very active clients. So for us, testing with iodepth=256 is actually fairly
realistic.
but it does also exhibit the problem with iodepth=32
[root@irviscsi03 ~]# fio --filename=/dev/rbd0 --direct=1 --rw=randwrite --bs=4
On Mon, Dec 14, 2020 at 12:46 PM Philip Brown wrote:
>
> Further experimentation with fio's -rw flag, setting to rw=read, and
> rw=randwrite, in addition to the original rw=randrw, indicates that it is
> tied to writes.
>
> Possibly some kind of buffer flush delay or cache sync delay when using
Further experimentation with fio's -rw flag, setting to rw=read, and
rw=randwrite, in addition to the original rw=randrw, indicates that it is tied
to writes.
Possibly some kind of buffer flush delay or cache sync delay when using rbd
device, even though fio specified --direct=1 ?
- Or
Aha Insightful question!
running rados bench write to the same pool, does not exhibit any problems. It
consistently shows around 480M/sec throughput, every second.
So this would seem to be something to do with using rbd devices. Which we need
to do.
For what it's worth, I'm using Micron 520
On Mon, Dec 14, 2020 at 11:28 AM Philip Brown wrote:
>
>
> I have a new 3 node octopus cluster, set up on SSDs.
>
> I'm running fio to benchmark the setup, with
>
> fio --filename=/dev/rbd0 --direct=1 --rw=randrw --bs=4k --ioengine=libaio
> --iodepth=256 --numjobs=1 --time_based --group_reporting
I have a new 3 node octopus cluster, set up on SSDs.
I'm running fio to benchmark the setup, with
fio --filename=/dev/rbd0 --direct=1 --rw=randrw --bs=4k --ioengine=libaio
--iodepth=256 --numjobs=1 --time_based --group_reporting --name=iops-test-job
--runtime=120 --eta-newline=1
However, I
Hi,
could you share more information about your setup? How much bandwidth
does the uplink have? Are there any custom configs regarding
rbd_journal or rbd_mirror settings? If there were lots of changes on
those images the sync would always be behind per design. But if
there's no activity i
On Mon, Dec 14, 2020 at 9:39 AM Marc Boisis wrote:
>
>
> Hi,
>
> I would like to know if you support iser in gwcli like the traditional
> targetcli or if this is planned in a future version of ceph ?
We don't have the (HW) resources to test with iSER so it's not
something that anyone is looking
Hi,
I had an osd crash yesterday, with 15.2.7.
seem similar:
ceph crash info
2020-12-13T02:37:57.475315Z_63f91999-ca9c-49a5-b381-5fad9780dbbb
{
"backtrace": [
"(()+0x12730) [0x7f6bccbb5730]",
"(std::_Rb_tree,
boost::intrusive_ptr,
std::_Identity >,
std::less >,
std::allocator
Hi,
I would like to know if you support iser in gwcli like the traditional
targetcli or if this is planned in a future version of ceph ?
Thanks
Marc
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@c
The ceph balancer sets upmap items which violates my crushrule
the rule:
rule cslivebapfirst {
id 0
type replicated
min_size 2
max_size 4
step take csliveeubap-u01dc
step chooseleaf firstn 2 type room
step emit
step take csliveeubs-u01dc
step chooseleaf firstn
Thank you Eugen, it worked.
For the record this is what I have done to remove the services completely. My
CephFS had the name "testfs".
* `ceph orch ls mds mds.testfs --export yaml >change.yaml`
* removed the placement-spec from `change.yaml`.
* reapplied using `cephadm shell -m change.yaml -- c
Hi Igor,
Thank you for the detailed analysis. That makes me hopeful we can get the
cluster back on track. No pools have been removed, but yes, due to the initial
crash of multiple OSDs and the subsequent issues with individual OSDs we’ve had
substantial PG remappings happening constantly.
I wi
Just a note - all the below is almost completely unrelated to high RAM
usage. The latter is a different issue which presumably just triggered
PG removal one...
On 12/14/2020 2:39 PM, Igor Fedotov wrote:
Hi Stefan,
given the crash backtrace in your log I presume some data removal is
in progr
Hi Kalle,
Memory usage is back on track for the OSDs since the OOM crash. I don’t know
what caused it back then, but until all OSDs were back up together, each one of
them (10 TiB capacity, 7 TiB used) ballooned to over 15 GB memory used. I’m
happy to dump the stats if they’re showing any histo
Hi Stefan,
given the crash backtrace in your log I presume some data removal is in
progress:
Dec 12 21:58:38 ceph-tpa-server1 bash[784256]: 3:
(KernelDevice::direct_read_unaligned(unsigned long, unsigned long,
char*)+0xd8) [0x5587b9364a48]
Dec 12 21:58:38 ceph-tpa-server1 bash[784256]: 4:
Hi Jeremy,
I think you lost the data for OSD.11 & .12 I'm not aware of any
reliable enough way to recover RocksDB from this sort of errors.
Theoretically you might want to disable auto compaction for RocksDB for
these daemons and try to bring then up and attempt to drain the data out
of the
Do you have a spec file for the mds services or how did you deploy the
services? If you have a yml file with the mds placement just remove
the entries from that file and run 'ceph orch apply -i mds.yml'.
You can export your current config with this command and then modify
the file to your n
Hi Samuel,
I think we're hitting some niche cases. Most of our experience (and links to
other posts) is here.
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/EWPPEMPAJQT6GGYSHM7GIM3BZWS2PSUY/
For the pg_log issue, the default of 3000 might be too large for some
installations, de
Hi,
we created multiple CephFS, this invloved deploying mutliple mds-services using
`ceph orch apply mds [...]`. Worked like a charm.
Now the filesystem has been removed and the leftovers of the filesystem should
also be removed, but I can't delete the services as cephadm/orchestration
module
Hello, Kalle,
Your comments abount some bugs with pg_log memory and buffer_anon memory
growth worry me a lot, as i am planning to build a cluster with the latest
Nautilous version.
Could you please comment on, how to safely deal with these bugs or to avoid, if
indeed they occur?
thanks a lot
Hi all,
Ok, so I have some updates on this.
We noticed that we had a bucket with tons of RGW garbage collection pending. It
was growing faster than we could clean it up.
We suspect this was because users tried to do "s3cmd sync" operations on SWIFT
uploaded large files. This could logically cau
31 matches
Mail list logo