Hi,I often observed now that the recovery/rebalance in Nautilus starts quite
fast but gets extremely slow (2-3 objects/s) even if there are like 20 OSDs
involved. Right now I am moving (reweighted to 0) 16x8TB disks, it's running
since 4 days and since 12h it's kind of stuck now at
cluster:
On Wed, Oct 2, 2019 at 9:00 PM Marc Roos wrote:
>
>
>
> Hi Brad,
>
> I was following the thread where you adviced on this pg repair
>
> I ran these rados 'list-inconsistent-obj'/'rados
> list-inconsistent-snapset' and have output on the snapset. I tried to
> extrapolate your comment on the data/om
On Wed, Oct 02, 2019 at 01:48:40PM +0200, Christian Pedersen wrote:
> Hi Martin,
>
> Even before adding cold storage on HDD, I had the cluster with SSD only. That
> also could not keep up with deleting the files.
> I am no where near I/O exhaustion on the SSDs or even the HDDs.
Please see my pres
On 10/02/2019 02:15 PM, Kilian Ries wrote:
> Ok i just compared my local python files and the git commit you sent me
> - it really looks like i have the old files installed. All the changes
> are missing in my local files.
>
>
>
> Where can i get a new ceph-iscsi-config package that has the fixe
Hello
I am running a Ceph 14.2.2 cluster and a few days ago, memory
consumption of our OSDs started to unexpectedly grow on all 5 nodes,
after being stable for about 6 months.
Node memory consumption: https://icecube.wisc.edu/~vbrik/graph.png
Average OSD resident size: https://icecube.wisc.ed
And now to fill in the full circle.
Sadly my solution was to run
> $ ceph pg repair 33.0
which returned
> 2019-10-02 15:38:54.499318 osd.12 (osd.12) 181 : cluster [DBG] 33.0 repair
> starts
> 2019-10-02 15:38:55.502606 osd.12 (osd.12) 182 : cluster [ERR] 33.0 repair :
> stat mismatch, got 264/26
Ok i just compared my local python files and the git commit you sent me - it
really looks like i have the old files installed. All the changes are missing
in my local files.
Where can i get a new ceph-iscsi-config package that has the fixe included? I
have installed version:
ceph-iscsi-confi
Yes, i created all four luns with these sizes:
lun0 - 5120G
lun1 - 5121G
lun2 - 5122G
lun3 - 5123G
Its always one GB more per LUN... Is there any newer ceph-iscsi-config package
than i have installed?
ceph-iscsi-config-2.6-2.6.el7.noarch
Then i could try to update the package and see if
Hi,
According to [1] there are new parameters in place to have the MDS
behave more stable. Quoting that blog post "One of the more recent
issues weve discovered is that an MDS with a very large cache (64+GB)
will hang during certain recovery events."
For all of us that are not (yet) running Nauti
On Wed, Oct 2, 2019 at 9:50 AM Kilian Ries wrote:
>
> Hi,
>
>
> i'm running a ceph mimic cluster with 4x ISCSI gateway nodes. Cluster was
> setup via ceph-ansible v3.2-stable. I just checked my nodes and saw that only
> two of the four configured iscsi gw nodes are working correct. I first
> no
Hi,
i'm running a ceph mimic cluster with 4x ISCSI gateway nodes. Cluster was setup
via ceph-ansible v3.2-stable. I just checked my nodes and saw that only two of
the four configured iscsi gw nodes are working correct. I first noticed via
gwcli:
###
$gwcli -d ls
Traceback (most recent cal
Hi Martin,
Even before adding cold storage on HDD, I had the cluster with SSD only. That
also could not keep up with deleting the files.
I am no where near I/O exhaustion on the SSDs or even the HDDs.
Cheers,
Christian
On Oct 2 2019, at 1:23 pm, Martin Verges wrote:
> Hello Christian,
>
> the
Hello Christian,
the problem is, that HDD is not capable of providing lots of IOs required
for "~4 million small files".
--
Martin Verges
Managing director
Mobile: +49 174 9335695
E-Mail: martin.ver...@croit.io
Chat: https://t.me/MartinVerges
croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Mar
Hi Brad,
I was following the thread where you adviced on this pg repair
I ran these rados 'list-inconsistent-obj'/'rados
list-inconsistent-snapset' and have output on the snapset. I tried to
extrapolate your comment on the data/omap_digest_mismatch_info onto my
situation. But I don't know
Hi,
Using the S3 gateway I store ~4 million small files in my cluster every
day. I have a lifecycle setup to move these files to cold storage after a
day and delete them after two days.
The default storage is SSD based and the cold storage is HDD.
However the rgw lifecycle process cannot keep up
>
> I created this issue: https://tracker.ceph.com/issues/42116
>
> Seems to be related to the 'crash' module not enabled.
>
> If you enable the module the problem should be gone. Now I need to check
> why this message is popping up.
Yup, crash module enabled and error message is gone. Either w
On 10/1/19 4:38 PM, Stefan Kooman wrote:
> Quoting Wido den Hollander (w...@42on.com):
>> Hi,
>>
>> The Telemetry [0] module has been in Ceph since the Mimic release and
>> when enabled it sends back a anonymized JSON back to
>> https://telemetry.ceph.com/ every 72 hours with information about t
17 matches
Mail list logo