Hello all,
We have a production cluster running since weeks deployed on centos8
with |cephadm| (3 nodes with 6x12TB HDD each, one |mon| and one |mgr| on
each node too). Today, the "master" node (the from on which I run all
the setup commands) crashed so I rebooted the server. Now, all ceph
co
Hi,
I have a problem with crashing OSD daemons in our Ceph 15.2.6 cluster . The
problem was temporarily resolved by disabling scrub and deep-scrub. All PGs
are active+clean. After a few days I tried to enable scrubbing again, but
the problem persists. OSD with high latencies, PG laggy, osd not
res
Hi,
There is a default limit of 1TiB for the max_file_size in CephFS. I altered
that to 2TiB, but I now got a request for storing a file up to 7TiB.
I'd expect the limit to be there for a reason, but what is the risk of setting
that value to say 10TiB?
--
Mark Schouten
Tuxis, Ede, https://w
Hi Ken,
This seems to have fixed that issue. It exposed another:
https://tracker.ceph.com/issues/39264 which is causing ceph-mgr to become
entirely unresponsive across the cluster, but cheroot seems to be ok.
David
On Wed, Dec 9, 2020 at 12:25 PM David Orman wrote:
> Ken,
>
> We have rebuilt t
Just run the tool from a client that is not part of the ceph nodes. Than
it can do nothing, that you did not configure ceph to allow it to do ;)
Besides you should never run software from 'unknown' sources in an
environment where it can use 'admin' rights.
-Original Message-
To: ce
Hi Miroslav,
haven't you performed massive data removal (PG migration) recently?
If so you might want to apply manual DB compaction to your OSDs.
The positive effect might be just temporary if background removals are
still in progress though..
See https://tracker.ceph.com/issues/47044 and t
This is what the "use_existing" flag is for (on by default). It
short-circuits initialize() which is what actually does the whole
shutdown/creation/startup procedure.
https://github.com/ceph/cbt/blob/master/cluster/ceph.py#L149-L151
That is invoked before shutdown() and make_osds():
https
Hi Igor,
thank you. Yes you are right.
It seems that the background removal is completed.
The correct way to fix it is "ceph-kvstore-tool bluestore-kv
compact" to all OSD (one by one)?
Regards,
Miroslav
pá 11. 12. 2020 v 14:19 odesílatel Igor Fedotov napsal:
> Hi Miroslav,
>
> haven't you p
Miroslav,
On 12/11/2020 4:57 PM, Miroslav Boháč wrote:
Hi Igor,
thank you. Yes you are right.
It seems that the background removal is completed.
you can inspect "numpg_removing" performance counter to make sure it's
been completed.
The correct way to fix it is "ceph-kvstore-tool bluestore-k
On 11/12/2020 00:12, David Orman wrote:
Hi Janek,
We realize this, we referenced that issue in our initial email. We do want
the metrics exposed by Ceph internally, and would prefer to work towards a
fix upstream. We appreciate the suggestion for a workaround, however!
Again, we're happy to
No, as the number of responses we've seen in the mailing lists and on the
bug report(s) have indicated it fixed the situation, we didn't proceed down
that path (it seemed highly probable it would resolve things). If it's of
additional value, we can disable the module temporarily to see if the
probl
Hi Friends,
We have 2 Ceph clusters on campus and we setup the second cluster as the DR
solution.
The images on the DR side are always behind the master.
Ceph Version : 12.2.11
VMWARE_LUN0:
global_id: 23460954-6986-4961-9579-0f2a1e58e2b2
state: up+replaying
descrip
After confirming that the corruption was limited to a single object, we deleted
the object (first via radosgw-admin, and the via a rados rm), and restarted the
new OSD in the set. The backfill has continued past the point of the original
crash, so things are looking promising.
I'm still concern
I just went to setup an iscsi gateway on a Debian Buster / Octopus
cluster and hit a brick wall with packages. I had perhaps naively
assumed they were in with the rest. Now I understand that it can exist
separately, but then so can RGW.
I found some ceph-iscsi rpm builds for Centos, but nothin
Hi Mark,
On Fri, Dec 11, 2020 at 4:21 AM Mark Schouten wrote:
> There is a default limit of 1TiB for the max_file_size in CephFS. I altered
> that to 2TiB, but I now got a request for storing a file up to 7TiB.
>
> I'd expect the limit to be there for a reason, but what is the risk of
> setting
I've had this set to 16TiB for several years now.
I've not seen any ill effects.
--
Adam
On Fri, Dec 11, 2020 at 12:56 PM Patrick Donnelly wrote:
>
> Hi Mark,
>
> On Fri, Dec 11, 2020 at 4:21 AM Mark Schouten wrote:
> > There is a default limit of 1TiB for the max_file_size in CephFS. I altere
>From how I understand it, that setting is a rev-limiter to prevent users from
>creating HUGE sparse files and then wasting cluster resources firing off
>deletes.
We have ours set to 32T and haven't seen any issues with large files.
--
Paul Mezzanini
Sr Systems Administrator / Engineer, Researc
Any idea whether 'diskprediction_local' will ever work in containers?
I'm running 15.2.7 which contains a dependency on scikit-learn v 0.19.2
which isn't in the container. It's been throwing that error for a year
now on all the octopus container versions I tried. It used to be on the
baremetal v
I know this isn't what you asked for, but I do know that Canonical is building
this package for focal and up.
While not Buster, could possibly be a compromise to move things forward without
huge plumbing changes between Debian and Ubuntu.
You may also be able to hack and slash your way through t
Hi,
In my environment I have a single node and I'm trying to run a ceph monitor
as a docker container using a kv store.
Version: Octopus (stable-5.0)
2020-12-12 00:24:28 /opt/ceph-container/bin/entrypoint.sh: STAYALIVE:
container will not die if a command fails.,
2020-12-12 00:24:28 /opt/ceph-
20 matches
Mail list logo