gt; personal desktop, but on servers where I keep data I’m doing it.
> > but what canonical did in this case is… this is LTS version :/
> >
> >
> > BR,
> > Sebastian
> >
> >
> >> On 13 Jun 2024, at 19:47, David C. wrote:
> >>
> >
In addition to Robert's recommendations,
Remember to respect the update order (mgr->mon->(crash->)osd->mds->...)
Before everything was containerized, it was not recommended to have
different services on the same machine.
Le jeu. 13 juin 2024 à 19:37, Robert Sander
a écrit :
> On 13.06.24 18:
Hi Pablo,
Could you tell us a little more about how that happened?
Do you have a min_size >= 2 (or E/C equivalent) ?
Cordialement,
*David CASIER*
Le lun. 17 juin 2024 à 16:26, c
ools min_size.
>> >
>> > If it is an EC setup, it might be quite a bit more painful, depending
>> on what happened to the dead OSDs and whether they are at all recoverable.
>> >
>> >
>> > Matthi
>> Matthias Grandl
>>>> Head Storage Engineer
>>>> matthias.gra...@croit.io
>>>>
>>>> > On 17. Jun 2024, at 16:56, Matthias Grandl
>>>> wrote:
>>&
In Pablo's unfortunate incident, it was because of a SAN incident, so it's
possible that Replica 3 didn't save him.
In this scenario, the architecture is more the origin of the incident than
the number of replicas.
It seems to me that replica 3 exists, by default, since firefly => make
replica 2,
Hi,
This type of incident is often resolved by setting the public_network
option to the "global" scope, in the configuration:
ceph config set global public_network a:b:c:d::/64
Le ven. 21 juin 2024 à 03:36, Eugen Block a écrit :
> Hi,
>
> this only a theory, not a proven answer or something.
Hi Albert,
I think it's related to your network change.
Can you send me the return of "ceph report" ?
Le mar. 16 juil. 2024 à 14:34, Albert Shih a écrit :
> Hi everyone
>
> My cluster ceph run currently 18.2.2 and ceph -s say everything are OK
>
> root@cthulhu1:/var/lib/ceph/crash# ceph -s
>
h a écrit :
> Le 16/07/2024 à 15:04:05+0200, David C. a écrit
> Hi,
>
> >
> > I think it's related to your network change.
>
> I though about it but in that case why the old (and before upgrade) server
> works ?
>
> > Can you send me the return of "
k.
> >
> > However, strangely, the osd and mds did not activate msgr v2 (msgr v2 was
> > activated on mon).
> >
> > It is possible to bypass by adding the "ms_mode=legacy" option but you
> need
> > to find out why msgr v2 is not activated
> >
&
Hi,
It would seem that the order of declaration of mons addresses (v2 then v1
and not the other way around) is important.
Albert restarted all services after this modification and everything is
back to normal
Le mer. 17 juil. 2024 à 09:40, David C. a écrit :
> Hi Frédéric,
>
> The
Hi Albert,
perhaps a conflict with the udev rules of locally installed packages.
Try uninstalling ceph-*
Le jeu. 18 juil. 2024 à 09:57, Albert Shih a écrit :
> Hi everyone.
>
> After my upgrade from 17.2.7 to 18.2.2 I notice after each time I restart I
> got a issue with perm on
>
> /var/lib
98 29*
Le jeu. 18 juil. 2024 à 10:34, Albert Shih a écrit :
> Le 18/07/2024 à 10:27:09+0200, David C. a écrit
> Hi,
>
> >
> > perhaps a conflict with the udev rules of locally installed packages.
> >
> > Try uninstalling ceph-*
>
> Sorry...not sure I
h a écrit
> > Le 18/07/2024 à 10:56:33+0200, David C. a écrit
> >
> Hi,
>
> >
> > > Your ceph processes are in containers.
> >
> > Yes I know but in my install process I just install
> >
> > ceph-common
> > ceph-base
> >
> >
Thanks Christian,
I see the fix is on the postinst, so probably the reboot shouldn't put
"nobody" back, right?
Le jeu. 18 juil. 2024 à 11:44, Christian Rohmann <
christian.rohm...@inovex.de> a écrit :
> On 18.07.24 9:56 AM, Albert Shih wrote:
> >Error scraping /var/lib/ceph/crash: [Errno 13
ig
> so it's never updated with the new mon addresses. This change is to
> have us recreate the OSD config when we redeploy or reconfig an OSD
> so it gets the new mon addresses."
>
> You mentioned a network change. Maybe the orch failed to update
> /var/lib/ceph/$(ceph fsi
Hi All
My main CephFS data pool on a Luminous 12.2.10 cluster hit capacity
overnight, metadata is on a separate pool which didn't hit capacity but the
filesystem stopped working which I'd expect. I increased the osd full-ratio
to give me some breathing room to get some data deleted once the filesy
s it doing anything?
> Is it using lots of CPU/RAM? If you increase debug_mds do you see some
> progress?
>
> -- dan
>
>
> On Thu, Oct 22, 2020 at 2:01 PM David C wrote:
> >
> > Hi All
> >
> > My main CephFS data pool on a Luminous 12.2.10 clust
t;
> -- dan
>
>
>
>
>
>
>
> -- dan
>
> On Thu, Oct 22, 2020 at 3:35 PM David C wrote:
> >
> > Dan, many thanks for the response.
> >
> > I was going down the route of looking at mds_beacon_grace but I now
> > realise when I start my MDS,
> manifested on a multi-mds cluster, so I am not sure if it is the root
> cause here https://tracker.ceph.com/issues/45090 )
> I don't know enough about the changelog diffs to suggest upgrading
> right now in the middle of this outage.
>
>
> -- dan
>
> On Thu, Oct
_______
> From: Dan van der Ster
> Sent: 22 October 2020 18:11:57
> To: David C
> Cc: ceph-devel; ceph-users
> Subject: [ceph-users] Re: Urgent help needed please - MDS offline
>
> I assume you aren't able to quickly double the RAM on thi
On Thu, Oct 22, 2020 at 6:09 PM Dan van der Ster wrote:
>
>
>
> On Thu, 22 Oct 2020, 19:03 David C, wrote:
>>
>> Thanks, guys
>>
>> I can't add more RAM right now or have access to a server that does,
>> I'd fear it wouldn't be enough
Success!
I remembered I had a server I'd taken out of the cluster to
investigate some issues, that had some good quality 800GB Intel DC
SSDs, dedicated an entire drive to swap, tuned up min_free_kbytes,
added an MDS to that server and let it run. Took 3 - 4 hours but
eventually came back online. I
Someone correct me if I'm saying something stupid but from what I see in
the code, there is a check each time to make sure rctime doesn't go back.
Which seems logical to me because otherwise you would have to go through
all the children to determine the correct ctime.
I don't have the impression t
. 20 oct. 2023 à 13:08, David C. a écrit :
> Someone correct me if I'm saying something stupid but from what I see in
> the code, there is a check each time to make sure rctime doesn't go back.
> Which seems logical to me because otherwise you would have to go through
&g
Hi Michel,
(I'm just discovering the existence of this module, so it's possible I'm
making mistakes)
The rgw module is new and only seems to be there to configure multisite.
It is present on the v17.2.6 branch but I don't see it in the container for
this version.
In any case, if you're not usin
ar. 24 oct. 2023 à 18:11, David C. a écrit :
> Hi Michel,
>
> (I'm just discovering the existence of this module, so it's possible I'm
> making mistakes)
>
> The rgw module is new and only seems to be there to configure multisite.
>
> It is present on the v17.2.6
Hi Hubert,
It's an error "125" (ECANCELED) (and there may be many reasons for it).
I see a high latency (144sec), is the object big ?
No network problems ?
Cordialement,
*David CASIER*
___
Hi Mohamed,
I understand there's one operational monitor, isn't there?
If so, you need to reprovision the other monitors on an empty base so that
they synchronize with the only remaining monitor.
Cordialement,
*David CASIER*
_
Hi,
I've just checked with the team and the situation is much more serious than
it seems: the lost disks contained the MONs AND OSDs databases (5 servers
down out of 8, replica 3).
It seems that the team fell victim to a bad batch of Samsung 980 Pros (I'm
not a big fan of this "Pro" range, but th
Hi Dominique,
The consistency of the data should not be at risk with such a problem.
But on the other hand, it's better to solve the network problem.
Perhaps look at the state of bond0 :
cat /proc/net/bonding/bond0
As well as the usual network checks
__
Hi,
It seems to me that before removing buckets from the crushmap, it is
necessary to do the migration first.
I think you should restore the initial crushmap by adding the default root
next to it and only then do the migration.
There should be some backfill (probably a lot).
__
11:50, David C. a écrit :
> Hi,
>
> It seems to me that before removing buckets from the crushmap, it is
> necessary to do the migration first.
> I think you should restore the initial crushmap by adding the default root
> next to it and only then do the migration.
> There should
so the next step is to place the pools on the right rule :
ceph osd pool set db-pool crush_rule fc-r02-ssd
Le mer. 8 nov. 2023 à 12:04, Denny Fuchs a écrit :
> hi,
>
> I've forget to write the command, I've used:
>
> =
> ceph osd crush move fc-r02-ceph-osd-01 root=default
> ceph osd crush
Without (raid/jbod) controller ?
Le mer. 8 nov. 2023 à 18:36, Peter a écrit :
> Hi All,
>
> I note that HDD cluster commit delay improves after i turn off HDD cache.
> However, i also note that not all HDDs are able to turn off the cache.
> special I found that two HDD with same model number, on
Hi Albert,
What would be the number of replicas (in total and on each row) and their
distribution on the tree ?
Le mer. 8 nov. 2023 à 18:45, Albert Shih a écrit :
> Hi everyone,
>
> I'm totally newbie with ceph, so sorry if I'm asking some stupid question.
>
> I'm trying to understand how the
ossible on this architecture.
Cordialement,
*David CASIER*
Le jeu. 9 nov. 2023 à 08:48, Albert Shih a écrit :
> Le 08/11/2023 à 19:29:19+0100, David C. a écrit
> Hi David
Hi Daniel,
it's perfectly normal for a PG to freeze when the primary osd is not stable.
It can sometimes happen that the disk fails but doesn't immediately send
back I/O errors (which crash the osd).
When the OSD is stopped, there's a 5-minute delay before it goes down in
the crushmap.
Le ve
Hi Jean Marc,
maybe look at this parameter "rgw_enable_apis", if the values you have
correspond to the default (need rgw restart) :
https://docs.ceph.com/en/quincy/radosgw/config-ref/#confval-rgw_enable_apis
ceph config get client.rgw rgw_enable_apis
_
rbd create testpool/test3 --size=100M
rbd snap limit set testpool/test3 --limit 3
Le mer. 15 nov. 2023 à 17:58, Wesley Dillingham a
écrit :
> looking into how to limit snapshots at the ceph level for RBD snapshots.
> Ideally ceph would enforce an arbitrary number of snapshots allowable per
> rb
t for each rbd?
>
> Respectfully,
>
> *Wes Dillingham*
> w...@wesdillingham.com
> LinkedIn <http://www.linkedin.com/in/wesleydillingham>
>
>
> On Wed, Nov 15, 2023 at 1:14 PM David C. wrote:
>
>> rbd create testpool/test3 --size=100M
>> rbd snap limit set
Hi Albert ,
5 instead of 3 mon will allow you to limit the impact if you break a mon
(for example, with the file system full)
5 instead of 3 MDS, this makes sense if the workload can be distributed
over several trees in your file system. Sometimes it can also make sense to
have several FSs in ord
Le ven. 17 nov. 2023 à 11:22, Jean-Marc FONTANA
a écrit :
> Hello, everyone,
>
> There's nothing cephadm.log in /var/log/ceph.
>
> To get something else, we tried what David C. proposed (thanks to him !!)
> and found:
>
> nov. 17 10:53:54 svtcephmonv3 ceph-mgr[727]:
Hi,
You can use the cephadm account (instead of root) to control machines with
the orchestrator.
Le ven. 17 nov. 2023 à 13:30, Luis Domingues a
écrit :
> Hi,
>
> I noticed when installing the cephadm rpm package, to bootstrap a cluster
> for example, that a user cephadm was created. But I do n
figure out how to enable cephadm's access to the
> machines.
>
> Anyway, thanks for your reply.
>
> Luis Domingues
> Proton AG
>
>
> On Friday, 17 November 2023 at 13:55, David C.
> wrote:
>
>
> > Hi,
> >
> > You can use the cephadm account (i
Hello Albert,
5 vs 3 MON => you won't notice any difference
5 vs 3 MGR => by default, only 1 will be active
Le sam. 18 nov. 2023 à 09:28, Albert Shih a écrit :
> Le 17/11/2023 à 11:23:49+0100, David C. a écrit
>
> Hi,
>
> >
> > 5 instead of 3 mon will a
Hi Guiseppe,
Wouldn't you have clients who heavily load the MDS with concurrent access
on the same trees ?
Perhaps, also, look at the stability of all your clients (even if there are
many) [dmesg -T, ...]
How are your 4 active MDS configured (pinning?) ?
Probably nothing to do but normal for 2
Hi,
It looks like a trim/discard problem.
I would try my luck by activating the discard on a disk, to validate.
I have no feedback on the reliability of the bdev_*_discard parameters.
Maybe dig a little deeper into the subject or if anyone has any feedback...
___
*
Le lun. 4 déc. 2023 à 06:01, Szabo, Istvan (Agoda)
a écrit :
> With the nodes that has some free space on that namespace, we don't have
> issue, only with this which is weird.
> --
> *From:* Anthony D'Atri
> *Sent:* Friday, December 1, 2023
v_async_discard": "false",
> "bdev_enable_discard": "false",
>
>
>
> Istvan Szabo
> Staff Infrastructure Engineer
> ---
> Agoda Services Co., Ltd.
> e: istvan.sz...@agoda.com
> -------
Hi Matthew,
To make a simplistic comparison, it is generally not recommended to raid 5
with large disks (>1 TB) due to the probability (low but not zero) of
losing another disk during the rebuild.
So imagine losing a host full of disks.
Additionally, min_size=1 means you can no longer maintain yo
Hi,
To return to my comparison with SANs, on a SAN you have spare disks to
repair a failed disk.
On Ceph, you therefore need at least one more host (k+m+1).
If we take into consideration the formalities/delivery times of a new
server, k+m+2 is not luxury (Depending on the growth of your volume).
in the point of view of
> global storage capacity.
>
> Patrick
>
> Le 05/12/2023 à 12:19, David C. a écrit :
>
> Hi,
>
> To return to my comparison with SANs, on a SAN you have spare disks to
> repair a failed disk.
>
> On Ceph, you therefore need at least o
Hi Mohamed,
Changing weights is no longer a good practice.
The balancer is supposed to do the job.
The number of pg per osd is really tight on your infrastructure.
Can you display the ceph osd tree command?
Cordialement,
*David CASIER*
Hi Sake,
I would start by decrementing max_mds by 1:
ceph fs set atlassian-prod max_mds 2
The mds.1 no longer restarts?
logs?
Le jeu. 21 déc. 2023 à 08:11, Sake Ceph a écrit :
> Starting a new thread, forgot subject in the previous.
> So our FS down. Got the following error, what can I do?
Hello Nicolas,
I don't know if it's an update issue.
If this is not a problem for you, you can consider redeploying
grafana/prometheus.
It is also possible to inject your own certificates :
https://docs.ceph.com/en/latest/cephadm/services/monitoring/#example
https://github.com/ceph/ceph/blob/m
function to create this certificate inside the Key
> store but how ... that's the point :-(
>
> Regards.
>
>
>
> Le mar. 23 janv. 2024 à 15:52, David C. a écrit :
>
>> Hello Nicolas,
>>
>> I don't know if it's an update issue.
>>
orError: unknown daemon type
> node-exporter
>
> Tried to remove & recreate service : it's the same ... how to stop the
> rotation now :-/
>
>
>
> Le mar. 23 janv. 2024 à 17:18, David C. a écrit :
>
Hi Albert,
In this scenario, it is more consistent to work with subvolumes.
Regarding security, you can use namespaces to isolate access at the OSD
level.
What Robert emphasizes is that creating pools dynamically is not without
effect on the number of PGs and (therefore) on the architecture (PG
Hi,
The client calculates the location (PG) of an object from its name and the
crushmap.
This is what makes it possible to parallelize the flows directly from the
client.
The client also has the map of the PGs which are relocated to other OSDs
(upmap, temp, etc.)
_
Albert,
Never used EC for (root) data pool.
Le jeu. 25 janv. 2024 à 12:08, Albert Shih a écrit :
> Le 25/01/2024 à 08:42:19+, Eugen Block a écrit
> > Hi,
> >
> > it's really as easy as it sounds (fresh test cluster on 18.2.1 without
> any
> > pools yet):
> >
> > ceph:~ # ceph fs volume creat
t; override.
>
> ceph:~ # ceph fs new cephfs cephfs_metadata cephfs_data --force
> new fs with metadata pool 6 and data pool 8
>
> CC'ing Zac here to hopefully clear that up.
>
> Zitat von "David C." :
>
> > Albert,
> > Never used EC for (root) data pool
; then this should definitely be in the docs as a warning for EC pools
> in cephfs!
>
> Zitat von "David C." :
>
> > In case the root is EC, it is likely that is not possible to apply the
> > disaster recovery procedure,
20e87354373b0fac
>
> This example shows that it's impossible to get any metrics in an IPv6 only
> network (Discovery is impossible) and it's visible at install so there's no
> test for IPv6 only environnement before release ?
>
> Now I'm seriously
Hello Albert,
this should return you the sockets used on the network cluster :
ceph report | jq '.osdmap.osds[] | .cluster_addrs.addrvec[] | .addr'
Cordialement,
*David CASIER*
Hi,
The problem seems to come from the clients (reconnect).
Test by disabling metrics on all clients:
echo Y > /sys/module/ceph/parameters/disable_send_metrics
Cordialement,
*David CASIER*
look at ALL cephfs kernel clients (no effect on RGW)
Le ven. 23 févr. 2024 à 16:38, a écrit :
> And we dont have parameter folder
>
> cd /sys/module/ceph/
> [root@cephgw01 ceph]# ls
> coresize holders initsize initstate notes refcnt rhelversion
> sections srcversion taint uevent
>
> My
Do you have the possibility to stop/unmount cephfs clients ?
If so, do that and restart the MDS.
It should restart.
Have the clients restart one by one and check that the MDS does not crash
(by monitoring the logs)
Cordialement,
*David C
if rebalancing tasks have been launched it's not a big deal, but I don't
think it's the priority.
The priority being to get the MDS back on its feet.
I haven't seen an answer to this question: can you stop/unmount cephfs
clients or not ?
There are other solutions but as you are not comfortable I a
Hello,
Each rack works on different trees or is everything parallelized ?
The meta pools would be distributed over racks 1,2,4,5 ?
If it is distributed, even if the addressed MDS is on the same switch as
the client, you will always have this MDS which will consult/write (nvme)
OSDs on the other ra
I came across an enterprise NVMe used for BlueFS DB whose performance
dropped sharply after a few months of delivery (I won't mention the brand
here but it was not among these 3: Intel, Samsung, Micron).
It is clear that enabling bdev_enable_discard impacted performance, but
this option also saved
user IO. Keep an eye on your
> discards being sent to devices and the discard latency, as well (via
> node_exporter, for example).
>
> Matt
>
>
> On 2024-03-02 06:18, David C. wrote:
> > I came across an enterprise NVMe used for BlueFS DB whose performance
> > dr
Hello everybody,
I'm encountering strange behavior on an infrastructure (it's pre-production
but it's very ugly). After a "drain" on monitor (and a manager). MGRs all
crash on startup:
Mar 07 17:06:47 pprod-mon1 ceph-mgr[564045]: mgr ms_dispatch2 standby
mgrmap(e 1310) v1
Mar 07 17:06:47 pprod-mo
I took the wrong ligne =>
https://github.com/ceph/ceph/blob/v17.2.6/src/mon/MonClient.cc#L822
Le jeu. 7 mars 2024 à 18:21, David C. a écrit :
>
> Hello everybody,
>
> I'm encountering strange behavior on an infrastructure (it's
> pre-production but it's very
;name": "pprod-mon3",
"weight": 10,
"name": "pprod-osd2",
"weight": 0,
"name": "pprod-osd1",
"weight": 0,
"name": "pprod-osd
> mon weight myself, do you know how that happened?
>
> Zitat von "David C." :
>
> Ok, got it :
>>
>> [root@pprod-admin:/var/lib/ceph/]# ceph mon dump -f json-pretty
>> |egrep "name|weigh"
>> dumped monmap epoch 14
>>
Hi Daniel,
Changing pg_num when some OSD is almost full is not a good strategy (or
even dangerous).
What is causing this backfilling? loss of an OSD? balancer? other ?
What is the least busy OSD level (sort -nrk17)
Is the balancer activated? (upmap?)
Once the situation stabilizes, it becomes i
Hi,
Do slow ops impact data integrity => No
Can I generally ignore it => No :)
This means that some client transactions are blocked for 120 sec (that's a
lot).
This could be a lock on the client side (CephFS, essentially), an incident
on the infrastructure side (a disk about to fall, network inst
My understanding is BeeGFS doesn't offer data redundancy by default,
you have to configure mirroring. You've not said how your Ceph cluster
is configured but my guess is you have the recommended 3x replication
- I wouldn't be surprised if BeeGFS was much faster than Ceph in this
case. I'd be intere
Hi All
I'm planning to upgrade a Luminous 12.2.10 cluster to Pacific 16.2.10,
cluster is primarily used for CephFS, mix of Filestore and Bluestore
OSDs, mons/osds collocated, running on CentOS 7 nodes
My proposed upgrade path is: Upgrade to Nautilus 14.2.22 -> Upgrade to
EL8 on the nodes (probabl
o 8 without a reinstall. Rocky has a
> similar path I think.
>
> - you will need to love those filestore OSD’s to Bluestore before hitting
> Pacific, might even be part of the Nautilus upgrade. This takes some time
> if I remember correctly.
>
> - You may need to upgrade monito
>
> I don't think this is necessary. It _is_ necessary to convert all
> leveldb to rocksdb before upgrading to Pacific, on both mons and any
> filestore OSDs.
Thanks, Josh, I guess that explains why some people had issues with
Filestore OSDs post Pacific upgrade
On Tue, Dec 6, 2022 at 4:07 PM Jo
Hi,
Michel,
the pool already appears to be in automatic autoscale ("autoscale_mode on").
If you're worried (if, for example, the platform is having trouble handling
a large data shift) then you can set the parameter to warn (like the
rjenkis pool).
If not, as Hervé says, the transition to 2048
Hi Albert,
(open question, without judgment)
What is the purpose of importing users recurrently ?
It seems to me that import is the complement of export, to restore.
Creating in ceph and exporting (possibly) in json format is not enough ?
Le mar. 3 déc. 2024 à 13:29, Albert Shih a écrit :
> Le
Hi,
In this case, the tool that adds the account should perform a caps check
(for security reasons) and probably use get-or-create/caps (not import)
Le mer. 4 déc. 2024 à 10:42, Albert Shih a écrit :
> Le 03/12/2024 à 18:27:57+0100, David C. a écrit
> Hi,
>
> >
> > (o
85 matches
Mail list logo