"ceph daemon osd.x ops" shows ops currently in flight, the number is different
from "ceph osd status
Hi All
Is there perhaps any updated documentation about ceph OSD node optimised
sysctl configuration?
I'm seeing a lot of these:
$ netstat -s
...
4955341 packets pruned from receive queue because of socket buffer overrun
...
5866 times the listen queue of a socket overflowed
...
TCPB
Hi Jan,
Jan Pekař wrote:
> Also I'm concerned, that this OSD restart caused data degradation and
> recovery - cluster should be clean immediately after OSD up when no
> client was uploading/modifying data during my tests.
We're experiencing the same thing on our 14.2.10 cluster. After marking a
Hi,
We have a large production cluster with Writeback Cache Tier. Recently, we
observed that some of the OSDs were near full and got full from the Cache
Tier and the cluster is in error state. The target_max_bytes was not set
correctly and hence I think the flush and eviction never happened.
I wis
When I make my hard efforts to print the documents through my HP printer,
suddenly my HP printer goes offline mode. I am applying an appropriate command
to my HP printer, I am facing HP printer offline problem. This offline error is
an annoying issue for me, so I am not able to work on my HP pri
When I make my hard efforts to print the documents through my HP printer,
suddenly my HP printer goes offline mode. I am applying an appropriate command
to my HP printer, I am facing HP printer offline problem. This offline error is
an annoying issue for me, so I am not able to work on my HP pri
Dear Mark and Dan,
I'm in the process of restarting all OSDs and could use some quick advice on
bluestore cache settings. My plan is to set higher minimum values and deal with
accumulated excess usage via regular restarts. Looking at the documentation
(https://docs.ceph.com/docs/mimic/rados/con
Hi,
I made a fresh install of Ceph Octopus 15.2.3 recently.
And after a few days, the 2 standby MDS suddenly crashed with segmentation
fault error.
I try to restart it but it does not start.
Here is the error :
-20> 2020-07-17T13:50:27.888+ 7fc8c6c51700 10 monclient: _renew_subs
-19> 2020
After trying to restart the mds master, it also failed. Now the cluster state
is :
# ceph status
cluster:
id: dd024fe1-4996-4fed-ba57-03090e53724d
health: HEALTH_WARN
1 filesystem is degraded
insufficient standby MDS daemons available
29 daemons have recently crashed
services:
m
Hi Mateusz,
I think you might be hit by:
https://tracker.ceph.com/issues/44213
This is fixed in upcoming Pacific release. Nautilus/Octopus backport is
under discussion for now.
Thanks,
Igor
On 7/18/2020 8:35 AM, Mateusz Skała wrote:
Hello Community,
I would like to ask about help in exp
On 7/20/20 3:23 AM, Frank Schilder wrote:
Dear Mark and Dan,
I'm in the process of restarting all OSDs and could use some quick advice on
bluestore cache settings. My plan is to set higher minimum values and deal with
accumulated excess usage via regular restarts. Looking at the documentation
On 20/07/2020 10:48 pm, carlimeun...@gmail.com wrote:
After trying to restart the mds master, it also failed. Now the cluster state
is :
Try deleting and recreating one of the MDS.
--
Lindsay
___
ceph-users mailing list -- ceph-users@ceph.io
To unsu
Hi Igor,
Given the patch histories and the rejection of the previous patch for the
one in favor of defaulting to 4k block size, does this essentially mean
ceph does not support higher block sizes when using erasure coding? Will
the ceph project be updating their documentation and references to let
On Mon, Jul 20, 2020 at 5:38 AM wrote:
>
> Hi,
>
> I made a fresh install of Ceph Octopus 15.2.3 recently.
> And after a few days, the 2 standby MDS suddenly crashed with segmentation
> fault error.
> I try to restart it but it does not start.
> [...]
Can you please increase MDS debugging:
ceph
I just want to thank the Ceph community, and the Ceph developers for such a
wonderful product.
We had a power outage on Saturday, and both Ceph clusters went offline, along
with all of our other servers.
Bringing Ceph back to full functionality was an absolute breeze, no problems,
no hiccups,
If there was a “like” button, I would have just clicked that to keep the list
noise down. I have smaller operations and so my cluster goes down a lot more
often. I keep dreading my abuse of the cluster and it just keeps coming back
for more.
Ceph really is amazing, and it’s hard to fully appre
Hi All,
Did more tests. Just one client, big object / small object, several
clients with big and small objects - and seems like im getting
absolutely reasonable numbers. Big objects are satturating network,
small objects - IOPs on discs. Overall i have better understanding and
im happy a
Dear Mark,
thank you very much for the very helpful answers. I will raise
osd_memory_cache_min, leave everything else alone and watch what happens. I
will report back here.
Thanks also for raising this as an issue.
Best regards,
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum
On Mon, Jul 20, 2020 at 12:43 PM Brian Topping
wrote:
> If there was a “like” button, I would have just clicked that to keep the
> list noise down. I have smaller operations and so my cluster goes down a
> lot more often. I keep dreading my abuse of the cluster and it just keeps
> coming back for
I agree, Thanks from me as well, I am also really impressed by this
storage solution as well as something like apache mesos. Those are the
most impressive technologies introduced and developed last 5(?) years.
-Original Message-
To: ceph-users
Cc: dhils...@performair.com
Subject: [c
Hi Mark and others,
last week we have finally been able to solve the problem. We are using Gentoo
on our test cluster and as it turned out the official Ebuilds are not setting
CMAKE_BUILD_TYPE=RelWithDebInfo, which alone caused the performance degradation
we have been seeing after upgrading to
Hi Adam,
We test it fairly regularly on our development test nodes. Basically what this
does is cache data in the bluestore buffer cache on write. By default we only
cache things when they are first read. The advantage to enabling this is that
you immediately have data in the cache once it's
Hi ,
Today checking the osd logs at boot after upgrade to 14.2.10 we found that:
set_numa_affinity unable to identify public interface 'p3p1.4094' numa node:
(2) No such file or directory
"2020-07-20 20:41:41.134 7f2cd15ca700 -1 osd.12 1120769 set_numa_affinity
unable to identify public inte
Hi,
I manage to activate the OSDs after adding the keys with
for i in `seq 0 8`; do ceph auth get-or-create osd.$i mon 'profile osd' mgr
'profile osd' osd 'allow *'; done
# ceph osd status
++--+---+---++-++-++
| id | host | used | ava
aha, thanks very much for pointing out, Anthony!
Just a summary for the screenshot pasted in my previous email. Based on my
understanding, "ceph daemon osd.x ops" or "ceph daemon osd.x
dump_ops_in_flight" shows the ops currently being processed in the osd.x. I
also noticed that there is anothe
25 matches
Mail list logo