Hello Kotresh,
We have the same problem quite frequently for a few month now with Ceph 16.2.7.
For us the only thing that helps is a
reboot of the MDS/client or the warning might disappears after a few days by
itself. Its a Ubuntu kernel (5.13) client.
Best,
Alex
Am Mittwoch, dem 06.07.
Hello everybody,
since auto sharding does not work on replicated clusters (we only share the
user account and metadata and not the actual data) I would like to
implement it on my own.
But when I reshard a bucket from 53 to 101 (yep, we have two buckets with
around 8m files in it) it takes a long
Ok, but I have not really SSD. My SSD is only for DB, not for data.
Jof
Le mar. 5 juil. 2022 à 18:01, Tatjana Dehler a écrit :
> Hi,
>
> On 7/5/22 13:17, Joffrey wrote:
> > Hi,
> >
> > I upgraded from 16.2.4 to 17.2.0
> >
> > Now, I have a CephImbalance alert with many errors on my OSD "deviate
Hi all I have a 10 node cluster with fairly modest hardware (6 HDD, 1 shared NVME for DB on each) on the nodes that I use for archival.After upgrading to Quincy I noticed that load avg on my servers is very high during recovery or rebalance.Changing the OSD recovery priority does not work, I assume
Hi all,
Thanks for opening this discussion,
Let me share with you some thoughts..
We discussed this in PetaSAN project a while ago, after getting
complaints concerning pgs not deep scrubbed in time.
The main question was whether Ceph should be responsible to finish
scrubbing in the specified
Hi Jimmy,
As you rightly pointed out, the OSD recovery priority does not work because
of the
change to mClock. By default, the "high_client_ops" profile is enabled and
this
optimizes client ops when compared to recovery ops. Recovery ops will take
the
longest time to complete with this profile and
Do you mean load average as reported by `top` or `uptime`?
That figure can be misleading on multi-core systems. What CPU are you using?
For context, when I ran systems with 32C/64T and 24x SATA SSD, the load average
could easily hit 40-60 without anything being wrong.
What CPU percentages in
Thanks for your reply.
What I meant with high load was load as seen by the top command, all the
servers have load average over 10.
I added one more noode to add more space.
This is what I get from ceph status:
cluster:
id:
health: HEALTH_WARN
2 failed cephadm daemon(s
Hi Cephers
I've got a missing object, can anyone point me to a simple method of
turning the oid into a /path/filename, that I could then recover from
backup?
root@ceph-s1 15:52 [~]: ceph pg 2.fff list_unfound
{
"num_missing": 1,
"num_unfound": 1,
"objects": [
{
> Do you mean load average as reported by `top` or `uptime`?
yes
> That figure can be misleading on multi-core systems. What CPU are you
using?
It's a 4c/4t low end CPU
/Jimmy
On Wed, Jul 6, 2022 at 4:52 PM Anthony D'Atri
wrote:
> Do you mean load average as reported by `top` or `uptime`?
>
>
Hey Andreas,
thanks for the info.
We also had our MGR reporting crashes related to the module.
We have a second cluster as mirror which we also updated to Quincy.
But there the MGR is able to use the snap_module (so "ceph fs
snap-schedule status" etc are not complaining).
And I'm able to schedu
I am wondering if it is safe to delete the following pool that rados ls reports
is empty, but rados df indicates has a few thousand objects?
[root@ceph-admin ~]# rados -p fs.data.user.hdd.ec ls | wc -l
0
[root@ceph-admin ~]# rados df | egrep -e 'POOL|fs.data.user.hdd.ec'
POOL_NAME USED OBJECTS C
This is from rbd hdd pool 3x replication (not really fast drives, 2.2 cpu's are
on balanced not optimized, nautilus)
[@~]# rados bench -p rbd 60 write
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304
for up to 60 seconds or 0 objects
Object prefix: benchmar
This is from rbd ssd pool 3x replication (sata ssd drives, 2.2 cpu's are on
balanced not optimized, nautilus)
[@~]# rados bench -p rbd.ssd 60 write
hints = 1
Maintaining 16 concurrent writes of 4194304 bytes to objects of size 4194304
for up to 60 seconds or 0 objects
Object prefix: benchmark_d
Here are the main topics of discussion during the CLT meeting today:
- make-check/API tests
- Ignoring the doc/ directory would skip an expensive git checkout
operation and save time
- Stale PRs
- Currently an issue with stalebot which is being investigated
- Cephalocon
have a problem with I/O performance on Openstack block device.
*Enviroment:*
*Openstack version: Ussuri*
- OS: CentOS8
- Kernel: 4.18.0-240.15.1.el8_3.x86_64
- KVM: qemu-kvm-5.1.0-20.el8
*CEPH version: Octopus*
- OS: CentOS8
- Kernel: 4.18.0-240.15.1.el8_3.x86_64
In CEPH Cluster we have 2
Hi Andreas,
On Wed, Jul 6, 2022 at 8:36 PM Andreas Teuchert wrote:
>
> Hello Mathias and others,
>
> I also ran into this problem after upgrading from 16.2.9 to 17.2.1.
>
> Additionally I observed a health warning: "3 mgr modules have recently
> crashed".
>
> Those are actually two distinct crash
Hi,
I want to test compression performance in Ceph on my cluster.
But I cannot find a tools to set compress ratio or set compress data.
Now I use warp(https://github.com/minio/warp) to test compression performance,
but the data is random that cannot compress.
So who knows which tools can test s
18 matches
Mail list logo