Hi Milan
Please DO NOT delete the object for all the EC shards (i.e. at all three
OSDs)
Sorry, I missed that you have three shards crashing... Removing that
many object shards will cause data lost.
Theoretically removing just a single object replica and then doing a
scrub might help t
Interesting. There is a more forceful way to disable progress which I had
to do as we have an older version. Basically, you stop the mgrs, and then
move the progress module files:
systemctl stop ceph-mgr.target
mv /usr/share//ceph/mgr/progress {some backup location}
systemctl start ceph-mgr.target
Hi Paul,
We had similar experience with redhat ceph, and it turned out to be the mgr
progress module. I think there are some works to fix this, though the one I
thought would impact you seems to be in 14.2.11.
https://github.com/ceph/ceph/pull/36076
If you have 14.2.15, you can try turning off th
Hi Igor,
Thank you for quick and useful answer. We are looking at our options.
Milan
On 2020-11-24 06:49, Igor Fedotov wrote:
> Another workaround would be to delete the object in question using
> ceph-objectstore-tool and then do a scrub on the corresponding PG to fix
> the absent object.
>
While the "progress off" was hung, I did a systemctl restart of the active
ceph-mgr. The progress toggle command completed and reported that progress
disabled.
All commands that were hanging before are still unresponsive. That was worth a
shot.
Thanks
--
Paul Mezzanini
Sr Systems Administra
"ceph progress off" is just hanging like the others.
I'll fiddle with it later tonight to see if I can get it to stick when I bounce
a daemon.
--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute
Can you clarify, Istvan, what you plan on setting to 64K? If it’s the number of
shards for a bucket, that would be a mistake.
> On Nov 21, 2020, at 2:09 AM, Szabo, Istvan (Agoda)
> wrote:
>
> Seems like this sharding we need to be plan carefully since the beginning.
> I'm thinking to set the
Starting in stable release Octopus 15.2.0 and continuing through Octopus 15.2.6
there is a bug in RGW that could result in data loss. There is both an
immediate configuration work-around and a fix is intended for Octopus 15.2.7.
[Note: the bug was first merged in a pre-stable release — Octopus 1
context : JSON output was added to smartmontools 7 explicitly for Ceph use
>
> I had to roll an upstream version of the smartmon tools because everything
> with redhat 7/8 was too old to support the json option.
>
___
ceph-users mailing list -- ceph-
I had to roll an upstream version of the smartmon tools because everything with
redhat 7/8 was too old to support the json option.
--
Paul Mezzanini
Sr Systems Administrator / Engineer, Research Computing
Information & Technology Services
Finance & Administration
Rochester Institute of Technolo
Hi,
I did some search about replacing osd, and found some different
steps, probably for different release?
Is there recommended process to replace an osd with Octopus?
Two cases here:
1) replace HDD whose WAL and DB are on a SSD.
1-1) failed disk is replaced by the same model.
1-2) working disk is
Hi,
With Ceph Octopus 15.2.5, here is the output of command
"ceph device get-health-metrics SEAGATE_DL2400MM0159_WBM2WP2S".
===
"20201123-000939": {
"dev": "/dev/sde",
"error": "smartctl failed",
"nvme_smart_health_information_add_log_error":
I had hoped to stay out of this, but here I go.
> 4) SATA controller and PCIe throughput
SoftIron claims “wire speed” with their custom hardware FWIW.
> Unfortunately these are the kinds of things that you can't easily generalize
> between ARM vs x86. Some ARM processors are going to do wildl
Ever since we jumped from 14.2.9 to .12 (and beyond) a lot of the ceph commands
just hang. The mgr daemon also just stops responding to our Prometheus scrapes
occasionally. A daemon restart and it wakes back up. I have nothing pointing
to these being related but it feels that way.
I also tri
Hi everyone,
The Ceph User Survey 2020 is being planned by our working group. Please
review the draft survey pdf, and let's discuss any changes. You may also
join us in the next meeting *on November 25th at 12pm *PT
https://tracker.ceph.com/projects/ceph/wiki/User_Survey_Working_Group
https://tr
So yes you can get the servers for a considerably lower price than Intel.
Its not just about the CPU cost but many arm servers are based on a SOC that
includes networking so the overall cost of the motherboard/processor/networking
is a lot lower.
It doesn't reduce the price of the storage or me
Hello,
I'm having difficulties with setting up the web certificates for the
Dashboard on hostnames ceph*01..n*.domain.tld.
I set the keys and crt with ceph-config-key. ceph-config-key get
mgr/dashbord/crt shows the correct certificate,
the same applies to mgr/dashbord/key, mgr/cephadm/grafana_key
Hi Nathan
Thanks for the reply.
root@ceph1 16:30 [~]: ceph osd pool autoscale-status
POOLSIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET
RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
ec82pool 2886T 1.254732T 0.7625
I've been running ceph on a heterogeneous mix of rock64 and rpi4 SBCs. i've
had to do my own builds, as the upstream ones started off with thunked-out
checksumming due to (afaict) different arm feature sets between upstream's
build targets and my SBCs, but other than that one, haven't run into any
Adrian;
I've always considered the advantage of ARM to be the reduction in the failure
domain. Instead of one server with 2 processors, and 2 power supplies, in 1
case, running 48 disks, you can do 4 cases containing 8 power supplies, and 32
processors running 32 (or 64...) disks.
The archit
Hi guys,
I was looking at some Huawei ARM-based servers and the datasheets are
very interesting. The high CPU core numbers and the SoC architecture
should be ideal for a distributed storage like Ceph, at least in theory.
I'm planning to build a new Ceph cluster in the future and my best
cas
2nd that. Why even remove old documentation before it is migrated to the
new environment. It should be left online until the migration
successfully completed.
-Original Message-
Sent: Tuesday, November 24, 2020 4:23 PM
To: Frank Schilder
Cc: ceph-users
Subject: [ceph-users] Re: Docu
Oliver;
You might consider asking this question of the CentOS folks. Possibly at
cen...@centos.org.
Thank you,
Dominic L. Hilsbos, MBA
Director – Information Technology
Perform Air International Inc.
dhils...@performair.com
www.PerformAir.com
-Original Message-
From: Oliver Weinman
No, you are not affected. Affected only clusters with mixed versions.
k
Sent from my iPhone
> On 24 Nov 2020, at 18:25, Rainer Krienke wrote:
>
> Hello,
>
> thanks for your answer. If I understand you correctly then only if I
> upgrade from 14.2.11 to 14.2.(12|14) this could lead to problem
I want to just echo this sentiment. I thought the lack of older docs would
be a very temporary issue, but they are still not available. It is
especially frustrating when half the google searches also return a page not
found error. The migration has been very badly done.
Sincerely,
On Tue, Nov 24,
I am gathering prometheus metrics from my (unhealthy) Octopus (15.2.4)
cluster and notice a discrepency (or misunderstanding) with the ceph
dashboard.
In the dashboard, and with ceph -s, it reports 807 million objects objects:
pgs: 169747/807333195 objects degraded (0.021%)
hi Seena, sorry for the late reply,
I have used jaeger to trace the rgw req, the PR is still not merged with
official repo, but you can give it a try
https://github.com/suab321321/ceph/tree/wip-jaegerTracer-noNamespace,
1. the cmake option to build jaeger is on by default so you dont need to
give
Hello,
> I'm curious however if the ARM servers are better or not for this use case
> (object-storage only). For example, instead of using 2xSilver/Gold server, I
> can use a Taishan 5280 server with 2x Kungpen 920 ARM CPUs with up to 128
> cores in total . So I can have twice as many CPU cor
Am 24.11.20 um 13:12 schrieb Adrian Nicolae:
> Has anyone tested Ceph in such scenario ? Is the Ceph software
> really optimised for the ARM architecture ?
Personally I have not run Ceph on ARM, but there are companies selling
such setups:
https://softiron.com/
https://www.ambedded.com.tw/
At least in the past, there have been a couple of things you really want
to focus on regarding ARM and performance (beyond the obvious core
count/clockspeed/ipc/etc):
1) HW acceleration for things like CRC32, MD5, etc
2) context switching overhead
3) memory throughput
4) SATA controller and
Older versions are available here:
https://web.archive.org/web/20191226012841/https://docs.ceph.com/docs/mimic/
I'm actually also a bit unhappy about older versions missing. Mimic is not end
of life and a lot of people still use luminous. Since there are such dramatic
differences between interf
We made the same observation and found out that for CentOS8 there are extra
modules for samba that provide vfs modules for certain storage systems (search
for all available package names containing samba and they show up in the list).
One is available and supports gluster fs. The corresponding p
Another workaround would be to delete the object in question using
ceph-objectstore-tool and then do a scrub on the corresponding PG to fix
the absent object.
But I would greatly appreciate if we dissect this case for a bit
On 11/24/2020 9:55 AM, Milan Kupcevic wrote:
Hello,
Three OSD d
Hi Milan,
given the log output mentioning 32768 spanning blobs I believe you're
facing https://tracker.ceph.com/issues/48216
The root cause for this case is still unknown but PR attached to the
ticket allows to fix the issue using objectstore's fsck/repair.
Hence if you're able to deploy a
Try to look at `radosgw-admin reshard stale-instances list` command. And if
list is not empty just rm this stale reshard and then start reshard process
again.
k
Sent from my iPhone
> On 22 Nov 2020, at 15:35, Mateusz Skała wrote:
>
> Thank You for response, how I can upload this to metadat
This bug may be affect when you upgrade from 14.2.11 to 14.2.(12|14) with low
speed (e.g. one by one node). If you already upgrade from 14.2.11 you just jump
over this bug.
k
Sent from my iPhone
> On 24 Nov 2020, at 10:43, Rainer Krienke wrote:
>
> Hello,
>
> I am running a productive cep
Just rpmbuild last samba version with enabled vfs features. This modules is
stable.
k
Sent from my iPhone
> On 24 Nov 2020, at 10:51, Frank Schilder wrote:
>
> We made the same observation and found out that for CentOS8 there are extra
> modules for samba that provide vfs modules for certa
I add one OSD node to the cluster and I get 500MB/s throughput over my
disks and it was 2 or 3 times better than before! but my latency raised 5
times!!!
When I enable bluefs_buffered_io the throughput on disks gets 200MB/s and
my latency gets down!
Is there any kernel config/tuning that should be
38 matches
Mail list logo