This is the seventh update to the Ceph Nautilus release series. This is
a hotfix release primarily fixing a couple of security issues. We
recommend that all users upgrade to this release.
Notable Changes
---
* CVE-2020-1699: Fixed a path traversal flaw in Ceph dashboard that
could al
Update: the primary data pool (con-fs2-meta2) does store data:
con-fs2-meta112 240 MiB 0.02 1.1 TiB 6437
con-fs2-meta213 0 B 0 373 TiB72167
con-fs2-data 14 103 GiB 0.01 894 TiB
On Fri, Jan 31, 2020 at 4:57 PM Dan van der Ster wrote:
>
> Hi Ilya,
>
> On Fri, Jan 31, 2020 at 11:33 AM Ilya Dryomov wrote:
> >
> > On Fri, Jan 31, 2020 at 11:06 AM Dan van der Ster
> > wrote:
> > >
> > > Hi all,
> > >
> > > We are quite regularly (a couple times per week) seeing:
> > >
> > >
Dear Gregory and Philip,
I'm also experimenting with a replicated primary data pool and an erasure-coded
secondary data pool. I make the same observation with regards to objects and
activity as Philip. However, is does seem to make a difference. If I run a very
aggressive fio test as in:
fio -
Hi Ilya,
On Fri, Jan 31, 2020 at 11:33 AM Ilya Dryomov wrote:
>
> On Fri, Jan 31, 2020 at 11:06 AM Dan van der Ster wrote:
> >
> > Hi all,
> >
> > We are quite regularly (a couple times per week) seeing:
> >
> > HEALTH_WARN 1 clients failing to respond to capability release; 1 MDSs
> > report sl
I tried that (and just tried again by setting it in /etc/ceph/ceph.conf). OSD
still won’t start.
Dr. T.J. Ragan
Senior Research Computation Officer
Leicester Institute of Structural and Chemical Biology
University of Leicester, University Road, Leicester LE1 7RH, UK
t: +44 (0)116 223 1287
e:
Was probably an over-paranoid question. The upgrade 13.2.2 -> 13.2.8 went
smoothly. Only this one didn't do what was expected:
# ceph osd set pglog_hardlimit
Invalid command: pglog_hardlimit not in
full|pause|noup|nodown|noout|noin|nobackfill|norebalance|norecover|noscrub|nodeep-scrub|notieragen
Hello,
in my cluster one after the other OSD dies until I recognized that it
was simply an "abort" in the daemon caused probably by
2020-01-31 15:54:42.535930 7faf8f716700 -1 log_channel(cluster) log
[ERR] : trim_object Snap 29c44 not in clones
Close to this msg I get a stracktrace:
ceph ver
If you don't care about the data: set
osd_find_best_info_ignore_history_les = true on the affected OSDs
temporarily.
This means losing data.
For anyone else reading this: don't ever use this option. It's evil
and causes data loss (but gets your PG back and active, yay!)
Paul
--
Paul Emmerich
The use case is for KVM RBD volumes.
Our enviroment will be 80% random reads/writes probably 40/60 or 30/70 is a
good estimate. All 4k-8k IO sizes. We currently run on a Nimble Hybrid array
which runs in the 5k-15k IOPS range with some spikes up to 20-25k IOPS (Capable
of 100k iops per Nimb
The RocksDB rings are 256MB, 2.5GB, 25GB, and 250GB. Unless you have a
workload that uses a lot of metadata, taking care of the first 3 and providing
room for compaction should be fine. To allow for compaction room, 60GB should
be sufficient. Add 4GB to accommodate WAL and you're at a nice m
vitalif@yourcmc.ru wrote:
> I think 800 GB NVMe per 2 SSDs is an overkill. 1 OSD usually only
> requires 30 GB block.db, so 400 GB per an OSD is a lot. On the other
> hand, does 7300 have twice the iops of 5300? In fact, I'm not sure if a
> 7300 + 5300 OSD will perform better than just a 5300 OS
Hi All,
Long story-short, we’re doing disaster recovery on a cephfs cluster, and are at
a point where we have 8 pgs stuck incomplete. Just before the disaster, I
increased the pg_count on two of the pools, and they had not completed
increasing the pgp_num yet. I’ve since forced pgp_num to the
Ok, so 100G seems to be the better choice. I will probably go with some of
these.
[ https://www.fs.com/products/75808.html |
https://www.fs.com/products/75808.html ]
From: "Paul Emmerich"
To: "EDH"
Cc: "adamb" , "ceph-users"
Sent: Friday, January 31, 2020 8:49:29 AM
Subject: Re: [c
Hello Adam,
Can you describe what performance values you want to gain out of your
cluster?
What's the use case?
EC oder Replica?
In general, more disks are preferred over bigger ones.
As Micron has not provided us with demo hardware, we can't say how fast
these disks are in reality. Before I thin
I think 800 GB NVMe per 2 SSDs is an overkill. 1 OSD usually only
requires 30 GB block.db, so 400 GB per an OSD is a lot. On the other
hand, does 7300 have twice the iops of 5300? In fact, I'm not sure if a
7300 + 5300 OSD will perform better than just a 5300 OSD at all.
It would be interestin
On Fri, Jan 31, 2020 at 2:06 PM EDH - Manuel Rios
wrote:
>
> Hmm change 40Gbps to 100Gbps networking.
>
> 40Gbps technology its just a bond of 4x10 Links with some latency due link
> aggregation.
> 100 Gbps and 25Gbps got less latency and Good performance. In ceph a 50% of
> the latency comes fr
Please check that you Support RDMA for improve Access.
40Gbps transceiver are internally a 4x10 Ports . Thats why you can Split 40
gbps switches port in 4x10 multiports over the same link
25GG is a new base technology with improvemenets over 10Gbps in latency.
Regards
Manuel
De: Adam Boyhan
Appreciate the input.
Looking at those articles they make me feel like the 40G they are talking about
is 4x Bonded 10G connections.
Im looking at 40Gbps without bonding for throughput. Is that still the same?
[ https://www.fs.com/products/29126.html |
https://www.fs.com/products/29126.html
Hmm change 40Gbps to 100Gbps networking.
40Gbps technology its just a bond of 4x10 Links with some latency due link
aggregation.
100 Gbps and 25Gbps got less latency and Good performance. In ceph a 50% of the
latency comes from Network commits and the other 50% from disk commits.
A fast graph :
Looking to role out a all flash Ceph cluster. Wanted to see if anyone else was
using Micron drives along with some basic input on my design so far?
Basic Config
Ceph OSD Nodes
8x Supermicro A+ Server 2113S-WTRT
- AMD EPYC 7601 32 Core 2.2Ghz
- 256G Ram
- AOC-S3008L-L8e HBA
- 10GB SFP+ for
I am seeing very few of such error messages in the mon logs (~ a couple per
day)
If I issue on every OSD the command "ceph daemon osd.$id dump_osd_network"
with the default 1000 ms threshold, I can't see entries.
I guess this is because that command considers only the last (15 ?) minutes.
Am I sup
On Fri, Jan 31, 2020 at 11:06 AM Dan van der Ster wrote:
>
> Hi all,
>
> We are quite regularly (a couple times per week) seeing:
>
> HEALTH_WARN 1 clients failing to respond to capability release; 1 MDSs
> report slow requests
> MDS_CLIENT_LATE_RELEASE 1 clients failing to respond to capability r
Turns out it is probably orphans.
We are running ceph luminous : 12.2.12
And the orphans find is stuck in the stage : "iterate_bucket_index" on shard
"0" for 2 days now.
Anyone is facing this issue ?
Regards,
De : ceph-users
mailto:ceph-users-boun...@lists.ceph.com>>
Envoyé : 21 January 2020
Dear all,
is it possible to upgrade from 13.2.2 directly to 13.2.8 after setting "ceph
osd set pglog_hardlimit" (mimic 13.2.5 release notes), or do I need to follow
this path:
13.2. 2 -> 5 -> 6 -> 8 ?
Thanks!
=
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14
__
Hi all,
We are quite regularly (a couple times per week) seeing:
HEALTH_WARN 1 clients failing to respond to capability release; 1 MDSs
report slow requests
MDS_CLIENT_LATE_RELEASE 1 clients failing to respond to capability release
mdshpc-be143(mds.0): Client hpc-be028.cern.ch: failing to res
Hi,
On 1/31/20 12:09 AM, Nigel Williams wrote:
Did you end up having all new IPs for your MONs? I've wondered how
should a large KVM deployment be handled when the instance-metadata
has a hard-coded list of MON IPs for the cluster? how are they changed
en-masse with running VMs? or do these move
27 matches
Mail list logo