Den sön 15 mars 2020 kl 14:06 skrev Виталий Филиппов :
> WAL is 1G (you can allocate 2 to be sure), DB should always be 30G. And
> this doesn't depend on the size of the data partition :-)
>
DB should be either 3, 30 or 300 depending on how much you can spare on the
fast devices. 30 is probably g
Hi,
I was planing to activate the pg_autoscaler on a EC (6+3) pool which I
created two years ago.
Back then I calculated the total # of pgs for this pool with a target
per ods pg # of 150 (this was the recommended /osd pg number as far as I
recall).
I used the RedHat ceph pg per pool calculator [
Over the weekend, all five MGRs failed, which means we have no more
Prometheus monitoring data. We are obviously monitoring the MGR status
as well, so we can detect the failure, but it's still a pretty serious
issue. Any ideas as to why this might happen?
On 13/03/2020 16:56, Janek Bevendorff
This was a bug in 14.2.7 and calculation for EC pools.
It has been fixed in 14.2.8
On Mon, 16 Mar 2020 16:21:41 +0800 Dietmar Rieder
wrote
Hi,
I was planing to activate the pg_autoscaler on a EC (6+3) pool which I
created two years ago.
Back then I calculated the total #
On 3/16/20 5:21 AM, Konstantin Shalygin wrote:
On 3/13/20 8:49 PM, Marc Roos wrote:
Can you also create snapshots via the vfs_ceph solution?
Yes! Since Samba 4.11 this supported via vfs_ceph_snapshots module.
Just out of curiosity: We are currently running a samba server with RBD
disks a
On 3/16/20 4:10 PM, mj wrote:
Just out of curiosity: We are currently running a samba server with
RBD disks as a VM on our proxmox/ceph cluster.
I see the advantage of having vfs_ceph_snapshots of the samba
user-data. But then again: re-sharing data using samba vfs_ceph adds a
layer of comp
Hello all,
We are having an issue with the Ceph Zabbix module and it's failing to send
data. The reason is that in our Zabbix infrastructure we use Encryption and
agent connection with certificate as well. I see the logs that are failing due
to that reason from the zabbix proxy servers.
1329:
Hi Derek,
first of all some BlueStore design overview to make sure we're on the
same plate.
BlueFS doesn't keep all the BlueStore data but just RocksDB part of it.
In your case BlueFS shares the same device with BlueStore user data.
Some space rebalance procedure takes periodically place t
Hi Victor,
1) RocksDB doesn't put L4 on the fast device if it's less than ~ 286 GB,
so no. But, anyway, there's usually no L4, so 30 GB is usually
sufficient. I had ~17 GB block.dbs even for 8 TB hard drives used for
RBD... RGW probably uses slightly more if stored objects are small...
but yo
13:51' starts looking
like peering a few pgs and then at '2020-03-15 14:40' on 716 fails, and
then for example 719 it fails 1 min later at '2020-03-15 14:41'.
[1] - ftp://ftp.umiacs.umd.edu/pub/derek/ceph-osd.716.log-20200316.gz
[2] - ftp://ftp.umiacs.umd.edu/pub/derek/ceph
Oh, didn't realize, Thanks
Dietmar
On 2020-03-16 09:44, Ashley Merrick wrote:
> This was a bug in 14.2.7 and calculation for EC pools.
>
> It has been fixed in 14.2.8
>
>
> On Mon, 16 Mar 2020 16:21:41 +0800 *Dietmar Rieder
> * wrote
>
> Hi,
>
> I was planing to activate th
other
OSDs reported(-ing) something similar?
Here is another node which at around '2020-03-15 13:51' starts looking
like peering a few pgs and then at '2020-03-15 14:40' on 716 fails, and
then for example 719 it fails 1 min later at '2020-03-15 14:41'.
[1] -
Hi Wido,
can you please share some detailed instructions how to do this?
And what do you mean with "respect your failure domain"?
THX
Am 04.03.2020 um 11:27 schrieb Wido den Hollander:
> On 3/4/20 11:15 AM, Thomas Schneider wrote:
>> Hi,
>>
>> Ceph balancer is not working correctly; there's an o
Hi Dan,
I have opened this this bug report for balancer not working as expected.
https://tracker.ceph.com/issues/43586
Then I thought it could make sense to balance the cluster manually by
means of moving PGs from a heavily loaded OSD to another.
I found your slides "Luminous: pg upmap (dev)
On 3/16/2020 3:25 PM, vita...@yourcmc.ru wrote:
Hi Victor,
1) RocksDB doesn't put L4 on the fast device if it's less than ~ 286
GB, so no. But, anyway, there's usually no L4, so 30 GB is usually
sufficient. I had ~17 GB block.dbs even for 8 TB hard drives used for
RBD... RGW probably uses s
Hi Thomas,
I lost track of your issue. Are you just trying to balance the PGs ?
14.2.8 has big improvements -- check the release notes / blog post
about setting the upmap_max_deviations down to 2 or 1.
-- Dan
On Mon, Mar 16, 2020 at 4:00 PM Thomas Schneider <74cmo...@gmail.com> wrote:
>
> Hi Dan,
Hi Dan,
indeed I'm trying to balance the PGs.
In order to ensure Ceph cluster operations I used OSD reweight, means
some specific OSDs are not with reweight 0.8 and 0.9 respectively.
Question:
Can I upgrade to Ceph 14.2.8 w/o resetting the weight to 1.0?
Or should I cleanup this reweight first,
Hi,
I would upgrade, configure the balancer correctly, then wait a bit for
it to smooth things out.
Afterwards you can reweight back to 1.0.
-- dan
On Mon, Mar 16, 2020 at 4:19 PM Thomas Schneider <74cmo...@gmail.com> wrote:
>
> Hi Dan,
>
> indeed I'm trying to balance the PGs.
>
> In order to ens
He means that if eg. you enforce 1 copy of a PG per rack, that any upmaps you
enter don’t result in 2 or 3 in the same rack. If your CRUSH poilicy is one
copy per *host* the danger is even higher that you could have data become
unavailable or even lost in case of a failure.
> On Mar 16, 2020,
Hi Igor,
On 3/16/20 10:34 AM, Igor Fedotov wrote:
> I can suggest the following non-straightforward way for now:
>
> 1) Check osd startup log for the following line:
>
> 2020-03-15 14:43:27.845 7f41bb6baa80 1
> bluestore(/var/lib/ceph/osd/ceph-681) _open_alloc loaded 23 GiB in 97
> extents
>
>
Hi,
thanks for this clarification.
I'm running a 7-node-cluster and this risk should be managable.
Am 16.03.2020 um 16:57 schrieb Anthony D'Atri:
> He means that if eg. you enforce 1 copy of a PG per rack, that any upmaps you
> enter don’t result in 2 or 3 in the same rack. If your CRUSH poil
On 2020-03-03 13:36, Abhishek Lekshmanan wrote:
>
> This is the eighth update to the Ceph Nautilus release series. This release
> fixes issues across a range of subsystems. We recommend that all users upgrade
> to this release. Please note the following important changes in this
> release; as alwa
Hi !
I'm stuck with "no available blob id" during the start of an OSD.
It seems there's a workaround back-ported only in nautilus (Bug
https://tracker.ceph.com/issues/38272), but I use mimic for now.
Someone has an operational workaround ?
Or should I recreate my OSD ?
And what is the easiest way
I've been trying the upmap balancer on a new Nautilus cluster. We three
main pools, a triple replicated pool (id:1) and two 6+3 erasure coded
pools (id: 4 and 5). The balancer does a very nice job on the triple
replicated pool, but does something strange on the EC pools. Here is a
sample of
24 matches
Mail list logo