Hi!
I am planning a new Flash-based cluster. In the past we used SAMSUNG PM863a
480G as journal drives in our HDD cluster.
After a lot of tests with luminous and bluestore on HDD clusters, we plan
to re-deploy our whole RBD pool (OpenNebula cloud) using these disks.
As far as I understand, it wou
2018-02-02 12:44 GMT+01:00 Richard Hesketh :
> On 02/02/18 08:33, Kevin Olbrich wrote:
> > Hi!
> >
> > I am planning a new Flash-based cluster. In the past we used SAMSUNG
> PM863a 480G as journal drives in our HDD cluster.
> > After a lot of tests with luminous and
Hi!
Currently I try to re-deploy a cluster from filestore to bluestore.
I zapped all disks (multiple times) but I fail adding a disk array:
Prepare:
> ceph-deploy --overwrite-conf osd prepare --bluestore --block-wal /dev/sdb
> --block-db /dev/sdb osd01.cloud.example.local:/dev/mapper/mpatha
Ac
I also noticed there are no folders under /var/lib/ceph/osd/ ...
Mit freundlichen Grüßen / best regards,
Kevin Olbrich.
2018-02-04 19:01 GMT+01:00 Kevin Olbrich :
> Hi!
>
> Currently I try to re-deploy a cluster from filestore to bluestore.
> I zapped all disks (multiple times
artitions 1 - 2 were not added, they are (this disk
has only two partitions).
Should I open a bug?
Kind regards,
Kevin
2018-02-04 19:05 GMT+01:00 Kevin Olbrich :
> I also noticed there are no folders under /var/lib/ceph/osd/ ...
>
>
> Mit freundlichen Grüßen / best regards,
> Kevi
Would be interested as well.
- Kevin
2018-02-04 19:00 GMT+01:00 Yoann Moulin :
> Hello,
>
> What is the best kernel for Luminous on Ubuntu 16.04 ?
>
> Is linux-image-virtual-lts-xenial still the best one ? Or
> linux-virtual-hwe-16.04 will offer some improvement ?
>
> Thanks,
>
> --
> Yoann Moul
2018-02-08 11:20 GMT+01:00 Martin Emrich :
> I have a machine here mounting a Ceph RBD from luminous 12.2.2 locally,
> running linux-generic-hwe-16.04 (4.13.0-32-generic).
>
> Works fine, except that it does not support the latest features: I had to
> disable exclusive-lock,fast-diff,object-map,de
OSDs (and setting size
to 3).
I want to make sure we can resist two offline hosts (in terms of hardware).
Is my assumption correct?
Mit freundlichen Grüßen / best regards,
Kevin Olbrich.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://list
regards,
Kevin Olbrich.
>
> Original Message
> Subject: Re: [ceph-users] degraded objects after osd add (17-Nov-2016 9:14)
> From:Burkhard Linke
> To: c...@dolphin-it.de
>
> Hi,
>
>
> On 11/17/2016 08:07 AM, Steffen Weißgerber wrot
them run remote services (terminal).
My question is: Are 80 VMs hosted on 53 disks (mostly 7.2k SATA) to much?
We sometime experience lags where nearly all servers suffer from "blocked
IO > 32" seconds.
What are your experiences?
Mit freundlichen Grüßen / best regards,
Hi!
I want to deploy two nodes with 4 OSDs each. I already prepared OSDs and
only need to activate them.
What is better? One by one or all at once?
Kind regards,
Kevin.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo
I need to note that I already have 5 hosts with one OSD each.
Mit freundlichen Grüßen / best regards,
Kevin Olbrich.
2016-11-28 10:02 GMT+01:00 Kevin Olbrich :
> Hi!
>
> I want to deploy two nodes with 4 OSDs each. I already prepared OSDs and
> only need to activate them.
> What
is safe regardless of full outage.
Mit freundlichen Grüßen / best regards,
Kevin Olbrich.
2016-12-07 21:10 GMT+01:00 Wido den Hollander :
>
> > Op 7 december 2016 om 21:04 schreef "Will.Boege" >:
> >
> >
> > Hi Wido,
> >
> > Just curious how
Hi,
just in case: What happens when all replica journal SSDs are broken at once?
The PGs most likely will be stuck inactive but as I read, the journals just
need to be replaced (http://ceph.com/planet/ceph-recover-osds-after-ssd-
journal-failure/).
Does this also work in this case?
Kind regards,
Ok, thanks for your explanation!
I read those warnings about size 2 + min_size 1 (we are using ZFS as RAID6,
called zraid2) as OSDs.
Time to raise replication!
Kevin
2016-12-13 0:00 GMT+01:00 Christian Balzer :
> On Mon, 12 Dec 2016 22:41:41 +0100 Kevin Olbrich wrote:
>
> > Hi,
>
2016-12-14 2:37 GMT+01:00 Christian Balzer :
>
> Hello,
>
Hi!
>
> On Wed, 14 Dec 2016 00:06:14 +0100 Kevin Olbrich wrote:
>
> > Ok, thanks for your explanation!
> > I read those warnings about size 2 + min_size 1 (we are using ZFS as
> RAID6,
> > called
Hi!
I tried to convert an qcow2 file to rbd and set the wrong pool.
Immediately I stopped the transfer but the image is stuck locked:
Previusly when that happened, I was able to remove the image after 30 secs.
[root@vm2003 images1]# rbd -p rbd_vms_hdd lock list fpi_server02
There is 1 exclusive
ope id suffix within the address.
>
> On Mon, Jul 9, 2018 at 2:47 PM Kevin Olbrich wrote:
>
>> Hi!
>>
>> I tried to convert an qcow2 file to rbd and set the wrong pool.
>> Immediately I stopped the transfer but the image is stuck locked:
>>
>> Previusl
and IPv6 addresses
>> since it is failing to parse the address as valid. Perhaps it's barfing on
>> the "%eth0" scope id suffix within the address.
>>
>> On Mon, Jul 9, 2018 at 2:47 PM Kevin Olbrich wrote:
>>
>>> Hi!
>>>
>>> I tri
ink local when there is an ULA-prefix available.
The address is available on brX on this client node.
- Kevin
> On Mon, Jul 9, 2018 at 3:43 PM Kevin Olbrich wrote:
>
>> 2018-07-09 21:25 GMT+02:00 Jason Dillaman :
>>
>>> BTW -- are you running Ceph on a one-node computer
2018-07-10 14:37 GMT+02:00 Jason Dillaman :
> On Tue, Jul 10, 2018 at 2:37 AM Kevin Olbrich wrote:
>
>> 2018-07-10 0:35 GMT+02:00 Jason Dillaman :
>>
>>> Is the link-local address of "fe80::219:99ff:fe9e:3a86%eth0" at least
>>> present on the clien
Sounds a little bit like the problem I had on OSDs:
[ceph-users] Blocked requests activating+remapped after extending pg(p)_num
<http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-May/026680.html>
*Kevin
Olbrich*
- [ceph-users] Blocked requests activating+remapped
afterextendi
You can keep the same layout as before. Most place DB/WAL combined in one
partition (similar to the journal on filestore).
Kevin
2018-07-13 12:37 GMT+02:00 Robert Stanford :
>
> I'm using filestore now, with 4 data devices per journal device.
>
> I'm confused by this: "BlueStore manages either
Hi,
why do I see activating followed by peering during OSD add (refill)?
I did not change pg(p)_num.
Is this normal? From my other clusters, I don't think that happend...
Kevin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com
PS: It's luminous 12.2.5!
Mit freundlichen Grüßen / best regards,
Kevin Olbrich.
2018-07-14 15:19 GMT+02:00 Kevin Olbrich :
> Hi,
>
> why do I see activating followed by peering during OSD add (refill)?
> I did not change pg(p)_num.
>
> Is this normal? From my other
Hi,
on upgrade from 12.2.4 to 12.2.5 the balancer module broke (mgr crashes
minutes after service started).
Only solution was to disable the balancer (service is running fine since).
Is this fixed in 12.2.7?
I was unable to locate the bug in bugtracker.
Kevin
2018-07-17 18:28 GMT+02:00 Abhishek
Am Fr., 10. Aug. 2018 um 19:29 Uhr schrieb :
>
>
> Am 30. Juli 2018 09:51:23 MESZ schrieb Micha Krause :
> >Hi,
>
> Hi Micha,
>
> >
> >I'm Running 12.2.5 and I have no Problems at the moment.
> >
> >However my servers reporting daily that they want to upgrade to 12.2.7,
> >is this save or should I
Hi!
I am in the progress of moving a local ("large", 24x1TB) ZFS RAIDZ2 to
CephFS.
This storage is used for backup images (large sequential reads and writes).
To save space and have a RAIDZ2 (RAID6) like setup, I am planning the
following profile:
ceph osd erasure-code-profile set myprofile \
Hi!
During our move from filestore to bluestore, we removed several Intel P3700
NVMe from the nodes.
Is someone running a SPDK/DPDK NVMe-only EC pool? Is it working well?
The docs are very short about the setup:
http://docs.ceph.com/docs/master/rados/configuration/bluestore-config-ref/#spdk-usage
Hi!
Today one of our nfs-ganesha gateway experienced an outage and since crashs
every time, the client behind it tries to access the data.
This is a Ceph Mimic cluster with nfs-ganesha from ceph-repos:
nfs-ganesha-2.6.2-0.1.el7.x86_64
nfs-ganesha-ceph-2.6.2-0.1.el7.x86_64
There were fixes for th
Hi!
is the compressible hint / incompressible hint supported on qemu+kvm?
http://docs.ceph.com/docs/mimic/rados/configuration/bluestore-config-ref/
If not, only aggressive would work in this case for rbd, right?
Kind regards
Kevin
___
ceph-users maili
Hi!
Currently I have a cluster with four hosts and 4x HDDs + 4 SSDs per host.
I also have replication rules to distinguish between HDD and SSD (and
failure-domain set to rack) which are mapped to pools.
What happens if I add a heterogeneous host with 1x SSD and 1x NVMe (where
NVMe will be a new d
To answer my own question:
ceph osd crush tree --show-shadow
Sorry for the noise...
Am Do., 20. Sep. 2018 um 14:54 Uhr schrieb Kevin Olbrich :
> Hi!
>
> Currently I have a cluster with four hosts and 4x HDDs + 4 SSDs per host.
> I also have replication rules to distinguish between
r example, if you have a hierarchy like root --> host1, host2, host3
> --> nvme/ssd/sata OSDs, then you'll actually have 3 trees:
>
> root~ssd -> host1~ssd, host2~ssd ...
> root~sata -> host~sata, ...
>
>
> Paul
>
> 2018-09-20 14:54 GMT+02:00 Kevin Olbrich :
&g
Hi!
Is it possible to set data-pool for ec-pools on qemu-img?
For repl-pools I used "qemu-img convert" to convert from e.g. vmdk to raw
and write to rbd/ceph directly.
The rbd utility is able to do this for raw or empty images but without
convert (converting 800G and writing it again would now ta
there a better way?
Kevin
Am So., 23. Sep. 2018 um 18:08 Uhr schrieb Paul Emmerich
:
>
> The usual trick for clients not supporting this natively is the option
> "rbd_default_data_pool" in ceph.conf which should also work here.
>
>
> Paul
> Am So., 23. Sep. 2018 um
Hi!
Yesterday one of our (non-priority) clusters failed when 3 OSDs went down
(EC 8+2) together.
*This is strange as we did an upgrade from 13.2.1 to 13.2.2 one or two
hours before.*
They failed exactly at the same moment, rendering the cluster unusable
(CephFS).
We are using CentOS 7 with latest
Small addition: the failing disks are in the same host.
This is a two-host, failure-domain OSD cluster.
Am Mi., 3. Okt. 2018 um 10:13 Uhr schrieb Kevin Olbrich :
> Hi!
>
> Yesterday one of our (non-priority) clusters failed when 3 OSDs went down
> (EC 8+2) together.
> *This is st
se LVM volumes?
>
> On 10/3/2018 11:22 AM, Kevin Olbrich wrote:
>
> Small addition: the failing disks are in the same host.
> This is a two-host, failure-domain OSD cluster.
>
>
> Am Mi., 3. Okt. 2018 um 10:13 Uhr schrieb Kevin Olbrich :
>
>> Hi!
>>
>>
place all. Most of current disks are of same age.
Kevin
Am Mi., 3. Okt. 2018 um 13:52 Uhr schrieb Paul Emmerich <
paul.emmer...@croit.io>:
> There's "ceph-bluestore-tool repair/fsck"
>
> In your scenario, a few more log files would be interesting: try
> setting debug
Hi!
Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
Before I migrated from filestore with simple-mode to bluestore with lvm, I
was able to find the raw disk with "df".
Now, I need to go from LVM LV to PV to disk every time I need to
check/smartctl a disk.
Kevin
__
gt; Wido
>
> On 10/08/2018 12:01 PM, Kevin Olbrich wrote:
> > Hi!
> >
> > Is there an easy way to find raw disks (eg. sdd/sdd1) by OSD id?
> > Before I migrated from filestore with simple-mode to bluestore with lvm,
> > I was able to find the raw disk with "df&
kub
>
> pon., 8 paź 2018, 19:32 użytkownik Alfredo Deza
> napisał:
>
>> On Mon, Oct 8, 2018 at 6:09 AM Kevin Olbrich wrote:
>> >
>> > Hi!
>> >
>> > Yes, thank you. At least on one node this works, the other node just
>> freezes but this mi
I had a similar problem:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-September/029698.html
But even the recent 2.6.x releases were not working well for me (many many
segfaults). I am on the master-branch (2.7.x) and that works well with less
crashs.
Cluster is 13.2.1/.2 with nfs-ganes
Hi!
On a new cluster, I get the following error. All 3x mons are connected to
the same switch and ping between them works (firewalls disabled).
Mon-nodes are Ubuntu 16.04 LTS on Cep Luminous.
[ceph_deploy.mon][ERROR ] Some monitors have still not reached quorum:
[ceph_deploy.mon][ERROR ] mon03
t;0.0.0.0:0/2",
[mon01][DEBUG ] "rank": 2
[mon01][DEBUG ] }
[mon01][DEBUG ] ]
DNS is working fine and the hostnames are also listed in /etc/hosts.
I already purged the mon but still the same problem.
- Kevin
2018-02-23 10:26 GMT+01:00 Kevin Olbrich :
> Hi!
&g
I found a fix: It is *mandatory *to set the public network to the same
network the mons use.
Skipping this while the mon has another network interface, saves garbage to
the monmap.
- Kevin
2018-02-23 11:38 GMT+01:00 Kevin Olbrich :
> I always see this:
>
> [mon01][DEBUG ] "mon
Hi,
how can I backup the dmcrypt keys on luminous?
The folder under /etc/ceph does not exist anymore.
Kind regards
Kevin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Hi!
On a small cluster I have an Intel P3700 as the journaling device for 4
HDDs.
While using filestore, I used it as journal.
On bluestore, is it safe to move both Block-DB and WAL to this journal NVMe?
Easy maintenance is first priority (on filestore we just had to flush and
replace the SSD).
k-DB and WAL to this journal
> NVMe?
> Yes, just specify block-db with ceph-volume and wal also use that
> partition. You can put 12-18 HDDs per NVMe
>
> >What happens im the NVMe dies?
> You lost OSDs backed by that NVMe and need to re-add them to cluster.
>
> On Th
Hi!
Yesterday I deployed 3x SSDs as OSDs fine but today I get this error when
deploying an HDD with separted WAL/DB:
stderr: 2018-04-26 11:58:19.531966 7fe57e5f5e00 -1
bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
Command:
ceph-deploy --overwrite-conf osd create --dmcrypt --blue
Hi!
Today I added some new OSDs (nearly doubled) to my luminous cluster.
I then changed pg(p)_num from 256 to 1024 for that pool because it was
complaining about to few PGs. (I noticed that should better have been small
changes).
This is the current status:
health: HEALTH_ERR
336
;:
> Hi,
>
>
>
> On 05/17/2018 01:09 PM, Kevin Olbrich wrote:
>
>> Hi!
>>
>> Today I added some new OSDs (nearly doubled) to my luminous cluster.
>> I then changed pg(p)_num from 256 to 1024 for that pool because it was
>> complaining about to few PGs. (
PS: Cluster currently is size 2, I used PGCalc on Ceph website which, by
default, will place 200 PGs on each OSD.
I read about the protection in the docs and later noticed that I better had
only placed 100 PGs.
2018-05-17 13:35 GMT+02:00 Kevin Olbrich :
> Hi!
>
> Thanks for your qu
max_pg_per_osd_hard_ratio 32'
Sure, mon_max_pg_per_osd is oversized but this is just temporary.
Calculated PGs per OSD is 200.
I searched the net and the bugtracker but most posts suggest
osd_max_pg_per_osd_hard_ratio
= 32 to fix this issue but this time, I got more stuck PGs.
Any more hints?
K
but
why are they failing to proceed to active+clean or active+remapped?
Kind regards,
Kevin
2018-05-17 14:05 GMT+02:00 Kevin Olbrich :
> Ok, I just waited some time but I still got some "activating" issues:
>
> data:
> pools: 2 pools, 1536 pgs
> objec
up 1.0 1.0
>> 21 ssd 0.43700 osd.21up 1.0 1.0
>> 22 ssd 0.43700 osd.22up 1.0 1.0
>>
>>
>> Pools are size 2, min_size 1 during setup.
>>
>> The count of PGs in activate st
Hi!
When we installed our new luminous cluster, we had issues with the cluster
network (setup of mon's failed).
We moved on with a single network setup.
Now I would like to set the cluster network again but the cluster is in use
(4 nodes, 2 pools, VMs).
What happens if I set the cluster network o
Realy?
I always thought that splitting the replication network is best practice.
Keeping everything in the same IPv6 network is much easier.
Thank you.
Kevin
2018-06-07 10:44 GMT+02:00 Wido den Hollander :
>
>
> On 06/07/2018 09:46 AM, Kevin Olbrich wrote:
> > Hi!
> >
Hi!
*Is it safe to run GFS2 on ceph as RBD and mount it to approx. 3 to 5 vm's?*
Idea is to consolidate 3 webservers which are located behind proxys. The
old infrastructure is not HA or capable of load balancing.
I would like to set up a webserver, clone the image and mount the GFS2 disk
as shared
hing).
I hope this helps all Ceph users who are interested in the idea of running
Ceph on ZFS.
Kind regards,
Kevin Olbrich.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
the OSDs out one by one or with norefill, norecovery flags set
but all at once?
If last is the case, which flags should be set also?
Thanks!
Kind regards,
Kevin Olbrich.
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com
__
> >
> > ___
> > ceph-users mailing list
> > ceph-users@lists.ceph.com
> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> >
>
>
>
> --
> Cheers,
> Brad
>
HI!
Currently I am deploying a small cluster with two nodes. I installed ceph
jewel on all nodes and made a basic deployment.
After "ceph osd create..." I am now getting "Failed to start Ceph disk
activation: /dev/dm-18" on boot. All 28 OSDs were never active.
This server has a 14 disk JBOD with 4
l entry in "df" and reboot fixed it.
Then OSDs were failing again. Cause: IPv6 DAD on bond-interface. Disabled
via sysctl.
Reboot and voila, cluster immediately online.
Kind regards,
Kevin.
2017-05-16 16:59 GMT+02:00 Kevin Olbrich :
> HI!
>
> Currently I am deploying a small c
Hi!
A customer is running a small two node ceph cluster with 14 disks each.
He has min_size 1 and size 2 and it is only used for backups.
If we add a third member with 14 identical disks and remain size = 2,
replicas should be distributed evenly, right?
Or is an uneven count of hosts unadvisable
Hi!
Is there an easy way to check when an image was last modified?
I want to make sure, that the images I want to clean up, were not used for
a long time.
Kind regards
Kevin
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/lis
Is it possible to use qemu-img with rbd support on Debian Stretch?
I am on Luminous and try to connect my image-buildserver to load images
into a ceph pool.
root@buildserver:~# qemu-img convert -p -O raw /target/test-vm.qcow2
> rbd:rbd_vms_ssd_01/test_vm
> qemu-img: Unknown protocol 'rbd'
Kevin
host_cdrom host_device http https iscsi iser luks nbd null-aio
> null-co parallels qcow qcow2 qed quorum raw rbd replication sheepdog
> throttle vdi vhdx vmdk vpc vvfat zeroinit
> On Tue, Oct 30, 2018 at 12:08 PM Kevin Olbrich wrote:
>
>> Is it possible to use qemu-img with r
I met the same problem. I had to create GPT table for each disk, create
first partition over full space and then fed these to ceph-volume (should
be similar for ceph-deploy).
Also I am not sure if you can combine fs-type btrfs with bluestore (afaik
this is for filestore).
Kevin
Am Di., 6. Nov. 2
Am Mi., 7. Nov. 2018 um 07:40 Uhr schrieb Nicolas Huillard <
nhuill...@dolomede.fr>:
>
> > It lists rbd but still fails with the exact same error.
>
> I stumbled upon the exact same error, and since there was no answer
> anywhere, I figured it was a very simple problem: don't forget to
> install t
Am Mi., 7. Nov. 2018 um 16:40 Uhr schrieb Gregory Farnum :
> On Wed, Nov 7, 2018 at 5:58 AM Simon Ironside
> wrote:
>
>>
>>
>> On 07/11/2018 10:59, Konstantin Shalygin wrote:
>> >> I wonder if there is any release announcement for ceph 12.2.9 that I
>> missed.
>> >> I just found the new packages
Hi!
ZFS won't play nice on ceph. Best would be to mount CephFS directly with
the ceph-fuse driver on the endpoint.
If you definitely want to put a storage gateway between the data and the
compute nodes, then go with nfs-ganesha which can export CephFS directly
without local ("proxy") mount.
I had
for you.
>
> -- Dan
>
> On Mon, Nov 12, 2018 at 3:01 PM Kevin Olbrich wrote:
> >
> > Hi!
> >
> > ZFS won't play nice on ceph. Best would be to mount CephFS directly with
> the ceph-fuse driver on the endpoint.
> > If you definitely want to put a st
I read the whole thread and it looks like the write cache should always be
disabled as in the worst case, the performance is the same(?).
This is based on this discussion.
I will test some WD4002FYYZ which don't mention "media cache".
Kevin
Am Di., 13. Nov. 2018 um 09:27 Uhr schrieb Виталий Фили
I now had the time to test and after installing this package, uploads to
rbd are working perfectly.
Thank you very much fur sharing this!
Kevin
Am Mi., 7. Nov. 2018 um 15:36 Uhr schrieb Kevin Olbrich :
> Am Mi., 7. Nov. 2018 um 07:40 Uhr schrieb Nicolas Huillard <
> nhuill...@do
Hi!
Currently I plan a migration of a large VM (MS Exchange, 300 Mailboxes
and 900GB DB) from qcow2 on ext4 (RAID1) to an all-flash Ceph luminous
cluster (which already holds lot's of images).
The server has access to both local and cluster-storage, I only need
to live migrate the storage, not mac
> > Assuming everything is on LVM including the root filesystem, only moving
> > the boot partition will have to be done outside of LVM.
>
> Since the OP mentioned MS Exchange, I assume the VM is running windows.
> You can do the same LVM-like trick in Windows Server via Disk Manager
> though; add
Hi!
On a medium sized cluster with device-classes, I am experiencing a
problem with the SSD pool:
root@adminnode:~# ceph osd df | grep ssd
ID CLASS WEIGHT REWEIGHT SIZEUSE AVAIL %USE VAR PGS
2 ssd 0.43700 1.0 447GiB 254GiB 193GiB 56.77 1.28 50
3 ssd 0.43700 1.0 4
Hi!
I wonder if changing qdisc and congestion_control (for example fq with
Google BBR) on Ceph servers / clients has positive effects during high
load.
Google BBR:
https://cloud.google.com/blog/products/gcp/tcp-bbr-congestion-control-comes-to-gcp-your-internet-just-got-faster
I am running a lot
Hi!
I did what you wrote but my MGRs started to crash again:
root@adminnode:~# ceph -s
cluster:
id: 086d9f80-6249-4594-92d0-e31b6a9c
health: HEALTH_WARN
no active mgr
105498/6277782 objects misplaced (1.680%)
services:
mon: 3 daemons, quorum mon01,m
PS: Could be http://tracker.ceph.com/issues/36361
There is one HDD OSD that is out (which will not be replaced because
the SSD pool will get the images and the hdd pool will be deleted).
Kevin
Am Fr., 4. Jan. 2019 um 19:46 Uhr schrieb Kevin Olbrich :
>
> Hi!
>
> I did what you wrote
If you realy created and destroyed OSDs before the cluster healed
itself, this data will be permanently lost (not found / inactive).
Also your PG count is so much oversized, the calculation for peering
will most likely break because this was never tested.
If this is a critical cluster, I would sta
n I check Hard Disk Iops on new server which are very low compared to
> existing cluster server.
>
> Indeed this is a critical cluster but I don't have expertise to make it
> flawless.
>
> Thanks
> Arun
>
> On Fri, Jan 4, 2019 at 11:35 AM Kevin Olbrich wrote:
&
1151 stale+down
> 667 activating+degraded
> 159 stale+activating
> 116 down
> 77activating+remapped
> 34stale+activating+degraded
> 21stale+activating+remapped
> 9
5:12 Uhr schrieb Konstantin Shalygin :
>
> On 1/5/19 1:51 AM, Kevin Olbrich wrote:
> > PS: Could behttp://tracker.ceph.com/issues/36361
> > There is one HDD OSD that is out (which will not be replaced because
> > the SSD pool will get the images and the hdd pool will be deleted
If I understand the balancer correct, it balances PGs not data.
This worked perfectly fine in your case.
I prefer a PG count of ~100 per OSD, you are at 30. Maybe it would
help to bump the PGs.
Kevin
Am Sa., 5. Jan. 2019 um 14:39 Uhr schrieb Marc Roos :
>
>
> I have straw2, balancer=on, crush-co
Looks like the same problem like mine:
http://lists.ceph.com/pipermail/ceph-users-ceph.com/2019-January/032054.html
The free space is total while Ceph uses the smallest free space (worst OSD).
Please check your (re-)weights.
Kevin
Am Di., 8. Jan. 2019 um 14:32 Uhr schrieb Rodrigo Embeita
:
>
> H
You use replication 3 failure-domain host.
OSD 2 and 4 are full, thats why your pool is also full.
You need to add two disks to pf-us1-dfs3 or swap one from the larger
nodes to this one.
Kevin
Am Di., 8. Jan. 2019 um 15:20 Uhr schrieb Rodrigo Embeita
:
>
> Hi Yoann, thanks for your response.
> He
ards
>
> On Tue, Jan 8, 2019 at 11:28 AM Kevin Olbrich wrote:
>>
>> You use replication 3 failure-domain host.
>> OSD 2 and 4 are full, thats why your pool is also full.
>> You need to add two disks to pf-us1-dfs3 or swap one from the larger
>> nodes to this one.
Are you sure, no service like firewalld is running?
Did you check that all machines have the same MTU and jumbo frames are
enabled if needed?
I had this problem when I first started with ceph and forgot to
disable firewalld.
Replication worked perfectly fine but the OSD was kicked out every few se
Am Sa., 26. Jan. 2019 um 13:43 Uhr schrieb Götz Reinicke
:
>
> Hi,
>
> I have a fileserver which mounted a 4TB rbd, which is ext4 formatted.
>
> I grow that rbd and ext4 starting with an 2TB rbd that way:
>
> rbd resize testpool/disk01--size 4194304
>
> resize2fs /dev/rbd0
>
> Today I wanted to ext
2019 um 07:34 Uhr schrieb Konstantin Shalygin :
>
> On 1/5/19 4:17 PM, Kevin Olbrich wrote:
> > root@adminnode:~# ceph osd tree
> > ID CLASS WEIGHT TYPE NAME STATUS REWEIGHT PRI-AFF
> > -1 30.82903 root default
> > -16 30.82903
Are you sure that firewalld is stopped and disabled?
Looks exactly like that when I missed one host in a test cluster.
Kevin
Am Di., 12. März 2019 um 09:31 Uhr schrieb Zhenshi Zhou :
> Hi,
>
> I deployed a ceph cluster with good performance. But the logs
> indicate that the cluster is not as st
Hi!
How can I determine which client compatibility level (luminous, mimic,
nautilus, etc.) is supported in Qemu/KVM?
Does it depend on the version of ceph packages on the system? Or do I need
a recent version Qemu/KVM?
Which component defines, which client level will be supported?
Thank you very
Hollander :
>
>
> On 5/28/19 7:52 AM, Kevin Olbrich wrote:
> > Hi!
> >
> > How can I determine which client compatibility level (luminous, mimic,
> > nautilus, etc.) is supported in Qemu/KVM?
> > Does it depend on the version of ceph packages on the s
Am Di., 28. Mai 2019 um 10:20 Uhr schrieb Wido den Hollander :
>
>
> On 5/28/19 10:04 AM, Kevin Olbrich wrote:
> > Hi Wido,
> >
> > thanks for your reply!
> >
> > For CentOS 7, this means I can switch over to the "rpm-nautilus/el7"
> > repos
Hi!
Today some OSDs went down, a temporary problem that was solved easily.
The mimic cluster is working and all OSDs are complete, all active+clean.
Completely new for me is this:
> 25 slow ops, oldest one blocked for 219 sec, mon.mon03 has slow ops
The cluster itself looks fine, monitoring for
OK, looks like clock skew is the problem. I thought this is caused by the
reboot but it did not fix itself after some minutes (mon3 was 6 seconds
ahead).
After forcing time sync from the same server, it seems to be solved now.
Kevin
Am Fr., 20. Sept. 2019 um 07:33 Uhr schrieb Kevin Olbrich
99 matches
Mail list logo