Re: [ceph-users] RBD image format v1 EOL ...

2019-02-22 Thread koukou73gr
On 2019-02-20 17:38, Mykola Golub wrote: Note, if even rbd supported live (without any downtime) migration you would still need to restart the client after the upgrate to a new librbd with migration support. You could probably get away with executing the client with a new librbd version by li

Re: [ceph-users] PG auto repair with BlueStore

2018-11-15 Thread koukou73gr
Are there any means to notify the administrator that an auto-repair has taken place? -K. On 2018-11-15 20:45, Mark Schouten wrote: As a user, I’m very surprised that this isn’t a default setting. Mark Schouten Op 15 nov. 2018 om 18:40 heeft Wido den Hollander het volgende geschreven: Hi

Re: [ceph-users] PGs inconsistent, do I fear data loss?

2017-11-02 Thread koukou73gr
The scenario is actually a bit different, see: Let's assume size=2, min_size=1 -We are looking at pg "A" acting [1, 2] -osd 1 goes down -osd 2 accepts a write for pg "A" -osd 2 goes down -osd 1 comes back up, while osd 2 still down -osd 1 has no way to know osd 2 accepted a write in pg "A" -osd 1

Re: [ceph-users] Bluestore with SSD-backed DBs; what if the SSD fails?

2017-10-25 Thread koukou73gr
On 2017-10-25 11:21, Wido den Hollander wrote: > >> Op 25 oktober 2017 om 5:58 schreef Christian Sarrasin >> : >> >> The one thing I'm still wondering about is failure domains. With >> Filestore and SSD-backed journals, an SSD failure would kill writes but >> OSDs were otherwise still whole. Re

Re: [ceph-users] Hard disk bad manipulation: journal corruption and stale pgs

2017-06-05 Thread koukou73gr
Is your min-size at least 2? Is it just one OSD affected? If yes and if it is only the journal that is corrupt, but the actual OSD store is intact although lagging behind now in writes and you do have healthy copies of its PGs elsewhere (hence the min-size requirement) you could resolve this situ

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread koukou73gr
On 2017-06-02 14:07, Peter Maloney wrote: > On 06/02/17 12:25, koukou73gr wrote: >> On 2017-06-02 13:01, Peter Maloney wrote: >>>> Is it easy for you to reproduce it? I had the same problem, and the same >>>> solution. But it isn't easy to reproduce... Ja

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread koukou73gr
On 2017-06-02 13:22, Peter Maloney wrote: > On 06/02/17 12:06, koukou73gr wrote: >> Thanks for the reply. >> >> Easy? >> Sure, it happens reliably every time I boot the guest with >> exclusive-lock on :) > If it's that easy, also try with only exclusive

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread koukou73gr
On 2017-06-02 13:01, Peter Maloney wrote: >> Is it easy for you to reproduce it? I had the same problem, and the same >> solution. But it isn't easy to reproduce... Jason Dillaman asked me for >> a gcore dump of a hung process but I wasn't able to get one. Can you do >> that, and when you reply, C

Re: [ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-02 Thread koukou73gr
Thanks for the reply. Easy? Sure, it happens reliably every time I boot the guest with exclusive-lock on :) I'll need some walkthrough on the gcore part though! -K. On 2017-06-02 12:59, Peter Maloney wrote: > On 06/01/17 17:12, koukou73gr wrote: >> Hello list, >> >&g

[ceph-users] RBD exclusive-lock and lqemu/librbd

2017-06-01 Thread koukou73gr
Hello list, Today I had to create a new image for a VM. This was the first time, since our cluster was updated from Hammer to Jewel. So far I was just copying an existing golden image and resized it as appropriate. But this time I used rbd create. So I "rbd create"d a 2T image and attached it to

Re: [ceph-users] - permission denied on journal after reboot

2017-02-13 Thread koukou73gr
On 2017-02-13 13:47, Wido den Hollander wrote: > > The udev rules of Ceph should chown the journal to ceph:ceph if it's set to > the right partition UUID. > > This blog shows it partially: > http://ceph.com/planet/ceph-recover-osds-after-ssd-journal-failure/ > > This is done by *95-ceph-osd.r

Re: [ceph-users] virt-install into rbd hangs during Anaconda package installation

2017-02-07 Thread koukou73gr
On 2017-02-07 10:11, Tracy Reed wrote: > Weird. Now the VMs that were hung in interruptable wait state have now > disappeared. No idea why. Have you tried the same procedure but with local storage instead? -K. ___ ceph-users mailing list ceph-users@lis

Re: [ceph-users] v0.94.7 Hammer released

2016-05-17 Thread koukou73gr
Same here. Warnings appeared for OSDs running the .6 version each time one of the rest was restarted to the .7 version. When the last .6 OSD host was upgraded, there where no more warnings from the rest. Cluster seems happy :) -K. On 05/17/2016 11:04 AM, Dan van der Ster wrote: > Hi Sage et a

Re: [ceph-users] Advice on OSD upgrades

2016-04-14 Thread koukou73gr
If you have empty drive slots in your OSD hosts, I'd be tempted to insert new drive in slot, set noout, shutdown one OSD, unmount OSD directory, dd the old drive to the new one, remove old drive, restart OSD. No rebalancing and minimal data movment when the OSD rejoins. -K. On 04/14/2016 04:29 P

Re: [ceph-users] 1 pg stuck

2016-03-24 Thread koukou73gr
Space on hosts in rack2 does not add up to cover space in rack1. After enough data are written to the cluster all pgs on rack2 would be allocated and the cluster won't be able to find a free pg to map new data to for the 3rd replica. Bottomline, spread your big disks to all 4 hosts, or add some mo

Re: [ceph-users] dealing with the full osd / help reweight

2016-03-24 Thread koukou73gr
What is your pool size? 304 pgs sound awfuly small for 20 OSDs. More pgs will help distribute full pgs better. But with a full or near full OSD in hand, increasing pgs is a no-no operation. If you search in the list archive, I believe there was a thread last month or so which provided a walkthroug

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread koukou73gr
Are you runnig with the default failure domain of 'host'? If so, with a pool size of 3 and your 20 OSDs physically only on 2 hosts Ceph is unable to find a 3rd host to map the 3rd replica. Either add a host and move some OSDs there or reduce pool size to 2. -K. On 03/23/2016 02:17 PM, Zhang Qia

Re: [ceph-users] Need help for PG problem

2016-03-23 Thread koukou73gr
You should have settled with the nearest power of 2, which for 666 is 512. Since you created the cluster and IIRC is a testbed, you may as well recreate it again, however it will less of a hassle to just increase the pgs to the next power of two: 1024 Your 20 ods appear to be equal sized in your c

Re: [ceph-users] Help: pool not responding

2016-02-14 Thread koukou73gr
Have you tried restarting osd.0 ? -K. On 02/14/2016 09:56 PM, Mario Giammarco wrote: > Hello, > I am using ceph hammer under proxmox. > I have working cluster it is several month I am using it. > For reasons yet to discover I am now in this situation: > > HEALTH_WARN 4 pgs incomplete; 4 pgs st

Re: [ceph-users] Ceph + Libvirt + QEMU-KVM

2016-01-28 Thread koukou73gr
On 01/28/2016 03:44 PM, Simon Ironside wrote: > Btw, using virtio-scsi devices as above and discard='unmap' above > enables TRIM support. This means you can use fstrim or mount file > systems with discard inside the VM to free up unused space in the image. Doesn't discard require the pc-q35-rhel7

Re: [ceph-users] Intel S3710 400GB and Samsung PM863 480GB fio results

2015-12-22 Thread koukou73gr
Even the cheapest stuff nowadays has some more or less decent wear leveling algorithm built into their controller so this won't be a problem. Wear leveling algorithms cycle the blocks internally so wear evens out on the whole disk. -K. On 12/22/2015 06:57 PM, Alan Johnson wrote: > I would also ad

Re: [ceph-users] Multiple journals and an OSD on one SSD doable?

2015-06-09 Thread koukou73gr
On 06/08/2015 11:54 AM, Jan Schermer wrote: > > This should indicate the real wear: 100 Gigabytes_Erased > 0x0032 000 000 000Old_age Always - 62936 > Bytes written after compression: 233 SandForce_Internal > 0x 000 000 000O

Re: [ceph-users] QEMU Venom Vulnerability

2015-05-21 Thread koukou73gr
On 05/21/2015 02:36 PM, Brad Hubbard wrote: If that's correct then starting from there and building a new RPM with RBD support is the proper way of updating. Correct? I guess there are two ways to approach this. 1. use the existing ceph source rpm here. http://ceph.com/packages/ceph-extras/

Re: [ceph-users] live migration fails with image on ceph

2015-04-15 Thread koukou73gr
Hello, Can't really help you with you with nova, but using plain libvirt-1.1.1 and qemu-1.5.3 live migration of rbd-backed VMs is (almost*) instant on the client side. We have rbd write-back cache enabled everywhere and have no problem at all. -K. *There is about a 1-2 second hitch at wors

Re: [ceph-users] Force an OSD to try to peer

2015-03-31 Thread koukou73gr
On 03/31/2015 09:23 PM, Sage Weil wrote: It's nothing specific to peering (or ceph). The symptom we've seen is just that byte stop passing across a TCP connection, usually when there is some largish messages being sent. The ping/heartbeat messages get through because they are small and we disa

Re: [ceph-users] qemu-kvm and cloned rbd image

2015-03-09 Thread koukou73gr
On 03/05/2015 07:19 PM, Josh Durgin wrote: client.libvirt key: caps: [mon] allow r caps: [osd] allow class-read object_prefix rbd_children, allow rw class-read pool=rbd This includes everything except class-write on the pool you're using. You'll need that so that a copy_up c

Re: [ceph-users] qemu-kvm and cloned rbd image

2015-03-05 Thread koukou73gr
On 03/05/2015 03:40 AM, Josh Durgin wrote: It looks like your libvirt rados user doesn't have access to whatever pool the parent image is in: librbd::AioRequest: write 0x7f1ec6ad6960 rbd_data.24413d1b58ba.0186 1523712~4096 should_complete: r = -1 -1 is EPERM, for operation not

Re: [ceph-users] qemu-kvm and cloned rbd image

2015-03-04 Thread koukou73gr
Hi Josh, Thanks for taking a look at this. I 'm answering your questions inline. On 03/04/2015 10:01 PM, Josh Durgin wrote: [...] And then proceeded to create a qemu-kvm guest with rbd/server as its backing store. The guest booted but as soon as it got to mount the root fs, things got weird:

Re: [ceph-users] qemu-kvm and cloned rbd image

2015-03-04 Thread koukou73gr
On 03/03/2015 05:53 PM, Jason Dillaman wrote: Your procedure appears correct to me. Would you mind re-running your cloned image VM with the following ceph.conf properties: [client] rbd cache off debug rbd = 20 log file = /path/writeable/by/qemu.$pid.log If you recreate the issue, would you mi

[ceph-users] qemu-kvm and cloned rbd image

2015-03-02 Thread koukou73gr
Hello, Today I thought I'd experiment with snapshots and cloning. So I did: rbd import --image-format=2 vm-proto.raw rbd/vm-proto rbd snap create rbd/vm-proto@s1 rbd snap protect rbd/vm-proto@s1 rbd clone rbd/vm-proto@s1 rbd/server And then proceeded to create a qemu-kvm guest with rbd/server