[ceph-users] Openstack on ceph rbd installation failure
Hi, I have a three node ceph cluster. ceph -w says health ok . I have openstack in the same cluster and trying to map cinder and glance onto rbd. I have followed steps as given in http://ceph.com/docs/next/rbd/rbd-openstack/ New Settings that is added in cinder.conf for three files volume_driver=cinder.volume.drivers.rbd.RBDDriver rbd_pool=volumes glance_api_version=2 rbd_user=volumes rbd_secret_uuid=62d0b384-50ad-2e17-15ed-66bfeda40252 ( different for each node) LOGS seen when I run ./rejoin.sh 2013-07-22 20:35:01.900 INFO cinder.service [-] Starting 1 workers 2013-07-22 20:35:01.909 INFO cinder.service [-] Started child 2290 2013-07-22 20:35:01.965 AUDIT cinder.service [-] Starting cinder-volume node (version 2013.2) 2013-07-22 20:35:02.129 ERROR cinder.volume.drivers.rbd [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] error connecting to ceph cluster 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd Traceback (most recent call last): 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 243, in check_for_setup_error 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd with RADOSClient(self): 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 215, in __init__ 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd self.cluster, self.ioctx = driver._connect_to_rados(pool) 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 263, in _connect_to_rados 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd client.connect() 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File "/usr/lib/python2.7/dist-packages/rados.py", line 192, in connect 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd raise make_ex(ret, "error calling connect") 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd ObjectNotFound: error calling connect 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd 2013-07-22 20:35:02.149 ERROR cinder.service [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] Unhandled exception 2013-07-22 20:35:02.149 TRACE cinder.service Traceback (most recent call last): 2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 228, in _start_child 2013-07-22 20:35:02.149 TRACE cinder.service self._child_process(wrap.server) 2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 205, in _child_process 2013-07-22 20:35:02.149 TRACE cinder.service launcher.run_server(server) 2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 96, in run_server 2013-07-22 20:35:02.149 TRACE cinder.service server.start() 2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/service.py", line 359, in start 2013-07-22 20:35:02.149 TRACE cinder.service self.manager.init_host() 2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/volume/manager.py", line 139, in init_host 2013-07-22 20:35:02.149 TRACE cinder.service self.driver.check_for_setup_error() 2013-07-22 20:35:02.149 TRACE cinder.service File "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 248, in check_for_setup_error 2013-07-22 20:35:02.149 TRACE cinder.service raise exception.VolumeBackendAPIException(data=msg) 2013-07-22 20:35:02.149 TRACE cinder.service VolumeBackendAPIException: Bad or unexpected response from the storage volume backend API: error connecting to ceph cluster 2013-07-22 20:35:02.149 TRACE cinder.service 2013-07-22 20:35:02.191 INFO cinder.service [-] Child 2290 exited with status 2 2013-07-22 20:35:02.192 INFO cinder.service [-] _wait_child 1 2013-07-22 20:35:02.193 INFO cinder.service [-] wait wrap.failed True Can someone help me with some debug points and solve it ? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Openstack on ceph rbd installation failure
There is a hidden bug which I couldn't reproduce. I was using devstack for openstack and I enabled syslog option for getting nova and cinder logs . After reboot, Everything was fine. I was able to create volumes and I verified in rados. Another thing I noticed is, I don't have cinder user as in devstack script. Hence, I didn't change owner permissions for keyring files and they are owned by root. But, it works fine though On Tue, Jul 23, 2013 at 6:19 AM, Sebastien Han wrote: > Can you send your ceph.conf too? > > Is /etc/ceph/ceph.conf present? Is the key of user volume present too? > > > Sébastien Han > Cloud Engineer > > "Always give 100%. Unless you're giving blood." > > > > > > > > > > *Phone : *+33 (0)1 49 70 99 72 – *Mobile : *+33 (0)6 52 84 44 70 > *Email :* sebastien@enovance.com – *Skype : *han.sbastien > *Address :* 10, rue de la Victoire – 75009 Paris > *Web : *www.enovance.com – *Twitter : *@enovance > > On Jul 23, 2013, at 5:39 AM, johnu wrote: > > Hi, > I have a three node ceph cluster. ceph -w says health ok . I have > openstack in the same cluster and trying to map cinder and glance onto rbd. > > > I have followed steps as given in > http://ceph.com/docs/next/rbd/rbd-openstack/ > > New Settings that is added in cinder.conf for three files > > volume_driver=cinder.volume.drivers.rbd.RBDDriver > rbd_pool=volumes > glance_api_version=2 > rbd_user=volumes > rbd_secret_uuid=62d0b384-50ad-2e17-15ed-66bfeda40252 ( different for each > node) > > > LOGS seen when I run ./rejoin.sh > > 2013-07-22 20:35:01.900 INFO cinder.service [-] Starting 1 workers > 2013-07-22 20:35:01.909 INFO cinder.service [-] Started child 2290 > 2013-07-22 20:35:01.965 AUDIT cinder.service [-] Starting cinder-volume > node (version 2013.2) > 2013-07-22 20:35:02.129 ERROR cinder.volume.drivers.rbd > [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] error connecting to > ceph cluster > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd Traceback (most > recent call last): > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File > "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 243, in > check_for_setup_error > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd with > RADOSClient(self): > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File > "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 215, in __init__ > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd self.cluster, > self.ioctx = driver._connect_to_rados(pool) > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File > "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 263, in > _connect_to_rados > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd > client.connect() > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File > "/usr/lib/python2.7/dist-packages/rados.py", line 192, in connect > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd raise > make_ex(ret, "error calling connect") > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd ObjectNotFound: > error calling connect > 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd > 2013-07-22 20:35:02.149 ERROR cinder.service > [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] Unhandled exception > 2013-07-22 20:35:02.149 TRACE cinder.service Traceback (most recent call > last): > 2013-07-22 20:35:02.149 TRACE cinder.service File > "/opt/stack/cinder/cinder/service.py", line 228, in _start_child > 2013-07-22 20:35:02.149 TRACE cinder.service > self._child_process(wrap.server) > 2013-07-22 20:35:02.149 TRACE cinder.service File > "/opt/stack/cinder/cinder/service.py", line 205, in _child_process > 2013-07-22 20:35:02.149 TRACE cinder.service > launcher.run_server(server) > 2013-07-22 20:35:02.149 TRACE cinder.service File > "/opt/stack/cinder/cinder/service.py", line 96, in run_server > 2013-07-22 20:35:02.149 TRACE cinder.service server.start() > 2013-07-22 20:35:02.149 TRACE cinder.service File > "/opt/stack/cinder/cinder/service.py", line 359, in start > 2013-07-22 20:35:02.149 TRACE cinder.service self.manager.init_host() > 2013-07-22 20:35:02.149 TRACE cinder.service File > "/opt/stack/cinder/cinder/volume/manager.py", line 139, in init_host > 2013-07-22 20:35:02.149 TRACE cinder.service > self.driver.check_for_setup_error() > 2013-07-22 20:35:02.149 TRACE cinder.service File > "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 248, in > check_for_setup_error > 2013-07-22 20:35:02.149 TRACE cinder.service raise >
Re: [ceph-users] Openstack on ceph rbd installation failure
[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[0 Jul 23 17:17:18 slave2 2013-07-23 17:17:18.380 ERROR nova.virt.libvirt.driver [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin admin] #033[01;35mAn error occurred while trying to launch a defined domain with xml: #012 instance-000b#012 4b58dea1-f281-4818-82da-8b9f5f923f64#012 524288#012 524288#012 1#012 #012 #012 OpenStack Foundation#012 OpenStack Nova#012 2013.2#012 38047832-f758-4e6d-aedf-2d6cf02d6b1e#012 4b58dea1-f281-4818-82da-8b9f5f923f64#012 #012 #012 #012hvm#012 /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/kernel#012 /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/ramdisk#012 root=/dev/vda console=tty0 console=ttyS0#012#012#012 #012 #012#012#012 #012 #012 destroy#012 restart#012 destroy#012 #012/usr/bin/qemu-system-x86_64#012 #012 #012 #012 #012 #012 #012#012 #012 #012 wrote: > There is a hidden bug which I couldn't reproduce. I was using devstack for > openstack and I enabled syslog option for getting nova and cinder logs . > After reboot, Everything was fine. I was able to create volumes and I > verified in rados. > > Another thing I noticed is, I don't have cinder user as in devstack > script. Hence, I didn't change owner permissions for keyring files and they > are owned by root. But, it works fine though > > > On Tue, Jul 23, 2013 at 6:19 AM, Sebastien Han > wrote: > >> Can you send your ceph.conf too? >> >> Is /etc/ceph/ceph.conf present? Is the key of user volume present too? >> >> >> Sébastien Han >> Cloud Engineer >> >> "Always give 100%. Unless you're giving blood." >> >> >> >> >> >> >> >> >> >> *Phone : *+33 (0)1 49 70 99 72 – *Mobile : *+33 (0)6 52 84 44 70 >> *Email :* sebastien@enovance.com – *Skype : *han.sbastien >> *Address :* 10, rue de la Victoire – 75009 Paris >> *Web : *www.enovance.com – *Twitter : *@enovance >> >> On Jul 23, 2013, at 5:39 AM, johnu wrote: >> >> Hi, >> I have a three node ceph cluster. ceph -w says health ok . I have >> openstack in the same cluster and trying to map cinder and glance onto rbd. >> >> >> I have followed steps as given in >> http://ceph.com/docs/next/rbd/rbd-openstack/ >> >> New Settings that is added in cinder.conf for three files >> >> volume_driver=cinder.volume.drivers.rbd.RBDDriver >> rbd_pool=volumes >> glance_api_version=2 >> rbd_user=volumes >> rbd_secret_uuid=62d0b384-50ad-2e17-15ed-66bfeda40252 ( different for each >> node) >> >> >> LOGS seen when I run ./rejoin.sh >> >> 2013-07-22 20:35:01.900 INFO cinder.service [-] Starting 1 workers >> 2013-07-22 20:35:01.909 INFO cinder.service [-] Started child 2290 >> 2013-07-22 20:35:01.965 AUDIT cinder.service [-] Starting cinder-volume >> node (version 2013.2) >> 2013-07-22 20:35:02.129 ERROR cinder.volume.drivers.rbd >> [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] error connecting to >> ceph cluster >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd Traceback (most >> recent call last): >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File >> "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 243, in >> check_for_setup_error >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd with >> RADOSClient(self): >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File >> "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 215, in __init__ >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd self.cluster, >> self.ioctx = driver._connect_to_rados(pool) >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File >> "/opt/stack/cinder/cinder/volume/drivers/rbd.py", line 263, in >> _connect_to_rados >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd >> client.connect() >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd File >> "/usr/lib/python2.7/dist-packages/rados.py", line 192, in connect >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd raise >> make_ex(ret, "error calling connect") >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd ObjectNotFound: >> error calling connect >> 2013-07-22 20:35:02.129 TRACE cinder.volume.drivers.rbd >> 2013-07-22 20:35:02.149 ERROR cinder.service >> [req-d3bc2e86-e9db-40e8-bcdb-08c609ce44c3 None None] Unhandled exception >> 2013-07-22 20:35:02.149
[ceph-users] Error when volume is attached in openstack
I was trying openstack on ceph. I could create volumes but I am not able to attach the volume to any running instance. If I attach a instance to an instance and reboot it, it goes to error state. Compute error logs are given below. 15:32.666 ERROR nova.compute.manager [#033[01;36mreq-464776fd-2832-4f76-91fa-3e4eff173064 #033[00;36mNone None] #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] error during stop() in sync_power_state.#033[00m#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00mTraceback (most recent call last):#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/compute/manager.py", line 4421, in _sync_instance_power_state#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m self.conductor_api.compute_stop(context, db_instance)#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/conductor/api.py", line 333, in compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m return self._manager.compute_stop(context, instance, do_cast)#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/conductor/rpcapi.py", line 483, in compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m return self.call(context, msg, version='1.43')#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 126, in call#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m result = rpc.call(context, real_topic, msg, timeout)#0122013-07-23 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[0 Jul 23 17:17:18 slave2 2013-07-23 17:17:18.380 ERROR nova.virt.libvirt.driver [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin admin] #033[01;35mAn error occurred while trying to launch a defined domain with xml: #012 instance-000b#012 4b58dea1-f281-4818-82da-8b9f5f923f64#012 524288#012 524288#012 1#012 #012 #012 OpenStack Foundation#012 OpenStack Nova#012 2013.2#012 38047832-f758-4e6d-aedf-2d6cf02d6b1e#012 4b58dea1-f281-4818-82da-8b9f5f923f64#012 #012 #012 #012hvm#012 /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/kernel#012 /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/ramdisk#012 root=/dev/vda console=tty0 console=ttyS0#012#012#012 #012 #012#012#012 #012 #012 destroy#012 restart#012 destroy#012 #012/usr/bin/qemu-system-x86_64#012 #012 #012 #012 #012 #012 #012#012 #012 #012___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Error when volume is attached in openstack
I followed the same steps earlier. How can I verify it? On Wed, Jul 24, 2013 at 11:26 AM, Abel Lopez wrote: > There's your problem: > error rbd username 'volumes' specified but secret not found#033[00m > > You need to follow the steps in the doc for creating the secret using > virsh. > http://ceph.com/docs/next/rbd/rbd-openstack/ > > > On Jul 24, 2013, at 11:20 AM, johnu wrote: > > > I was trying openstack on ceph. I could create volumes but I am not able > to attach the volume to any running instance. If I attach a instance to an > instance and reboot it, it goes to error state. > > Compute error logs are given below. > > 15:32.666 ERROR nova.compute.manager > [#033[01;36mreq-464776fd-2832-4f76-91fa-3e4eff173064 #033[00;36mNone None] > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] error during > stop() in sync_power_state.#033[00m#0122013-07-23 17:15:32.666 TRACE > nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00mTraceback (most recent call > last):#0122013-07-23 17:15:32.666 TRACE nova.compute.manager > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File > "/opt/stack/nova/nova/compute/manager.py", line 4421, in > _sync_instance_power_state#0122013-07-23 17:15:32.666 TRACE > nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m > self.conductor_api.compute_stop(context, db_instance)#0122013-07-23 > 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File > "/opt/stack/nova/nova/conductor/api.py", line 333, in > compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m > return self._manager.compute_stop(context, instance, do_cast)#0122013-07-23 > 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File > "/opt/stack/nova/nova/conductor/rpcapi.py", line 483, in > compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m > return self.call(context, msg, version='1.43')#0122013-07-23 17:15:32.666 > TRACE nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File > "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 126, in > call#0122013-07-23 17:15:32.666 TRACE nova.compute.manager > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m > result = rpc.call(context, real_topic, msg, timeout)#0122013-07-23 > 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[0 > > Jul 23 17:17:18 slave2 2013-07-23 17:17:18.380 ERROR > nova.virt.libvirt.driver > [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin > admin] #033[01;35mAn error occurred while trying to launch a defined domain > with xml: #012 instance-000b#012 > 4b58dea1-f281-4818-82da-8b9f5f923f64#012 unit='KiB'>524288#012 unit='KiB'>524288#012 placement='static'>1#012 #012 > #012 OpenStack > Foundation#012 OpenStack > Nova#012 2013.2#012 name='serial'>38047832-f758-4e6d-aedf-2d6cf02d6b1e#012 name='uuid'>4b58dea1-f281-4818-82da-8b9f5f923f64#012 > #012 #012 #012 machine='pc-i440fx-1.4'>hvm#012 > /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/kernel#012 > /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/ramdisk#012 > root=/dev/vda console=tty0 console=ttyS0#012 dev='hd'/>#012#012 #012 > #012#012#012 #012 offset='utc'/>#012 destroy#012 > restart#012 destroy#012 > #012/usr/bin/qemu-system-x86_64#012 > #012 cache='none'/>#012 file='/opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/disk'/>#012 > #012 domain='0x' bus='0x00' slot='0x04' function='0x0'/>#012 > #012#012 name='qemu' type='raw' cache='none'/>#012 username='volumes'>#012 not found#033[00m > > > > Jul 23 17:17:18 slave2 2013-07-23 17:17:18.681 ERROR > nova.openstack.common.rpc.amqp > [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin > admin] #033[01;35mException during message handling#033[00m#0122013-07-23 > 17:17:18.681 TRACE nova.openstack.common.rpc.amqp > #033[01;35m#033[00mTraceback (most recent call last):#0122013-07-23 > 17:17:18.681 TRACE nova.openstack.comm
Re: [ceph-users] Error when volume is attached in openstack
sudo virsh secret-list UUID Usage --- bdf77f5d-bf0b-1053-5f56-cd76b32520dc Unused All nodes have secret set. On Wed, Jul 24, 2013 at 11:30 AM, Abel Lopez wrote: > You need to do this on each compute node, and you can verify with > virsh secret-list > > On Jul 24, 2013, at 11:20 AM, johnu wrote: > > > I was trying openstack on ceph. I could create volumes but I am not able > to attach the volume to any running instance. If I attach a instance to an > instance and reboot it, it goes to error state. > > Compute error logs are given below. > > 15:32.666 ERROR nova.compute.manager > [#033[01;36mreq-464776fd-2832-4f76-91fa-3e4eff173064 #033[00;36mNone None] > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] error during > stop() in sync_power_state.#033[00m#0122013-07-23 17:15:32.666 TRACE > nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00mTraceback (most recent call > last):#0122013-07-23 17:15:32.666 TRACE nova.compute.manager > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File > "/opt/stack/nova/nova/compute/manager.py", line 4421, in > _sync_instance_power_state#0122013-07-23 17:15:32.666 TRACE > nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m > self.conductor_api.compute_stop(context, db_instance)#0122013-07-23 > 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File > "/opt/stack/nova/nova/conductor/api.py", line 333, in > compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m > return self._manager.compute_stop(context, instance, do_cast)#0122013-07-23 > 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File > "/opt/stack/nova/nova/conductor/rpcapi.py", line 483, in > compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m > return self.call(context, msg, version='1.43')#0122013-07-23 17:15:32.666 > TRACE nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File > "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 126, in > call#0122013-07-23 17:15:32.666 TRACE nova.compute.manager > #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m > result = rpc.call(context, real_topic, msg, timeout)#0122013-07-23 > 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: > 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[0 > > Jul 23 17:17:18 slave2 2013-07-23 17:17:18.380 ERROR > nova.virt.libvirt.driver > [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin > admin] #033[01;35mAn error occurred while trying to launch a defined domain > with xml: #012 instance-000b#012 > 4b58dea1-f281-4818-82da-8b9f5f923f64#012 unit='KiB'>524288#012 unit='KiB'>524288#012 placement='static'>1#012 #012 > #012 OpenStack > Foundation#012 OpenStack > Nova#012 2013.2#012 name='serial'>38047832-f758-4e6d-aedf-2d6cf02d6b1e#012 name='uuid'>4b58dea1-f281-4818-82da-8b9f5f923f64#012 > #012 #012 #012 machine='pc-i440fx-1.4'>hvm#012 > /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/kernel#012 > /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/ramdisk#012 > root=/dev/vda console=tty0 console=ttyS0#012 dev='hd'/>#012#012 #012 > #012#012#012 #012 offset='utc'/>#012 destroy#012 > restart#012 destroy#012 > #012/usr/bin/qemu-system-x86_64#012 > #012 cache='none'/>#012 file='/opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/disk'/>#012 > #012 domain='0x' bus='0x00' slot='0x04' function='0x0'/>#012 > #012#012 name='qemu' type='raw' cache='none'/>#012 username='volumes'>#012 not found#033[00m > > > > Jul 23 17:17:18 slave2 2013-07-23 17:17:18.681 ERROR > nova.openstack.common.rpc.amqp > [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin > admin] #033[01;35mException during message handling#033[00m#0122013-07-23 > 17:17:18.681 TRACE nova.openstack.common.rpc.amqp > #033[01;35m#033[00mTraceback (most recent call last):#0122013-07-23 > 17:17:18.681 TRACE nova.openstack.common.rpc.amqp #033[01;35m#033[00
Re: [ceph-users] Error when volume is attached in openstack
Abel, What did you change in nova.conf? . I have added rbd_username and rbd_secret_uuid in cinder.conf. I verified that rbd_secret_uuid is same as virsh secret-list . On Wed, Jul 24, 2013 at 11:49 AM, Abel Lopez wrote: > One thing I had to do, and it's not really in the documentation, > I Created the secret once on 1 compute node, then I reused the UUID when > creating it in the rest of the compute nodes. > I then was able to use this value in cinder.conf AND nova.conf. > > On Jul 24, 2013, at 11:39 AM, johnu wrote: > > sudo virsh secret-list > UUID Usage > --- > bdf77f5d-bf0b-1053-5f56-cd76b32520dc Unused > > All nodes have secret set. > > > On Wed, Jul 24, 2013 at 11:30 AM, Abel Lopez wrote: > >> You need to do this on each compute node, and you can verify with >> virsh secret-list >> >> On Jul 24, 2013, at 11:20 AM, johnu wrote: >> >> >> I was trying openstack on ceph. I could create volumes but I am not able >> to attach the volume to any running instance. If I attach a instance to an >> instance and reboot it, it goes to error state. >> >> Compute error logs are given below. >> >> 15:32.666 ERROR nova.compute.manager >> [#033[01;36mreq-464776fd-2832-4f76-91fa-3e4eff173064 #033[00;36mNone None] >> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] error during >> stop() in sync_power_state.#033[00m#0122013-07-23 17:15:32.666 TRACE >> nova.compute.manager #033[01;35m[instance: >> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00mTraceback (most recent call >> last):#0122013-07-23 17:15:32.666 TRACE nova.compute.manager >> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File >> "/opt/stack/nova/nova/compute/manager.py", line 4421, in >> _sync_instance_power_state#0122013-07-23 17:15:32.666 TRACE >> nova.compute.manager #033[01;35m[instance: >> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m >> self.conductor_api.compute_stop(context, db_instance)#0122013-07-23 >> 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: >> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File >> "/opt/stack/nova/nova/conductor/api.py", line 333, in >> compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager >> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m >> return self._manager.compute_stop(context, instance, do_cast)#0122013-07-23 >> 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: >> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File >> "/opt/stack/nova/nova/conductor/rpcapi.py", line 483, in >> compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager >> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m >> return self.call(context, msg, version='1.43')#0122013-07-23 17:15:32.666 >> TRACE nova.compute.manager #033[01;35m[instance: >> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File >> "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 126, in >> call#0122013-07-23 17:15:32.666 TRACE nova.compute.manager >> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m >> result = rpc.call(context, real_topic, msg, timeout)#0122013-07-23 >> 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: >> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[0 >> >> Jul 23 17:17:18 slave2 2013-07-23 17:17:18.380 ERROR >> nova.virt.libvirt.driver >> [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin >> admin] #033[01;35mAn error occurred while trying to launch a defined domain >> with xml: #012 instance-000b#012 >> 4b58dea1-f281-4818-82da-8b9f5f923f64#012 > unit='KiB'>524288#012 > unit='KiB'>524288#012 > placement='static'>1#012 #012 >> #012 OpenStack >> Foundation#012 OpenStack >> Nova#012 2013.2#012 > name='serial'>38047832-f758-4e6d-aedf-2d6cf02d6b1e#012 > name='uuid'>4b58dea1-f281-4818-82da-8b9f5f923f64#012 >> #012 #012 #012> machine='pc-i440fx-1.4'>hvm#012 >> /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/kernel#012 >> /opt/stack/data/nova/instances/4b58dea1-f281-4818-82da-8b9f5f923f64/ramdisk#012 >> root=/dev/vda console=tty0 console=ttyS0#012> dev='hd'/>#012#012 #012 >> #012#012#012 #012 > offset='utc'/>#012 destroy#012 >> restart#012 destroy#012 >> #012/usr/bin/qemu-system-x86_64#012 >> #012 &
Re: [ceph-users] Error when volume is attached in openstack
Yes. It matches for all nodes in the cluster On Wed, Jul 24, 2013 at 1:12 PM, Abel Lopez wrote: > You are correct, I didn't add that to nova.conf, only cinder.conf. > if you do > virsh secret-get-value bdf77f5d-bf0b-1053-5f56-cd76b32520dc > do you see the key that you have for your client.volumes? > > On Jul 24, 2013, at 12:11 PM, johnu wrote: > > Abel, >What did you change in nova.conf? . I have added rbd_username and > rbd_secret_uuid in cinder.conf. I verified that rbd_secret_uuid is same as > virsh secret-list . > > > On Wed, Jul 24, 2013 at 11:49 AM, Abel Lopez wrote: > >> One thing I had to do, and it's not really in the documentation, >> I Created the secret once on 1 compute node, then I reused the UUID when >> creating it in the rest of the compute nodes. >> I then was able to use this value in cinder.conf AND nova.conf. >> >> On Jul 24, 2013, at 11:39 AM, johnu wrote: >> >> sudo virsh secret-list >> UUID Usage >> --- >> bdf77f5d-bf0b-1053-5f56-cd76b32520dc Unused >> >> All nodes have secret set. >> >> >> On Wed, Jul 24, 2013 at 11:30 AM, Abel Lopez wrote: >> >>> You need to do this on each compute node, and you can verify with >>> virsh secret-list >>> >>> On Jul 24, 2013, at 11:20 AM, johnu wrote: >>> >>> >>> I was trying openstack on ceph. I could create volumes but I am not able >>> to attach the volume to any running instance. If I attach a instance to an >>> instance and reboot it, it goes to error state. >>> >>> Compute error logs are given below. >>> >>> 15:32.666 ERROR nova.compute.manager >>> [#033[01;36mreq-464776fd-2832-4f76-91fa-3e4eff173064 #033[00;36mNone None] >>> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] error during >>> stop() in sync_power_state.#033[00m#0122013-07-23 17:15:32.666 TRACE >>> nova.compute.manager #033[01;35m[instance: >>> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00mTraceback (most recent call >>> last):#0122013-07-23 17:15:32.666 TRACE nova.compute.manager >>> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File >>> "/opt/stack/nova/nova/compute/manager.py", line 4421, in >>> _sync_instance_power_state#0122013-07-23 17:15:32.666 TRACE >>> nova.compute.manager #033[01;35m[instance: >>> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m >>> self.conductor_api.compute_stop(context, db_instance)#0122013-07-23 >>> 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: >>> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File >>> "/opt/stack/nova/nova/conductor/api.py", line 333, in >>> compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager >>> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m >>> return self._manager.compute_stop(context, instance, do_cast)#0122013-07-23 >>> 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: >>> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File >>> "/opt/stack/nova/nova/conductor/rpcapi.py", line 483, in >>> compute_stop#0122013-07-23 17:15:32.666 TRACE nova.compute.manager >>> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m >>> return self.call(context, msg, version='1.43')#0122013-07-23 17:15:32.666 >>> TRACE nova.compute.manager #033[01;35m[instance: >>> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m File >>> "/opt/stack/nova/nova/openstack/common/rpc/proxy.py", line 126, in >>> call#0122013-07-23 17:15:32.666 TRACE nova.compute.manager >>> #033[01;35m[instance: 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[00m >>> result = rpc.call(context, real_topic, msg, timeout)#0122013-07-23 >>> 17:15:32.666 TRACE nova.compute.manager #033[01;35m[instance: >>> 4b58dea1-f281-4818-82da-8b9f5f923f64] #033[0 >>> >>> Jul 23 17:17:18 slave2 2013-07-23 17:17:18.380 ERROR >>> nova.virt.libvirt.driver >>> [#033[01;36mreq-560b46ed-e96e-4645-a23e-3eba6f51437c #033[00;36madmin >>> admin] #033[01;35mAn error occurred while trying to launch a defined domain >>> with xml: #012 instance-000b#012 >>> 4b58dea1-f281-4818-82da-8b9f5f923f64#012 >> unit='KiB'>524288#012 >> unit='KiB'>524288#012 >> placement='static'>1#012 #012 >>> #012 OpenStack >>> Foundation#012 OpenStack >>> Nova#012
[ceph-users] Cinder volume creation issues
Hi all, I need to know whether someone else also faced the same issue. I tried openstack + ceph integration. I have seen that I could create volumes from horizon and it is created in rados. When I check the created volumes in admin panel, all volumes are shown to be created in the same host.( I tried creating 10 volumes, but all are created in same host 'slave1') I I haven't changed crushmap and I am using the default one which came along with ceph-deploy. nova-manage version 2013.2 host master { id -2 # do not change unnecessarily # weight 0.010 alg straw hash 0 # rjenkins1 item osd.0 weight 0.010 } host slave1 { id -3 # do not change unnecessarily # weight 0.010 alg straw hash 0 # rjenkins1 item osd.1 weight 0.010 } host slave2 { id -4 # do not change unnecessarily # weight 0.010 alg straw hash 0 # rjenkins1 item osd.2 weight 0.010 } root default { id -1 # do not change unnecessarily # weight 0.030 alg straw # do not change bucket size (3) unnecessarily hash 0 # rjenkins1 item master weight 0.010 pos 0 item slave1 weight 0.010 pos 1 rule data { ruleset 0 type replicated min_size 1 max_size 10 step take default step chooseleaf firstn 0 type host step emit } ceph osd dump pool 0 'data' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 crash_replay_interval 45 pool 1 'metadata' rep size 2 min_size 1 crush_ruleset 1 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 pool 2 'rbd' rep size 2 min_size 1 crush_ruleset 2 object_hash rjenkins pg_num 64 pgp_num 64 last_change 1 owner 0 pool 3 'volumes' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 19 owner 0 pool 4 'images' rep size 2 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 128 pgp_num 128 last_change 21 owner 0 Second Issue I am not able to attach volumes to instances if hosts differ. Eg: If volumes are created in host 'slave1' , instance1 is created in host 'master' and instance2 is created in host 'slave1', I am able to attach volumes to instance2 but not to instance1. Did someone face this issue in openstack with ceph ?. ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Cinder volume creation issues
Greg, I verified in all cluster nodes that rbd_secret_uuid is same as virsh secret-list. And If I do virsh secret-get-value of this uuid, i getting back the auth key for client.volumes. What did you mean by same configuration?. Did you mean same secret for all compute nodes? when we login as admin, There is a column in admin panel which gives the 'host' where the volumes lie. I know that volumes are striped across the cluster but it gives same host for all volumes. That is why ,I got little confused. On Fri, Jul 26, 2013 at 9:23 AM, Gregory Farnum wrote: > On Fri, Jul 26, 2013 at 9:17 AM, johnu wrote: > > Hi all, > > I need to know whether someone else also faced the same issue. > > > > > > I tried openstack + ceph integration. I have seen that I could create > > volumes from horizon and it is created in rados. > > > > When I check the created volumes in admin panel, all volumes are shown > to be > > created in the same host.( I tried creating 10 volumes, but all are > created > > in same host 'slave1') I I haven't changed crushmap and I am using the > > default one which came along with ceph-deploy. > > RBD volumes don't live on a given host in the cluster; they are > striped across all of them. What do you mean the volume is "in" > slave1? > > > Second Issue > > I am not able to attach volumes to instances if hosts differ. Eg: If > volumes > > are created in host 'slave1' , instance1 is created in host 'master' and > > instance2 is created in host 'slave1', I am able to attach volumes to > > instance2 but not to instance1. > > This sounds like maybe you don't have quite the same configuration on > both hosts. Due to the way OpenStack and virsh handle their config > fragments and secrets, you need to have the same virsh secret-IDs both > configured (in the OpenStack config files) and set (in virsh's > internal database) on every compute host and the Cinder/Nova manager. > > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Cinder volume creation issues
Greg, Yes, the outputs match master node: ceph auth get-key client.volumes AQC/ze1R2EOWNBAAmLUE4U7zO1KafZ/CzVVTqQ== virsh secret-get-value bdf77f5d-bf0b-1053-5f56-cd76b32520dc AQC/ze1R2EOWNBAAmLUE4U7zO1KafZ/CzVVTqQ== /etc/cinder/cinder.conf volume_driver=cinder.volume.drivers.rbd.RBDDriver rbd_pool=volumes glance_api_version=2 rbd_user=volumes rbd_secret_uuid=bdf77f5d-bf0b-1053-5f56-cd76b32520dc slave1 /etc/cinder/cinder.conf volume_driver=cinder.volume.drivers.rbd.RBDDriver rbd_pool=volumes glance_api_version=2 rbd_user=volumes rbd_secret_uuid=62d0b384-50ad-2e17-15ed-66bfeda40252 virsh secret-get-value 62d0b384-50ad-2e17-15ed-66bfeda40252 AQC/ze1R2EOWNBAAmLUE4U7zO1KafZ/CzVVTqQ== slave2 /etc/cinder/cinder.conf volume_driver=cinder.volume.drivers.rbd.RBDDriver rbd_pool=volumes glance_api_version=2 rbd_user=volumes rbd_secret_uuid=33651ba9-5145-1fda-3e61-df6a5e6051f5 virsh secret-get-value 33651ba9-5145-1fda-3e61-df6a5e6051f5 AQC/ze1R2EOWNBAAmLUE4U7zO1KafZ/CzVVTqQ== Yes, Openstack horizon is showing same host for all volumes. Somehow, if volume is attached to an instance lying on the same host, it works otherwise, it doesn't. Might be a coincidence. And I am surprised that no one else has seen or reported this issue. Any idea? On Fri, Jul 26, 2013 at 9:45 AM, Gregory Farnum wrote: > On Fri, Jul 26, 2013 at 9:35 AM, johnu wrote: > > Greg, > > I verified in all cluster nodes that rbd_secret_uuid is same as > > virsh secret-list. And If I do virsh secret-get-value of this uuid, i > > getting back the auth key for client.volumes. What did you mean by same > > configuration?. Did you mean same secret for all compute nodes? > > If you run "virsh secret-get-value" with that rbd_secret_uuid on each > compute node, does it return the right secret for client.volumes? > > > when we login as admin, There is a column in admin panel which > gives > > the 'host' where the volumes lie. I know that volumes are striped across > the > > cluster but it gives same host for all volumes. That is why ,I got little > > confused. > > That's not something you can get out of the RBD stack itself; is this > something that OpenStack is showing you? I suspect it's just making up > information to fit some API expectations, but somebody more familiar > with the OpenStack guts can probably chime in. > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Cinder volume creation issues
Greg, :) I am not getting where was the mistake in the configuration. virsh secret-define gave different secrets sudo virsh secret-define --file secret.xml sudo virsh secret-set-value --secret {uuid of secret} --base64 $(cat client.volumes.key) On Fri, Jul 26, 2013 at 10:16 AM, Gregory Farnum wrote: > On Fri, Jul 26, 2013 at 10:11 AM, johnu wrote: > > Greg, > > Yes, the outputs match > > Nope, they don't. :) You need the secret_uuid to be the same on each > node, because OpenStack is generating configuration snippets on one > node (which contain these secrets) and then shipping them to another > node where they're actually used. > > Your secrets are also different despite having the same rbd user > specified, so that's broken too; not quite sure how you got there... > -Greg > Software Engineer #42 @ http://inktank.com | http://ceph.com > > > > > master node: > > > > ceph auth get-key client.volumes > > AQC/ze1R2EOWNBAAmLUE4U7zO1KafZ/CzVVTqQ== > > > > virsh secret-get-value bdf77f5d-bf0b-1053-5f56-cd76b32520dc > > AQC/ze1R2EOWNBAAmLUE4U7zO1KafZ/CzVVTqQ== > > > > /etc/cinder/cinder.conf > > > > volume_driver=cinder.volume.drivers.rbd.RBDDriver > > rbd_pool=volumes > > glance_api_version=2 > > rbd_user=volumes > > rbd_secret_uuid=bdf77f5d-bf0b-1053-5f56-cd76b32520dc > > > > > > slave1 > > > > /etc/cinder/cinder.conf > > > > volume_driver=cinder.volume.drivers.rbd.RBDDriver > > rbd_pool=volumes > > glance_api_version=2 > > rbd_user=volumes > > rbd_secret_uuid=62d0b384-50ad-2e17-15ed-66bfeda40252 > > > > > > virsh secret-get-value 62d0b384-50ad-2e17-15ed-66bfeda40252 > > AQC/ze1R2EOWNBAAmLUE4U7zO1KafZ/CzVVTqQ== > > > > slave2 > > > > /etc/cinder/cinder.conf > > > > volume_driver=cinder.volume.drivers.rbd.RBDDriver > > rbd_pool=volumes > > glance_api_version=2 > > rbd_user=volumes > > rbd_secret_uuid=33651ba9-5145-1fda-3e61-df6a5e6051f5 > > > > virsh secret-get-value 33651ba9-5145-1fda-3e61-df6a5e6051f5 > > AQC/ze1R2EOWNBAAmLUE4U7zO1KafZ/CzVVTqQ== > > > > > > Yes, Openstack horizon is showing same host for all volumes. Somehow, if > > volume is attached to an instance lying on the same host, it works > > otherwise, it doesn't. Might be a coincidence. And I am surprised that no > > one else has seen or reported this issue. Any idea? > > > > On Fri, Jul 26, 2013 at 9:45 AM, Gregory Farnum > wrote: > >> > >> On Fri, Jul 26, 2013 at 9:35 AM, johnu > wrote: > >> > Greg, > >> > I verified in all cluster nodes that rbd_secret_uuid is same > as > >> > virsh secret-list. And If I do virsh secret-get-value of this uuid, i > >> > getting back the auth key for client.volumes. What did you mean by > same > >> > configuration?. Did you mean same secret for all compute nodes? > >> > >> If you run "virsh secret-get-value" with that rbd_secret_uuid on each > >> compute node, does it return the right secret for client.volumes? > >> > >> > when we login as admin, There is a column in admin panel which > >> > gives > >> > the 'host' where the volumes lie. I know that volumes are striped > across > >> > the > >> > cluster but it gives same host for all volumes. That is why ,I got > >> > little > >> > confused. > >> > >> That's not something you can get out of the RBD stack itself; is this > >> something that OpenStack is showing you? I suspect it's just making up > >> information to fit some API expectations, but somebody more familiar > >> with the OpenStack guts can probably chime in. > >> -Greg > >> Software Engineer #42 @ http://inktank.com | http://ceph.com > > > > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] rbd read write very slow for heavy I/O operations
Hi, I have an openstack cluster which runs on ceph . I tried running hadoop inside VM's and I noticed that map tasks take long time to complete with time and finally it fails. RDB read/writes are getting slower with time. Is it because of too many objects in ceph per volume? I have 8 node cluster with 24* 1TB disk for each node. master : mon slave1: 1 osd per disk, ie 23 slave 2: 1 osd per disk ie 23 . . slave 7 : 1 osd per isk ie 23 replication factor :2 pg nums in default pool: 128 In openstack,I have 14 instances. 14 5TB volumes are created and each one is attached to an instance.I am using default stripe settings. rpd -p volumes info volume-1 size 5000GB in 128 objects order 22(4096kb objects) 1.I couldn't find the documentation for stripe settings which can be used for volume creation in openstack. Can it be exposed through any configuration files? . (http://ceph.com/docs/master/rbd/rbd-openstack/). Like 64MB default block size in hdfs, how do we set layout for the objects,?.Can we change it after volume creation ?. Will this affect performance for huge I/O applications like MapReduce? 2. How can RBD caching improve the performance? 3. Like in hdfs which gives priority over localized writes, how can we implement the same feature because rbd volumes are striped across the cluster. I am not sure of crush rulesets which can help this situation Can someone give me debug points and ideas related to this?. I have not used cephfs for now . ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How to set Object Size/Stripe Width/Stripe Count?
This can help you. http://www.sebastien-han.fr/blog/2013/02/11/mount-a-specific-pool-with-cephfs/ On Thu, Aug 8, 2013 at 7:48 AM, Da Chun wrote: > Hi list, > I saw the info about data striping in > http://ceph.com/docs/master/architecture/#data-striping . > But couldn't find the way to set these values. > > Could you please tell me how to that or give me a link? Thanks! > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Crushmap ruleset for rack aware PG placement
Hi Daniel, Can you provide your exact crush map and exact crushtool command that results in segfaults? Johnu On 9/16/14, 10:23 AM, "Daniel Swarbrick" wrote: >Replying to myself, and for the benefit of other caffeine-starved people: > >Setting the last rule to "chooseleaf firstn 0" does not generate the >desired results, and ends up sometimes putting all replicas in the same >zone. > >I'm slowly getting the hang of customised crushmaps ;-) > >On 16/09/14 18:39, Daniel Swarbrick wrote: >> >> One other area I wasn't sure about - can the final "chooseleaf" step >> specify "firstn 0" for simplicity's sake (and to automatically handle a >> larger pool size in future) ? Would there be any downside to this? > > >> >> Cheers >> >> On 16/09/14 16:20, Loic Dachary wrote: >>> Hi Daniel, >>> >>> When I run >>> >>> crushtool --outfn crushmap --build --num_osds 100 host straw 2 rack >>>straw 10 default straw 0 >>> crushtool -d crushmap -o crushmap.txt >>> cat >> crushmap.txt <>> rule myrule { >>> ruleset 1 >>> type replicated >>> min_size 1 >>> max_size 10 >>> step take default >>> step choose firstn 2 type rack >>> step chooseleaf firstn 2 type host >>> step emit >>> } >>> EOF > > >___ >ceph-users mailing list >ceph-users@lists.ceph.com >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Multi node dev environment
Hi, I was trying to set up a multi node dev environment. Till now, I was building ceph by executing ./configure and make. I then used to test the features by using vstart in a single node. Instead of it, if I still need to use the multi node cluster for testing, what is the proper way to do?. If I need to run benchmarks(using rados bench or other benchmarking tools) after any code change, what is the right practice to test some change in a multi node dev setup? ( Multi node setup is needed as part of getting right performance results in benchmark tests) Thanks, Johnu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multi node dev environment
How do I use ceph-deploy in this case?. How do I get ceph-deploy to use my privately built ceph package (with my changes) and install them in all ceph nodes? Johnu On 10/2/14, 7:22 AM, "Loic Dachary" wrote: >Hi, > >I would use ceph-deploy >http://ceph.com/docs/master/start/quick-start-preflight/#ceph-deploy-setup > but ... I've only done tests a few times and other people may have a >more elaborate answer to this question ;-) > >Cheers > >On 02/10/2014 15:44, Johnu George (johnugeo) wrote:> Hi, >> I was trying to set up a multi node dev environment. Till now, I was >>building ceph by executing ./configure and make. I then used to test the >>features by using vstart in a single node. Instead of it, if I still >>need to use the multi node cluster for testing, what is the proper way >>to do?. If I need to run benchmarks(using rados bench or other >>benchmarking tools) after any code change, what is the right practice to >>test some change in a multi node dev setup? ( Multi node setup is needed >>as part of getting right performance results in benchmark tests) >> >> >> Thanks, >> Johnu >> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >> > >-- >Loïc Dachary, Artisan Logiciel Libre > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multi node dev environment
Hi Somnath, I will try out with ‹dev option which you told. Does it mean that I have to remove osds and mon each time and then do ceph-deploy install ‹dev, ceph mon create, ceph osd create ? The problem with the first option is that, I have to manually install in 5-6 nodes each time for every small change. Johnu On 10/2/14, 1:55 PM, "Somnath Roy" wrote: >I think you should just skip 'ceph-deploy install' command and install >your version of the ceph package in all the nodes manually. >Otherwise there is ceph-deploy install --dev you can try out. > >Thanks & Regards >Somnath > >-Original Message- >From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >Johnu George (johnugeo) >Sent: Thursday, October 02, 2014 1:08 PM >To: Loic Dachary >Cc: ceph-users@lists.ceph.com >Subject: Re: [ceph-users] Multi node dev environment > >How do I use ceph-deploy in this case?. How do I get ceph-deploy to use >my privately built ceph package (with my changes) and install them in all >ceph nodes? > > >Johnu > >On 10/2/14, 7:22 AM, "Loic Dachary" wrote: > >>Hi, >> >>I would use ceph-deploy >>http://ceph.com/docs/master/start/quick-start-preflight/#ceph-deploy-se >>tup but ... I've only done tests a few times and other people may have >>a more elaborate answer to this question ;-) >> >>Cheers >> >>On 02/10/2014 15:44, Johnu George (johnugeo) wrote:> Hi, >>> I was trying to set up a multi node dev environment. Till now, I was >>>building ceph by executing ./configure and make. I then used to test >>>the features by using vstart in a single node. Instead of it, if I >>>still need to use the multi node cluster for testing, what is the >>>proper way to do?. If I need to run benchmarks(using rados bench or >>>other benchmarking tools) after any code change, what is the right >>>practice to test some change in a multi node dev setup? ( Multi node >>>setup is needed as part of getting right performance results in >>>benchmark tests) >>> >>> >>> Thanks, >>> Johnu >>> >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> >>-- >>Loïc Dachary, Artisan Logiciel Libre >> > >___ >ceph-users mailing list >ceph-users@lists.ceph.com >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > >PLEASE NOTE: The information contained in this electronic mail message is >intended only for the use of the designated recipient(s) named above. If >the reader of this message is not the intended recipient, you are hereby >notified that you have received this message in error and that any >review, dissemination, distribution, or copying of this message is >strictly prohibited. If you have received this communication in error, >please notify the sender by telephone or e-mail (as shown above) >immediately and destroy any and all copies of this message in your >possession (whether hard copies or electronically stored copies). > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multi node dev environment
Even when I try ceph-deploy install --dev , I am seeing that it is getting installed from official ceph repo. How can I install ceph from my github repo or my local repo in all ceph nodes? (Or any other possibility? ). Someone can help me in setting this? Johnu On 10/2/14, 1:55 PM, "Somnath Roy" wrote: >I think you should just skip 'ceph-deploy install' command and install >your version of the ceph package in all the nodes manually. >Otherwise there is ceph-deploy install --dev you can try out. > >Thanks & Regards >Somnath > >-Original Message- >From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >Johnu George (johnugeo) >Sent: Thursday, October 02, 2014 1:08 PM >To: Loic Dachary >Cc: ceph-users@lists.ceph.com >Subject: Re: [ceph-users] Multi node dev environment > >How do I use ceph-deploy in this case?. How do I get ceph-deploy to use >my privately built ceph package (with my changes) and install them in all >ceph nodes? > > >Johnu > >On 10/2/14, 7:22 AM, "Loic Dachary" wrote: > >>Hi, >> >>I would use ceph-deploy >>http://ceph.com/docs/master/start/quick-start-preflight/#ceph-deploy-se >>tup but ... I've only done tests a few times and other people may have >>a more elaborate answer to this question ;-) >> >>Cheers >> >>On 02/10/2014 15:44, Johnu George (johnugeo) wrote:> Hi, >>> I was trying to set up a multi node dev environment. Till now, I was >>>building ceph by executing ./configure and make. I then used to test >>>the features by using vstart in a single node. Instead of it, if I >>>still need to use the multi node cluster for testing, what is the >>>proper way to do?. If I need to run benchmarks(using rados bench or >>>other benchmarking tools) after any code change, what is the right >>>practice to test some change in a multi node dev setup? ( Multi node >>>setup is needed as part of getting right performance results in >>>benchmark tests) >>> >>> >>> Thanks, >>> Johnu >>> >>> >>> ___ >>> ceph-users mailing list >>> ceph-users@lists.ceph.com >>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >> >>-- >>Loïc Dachary, Artisan Logiciel Libre >> > >___ >ceph-users mailing list >ceph-users@lists.ceph.com >http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > > > >PLEASE NOTE: The information contained in this electronic mail message is >intended only for the use of the designated recipient(s) named above. If >the reader of this message is not the intended recipient, you are hereby >notified that you have received this message in error and that any >review, dissemination, distribution, or copying of this message is >strictly prohibited. If you have received this communication in error, >please notify the sender by telephone or e-mail (as shown above) >immediately and destroy any and all copies of this message in your >possession (whether hard copies or electronically stored copies). > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Multi node dev environment
Thanks Alfredo. Is there any other possible way that will work for my situation? Anything would be helpful Johnu On 10/7/14, 2:25 PM, "Alfredo Deza" wrote: >On Tue, Oct 7, 2014 at 5:05 PM, Johnu George (johnugeo) > wrote: >> Even when I try ceph-deploy install --dev , I >> am seeing that it is getting installed from official ceph repo. How can >>I >> install ceph from my github repo or my local repo in all ceph nodes? (Or >> any other possibility? ). Someone can help me in setting this? > >That is just not possible. Only branches that are pushed to the Ceph >repo are available through the >`--dev` flag because they rely on a URL structure and repo that we >maintain. > > >> >> Johnu >> >> >> >> On 10/2/14, 1:55 PM, "Somnath Roy" wrote: >> >>>I think you should just skip 'ceph-deploy install' command and install >>>your version of the ceph package in all the nodes manually. >>>Otherwise there is ceph-deploy install --dev you can try >>>out. >>> >>>Thanks & Regards >>>Somnath >>> >>>-Original Message- >>>From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of >>>Johnu George (johnugeo) >>>Sent: Thursday, October 02, 2014 1:08 PM >>>To: Loic Dachary >>>Cc: ceph-users@lists.ceph.com >>>Subject: Re: [ceph-users] Multi node dev environment >>> >>>How do I use ceph-deploy in this case?. How do I get ceph-deploy to use >>>my privately built ceph package (with my changes) and install them in >>>all >>>ceph nodes? >>> >>> >>>Johnu >>> >>>On 10/2/14, 7:22 AM, "Loic Dachary" wrote: >>> >>>>Hi, >>>> >>>>I would use ceph-deploy >>>>http://ceph.com/docs/master/start/quick-start-preflight/#ceph-deploy-se >>>>tup but ... I've only done tests a few times and other people may have >>>>a more elaborate answer to this question ;-) >>>> >>>>Cheers >>>> >>>>On 02/10/2014 15:44, Johnu George (johnugeo) wrote:> Hi, >>>>> I was trying to set up a multi node dev environment. Till now, I was >>>>>building ceph by executing ./configure and make. I then used to test >>>>>the features by using vstart in a single node. Instead of it, if I >>>>>still need to use the multi node cluster for testing, what is the >>>>>proper way to do?. If I need to run benchmarks(using rados bench or >>>>>other benchmarking tools) after any code change, what is the right >>>>>practice to test some change in a multi node dev setup? ( Multi node >>>>>setup is needed as part of getting right performance results in >>>>>benchmark tests) >>>>> >>>>> >>>>> Thanks, >>>>> Johnu >>>>> >>>>> >>>>> ___ >>>>> ceph-users mailing list >>>>> ceph-users@lists.ceph.com >>>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>>>> >>>> >>>>-- >>>>Loïc Dachary, Artisan Logiciel Libre >>>> >>> >>>___ >>>ceph-users mailing list >>>ceph-users@lists.ceph.com >>>http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com >>> >>> >>> >>>PLEASE NOTE: The information contained in this electronic mail message >>>is >>>intended only for the use of the designated recipient(s) named above. If >>>the reader of this message is not the intended recipient, you are hereby >>>notified that you have received this message in error and that any >>>review, dissemination, distribution, or copying of this message is >>>strictly prohibited. If you have received this communication in error, >>>please notify the sender by telephone or e-mail (as shown above) >>>immediately and destroy any and all copies of this message in your >>>possession (whether hard copies or electronically stored copies). >>> >> >> ___ >> ceph-users mailing list >> ceph-users@lists.ceph.com >> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] Regarding Primary affinity configuration
Hi All, I have few questions regarding the Primary affinity. In the original blueprint (https://wiki.ceph.com/Planning/Blueprints/Firefly/osdmap%3A_primary_role_affinity ), one example has been given. For PG x, CRUSH returns [a, b, c] If a has primary_affinity of .5, b and c have 1 , with 50% probability, we will choose b or c instead of a. (25% for b, 25% for c) A) I was browsing through the code, but I could not find this logic of splitting the rest of configured primary affinity value between other osds. How is this handled? 1. if (a < CEPH_OSD_MAX_PRIMARY_AFFINITY && 2. (crush_hash32_2(CRUSH_HASH_RJENKINS1, 3. seed, o) >> 16) >= a) { 4.// we chose not to use this primary. note it anyway as a 5.// fallback in case we don't pick anyone else, but keep looking. 6.if (pos < 0) 7. pos = i; 8. } else { 9.pos = i; 10. break; 11. } 12. } B) Since, primary affinity value is configured independently, there can be a situation with [0.1,0.1,0.1] with total value that don’t add to 1. How is this taken care of? C) Slightly confused. What happens for a situation with [1,0.5,1] ? Is osd.0 always returned? D) After calculating primary based on the affinity values, I see a shift of osds so that primary comes to the front. Why is this needed?. I thought, primary affinity value affects only reads and hence, osd ordering need not be changed. Thanks, Johnu ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Monitor segfaults when updating the crush map
Stephen, You are right. Crash can happen if replica size doesn’t match the no of osds. I am not sure if there exists any other solution for your problem " choose first 2 replicas from a rack and choose third replica from any other rack different from one”. Some different thoughts: 1)If you have 3 racks, you can try for choose 3 racks and chooseleaf 1 host ensuring three separate racks and three replicas 2)Another thought Take rack1 Chooseleaf firstn 2 type host Emit Take rack2 Chooseleaf firstn 1 type host Emit This of course restricts first 2 replicas in rack1 and may become unbalanced.(Ensure enough storage in rack1) Thanks, Johnu From: Stephen Jahl mailto:stephenj...@gmail.com>> Date: Thursday, October 9, 2014 at 11:11 AM To: Loic Dachary mailto:l...@dachary.org>> Cc: "ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>" mailto:ceph-users@lists.ceph.com>> Subject: Re: [ceph-users] Monitor segfaults when updating the crush map Thanks Loic, In my case, I actually only have three replicas for my pools -- with this rule, I'm trying to ensure that at OSDs in at least two racks are selected. Since the replica size is only 3, I think I'm still affected by the bug (unless of course I set my replica size to 4). Is there a better way I can express what I want in the crush rule, preferably in a way not hit by that bug ;) ? Is there an ETA on when that bugfix might land in firefly? Best, -Steve On Thu, Oct 9, 2014 at 1:59 PM, Loic Dachary mailto:l...@dachary.org>> wrote: Hi Stephen, It looks like you're hitting http://tracker.ceph.com/issues/9492 which has been fixed but is not yet available in firefly. The simplest workaround is to min_size 4 in this case. Cheers On 09/10/2014 19:31, Stephen Jahl wrote:> Hi All, > > I'm trying to add a crush rule to my map, which looks like this: > > rule rack_ruleset { > ruleset 1 > type replicated > min_size 1 > max_size 10 > step take default > step choose firstn 2 type rack > step chooseleaf firstn 2 type host > step emit > } > > I'm not configuring any pools to use the ruleset at this time. When I > recompile the map, and test the rule with crushtool --test, everything seems > fine, and I'm not noticing anything out of the ordinary. > > But, when I try to inject the compiled crush map back into the cluster like > this: > > ceph osd setcrushmap -i /path/to/compiled-crush-map > > The monitor process appears to stop, and I see a monitor election happening. > Things hang until I ^C the setcrushmap command, and I need to restart the > monitor processes to make things happy again (and the crush map never ends up > getting updated). > > In the monitor logs, I see several segfaults that look like this: > http://pastebin.com/K1XqPpbF > > I'm running ceph 0.80.5-1trusty on Ubuntu 14.04 with kernel 3.13.0-35-generic. > > Anyone have any ideas as to what is happening? > > -Steve > > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > -- Loïc Dachary, Artisan Logiciel Libre ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] Regarding Primary affinity configuration
Hi Greg, Thanks for your extremely informative post. My related questions are posted inline On 10/9/14, 2:21 PM, "Gregory Farnum" wrote: >On Thu, Oct 9, 2014 at 10:55 AM, Johnu George (johnugeo) > wrote: >> Hi All, >> I have few questions regarding the Primary affinity. In the >> original blueprint >> >>(https://wiki.ceph.com/Planning/Blueprints/Firefly/osdmap%3A_primary_role >>_affinity >> ), one example has been given. >> >> For PG x, CRUSH returns [a, b, c] >> If a has primary_affinity of .5, b and c have 1 , with 50% probability, >>we >> will choose b or c instead of a. (25% for b, 25% for c) >> >> A) I was browsing through the code, but I could not find this logic of >> splitting the rest of configured primary affinity value between other >>osds. >> How is this handled? >> >> if (a < CEPH_OSD_MAX_PRIMARY_AFFINITY && >> (crush_hash32_2(CRUSH_HASH_RJENKINS1, >> seed, o) >> 16) >= a) { >> // we chose not to use this primary. note it anyway as a >> // fallback in case we don't pick anyone else, but keep looking. >> if (pos < 0) >> pos = i; >> } else { >> pos = i; >> break; >> } >> } > >It's a fallback mechanism ‹ if the chosen primary for a PG has primary >affinity less than the default (max), we (probabilistically) look for >a different OSD to be the primary. We decide whether to offload by >running a hash and discarding the OSD if the output value is greater >than the OSDs affinity, and then we go through the list and run that >calculation in order (obviously if the affinity is 1, then it passes >without needing to run the hash). >If no OSD in the list has a high enough hash value, we take the >originally-chosen primary. As in example for [0.5,1,1], I got your point that with 50% probability, first osd will be chosen. But, how do we ensure that second and third osd will be having remaining 25% and 25% respectively?. I could see only individual primary affinity values but not a sum value anywhere to ensure that. > >> B) Since, primary affinity value is configured independently, there can >>be a >> situation with [0.1,0.1,0.1] with total value that don¹t add to 1. >>How is >> this taken care of? > >These primary affinity values are just compared against the hash >output I mentioned, so the sum doesn't matter. In general we simply >expect that OSDs which don't have the max weight value will be chosen >as primary in proportion to their share of the total weight of their >PG membership (ie, if they have a weight of .5 and everybody else has >weight 1, they will be primary in half the normal number of PGs. If >everybody has a weight of .5, they will be primary in the normal >proportions. Etc). I got your idea but I couldn¹t figure out that from the code. You said that max weight value will be chosen as primary in proportion to their share of the total weight of their PG membership. But, from what I understood from code, if it is [0.1,0.1,0.1], first osd will be chosen always. (Probabilistically for 10% reads, it will choose first osd. However,first osd will still be chosen for rest of the reads as part of fallback mechanism which is the originally chosen primary.) Am I wrong? > >> >> C) Slightly confused. What happens for a situation with [1,0.5,1] ? Is >>osd.0 >> always returned? > >If the first OSD in the PG list has primary affinity of 1 then it is >always the primary for that OSD, yes. That's not osd.0, though; just >the first OSD in the PG list. ;) Sorry. I meant the first OSD, but accidentally wrote as osd.0 . As you said, if first osd is always selected in the PG list for this scenario, doesn¹t it violate our assumption to have probabilistically 25%, 50%, 25% reads for first ,second and third osd respectively? > >> D) After calculating primary based on the affinity values, I see a >>shift of >> osds so that primary comes to the front. Why is this needed?. I thought, >> primary affinity value affects only reads and hence, osd ordering need >>not >> be changed. > >Primary affinity impacts which OSD is chosen to be primary; the >primary is the ordering point for *all* access to the PG. That >includes writes as well as reads, plus coordination of the cluster on >map changes. We move the primary to the front of the list...well, I >think it's just because we were lazy and there are a bunch of places >that assume the first OSD in a replicated pool is the primary. Does that mean that osd set ordering keeps on changing(in real time) for
Re: [ceph-users] Regarding Primary affinity configuration
Thanks for detailed post, Greg. I was trying to configure primary affinity in my cluster but I didn’t see any expected results. As you said, I was just looking into single pg and got wrong. I also had primary affinity value configured for multiple osds in a pg, which makes the calculation more complex. As in your eg: if osd0, osd1,osd2 has primary affinity value of [1,0.5,0.1] and there are 600 pgs, the final distribution comes in 440:140:20 or 22:7:1 which is slighly skewed from expected. Johnu On 10/9/14, 4:51 PM, "Gregory Farnum" wrote: >On Thu, Oct 9, 2014 at 4:24 PM, Johnu George (johnugeo) > wrote: >> Hi Greg, >> Thanks for your extremely informative post. My related >>questions >> are posted inline >> >> On 10/9/14, 2:21 PM, "Gregory Farnum" wrote: >> >>>On Thu, Oct 9, 2014 at 10:55 AM, Johnu George (johnugeo) >>> wrote: >>>> Hi All, >>>> I have few questions regarding the Primary affinity. In the >>>> original blueprint >>>> >>>>(https://wiki.ceph.com/Planning/Blueprints/Firefly/osdmap%3A_primary_ro >>>>le >>>>_affinity >>>> ), one example has been given. >>>> >>>> For PG x, CRUSH returns [a, b, c] >>>> If a has primary_affinity of .5, b and c have 1 , with 50% >>>>probability, >>>>we >>>> will choose b or c instead of a. (25% for b, 25% for c) >>>> >>>> A) I was browsing through the code, but I could not find this logic of >>>> splitting the rest of configured primary affinity value between other >>>>osds. >>>> How is this handled? >>>> >>>> if (a < CEPH_OSD_MAX_PRIMARY_AFFINITY && >>>> (crush_hash32_2(CRUSH_HASH_RJENKINS1, >>>> seed, o) >> 16) >= a) { >>>> // we chose not to use this primary. note it anyway as a >>>> // fallback in case we don't pick anyone else, but keep looking. >>>> if (pos < 0) >>>> pos = i; >>>> } else { >>>> pos = i; >>>> break; >>>> } >>>> } >>> >>>It's a fallback mechanism ‹ if the chosen primary for a PG has primary >>>affinity less than the default (max), we (probabilistically) look for >>>a different OSD to be the primary. We decide whether to offload by >>>running a hash and discarding the OSD if the output value is greater >>>than the OSDs affinity, and then we go through the list and run that >>>calculation in order (obviously if the affinity is 1, then it passes >>>without needing to run the hash). >>>If no OSD in the list has a high enough hash value, we take the >>>originally-chosen primary. >> As in example for [0.5,1,1], I got your point that with 50% >>probability, >> first osd will be chosen. But, how do we ensure that second and third >>osd >> will be having remaining 25% and 25% respectively?. I could see only >> individual primary affinity values but not a sum value anywhere to >>ensure >> that. > >Well, for any given PG with that pattern, the second OSD in the list >is going to be chosen. But *which* osd is listed second is random, so >if you only have 3 OSDs 0,1,2 (with weights .5, 1, 1, respectively), >then the PGs in total will work in a 1:2:2 ratio because OSDs 1 and 2 >will between themselves be first in half of the PG lists. > >> >>> >>>> B) Since, primary affinity value is configured independently, there >>>>can >>>>be a >>>> situation with [0.1,0.1,0.1] with total value that don¹t add to 1. >>>>How is >>>> this taken care of? >>> >>>These primary affinity values are just compared against the hash >>>output I mentioned, so the sum doesn't matter. In general we simply >>>expect that OSDs which don't have the max weight value will be chosen >>>as primary in proportion to their share of the total weight of their >>>PG membership (ie, if they have a weight of .5 and everybody else has >>>weight 1, they will be primary in half the normal number of PGs. If >>>everybody has a weight of .5, they will be primary in the normal >>>proportions. Etc). >> >> I got your idea but I couldn¹t figure out that from the code. You said >> that max weight value will be chosen as primary in proportion to their >> share of the total weight of their >> PG membership. But, from what I understood from code, if it is >> [0.1