Clark,
What about ephemeral storage at OVH vms? I see may storage related errors (see full output below) these days. Basically it means Docker cannot create storage device at local drive -- Logs begin at Mon 2015-12-14 06:40:09 UTC, end at Mon 2015-12-14 07:00:38 UTC. -- Dec 14 06:45:50 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_45_50> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Stopped Docker Application Container Engine. Dec 14 06:47:54 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_47_54> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Starting Docker Application Container Engine... Dec 14 06:48:00 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_48_00> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]: Warning: '-d' is deprecated, it will be removed soon. See usage. Dec 14 06:48:00 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_48_00> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]: time="2015-12-14T06:48:00Z" level=warning msg="please use 'docker daemon' instead." Dec 14 06:48:03 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_48_03> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]: time="2015-12-14T06:48:03.447936206Z" level=info msg="Listening for HTTP on unix (/var/run/docker.sock)" Dec 14 06:48:06 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_48_06> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 docker[1022]: time="2015-12-14T06:48:06.280086735Z" level=fatal msg="Error starting daemon: error initializing graphdriver: Non existing device docker-docker--pool" Dec 14 06:48:06 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_48_06> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: docker.service: main process exited, code=exited, status=1/FAILURE Dec 14 06:48:06 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_48_06> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Failed to start Docker Application Container Engine. Dec 14 06:48:06 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_48_06> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: Unit docker.service entered failed state. Dec 14 06:48:06 <http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz#_Dec_14_06_48_06> te-egw4i5xthw-0-nmaiwpjhkqg6-kube-minion-5emvszmbwpi2 systemd[1]: docker.service failed. http://logs.openstack.org/58/251158/3/check/gate-functional-dsvm-magnum-k8s/5ed0e01/logs/bay-nodes/worker-test_replication_controller_apis-172.24.5.11/docker.txt.gz — Egor On 12/13/15, 10:51, "Clark Boylan" <cboy...@sapwetik.org> wrote: >On Sat, Dec 12, 2015, at 02:16 PM, Hongbin Lu wrote: >> Hi, >> >> As Kai Qiang mentioned, magnum gate recently had a bunch of random >> failures, which occurred on creating a nova instance with 2G of RAM. >> According to the error message, it seems that the hypervisor tried to >> allocate memory to the nova instance but couldn’t find enough free memory >> in the host. However, by adding a few “nova hypervisor-show XX” before, >> during, and right after the test, it showed that the host has 6G of free >> RAM, which is far more than 2G. Here is a snapshot of the output [1]. You >> can find the full log here [2]. >If you look at the dstat log >http://logs.openstack.org/07/244907/5/check/gate-functional-dsvm-magnum-k8s/5305d7a/logs/screen-dstat.txt.gz >the host has nowhere near 6GB free memory and less than 2GB. I think you >actually are just running out of memory. >> >> Another observation is that most of the failure happened on a node with >> name “devstack-trusty-ovh-*” (You can verify it by entering a query [3] >> at http://logstash.openstack.org/ ). It seems that the jobs will be fine >> if they are allocated to a node other than “ovh”. >I have just done a quick spot check of the total memory on >devstack-trusty hosts across HPCloud, Rackspace, and OVH using `free -m` >and the results are 7480, 7732, and 6976 megabytes respectively. Despite >using 8GB flavors in each case there is variation and OVH comes in on >the low end for some reason. I am guessing that you fail here more often >because the other hosts give you just enough extra memory to boot these >VMs. > >We will have to look into why OVH has less memory despite using flavors >that should be roughly equivalent. >> >> Any hints to debug this issue further? Suggestions are greatly >> appreciated. >> >> [1] http://paste.openstack.org/show/481746/ >> [2] >> http://logs.openstack.org/48/256748/1/check/gate-functional-dsvm-magnum-swarm/56d79c3/console.html >> [3] https://review.openstack.org/#/c/254370/2/queries/1521237.yaml >> >> Best regards, >> Hongbin > >Clark > >__________________________________________________________________________ >OpenStack Development Mailing List (not for usage questions) >Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe >http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev __________________________________________________________________________ OpenStack Development Mailing List (not for usage questions) Unsubscribe: openstack-dev-requ...@lists.openstack.org?subject:unsubscribe http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev