Hi, This is a tale of a problem and its solution. I'm posting it to the mailing list to make it googlable by someone who experiences the same problem. This is why I'm also packing it with SEO keywords.
===The only thing I'd like to ask the community=== is to help me identify the proper places in documentation and wiki which could be improved with this experience. Just point me, and I'll improve them. * I suffered from not knowing what the network is supposed to look like with each networking mode - which interfaces should there be, how they should be configured (e.g. which should have an IP address and which shouldn't), which pings should work and which shouldn't. Also, which piece of code is supposed to do this piece of network configuration, on behalf of whom and when. * I suffered from lack of a trouble-shooting guide. Now, to the actual business. I hereby glorify the person with IRC nick Diopter who spent a lot of time to help me resolve this problem. Many of the things I'm saying "I tried" below, I tried after he suggested them. == Environment == I have a few machines: (actually VMs but that doesn' matter): controller and a couple of compute nodes. They are connected by 2 networks: 1) internal network on eth1 2) VM network (10.0.0.0/24) on eth2. I've freshly deployed OpenStack on them in Flat DHCP mode (FlatDHCPManager) following the recipes of https://github.com/puppetlabs/puppetlabs-openstack. I've imported the Cirros image I got from here http://wiki.openstack.org/GettingImages [as it turns out, incorrectly] and "nova boot"-ed an instance, with an IP of 10.0.0.3. The instance status was ACTIVE. The network configuration looked exactly as it's supposed to look in flat DHCP mode: all machines had br100 attached to eth2, and the VM's vnet0 was attached to br100 as well. None of br100, eth2, vnet0 had an IP address (this is correct too). == Problem == Then I pinged the VM: $ ping 10.0.0.3 And got "Destination Host Unreachable". == Investigations == A really weird thing is that I usually would get a single ping response per VM lifecycle. Another weird thing is that dnsmasq on the controller was properly configured and /var/log/syslog had DHCPREQUEST and even DHCPACK entries for the VM's MAC and IP. I tried to "arp 10.0.0.3" from both the controller and the VM, I also tried "arping" but to no avail - tcpdump was always showing that ARP requests arrive where they should, but there are no ARP responses. Another sad thing is that you can't catch ARP packets with iptables -j TRACE and ipt_LOG, and arptables don't have logging facilities at all. I authorized the default security group to allow ICMP traffic (Horizon -> Projects -> Security -> allow ICMP protocol code -1 type -1 CIDR 0.0.0.0). That didn't help. Then I looked at the instance log (/var/lib/nova/instances/instance*/console.log) and it turned out empty! Now this was not expected behavior. Since the controller and VMs were headless, I couldn't VNC to the instance. So I took a screenshot with "virsh screenshot instance-00000009". I opened the screenshot and it said "Boot failed: not a bootable disk" and then the PXE boot process and then "No more network devices" and finally "No bootable device". == True problem and solution == So, it turns out the ACTUAL PROBLEM was this: I imported the image incorrectly (glance add name=cirros disk_format=raw container_format=bare <cirros.img), so it didn't know it had a hard disk. The sole ping that got through, apparently, did so while BIOS was attempting a PXE boot, and after it failed, it wasn't listening to network packets anymore. The proper way to import the Cirros image is this: * Download it from https://launchpad.net/cirros/trunk/0.3.0/+download/cirros-0.3.0-x86_64-disk.img (just getting the .img from inside the -uec- image .tar.gz at http://wiki.openstack.org/GettingImages won't do!) * Import it as "glance add name=cirros container_format=bare disk_format=qcow2 < cirros-0.3.0-x86_64-disk.img" * Then everything works fine. TO REMIND: Please help me find the places in documentation that you think could benefit from me improving them to this experience. -- Eugene Kirpichov http://www.linkedin.com/in/eugenekirpichov _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp