There is more history and documentation in Question #112157 Effectively I am running the cluster controller of a managednvlan UEC on an instance of Ubuntu 10.4 Desktop. What I had noticed, over time is running instances would go deaf, often times with what looks like valid IP addresses listed in the describe-instances output. Both the public and private IP addresses would be unavailable. What's odd is that in the case of an intervening vpn session after the vpn session was closed, the ip endpoints to that cloud instance were removed.
Even if there were a separate dedicated cc would not one lose connectivity from the client machine? My existing (and previously existing) network was the 192.168.0.xxx served by my wireless router. The segment reserved for the cloud instances public IP address was 192.168.3.0->3.50 or some such limited range. What is avahi and why is it withdrawing the endpoint IPs to the cloud instance? More information...consider this: May 28 18:26:19 cor720 NetworkManager: <info> Maximum Segment Size (MSS): 0 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 10.0.0.0/8 Next Hop: 10.0.0.0 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 192.168.251.0/24 Next Hop: 192.168.251.0 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 192.168.22.0/24 Next Hop: 192.168.22.0 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 192.168.23.0/24 Next Hop: 192.168.23.0 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 192.168.24.0/24 Next Hop: 192.168.24.0 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 63.131.134.0/24 Next Hop: 63.131.134.0 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 208.111.81.157/32 Next Hop: 208.111.81.157 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 208.111.81.159/32 Next Hop: 208.111.81.159 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 72.20.25.16/32 Next Hop: 72.20.25.16 May 28 18:26:19 cor720 NetworkManager: <info> Static Route: 209.249.222.54/32 Next Hop: 209.249.222.54 May 28 18:26:19 cor720 NetworkManager: <info> Internal IP4 DNS: 10.50.33.21 May 28 18:26:19 cor720 NetworkManager: <info> Internal IP4 DNS: 10.5.4.1 May 28 18:26:19 cor720 NetworkManager: <info> DNS Domain: 'na.global.ad' May 28 18:26:19 cor720 NetworkManager: <info> Login Banner: May 28 18:26:19 cor720 NetworkManager: <info> ----------------------------------------- May 28 18:26:19 cor720 NetworkManager: <info> ----------------------------------------- May 28 18:26:19 cor720 vpnc[3537]: can't open pidfile /var/run/vpnc/pid for writing May 28 18:26:20 cor720 NetworkManager: <info> VPN connection 'Monster (Maynard)' (IP Config Get) complete. May 28 18:26:20 cor720 NetworkManager: <info> Policy set 'Monster (Maynard)' (tun0) as default for routing and DNS. May 28 18:26:20 cor720 vmnetBridge: RTM_NEWROUTE: index:5 May 28 18:26:20 cor720 NetworkManager: <info> VPN plugin state changed: 4 May 28 18:26:20 cor720 nm-dispatcher.action: Script '/etc/NetworkManager/dispatcher.d/01ifupdown' exited with error status 1. May 28 18:27:36 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 from d0:0d:30:cf:06:f7 via eth0 May 28 18:27:36 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0 May 28 18:28:26 cor720 vpnc[3537]: select: Interrupted system call May 28 18:28:26 cor720 vpnc[3537]: terminated by signal: 15 May 28 18:28:26 cor720 avahi-daemon[1175]: Withdrawing address record for 172.19.1.1 on eth0. May 28 18:28:26 cor720 avahi-daemon[1175]: Withdrawing address record for 192.168.3.100 on eth0. May 28 18:28:27 cor720 NetworkManager: <info> Policy set 'Auto eth0' (eth0) as default for routing and DNS. May 28 18:28:27 cor720 vmnetBridge: RTM_NEWROUTE: index:2 There was a, likely, corresponding loss of signal to the open connection to the instance: When I say it I tried to log on again. I tried restarting eucalyptus on the cluster as well as the eucalyptus-cc ubu...@ip-172-19-1-2:~$ Write failed: Broken pipe w...@cor720:~$ ssh -i .euca/mykey.priv ubu...@192.168.3.100 ssh: connect to host 192.168.3.100 port 22: Connection timed out w...@cor720:~$ ssh -i .euca/mykey.priv ubu...@192.168.3.100 ssh: connect to host 192.168.3.100 port 22: Connection timed out w...@cor720:~$ I looked at the tail end of the console log and found this: I also tried rebooting the instance: Begin: Running /scripts/local-bottom ... Done. Done. Begin: Running /scripts/init-bottom ... Done. cloud-init running: Sat, 29 May 2010 01:29:42 +0000. up 9.06 seconds waiting for metadata service at http://169.254.169.254/2009-04-04/meta-data/instance-id 01:29:44 [ 1/100]: url error [timed out] 01:29:47 [ 2/100]: url error [timed out] 01:29:50 [ 3/100]: url error [timed out] 01:29:51 [ 4/100]: url error [[Errno 113] No route to host] 01:29:54 [ 5/100]: url error [timed out] 01:29:57 [ 6/100]: url error [timed out] 01:30:01 [ 7/100]: url error [timed out] 01:30:05 [ 8/100]: url error [timed out] 01:30:09 [ 9/100]: url error [timed out] 01:30:13 [10/100]: url error [timed out] 01:30:17 [11/100]: url error [timed out] 01:30:22 [12/100]: url error [timed out] 01:30:27 [13/100]: url error [timed out] 01:30:32 [14/100]: url error [timed out] 01:30:37 [15/100]: url error [timed out] 01:30:42 [16/100]: url error [timed out] 01:30:48 [17/100]: url error [timed out] 01:30:54 [18/100]: url error [timed out] 01:31:00 [19/100]: url error [[Errno 113] No route to host] 01:31:06 [20/100]: url error [timed out] 01:31:12 [21/100]: url error [timed out] 01:31:19 [22/100]: url error [timed out] The system log on the cc shows: May 28 21:23:47 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 from d0:0d:30:cf:06:f7 via eth0 May 28 21:23:47 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0 May 28 21:23:47 cor720 dhcpd: DHCPDISCOVER from d0:0d:30:cf:06:f7 via eth0 May 28 21:23:47 cor720 dhcpd: DHCPOFFER on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0 May 28 21:23:47 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 (169.254.169.254) from d0:0d:30:cf:06:f7 via eth0 May 28 21:23:47 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0 May 28 21:24:16 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 from d0:0d:30:cf:06:f7 via eth0 May 28 21:24:16 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0 May 28 21:27:47 cor720 init: uec-component-listener main process (3949) killed by TERM signal May 28 21:28:47 cor720 init: uec-component-listener main process (22579) killed by TERM signal May 28 21:29:45 cor720 dhcpd: DHCPREQUEST for 172.19.1.2 from d0:0d:30:cf:06:f7 via eth0 May 28 21:29:45 cor720 dhcpd: DHCPACK on 172.19.1.2 to d0:0d:30:cf:06:f7 via eth0 The instance, while continuing to show a running state, never shows a reestablished IP address pair RESERVATION r-4481087B admin default INSTANCE i-30CF06F7 emi-DEF41072 0.0.0.0 0.0.0.0 running mykey 0 m1.large 1970-01-01T00:00:00.65Z cluster1 eki-F52010F2 eri-0960114A -- otherwise live instance goes deaf connection refused https://bugs.launchpad.net/bugs/587340 You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs