Thanks to all the replied on and off list! tl;dr dhclient died and the instances gave up their IP's
Turns out this one was inadvertently my fault. I got bit by a bug in an old version of NetworkManager. Something triggered an update of a package on some of my instances, which lead to this bug showing up. The bug appears in versions of NetworkManage prior to NetworkManager-1.0.0-14.git2015012 https://bugzilla.redhat.com/show_bug.cgi?id=1285974 https://bugzilla.redhat.com/show_bug.cgi?id=1136836 https://rhn.redhat.com/errata/RHBA-2015-0311.html Thanks! Grant On Fri, Jan 15, 2016 at 2:02 PM, Grant Ridder <shortdudey...@gmail.com> wrote: > Gotcha, thanks for the info. > I am at 128 instances and counting in the last 8 hrs > > -Grant > > On Fri, Jan 15, 2016 at 1:58 PM, Neil Robst <neil.ro...@piksel.com> wrote: > >> Hi Grant, >> We saw the first confirmed issue last week. So far only >> experienced 2 >> confirmed - that last week and one this morning, but its possible there >> have been others. >> >> Neil >> >> From: Grant Ridder <shortdudey...@gmail.com> >> Date: Friday, January 15, 2016 at 1:54 PM >> To: Neil Robst <neil.ro...@piksel.com> >> Cc: "do...@telecurve.com" <do...@telecurve.com>, NANOG >> <nanog-boun...@nanog.org>, "nanog@nanog.org" <nanog@nanog.org> >> Subject: Re: network issue on ec2 classic us-east-1?? >> >> >> Neil / Dovid, >> How long ago did your issues start? Symptoms are the same, but the issue >> for me started early this morning at an alarming rate. >> >> -Grant >> >> >> On Fri, Jan 15, 2016 at 1:45 PM, Neil Robst >> <neil.ro...@piksel.com> wrote: >> >> Hi David and Grant, >> >> We have been experiencing exactly the same issue also now whereby >> our >> instances randomly stop getting their DHCP reservation and then drop >> offline. A simple reboot in the AWS console usually sorts it but as yet we >> do not know the root cause. >> >> Regards, >> Neil >> >> On 1/15/16, 1:31 PM, "NANOG on behalf of Dovid Bender" >> <nanog-boun...@nanog.org on behalf of >> do...@telecurve.com> wrote: >> >> >Grant, >> > >> >We have been having issues for a few weeks now with instances that >> >randomly stop getting their IP from DHCP. Did you see any dhcp errors? >> > >> > >> >Regards, >> > >> >Dovid >> > >> >-----Original Message----- >> >From: Grant Ridder <shortdudey...@gmail.com> >> >Sender: "NANOG" <nanog-boun...@nanog.org>Date: Fri, 15 Jan 2016 12:58:58 >> >To: nanog@nanog.org<nanog@nanog.org> >> >Subject: network issue on ec2 classic us-east-1?? >> > >> >Hi, >> > >> >Over the last 6 hrs i have had over 100 instances in us-east-1 in EC2 >> >Classic fail their instance health checks and a reboot via the console >> >solves them. Logs on the host point to a loss of all network >> >connectivity. Anyone else experiencing something like this? >> > >> >Reached out to AWS support and haven't gotten anywhere with that yet. >> > >> >-Grant >> >> >> >> >> >> >> >> >> >