On 9/25/18 12:32 PM, Neal Becker wrote:
> I'm using f28 cloud on AWS as a compute farm.  It seems that instances 
> randomly shutdown within hours of starting.  An example log:
> 
> ...
> Fedora 28 (Cloud Edition)
> Kernel 4.16.3-301.fc28.x86_64 on an x86_64 (ttyS0)
> 
>          Stopping Restore /run/initramfs on shutdown...
> [  OK  ] Removed slice system-sshd\x2dkeygen.slice.
>          Stopping User Manager for UID 1000...
> ...
> 
> In this case after about 4 hours it seems to have spontaneously shutdown.  
> This happens with high probability - maybe 2/10 instances I start 
> spontaneously shutdown.
> 
> Any ideas what's going on?  I'm just wondering if this is something specific 
> to fedora cloud edition, because it doesn't seem to be a common complaint on 
> AWS (most of which is ubuntu).

Are you getting emails from AWS that they're shutting down your
instance? AWS does some testing and, should your instance fail their
tests, they will shut it down "to protect others sharing the hardware".
If this is what's happening, you should get an email about it (we get
one perhaps 20% of the time) and if not, check the AWS admin portal
under "Events" right after a restart. There should be a record about it.
That record goes away after a while (not sure how long it hangs around).

In my experience, AWS is rather vague as to just _what_ tests they use
to determine if your instance is dangerous so it can be difficult to fix
your code. We've got some AWS stuff that's been up for well over a year,
but others they shut down because they fail these mysterious tests.

If you're using instance store disks, the disk image is purged when you
restart your instance so your logs probably don't contain why the system
shut down the last time. The only way to hang onto that stuff is to use
persistent (EBC) storage for your machine--at least for the logs (I'd
recommend st1-type storage for logs). Persistent storage at AWS can get
expensive depending on how big it is, but it may be necessary to sort
this out. Once figured out, you can get rid of the EBS storage to
minimize costs.

This may be a Fedora Cloud issue. It may be something you're doing in an
application. It may be AWS protecting itself. Hard to tell.
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer, AllDigital    ri...@alldigital.com -
- AIM/Skype: therps2        ICQ: 226437340           Yahoo: origrps2 -
-                                                                    -
-      The moving cursor writes, and having written, blinks on.      -
----------------------------------------------------------------------
_______________________________________________
users mailing list -- users@lists.fedoraproject.org
To unsubscribe send an email to users-le...@lists.fedoraproject.org
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org

Reply via email to