On 29/02/2024 17:21, Jared Mauch wrote:
On behalf of cisco-nsp and outages - we salute you.
-Hank
On Feb 28, 2024, at 1:30 AM, Daniel Marks via NANOG <nanog@nanog.org> wrote:
We’re getting rocked by storms here in Michigan, could be related.
[ brief version of what happened from what I can tell reconstructing things]
I was alerted ~4am US/E yesterday about the issue. This machine has been
generously hosted by my previous employer for quite some time, funnily enough
it was 7 years ago almost to the day since I started my current employment.
The IPMI was not responsive and the machine was located in 350 Cermak, on a
floor that was not impacted with the heat/cold event.
I have been meaning to move things off and on, but never quite had the
motivation to tackle the task. Yesterday forced my hand.
Once I confirmed that we could get the machine out of the colocation facility
(thank you again NTT) I drove from Michigan to Chicago, got lunch and picked up
the machine and headed back to the colocation that I have in Michigan at the
123Net/DetroitIX site.
Once I had a console on it, I determined that this old machine had a few things
that had been gradually updated and upgraded over time, not all the filesystem
options were set correctly and after some tune2fs options were set and fstab
updated to ensure everything is migrated fully from ext2 -> ext4 the system was
able to be booted without issues.
Afterwards I’ve determined that there is still a hardware related problem, so I
am now going to move it to new hardware later today schedule permitting as I
want to go onsite and make sure that the I/O is performant.
Feb 28 22:09:05 kernel: Memory: 32816872K/33544380K available (20480K kernel
code, 3276K rwdata, 14748K rodata, 4588K init, 4892K bss, 727248K reserved, 0K
cma-reserved)
Feb 29 00:20:07 kernel: Memory: 16326408K/16767164K available (20480K kernel
code, 3276K rwdata, 14748K rodata, 4588K init, 4892K bss, 440496K reserved, 0K
cma-reserved)
Not quite a great thing when nobody is onsite and the machine requires being
power cycled and the amount of memory changes.
If you are seeing any other issues, do let me know, I did move the IPv4 space
but have renumbered for v6, so if you use my free secondary dns service, and
your own vanity name, you will need to update your AAAA records.
If you are seeing any reachability issues let me know, there should be ROA and
other objects in place for things.
Sorry everyone got this email, feel bad it’s like when warren asked the list
some personal details :-)
- Jared
(Even more details: changing disk images from qcow -> qcow2 and other things like ext2
-> ext3/4 over all the years as the machine has gone from Linux -> FreeBSD ->
Linux again and other things is always a fun way to keep bringing your legacy around with
you, it’s good overall)