On Tue, Dec 24, 2024 at 8:26 AM Mike Hammett <na...@ics-il.net> wrote:
> In the articles I've read and videos I've watched, they have mentioned > varying amounts of reduced power. I didn't commit them to memory because > that wasn't the part I was interested in at the moment. > > I'd think that, especially as data rates climb, the power consumption is going to really get important fast. When a single device requires ~50kw to run ... I think you'll want to make sure you have space/power to deal with that :( I'm not sure that distributed fabric plans make that problem better? (maybe it's all the same problem in the end because the fabric interconnect is going to be distance limited/etc too) > Management of the things is a big thing I've been concerned about going > into more modern systems. So often there's hand waiving regarding the > orchestration piece of non-traditional systems. From what I've seen (and I > would love to be wrong), you either build it in-house (not a small lift) or > you buy something that ends up taking away all of the cost advantages that > path had. > > You almost certainly get into (pretty quickly) something that smells a bunch like: "here's my pile of ansible recipes for...." (choice of ansible here for example only, s/ansible/<whatever>/ of course to whatever you feel like) That's maybe fine if that's your jam? I think it's hard to reason/plan/build without some automation plan 'now', and it looks like a ton of folk start without that then try to retrofit once: "omg this is very large now... ugh" happens. (1-10 devices? sure fine do it by hand, 5-><bunches more> you really ought to have had an automation plan at ~5 ... my opinion clearly) > Failure domain stuff is part of what I'm trying to learn more about, which > goes back to more about the fundamentals of how the fabric works. > > yea... This part(reasoning about failure domains) I assume is also a tad hard. A scenario is: "I built this 200tb fabric, I interconnect to the outside with ~100T max and internally with ~100T" now that ~100T breaks and (ideally!) everything on the outside re-routes around to a different front-door... oops are you prepared for an extra ~100T arriving? How do you deal with parts (fabric parts) failing in part? "oops only 50T of my 100T can get through here and ... I also am still telling my external neighbors all's good" Really that failure-domain problem is tightly linked to the 'manage a ton of things' problem too.. at least for containing damage in a quick manner.