On Thursday, 25 August 2016 13:21:24 UTC+1, Marc Haber wrote: > > On Wed, Aug 24, 2016 at 08:36:49AM -0700, Luke Bigum wrote: > > Here we have very strict control over our hardware and what interface > goes > > where. We keep CentOS 6's naming scheme on Dell hardware, so p2p1 is PCI > > slot 2, Port 1, and don't try rename it. > > Isn't CentOS 6 still using eth0, 1, 2, 3? How do you handle different > hardware having different slot numbers, or PCI bridges shifting bus > numbers? >
I find this depends on the manufacturer. I've never come across a Dell server newer than an R510 that *doesn't* give you PCI based names. I just checked an R510 and it does. All of our ancient HP gear (7 years, older than the R510s which is old) give the ethX names. Also random SuperMicro hardware gives ethX. I don't really know what's missing for the kernel / udev to name them so, but for us it doesn't really matter. > We have a 3rd party patch manager tool (patchmanager.com), LLDP on > > our switches, and a Nagios check that tells me if an interface is not > > plugged into the switch port it is supposed to be plugged into > > (according to patchmanager). > > Nice ;-) Is the code for the Nagios stuff public? > Unfortunately no :-( Another one of those LMAX modules that's had years of development but too much company specific stuff hard coded in it to release. It's not a huge amount though, and I did just ask my Lead if I could clean up our networking module and release it and he was more than happy, I'm sure I could do the same for our nagios module. Watch this space, but don't hold your breath. > This works perfectly on Dell hardware because the PCI name mapping > > works. > > And you don't have many different kinds of servers. We try keep as few as possible, but it's not that small a list: ******************* [root@puppet ~]# mco facts productname Report for fact: productname ................................. found 1 times KVM found 603 times OptiPlex 7010 found 1 times OptiPlex 7020 found 2 times PowerEdge FC430 found 15 times PowerEdge FC630 found 56 times PowerEdge R220 found 1 times PowerEdge R320 found 92 times PowerEdge R330 found 1 times PowerEdge R510 found 17 times PowerEdge R520 found 66 times PowerEdge R720 found 36 times PowerEdge R720xd found 30 times PowerEdge R730 found 7 times PowerEdge R730xd found 37 times Precision Tower 5810 found 10 times Precision WorkStation T5500 found 7 times ProLiant DL360 G6 found 2 times ProLiant DL380 G5 found 16 times ProLiant DL380 G6 found 11 times To Be Filled By O.E.M. found 1 times X9SCL/X9SCM found 6 times ************************* > > On really old HP gear it doesn't work, > > What does that mean? > > I meant that on our very old HP servers the PCI device name mapping doesn't come up, so you end up with eth0, eth1, etc. > > We still need some sort of "glue record" that says "this interface > should > > be up and have this IP". In our older designs this was managed entirely > in > > Hiera - so there's a giant multi-level hash that we run > create_resources() > > over to define every single network interface. You can imagine the > amount > > of Hiera data we have. > > That's what we're trying to avoid. Can you share example snippets? > Here is a snippet of the older style, in a Node's Hiera. It is what I'm trying to move away from, because if you want to create 20 of these machines you've got to copy this Hiera hash around 20 times over. Oh the number of typos... You can probably interpolate the defined types that this data has create_resources() run over, the key names are pretty Red Hat specific: ******************************* networking::interfaces: bond1: bonding_opts: mode=802.3ad xmit_hash_policy=layer3+4 lacp_rate=slow miimon=100 enable: true onboot: 'yes' type: Bonding bond1.3: broadcast: 1.1.3.255 enable: true ipaddr: 1.1.3.7 netmask: 255.255.255.0 network: 1.1.3.0 onboot: 'yes' vlan: 'yes' p4p1: enable: true master: bond1 onboot: 'yes' slave: 'yes' type: Ethernet p4p2: enable: true master: bond1 onboot: 'yes' slave: 'yes' type: Ethernet networking::routes: bond1: device: bond1 routes: - 1.1.2.0/24 via 1.1.3.1 ******************************* > > In the newer designs which are a lot more of a role/profile approach > > I've been trying to conceptualise the networking based on our > > profiles. So if one of our servers is fulfilling function "database" > > there will be a Class[profile::database]. This Class might create a > > bonded interface for the "STORAGE" network and another interface for > > the "CLIENT" network. > > That is interesting and a nice concept. But nothing one introduces > just to remedy an error report named "help, my interface names do not > fit any more". Probably not, it's a lot of work for burying an error message if that's just your aim. What I get from the abstraction above is being able to take our profiles and re-use them in a completely different site on the other side of the world, or in a staging / testing environment. So I don't have the concept of "VLAN 123 in Production UK", I've just got "The STORAGE network" which in Production UK happens to be vlan 123 (buried low down in Hiera, and only specified once once), but in Dev it's 456, and over there it doesn't exist so we'll give it the same vlan tag as the CLIENT network, etc... The physical-ness of the network is abstracted from the concepts our software relies on. > So you do create network interfaces in the profile and not in the > role? > We try to follow the design rule that "Roles only include Profiles". Since our software stack is heavily dependent on the networking architecture, our profiles for our software are not designed exist on the same server. I would never have profile::database and profile::frontend on the same server, as they might both try create a STORAGE network and fail catalog compilation. As such our roles generally only contain one "business level" profile, and look something like: ******************* class role::database { include profile::mandatory #Everything mandatory on EL6 include profile::authentication #Authentication is not mandatory include profile::database #The profile that does most of the work for our software } ******************* This is more a function of our software being heavily tied to networking layout though. I do have other roles and profiles for internal (office) systems that can move around a lot easier, as they are just your standard "Run a webapp and DB" thing. > > Through various levels of Hiera I can define the STORAGE network as > > VLAN 100, because it might be a different vlan tag at a different > > location. Then at the Hiera node level (on each individual server) I > > will have something like: > > > > profile::database::bond_storage_slaves: [ 'p2p1', 'p2p2' ] > > > > That's the glue. At some point I need to tell Puppet that on this > specific > > server, the storage network is a bond of p2p1 and p2p2. > > So you need to know when writing this code what kind of hardware the > system is running on, probably down to firmware version and hardware > extras? > No, the exact opposite ideally. You need to know *conceptually* what the requirements of our software are. So sticking with the same fictitious "database" example, you must have a STORAGE network and you must have a CLIENT network otherwise the App simply won't run (we're a little bit more complicated than a LAMP-stack-in-AWS company). When we've got this coded correctly it should be hardware independent, but, there is this "mandatory data" that we need to supply to get it to build (what interfaces are for what network). These are the "glue records" I keep talking about (to borrow a term from DNS). Ideally this would be zero. We *could* programatically determine it, but the arguments against are part "effort vs gain" and part "what do you want your source of truth to be". Maybe the best way for us to do it auto-magically would be to query Patch Manager to determine what networking interfaces should be present and what logical networks they attach to. We already Nagios check against it... That's not trivial though, and it also means my Puppet builds rely on an externally hosted SaaS (not going to fly). > I have bounced around the idea of removing this step and trusting the > > switch - ie: write a fact to do an LLDP query for the VLAN of the switch > > port each interface is connected to, that way you wouldn't need the > glue, > > there'd be a fact called vlan_100_interfaces. > > So the fact would be coming from the _server_ based on what lldpcli > show neighbors detail returns, which is supposed to include the VLAN > information? Would this work on 801.1q trunks as well? > That was the idea, yes. Don't know about 801.1q, depends on what the switch OS does for such interfaces. > Two problems with this approach: we end up trusting the switch to be > > our source of truth (it may not be correct, > > The switch uses its own source of truth which also influences which > network traffic gets sent down the link, so trusting the switch will > at least fail to the safe side and avoid accidentally putting full > trust on an Internet link. > Yeah if that suits your use case, you could do that. For me though, I'd much prefer a Puppet manifest to fail to compile because someone hasn't supplied the correct data. It forces an engineer to think about what they are building, and where it's attached. > and, what if the switch port is down?). > > One would have to fall back to a certain safety net then. > > > Secondly the quality and consistency of LLDP information you get out > > of various manufacturers of networking hardware is very different, so > > relying on LLDP information to define your OS network config is a bit > > risky for me. > > Is it really this bad? I do have experience with HP and Cisco, and > their LLDP/CDP information is usually fine. > In my opinion it is, yes. One our Network Engineers changed a Dell FX2 chassis internal I/O switch between one mode and the other to get MLAG workign (these are Dell Force 10 internally) and the structure of the LLDP information changed, and this was simple shit too - the switch description just "disappeared" :-( Here's one part of our client side Nagios monitoring, a script that converts the LLDP information into a parse-able CSV. Our Nagios servers query this data via SNMP and compare it to Patch Manager, there by telling us if something is plugged in to the wrong port. It is sanitised to the "database" example, it looks like this: [root@server ~]$ sudo .//interfaces.py p3p2,yes,clientswitch01.example.com,16,456,Arista DCS-7124SX em1,yes,storageswitch01.example.com,8/1/20,123,Brocade ICX6450-48 em2,yes,storageswitch01.example.com,8/1/20,123,Brocade ICX6450-48 p4p1,yes,clientswitch02.example.com,16,456,Arista DCS-7124SX And here's the Python that generates that output. Note the number of if statements in the function parse_switch_type_from_data(), and how I have to fall back on MAC address checks because some models simply don't want to report that they are a "Brocade", etc: https://gist.github.com/lukebigum/efb5b789bfeaf962ef15128092015d08 I haven't read the LLDP standard, but from personal experience I assume it reads something like "Here is a list of optional fields, put whatever you want in them". > > It's a different story for our VMs. Since they are Puppet defined we > > specify a MAC address and so we "know" which MAC will be attached to > which > > VM bridge. We drop a MAC based udev rule into the guest to name them > > similarly, ie: eth100 is on br100. > > How do you puppet define your MAC addresses? Which virtualization do > you use? Can i see a code snippet? > KVM. MAC addresses statically defined to a deterministic formulae - so if the IP is 1.2.3.4 the MAC address is 52:54:00:02:03:04 - the last 3 IP bytes are the same as the last three MAC Hex numbers. This means no MAC address clash :-) Unfortunately I haven't got everything magically defined once, so we can must define a VM in Hiera (not the best place for it, but it's what we've got): ***************** libvirt::vms: database: cpus: '4' ensure: running interfaces: - bridge:br123,54:52:00:01:02:03 - bridge:br456,54:52:00: memory: '4096' on_crash: restart on_poweroff: destroy on_reboot: restart virt_disk: path=/var/lib/libvirt/images/ld4deploy01/ld4deploy01.img,size=16,bus=virtio,sparse=false virt_type: kvm ***************** And then we must duplicate the MAC address in the Hiera of the VM itself when creating the networking inside the VM. This is crap, as it's the same MAC address in multiple places, but it's tricky to fix. I might be able to solve it with exported resources... but I'd probably get VM definitions out of Hiera first before I fixed this. > That's what we do, but it's made easy by an almost homogeneous hardware > > platform and strict physical patch management. > > Yes. The homogenous hardware platform is probably something that can > only be maintained for really large installations. > > > When I read about your problem, it sounds like you are missing a "glue > > record" that describes your logical interfaces to your physical devices. > > We're desperately trying to avoid having this in Hiera. > I can understand that, and it's good you've got that mindset. I'd like to get to the same place eventually. For me, going from 100s of lines of Hiera for a node to < 20 is good enough so far. > If you were to follow something along the lines of our approach, you > > might have something like this: > > > > class profile::some_firewall( > > $external_interface_name = 'eth0', > > $internal_interface_name = 'eth1', > > $perimiter_interface_name = 'eth2' > > ) { > > firewall { '001_allow_internal': > > chain => 'INPUT', > > iniface => $internal_interface_name, > > action => 'accept', > > proto => 'all', > > } > > > > firewall { '002_some_external_rule': > > chain => 'INPUT', > > iniface => $external_interface_name, > > action => 'accept', > > proto => 'tcp', > > dport => '443', > > } > > } > > > > That very simple firewall profile probably already works on your HP > > hardware, and on your Dell hardware you'd need to override the 3 > parameters > > in Hiera: > > > > profile::some_firewall::internal_interface_name: 'em1' > > profile::some_firewall::external_interface_name: 'p3p1' > > profile::some_firewall::perimiter_interface_name: 'p1p1' > > On the Dell R680, yes. A hypothetical "R680s" would need some other > definition, and a VMware Vm comes up with eno<number> interfaces with > eight-digit <number>, basically random. Thanks for your input, I appreciate it. > > Greetings > Marc > > > > -- > ----------------------------------------------------------------------------- > > Marc Haber | "I don't trust Computers. They | Mailadresse im > Header > Leimen, Germany | lose things." Winona Ryder | Fon: *49 6224 > 1600402 > Nordisch by Nature | How to make an American Quilt | Fax: *49 6224 > 1600421 > -- You received this message because you are subscribed to the Google Groups "Puppet Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to puppet-users+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/puppet-users/f041efb7-8dcb-4011-a6ba-67a7b7597020%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.