OK, so I said I'd get some notes on the environment within which IoT crypto has to function, here's what the peanut gallery came up with. A lot of this isn't my own work and I don't claim it to be, it's a collaboration created by people who for various business/legal reasons can't attach their names to public comments. Note that every one of the for-instances given below are actual real-life examples, not something someone invented to make a good story.
Peter. IoT Crypto, What you Need to Know --------------------------------- The problem we have is not how to get stronger crypto in place, it's how to get more crypto in place. -- Ian Grigg, 28 August 2016. ... and to raise the level of security of the rest of the system so that attackers are actually forced to target the crypto rather than just strolling around it. -- Peter Gutmann, in corollary. The device may be operating under severe power constraints. There are IoT devices that need to run for several years on a single battery pack. If you're lucky, it's a bundle of 18650s. If you're less lucky, it's a CR2032. Renegotiating protocol state on every wake event is incompatible with low power consumption. The crypto should be able to resume from a pause an arbitrary amount of time later. Even if the device is constantly powered, many components will be powered only when needed, and will lose state when powered off (they work by warm-starting very quickly rather than saving state across restarts). Some IoT chips can't cost more than a few cents each. If you're lucky, they're allowed to cost tens of cents. Any fancy crypto hardware will break the budget. When crypto hardware support is available, it's universally AES, occasionally SHA-1 and/or DES, and very rarely RSA and/or DH and/or ECDSA (there are also oddballs like ones that do SHA-1 but not AES, but they're pretty special cases, and AES in software is very efficient in any case). Any crypto had therefore better be based mostly, or exclusively, around AES. As a convenient side-effect of this, you won't have to worry about which flavour of PKC will be in fashion in ten years' time, or what keysize they're wearing in Paris that year. Even if the device includes crypto hardware, the HAL or vendor-supplied firmware may not make it available. In addition the crypto engine will be in a separate IP core that's effectively an external peripheral, and accessing it is so painful that it's quicker to do it in software (you never get AES instructions, you get something that you talk to via PIO). As a result you have to run your crypto in software while the crypto hardware sits idle. Devices have a design lifetime of ten to twenty years, possibly more. There is hardware deployed today that was designed when the people now maintaining it were in kindergarten. Firmware is never updated, and frequently *can* never be updated. This is typically because it's not writeable, or there's no room (the code already occupies 120% of available storage, brought down to 100% by replacing the certificate-handling code with a memcpy() for encoding and a seek + read of { n, e } for decoding, see below), leaving 0% available for firmware updates. Alternatively, there's no connectivity to anything to provide updates, either of firmware or anything else (for example in one globally-deployed system the CRL arrives once every 6-12 months via sneakernet, although I'm not sure why they use CRLs since they can just disable the certificate or device centrally). Or the device, once approved and operational, can't ever be changed. Like children, make one mistake here and you have to live with it for the next 15-20 years. Even if the hardware and/or firmware could be updated, the rest of the infrastructure often can't. Some firmware needs to be built with a guaranteed correspondence between the source code and the binary. This means not only using approved compilers from the late 1990s that cleanly translate the code without using any tricks or fancy optimisations, but also scouring eBay for the appropriate late-1990s hardware because it's not guaranteed that the compiler running on current CPUs will produce the same result. Don't bother asking "have you thought about using $shiny_new_thing from $vendor" (or its closely-related overgeneralisation "Moore's Law means that real soon now infinite CPU/memory/crypto will be available to anyone for free"). They're already aware of $shiny_new_thing, $shiny_other_thing, and $shiny_thing_you_havent_even_heard_of_yet, but aren't about to redo their entire hardware design, software toolchain, BSP, system firmware, certification, licensing, and product roadmap for any of them, no matter how shiny they are. The device may have no or only inadequate entropy sources. Alternatively, if there is an entropy source, it may lose state when it's powered off (see the earlier comment on power management), requiring it to perform a time-consuming entropy collection step before it can be used. Since this can trigger the watchdog (see the comment further down), it'll end up not being used. Any crypto protocol should therefore allow the entropy used in it to be injected by both parties like TLS' client and server random values, because one party may not have any entropy to inject. In addition, it's best to prefer algorithms that aren't dependent on high-quality randomness (ECDSA is a prime example of something that fails catastrophically when there are problems with randomness). Many SoCs have different portions developed by different vendors, and the only way to communicate between them is via predefined APIs. If you need entropy for your crypto and the entropy source is on a separate piece of IP that doesn't provide an entropy-extraction interface, you either need to spend twelve months negotiating access to the source, and pay handsomely for the privilege, or do without. (The following is a special case that only applies to very constrained devices: As a variant of the above, there may be no accessible writeable non- volatile memory on your section of the device. Storing a seed for crypto keys may work when you bake it into the firmware, but you can't update it once the firmware is installed because there's no access to writeable nonvolatile memory, unless you negotiate it with one of the vendors whose IP has access to it). Fuses are expensive and per-device provisioning is prohibitively expensive for low-cost IoT chips. As mentioned previously, certificates are handled by memcpy()ing a pre-encoded certificate blob to the output and seeking to the appropriate location in an incoming certificate and extracting { n, e } (note that that's { n, e }, not { p, q, g, y } or { p, a, b, G, n, ... }). If you've ever wondered why you can feed a device an expired, digital-signature-only certificate and use it for encryption, this is why (but see also the point on error handling below). This is precisely what you get when you take a hardware spec targeted at Cortex M0s, Coldfire's, AVRs, and MSP430s, and write a security spec that requires the use of a PKI. The whole device may be implemented as a single event-driven loop, with no processes, threads, or different address spaces. In addition there are hard real-time constraints on processing. You can't go off and do ECC or RSA or DH and halt all other processing while you do so because the system watchdog will hard-reset the CPU if you spend too long on something. While it is possible, with some effort, to write a manually-timesliced modmult implementation, the result is horribly inefficient and a goldmine of timing channels. It's also painful to implement, and a specific implementation is completely tied to a particular CPU architecture and clock speed. MSP 430s. Apologies to all the embedded devs who have just gone into anaphylactic shock at the mention of that name. There are billions of these things in active use. For a very recent one (August 2016), look at http://www.ti.com/lit/ds/slase78a/slase78a.pdf. Hardware-wise, a Raspberry Pi is a desktop PC, not an embedded device. So is an Alix APU, a BeagleBoard, a SheevaPlug, a CI20, a CubieBoard, a C.H.I.P, and any number of similar things that people like to cite as examples of IoT devices. In general terms, errors are divided into two classes, recoverable and nonrecoverable (this really is generalising a lot in order to avoid writing a small essay). Recoverable errors are typically handled by trying to find a way to continue, possibly in slightly degraded form. Non-recoverable errors are typically handled by forcing a hard-fault, which restarts the system in a known-good state. For example one system that uses the event-loop model has, sprinkled throughout the code, "if ( errType == FATAL ) while ( 55 );" (the 55 has no special significance, it's just a way to force an endless loop, which causes the watchdog to reset the system). An expired certificate or incorrect key usage is a soft error for which the appropriate handling action is to continue, so there's no point in even checking for it (see a previous point on the lack of checking for this sort of thing). (The equivalent in standard PCs - which includes tablets, phones, and other devices - is to dump responsibility on the user, popping up a dialog that they have to click past in order to continue, but at least now it's the user's fault and not the developer's. Embedded systems developers don't have the luxury of doing this but have to explicitly manage these types of error conditions themselves. So when a protocol spec says SHOULD NOT or MUST NOT then for standard PCs it means "throw up a warning/error dialog and blame the user if they continue" and for embedded devices it means "continue if possible". Have you ever seen a security spec of any kind that tells you what step to take next when a problem occurs?). Development will be done by embedded systems engineers who are good at making things work in tough environments but aren't crypto experts. In addition, portions of the device won't always work as they should. Any crypto used had better be able to take a huge amount of of abuse without failing. AES-CBC, even in the worst-case scenario of a constant, all-zero IV, at worst degrades to AES-ECB. AES-GCM (and related modes like AES-CTR), on the other hand, fail completely for both confidentiality and integrity protection. And you don't even want to think about all the ways ECDSA can fail (see, for example, the issues with entropy, and timing issues, above). _______________________________________________ TLS mailing list TLS@ietf.org https://www.ietf.org/mailman/listinfo/tls