Hi Tom, On Fri, 2 Aug 2024 at 11:34, Raymond Mao <raymond....@linaro.org> wrote:
> Hi Tom, > > On Thu, 1 Aug 2024 at 16:46, Tom Rini <tr...@konsulko.com> wrote: > >> On Wed, Jul 31, 2024 at 10:25:10AM -0700, Raymond Mao wrote: >> > >> > Integrate MbedTLS v3.6 LTS (currently v3.6.0) with U-Boot. >> > >> > Motivations: >> > ------------ >> > >> > 1. MbedTLS is well maintained with LTS versions. >> > 2. LWIP is integrated with MbedTLS and easily to enable HTTPS. >> > 3. MbedTLS recently switched license back to GPLv2. >> > >> > Prerequisite: >> > ------------- >> > >> > This patch series requires mbedtls git repo to be added as a >> > subtree to the main U-Boot repo via: >> > $ git subtree add --prefix lib/mbedtls/external/mbedtls \ >> > https://github.com/Mbed-TLS/mbedtls.git \ >> > v3.6.0 --squash >> > Moreover, due to the Windows-style files from mbedtls git repo, >> > we need to convert the CRLF endings to LF and do a commit manually: >> > $ git add --renormalize . >> > $ git commit >> > >> > New Kconfig options: >> > -------------------- >> > >> > `MBEDTLS_LIB` is for MbedTLS general switch. >> > `MBEDTLS_LIB_CRYPTO` is for replacing original digest and crypto libs >> with >> > MbedTLS. >> > `MBEDTLS_LIB_X509` is for replacing original X509, PKCS7, MSCode, ASN1, >> > and Pubkey parser with MbedTLS. >> > `LEGACY_CRYPTO` is introduced as a main switch for legacy crypto >> library. >> > `LEGACY_CRYPTO_BASIC` is for the basic crypto functionalities and >> > `LEGACY_CRYPTO_CERT` is for the certificate related functionalities. >> > For each of the algorithm, a pair of `<alg>_LEGACY` and `<alg>_MBEDTLS` >> > Kconfig options are introduced. Meanwhile, `SPL_` Kconfig options are >> > introduced. >> > >> > In this patch set, MBEDTLS_LIB, MBEDTLS_LIB_CRYPTO and MBEDTLS_LIB_X509 >> > are by default enabled in qemu_arm64_defconfig and sandbox_defconfig >> > for testing purpose. >> > >> > Patches for external MbedTLS project: >> > ------------------------------------- >> > >> > Since U-Boot uses Microsoft Authentication Code to verify PE/COFFs >> > executables which is not supported by MbedTLS at the moment, >> > addtional patches for MbedTLS are created to adapt with the EFI loader: >> > 1. Decoding of Microsoft Authentication Code. >> > 2. Decoding of PKCS#9 Authenticate Attributes. >> > 3. Extending MbedTLS PKCS#7 lib to support multiple signer's >> certificates. >> > 4. MbedTLS native test suites for PKCS#7 signer's info. >> > >> > All above 4 patches (tagged with `mbedtls/external`) are submitted to >> > MbedTLS project and being reviewed, eventually they should be part of >> > MbedTLS LTS release. >> > But before that, please merge them into U-Boot, otherwise the building >> > will be broken when MBEDTLS_LIB_X509 is enabled. >> > >> > See below PR link for the reference: >> > https://github.com/Mbed-TLS/mbedtls/pull/9001 >> > >> > Miscellaneous: >> > -------------- >> > >> > Optimized MbedTLS library size by tailoring the config file >> > and disabling all unnecessary features for EFI loader. >> > From v2, original libs (rsa, asn1_decoder, rsa_helper, md5, sha1, >> sha256, >> > sha512) are completely replaced when MbedTLS is enabled. >> > From v3, the size-growth is slightly reduced by refactoring Hash >> functions. >> > >> > Target(QEMU arm64) size-growth when enabling MbedTLS: >> > v1: 6.03% >> > v2: 4.66% >> > From v3: 4.55% >> > >> > Please see the latest output from buildman for size-growth on QEMU >> arm64, >> > Sandbox and Nanopi A64. [1] >> >> Let us inline the growth on qemu_arm64 for a moment: >> aarch64: (for 1/1 boards) all +6916.0 bss -32.0 data -64.0 rodata >> +200.0 text +6812.0 >> qemu_arm64 : all +6916 bss -32 data -64 rodata +200 text >> +6812 >> u-boot: add: 28/-17, grow: 12/-16 bytes: 15492/-8304 (7188) >> function old new >> delta >> mbedtls_internal_sha1_process - 4540 >> +4540 >> mbedtls_internal_md5_process - 2928 >> +2928 >> mbedtls_internal_sha256_process - 2052 >> +2052 >> mbedtls_internal_sha512_process - 1056 >> +1056 >> K - 896 >> +896 >> mbedtls_sha512_finish - 556 >> +556 >> mbedtls_sha256_finish - 484 >> +484 >> mbedtls_sha1_finish - 420 >> +420 >> mbedtls_sha512_starts - 340 >> +340 >> mbedtls_md5_finish - 336 >> +336 >> mbedtls_sha512_update - 264 >> +264 >> mbedtls_sha256_update - 252 >> +252 >> mbedtls_sha1_update - 236 >> +236 >> mbedtls_md5_update - 236 >> +236 >> mbedtls_sha512 - 148 >> +148 >> mbedtls_sha256_starts - 124 >> +124 >> hash_init_sha512 52 128 >> +76 >> hash_init_sha256 52 128 >> +76 >> mbedtls_sha1_starts - 72 >> +72 >> mbedtls_md5_starts - 60 >> +60 >> hash_init_sha1 52 112 >> +60 >> mbedtls_platform_zeroize - 56 >> +56 >> mbedtls_sha512_free - 16 >> +16 >> mbedtls_sha256_free - 16 >> +16 >> mbedtls_sha1_free - 16 >> +16 >> mbedtls_md5_free - 16 >> +16 >> hash_finish_sha512 72 88 >> +16 >> hash_finish_sha256 72 88 >> +16 >> hash_finish_sha1 72 88 >> +16 >> sha512_csum_wd 68 80 >> +12 >> sha256_csum_wd 68 80 >> +12 >> sha1_csum_wd 68 80 >> +12 >> md5_wd 68 80 >> +12 >> mbedtls_sha512_init - 12 >> +12 >> mbedtls_sha256_init - 12 >> +12 >> mbedtls_sha1_init - 12 >> +12 >> mbedtls_md5_init - 12 >> +12 >> memset_func - 8 >> +8 >> sha512_update 4 8 >> +4 >> sha384_update 4 8 >> +4 >> sha256_update 12 8 >> -4 >> sha1_update 12 8 >> -4 >> sha256_process 16 - >> -16 >> sha1_process 16 - >> -16 >> hash_update_sha512 36 16 >> -20 >> hash_update_sha256 36 16 >> -20 >> hash_update_sha1 36 16 >> -20 >> MD5Init 56 36 >> -20 >> sha1_starts 60 36 >> -24 >> hash_update_sha384 36 - >> -36 >> hash_init_sha384 52 - >> -52 >> sha384_csum_wd 68 12 >> -56 >> sha256_starts 104 40 >> -64 >> sha256_padding 64 - >> -64 >> sha1_padding 64 - >> -64 >> hash_finish_sha384 72 - >> -72 >> sha512_finish 152 36 >> -116 >> sha512_starts 168 40 >> -128 >> sha384_starts 168 40 >> -128 >> sha384_finish 152 4 >> -148 >> MD5Final 196 44 >> -152 >> sha512_base_do_finalize 160 - >> -160 >> static.sha256_update 228 - >> -228 >> static.sha1_update 240 - >> -240 >> sha512_base_do_update 244 - >> -244 >> MD5Update 260 - >> -260 >> sha1_finish 300 36 >> -264 >> sha256_finish 404 36 >> -368 >> sha256_armv8_ce_process 428 - >> -428 >> sha1_armv8_ce_process 484 - >> -484 >> sha512_K 640 - >> -640 >> sha512_block_fn 1212 - >> -1212 >> MD5Transform 2552 - >> -2552 >> >> And to start with, that's not bad. In fact, tossing LTO in before mbedTLS >> only changes >> the top-line a little: >> aarch64: (for 1/1 boards) all +5120.0 bss -16.0 data -64.0 rodata >> +200.0 text +5000.0 >> qemu_arm64 : all +5120 bss -16 data -64 rodata +200 text >> +5000 >> u-boot: add: 19/-18, grow: 11/-7 bytes: 14696/-7884 (6812) >> >> But, is there something we can do still? mbedTLS is a more robust >> solution and I'm accepting there will be growth. But still the >> process/start/finish is much larger. Is there something configurable >> there? >> >> I have investigated all those MbedTLS native functions with big-size > (_process/_update/_finish). > For MD5 and SHA1, we don't have turnable configs. > For SHA256 and SHA512, there are a few configs: > 1. Performance configs only for Armv8/a64. > I didn't turn that on, which might affect the target size as well. > 2. Smaller implementation with lower size (only for non-Armv8/a64) at the > expense of losing > performance. > I didn't enable both, as #1 is more for performance and might potentially > increase target size; > #2 compromises the performance and only for non-Armv8/a64. > Looks like that both don't help in reducing the size of qemu_arm64. > But I will try #1 on qemu_arm64 and #2 on sandbox and let you know > the size impact soon. > > The smaller footprint implementation for SHA256/512 can reduce the target size significantly on those "<hash>_process()" functions. Please see below output from buildman: ``` aarch64: (for 2/2 boards) all -1468.0 bss +16.0 data -64.0 rodata +200.0 text -1620.0 qemu_arm64 : all +4608 bss +80 data -64 rodata +200 text +4392 u-boot: add: 29/-17, grow: 12/-16 bytes: 13072/-8304 (4768) nanopi_a64 : all -7544 bss -48 data -64 rodata +200 text -7632 u-boot: add: 21/-8, grow: 4/-8 bytes: 10692/-4364 (6328) sandbox: (for 1/1 boards) all +19312.0 data +1440.0 rodata -4128.0 text +22000.0 sandbox : all +19312 data +1440 rodata -4128 text +22000 u-boot: add: 258/-206, grow: 122/-59 bytes: 90286/-76286 (14000) ``` Since this is a trade-off between size and performance, I will add one more kconfig to allow the user to turn it on/off. What are your thoughts? On the other hand, the "Armv8/a64 only" options depend on NEON instructions, so I will keep them off. Regards, Raymond