On 1/18/2013 1:25 AM, Jeffrey Walton wrote:
...
That's actually covered in the FIPS User Guide.

3.2.3 Assembler Optimizations
...

For the x86/x86-64 and ARM processors several levels of optimization
are support by the code.
Note that most such optimizations, if compiled into executable code,
are selectively enabled at
runtime depending on the capabilities of the target processor. If the
Module is built and executed
on the same platform (the build-time and run-time systems are the
same) then the appropriate
optimization will be utilized (assuming that the build+target system
corresponds to a formally
tested platform).

For x86-64 there are three possible optimization levels:
   1. No optimization (plain C)
   2. SSE2 optimization
   3. AES-NI+PCLMULQDQ+SSSE3 optimization

Note that other theoretically possible combinations (e.g. AES-NI only,
or SSE3 only) are not
addressed individually, so that a processor which does not support all
three of AES-NI,
PCLMULQDQ, and SSSE3 will fall back to only SSE2 optimization.
The runtime environment variable OPENSSL_ia32cap=~0x200000200000000
disables use of
AES-NI, PCLMULQDQ, and SSSE3 optimizations for x86-64.

For ARM there are two possible optimization levels:
   1. Without NEON
   2. With NEON (ARM7 only)

The runtime variable OPENSSL_armcap=0 disables use of NEON
optimizations for ARM. In the case where the build and runtime systems
are different care must be taken to verify that the
optimizations enabled at run-time on the target system correspond to a
formally test platform. For
instance, if "Windows on x86 32bit" was formally tested but "Windows
on x86 with AES-NI 32
bit" was not

then a Module built on an AES-NI capable built system would be validated when
executed on a non-AES-NI capable target processor, but would notbe
validated when executed on
an AES-NI capable system (such as the build system itself).


This sounds very wrong!

For platforms with runtime capability detection (such as x86 and x86_64), modules compiled for that target platform *on any computer
capable of compiling or cross compiling for that target* should
include all the run-time selectable variants.

Otherwise users who use robotic autobuilders running on a farm of build
machines will be getting somewhat random results depending on which
machine picks up the build job on any given day.

This way, there is only one possible x86 compilation result, with
3 possible runtime hardware dependent behaviors, not a 3x3 matrix
of possible host/target capability combinations.  Ditto for x86_64
(maybe 2 possibilities not 2x2) and arm (2 possibilities not 2x2).


Enjoy

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  http://www.wisemo.com
Transformervej 29, 2730 Herlev, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded
______________________________________________________________________
OpenSSL Project                                 http://www.openssl.org
User Support Mailing List                    openssl-users@openssl.org
Automated List Manager                           majord...@openssl.org

Reply via email to