Geert Bosch wrote: > On Nov 19, 2010, at 11:53, Eric Botcazou wrote: > >>> Yes, if all the people who want only one set of libraries agree on what >>> that set shall be (or this can be selected with existing configure flags), >>> this is the simplest way. >>> >> Yes, this can be selected at configure time with --with-cpu and --with-float. >> >> The default configuration is also straightforward: LEON is an implementation >> of the SPARC-V8 architecture so --with-cpu=v8 and --with-float=hard. >> > > There is LEON2, which is V7, and LEON3/LEON4, which are V8. > While LEON3 can support all of V8 in hardware, LEON3 is a > configurable system-on-a-chip, targetting both FPGAs and ASICs, > where users can configure and synthesize different aspects of > the CPU: > > * CONFIG_PROC_NUM: The number of processor cores. > > * CONFIG_IU_V8MULDIV: Implements V8 multiply and divide instructions > UMUL, UMULCC, SMUL, SMULCC, UDIV, UDIVCC, SDIV, SDIVCC. > Costs about 8k gates. > > * CONFIG_IU_MUL_MAC: Implements the SPARC V8e UMAC/SMAC > (multiply-accumulate) instructions with a 40-bits accumulator > > * CONFIG_FPU_ENABLE: Enable or disable floating point unit > > Apart from these settings that determine wether instructions are > present at all, other settings allow selection of FPU implementation > (trading off between cycle count, area and timing), such as: > > * CONFIG_IU_MUL_LATENCY_2: Implementation options for the integer multiplier. > Type Implementation issue-rate/latency > 2-clocks 32x32 pipelined multiplier 1/2 > 4-clocks 16x16 standard multiplier 4/4 > 5-clocks 16x16 pipelined multiplier 4/5 > > * CONFIG_IU_LDELAY: One cycle load delay for best performance, or 2-cycles > to improve timing at the cost of about 5% reduced performance. > > CONFIG_FPU_ENABLE Y/N would correspond to --with-float=hard/soft, and > I believe setting CONFIG_IU_V8MULDIV to Y/N requires --with-cpu=V8/V7, > is that correct? I think it would make sense to build these as multilibs, > so the user can experiment to find out performance impacts of > the various hardware configurations on generated code. > > I wonder if it also would be worthwhile to have compiler options > for fpu=fast/slow and multiply=fast/slow, so we can schedule > appropriately. For the FPU, issue-rate/latency are as follows: > GR FPU: 1/4, with FDIV? 16 and FSQRT? 24 cycles, > non-pipelined on separate unit > GR FPU Lite: 8/8, with FDIVS/FDIVD/FSQRTS/FSQRTD 31/57/46/57 cycles, > non-pipelined on same unit > > While the FPU Lite is not pipelined, integer instructions can be > executed in parallel with a FPU instruction as long as no new FPU > instructions are pending. > > -Geert > > Just a humble opinion: Geert points out a very important fact, LEON's RTL is very configurable and if the compiler takes away such flexibility could be a bit of a pitty. Maybe the user should always have the choice to implement in software or hardware any given configuration.
Jorge