Geert Bosch wrote:
> On Nov 19, 2010, at 11:53, Eric Botcazou wrote:
>   
>>> Yes, if all the people who want only one set of libraries agree on what
>>> that set shall be (or this can be selected with existing configure flags),
>>> this is the simplest way.
>>>       
>> Yes, this can be selected at configure time with --with-cpu and --with-float.
>>
>> The default configuration is also straightforward: LEON is an implementation 
>> of the SPARC-V8 architecture so --with-cpu=v8 and --with-float=hard.
>>     
>
> There is LEON2, which is V7, and LEON3/LEON4, which are V8.
> While LEON3 can support all of V8 in hardware, LEON3 is a 
> configurable system-on-a-chip, targetting both FPGAs and ASICs, 
> where users can configure and  synthesize different aspects of
> the CPU:
>
> * CONFIG_PROC_NUM: The number of processor cores.
>
> * CONFIG_IU_V8MULDIV: Implements V8 multiply and divide instructions
>   UMUL, UMULCC, SMUL, SMULCC, UDIV, UDIVCC, SDIV, SDIVCC.
>   Costs about 8k gates.
>
> * CONFIG_IU_MUL_MAC: Implements the SPARC V8e UMAC/SMAC
>   (multiply-accumulate) instructions with a 40-bits accumulator
>
> * CONFIG_FPU_ENABLE: Enable or disable floating point unit
>
> Apart from these settings that determine wether instructions are
> present at all, other settings allow selection of FPU implementation
> (trading off between cycle count, area and timing), such as:
>
> * CONFIG_IU_MUL_LATENCY_2: Implementation options for the integer multiplier.
>   Type        Implementation              issue-rate/latency
>   2-clocks    32x32 pipelined multiplier     1/2 
>   4-clocks    16x16 standard multiplier      4/4
>   5-clocks    16x16 pipelined multiplier     4/5
>
> * CONFIG_IU_LDELAY: One cycle load delay for best performance, or 2-cycles
>   to improve timing at the cost of about 5% reduced performance.
>
> CONFIG_FPU_ENABLE Y/N would correspond to --with-float=hard/soft, and
> I believe setting CONFIG_IU_V8MULDIV to Y/N requires --with-cpu=V8/V7,
> is that correct? I think it would make sense to build these as multilibs,
> so the user can experiment to find out performance impacts of
> the various hardware configurations on generated code.
>
> I wonder if it also would be worthwhile to have compiler options
> for fpu=fast/slow and multiply=fast/slow, so we can schedule
> appropriately. For the FPU, issue-rate/latency are as follows:
>   GR FPU:      1/4, with FDIV? 16 and FSQRT? 24 cycles,
>                     non-pipelined on separate unit
>   GR FPU Lite: 8/8, with FDIVS/FDIVD/FSQRTS/FSQRTD 31/57/46/57 cycles,
>                     non-pipelined on same unit
>
> While the FPU Lite is not pipelined, integer instructions can be
> executed in parallel with a FPU instruction as long as no new FPU
> instructions are pending.
>
>   -Geert
>
>   
Just a humble opinion: Geert points out a very important fact, LEON's
RTL is very configurable and if the compiler takes away such flexibility
could be a bit of a pitty. Maybe the user should always have the choice
to implement in software or hardware any given configuration.


Jorge


Reply via email to