Re: [Qemu-devel] Poking a sun4v machine

Alexander Graf Sun, 06 May 2012 02:16:45 -0700


Am 06.05.2012 um 11:13 schrieb Blue Swirl <blauwir...@gmail.com>:


> On Sun, May 6, 2012 at 8:58 AM, Alexander Graf <ag...@suse.de> wrote:
>> 
>> 
>> Am 06.05.2012 um 10:29 schrieb Blue Swirl <blauwir...@gmail.com>:
>> 
>>> On Wed, May 2, 2012 at 2:38 PM, Artyom Tarasenko <atar4q...@gmail.com> 
>>> wrote:
>>>> On Tue, May 1, 2012 at 4:06 PM, Blue Swirl <blauwir...@gmail.com> wrote:
>>>>> On Tue, May 1, 2012 at 13:54, Artyom Tarasenko <atar4q...@gmail.com> 
>>>>> wrote:
>>>>>> On Tue, May 1, 2012 at 11:25 AM, Blue Swirl <blauwir...@gmail.com> wrote:
>>>>>>> On Mon, Apr 30, 2012 at 17:38, Artyom Tarasenko <atar4q...@gmail.com> 
>>>>>>> wrote:
>>>>>>>> On Mon, Apr 30, 2012 at 7:15 PM, Andreas Färber <afaer...@suse.de> 
>>>>>>>> wrote:
>>>>>>>>> Am 30.04.2012 18:39, schrieb Artyom Tarasenko:
>>>>>>>>>> Tried to boot QEMU Niagara machine with the firmware from the
>>>>>>>>>> OpenSPARC T1 emulator ( www.opensparc.net/opensparc-t1/download.html 
>>>>>>>>>> )
>>>>>>>>>> , and it dies very early.
>>>>>>>>>> The reason: in translate.c
>>>>>>>>>> 
>>>>>>>>>> #define hypervisor(dc) (dc->mem_idx == MMU_HYPV_IDX)
>>>>>>>>>> #define supervisor(dc) (dc->mem_idx >= MMU_KERNEL_IDX)
>>>>>>>>>> 
>>>>>>>>>> and the dc->mem_idx is initialized like this:
>>>>>>>>>> 
>>>>>>>>>>     if (env1->tl > 0) {
>>>>>>>>>>         return MMU_NUCLEUS_IDX;
>>>>>>>>>>     } else if (cpu_hypervisor_mode(env1)) {
>>>>>>>>>>         return MMU_HYPV_IDX;
>>>>>>>>>>     } else if (cpu_supervisor_mode(env1)) {
>>>>>>>>>>         return MMU_KERNEL_IDX;
>>>>>>>>>>     } else {
>>>>>>>>>>         return MMU_USER_IDX;
>>>>>>>>>>     }
>>>>>>>>>> 
>>>>>>>>>> Which seems to be conceptually incorrect. After reset tl == MAXTL, 
>>>>>>>>>> but
>>>>>>>>>> still super- and hyper-visor bits are set, so both supervisor(dc) and
>>>>>>>>>> hypervisor(dc) must return 1 which is impossible in the current
>>>>>>>>>> implementation.
>>>>>>>>>> 
>>>>>>>>>> What would be the proper way to fix it? Make mem_idx bitmap, add two
>>>>>>>>>> more variables to DisasContext, or ...?
>>>>>>>>>> 
>>>>>>>>>> Some other findings/questions:
>>>>>>>>>> 
>>>>>>>>>>     /* Sun4v generic Niagara machine */
>>>>>>>>>>     {
>>>>>>>>>>         .default_cpu_model = "Sun UltraSparc T1",
>>>>>>>>>>         .console_serial_base = 0xfff0c2c000ULL,
>>>>>>>>>> 
>>>>>>>>>> Where is this address coming from? The OpenSPARC Niagara machine has 
>>>>>>>>>> a
>>>>>>>>>> "dumb serial" at 0x1f10000000ULL.
>>>>>>>>>> 
>>>>>>>>>> And the biggest issue: UA2005 (as well as UA2007) describe a totally
>>>>>>>>>> different format for a MMU TTE entry than the one sun4u CPU are 
>>>>>>>>>> using.
>>>>>>>>>> I think the best way to handle it would be splitting off Niagara
>>>>>>>>>> machine, and #defining MMU bits differently for sun4u and sun4v
>>>>>>>>>> machines.
>>>>>>>>>> 
>>>>>>>>>> Do we the cases in qemu where more than two (qemu-system-xxx and
>>>>>>>>>> qemu-system-xxx64) binaries are produced?
>>>>>>>>>> Would the name qemu-system-sun4v fit the naming convention?
>>>>>>>>> 
>>>>>>>>> We have such a case for ppc (ppcemb) and it is kind of a maintenance
>>>>>>>>> nightmare - I'm working towards getting rid of it with my QOM CPU 
>>>>>>>>> work.
>>>>>>>>> Better avoid it for sparc in the first place.
>>>>>>>>> 
>>>>>>>>> Instead, you should add a callback function pointer to SPARCCPUClass
>>>>>>>>> that you initialize based on CPU model so that is behaves differently 
>>>>>>>>> at
>>>>>>>>> runtime rather than at compile time.
>>>>>>>>> Or if it's just about the class_init then after the Hard Freeze I can
>>>>>>>>> start polishing my subclasses for sparc so that you can add a special
>>>>>>>>> class_init for Niagara.
>>>>>>>> 
>>>>>>>> But this would mean that the defines from
>>>>>>>> #define TTE_NFO_BIT (1ULL << 60)
>>>>>>>> to
>>>>>>>> #define TTE_PGSIZE(tte)     (((tte) >> 61) & 3ULL)
>>>>>>>> 
>>>>>>>> inclusive would need to be replaced with functions and variables?
>>>>>>>> Sounds like a further performance regression for sun4u?
>>>>>>> 
>>>>>>> There could be parallel definitions for sun4u (actually UltraSparc-III
>>>>>>> onwards the MMU is again different) and sun4v.
>>>>>>> 
>>>>>>> At tlb_fill(), different implementations can be selected based on MMU
>>>>>>> model. For ASI accesses, we can add conditional code but for higher
>>>>>>> performance, some checks can be moved to translation time.
>>>>>> 
>>>>>> Can be done, but what is the gain of having it runtime configurable?
>>>>> 
>>>>> I was thinking of code like this in:
>>>>> 
>>>>> switch (env->mmu_model) {
>>>>> case MMU_US2:
>>>>>   return tlb_fill_us2(..);
>>>>> case MMU_US3:
>>>>>   return tlb_fill_us3(..);
>>>>> case MMU_US4:
>>>>>   return tlb_fill_us4(..);
>>>>> case MMU_T1:
>>>>>   return tlb_fill_t1(..);
>>>>> case MMU_T2:
>>>>>   return tlb_fill_t2(..);
>>>>> }
>>>>> 
>>>>> The perfomance cost shouldn't be too high. Alternatively a function
>>>>> pointer could be set up.
>>>> 
>>>> Actually I was more worried about get_physical_address_* than filling,
>>>> there we would have to use variables instead of constants and
>>>> functions instead of macros.
>>> 
>>> Preferably entirely different functions with constants.
>>> 
>>>> 
>>>>> Yes, we can always provide the register bank, older models just access
>>>>> some of those.
>>>>> 
>>>>>> cpu_change_pstate should probably have another parameter (new_GL)
>>>>>> which is only valid for sun4v.
>>>>>> And, depending on a trap type, env->htba has to be taken instead of
>>>>>> env->tbr. To me it looks like at the end do_interrupt will have less
>>>>>> common parts between sun4u and sun4v than specific ones.
>>>>> 
>>>>> Same as tlb_fill(), switch() or function pointer. The functions are 
>>>>> different.
>>>>> 
>>>>> This is unavoidable (unless maybe in the future the TLB handling can
>>>>> be pushed partially higher so mmu_idx parameters can be eliminated)
>>>>> and the performance cost is not great.
>>>> 
>>>> So, altogether you'd still prefer run-time checks over having
>>>> qemu-system-sun4v (or -sparc64v) ?
>>> 
>>> Yes. Architectures are not meant to handle small issues like this.
>>> Should performance become a problem, there are a plenty of lower
>>> hanging fruits where to start optimizing.
>>> 
>>> Even in this case, rather than the new architecture solution, it could
>>> be possible to build separate TLB handlers which call directly the
>>> correct MMU functions without switches and these would be selected at
>>> translation time or earlier. For the PPCEMB case, maybe the memory API
>>> could be changed to handle different page sizes without loss of
>>> performance, I don't know. Devices should not depend on
>>> TARGET_PAGE_SIZE.
>> 
>> It's not a matter of an API. The main problem is that the QEMU TLB has to be 
>> fine grained enough to handle 1k faults, so it has to be in 1k-steps in its 
>> current design.
>> 
>> That'd hurt performance quite a bit. The softmmu already is a very big chunk 
>> of execution time on ppc and zi really don't want that number to go up.
> 
> Yes, but that's not what I proposed. Now the translator arranges a
> call to for example qemu_ld32u which uses this fixed TLB size.
> Instead, it should generate calls to qemu_ld32u_ppcemb (which should
> use 1k pages) or qemu_ld32u_ppc (4k?) as needed, maybe also combining
> with MMU_IDX they would expand to qemu_ld32u_ppc_hypv etc. Obviously
> this would need big changes everywhere and MMU_IDX change should be
> compared with the negative cache effects of having many separate,
> often used functions in the hot path.

Ah, I see :). Yeah, that'd work and sounds like a great idea!

Alex

> 
> Memory API (or actually the related pieces in exec.c) hardcodes
> TARGET_PAGE_SIZE assumptions in many places, but it might be possible
> to make this dynamic.
> 
>> 
>> 
>> Alex
>>

Re: [Qemu-devel] Poking a sun4v machine

Reply via email to