On 02.02.16 18:49, Mark Rutland wrote:
> On Tue, Feb 02, 2016 at 03:45:01AM +0100, Alexander Graf wrote:
>> EFI uses the PE binary format for its application images. Add support to EFI 
>> PE
>> binaries as well as all necessary bits for the "EFI image loader" interfaces.
>>
>> Signed-off-by: Alexander Graf <ag...@suse.de>
>>
>> ---
>>
>> v1 -> v2:
>>
>>   - move memory allocation to separate patch
>>   - limit 32/64 to hosts that support it
>>   - check 32bit optional nt header magic
>>   - switch to GPL2+
>>
>> v2 -> v3:
>>
>>   - use efi_alloc
>>   - add EFIAPI to function prototypes
>>   - remove unused macros
>>   - reorder header inclusion
>>   - split relocation code into function
>>   - flush cache after loading
>> ---
>>  include/efi_loader.h              |  20 +++
>>  include/pe.h                      | 263 
>> ++++++++++++++++++++++++++++++++++++++
>>  lib/efi_loader/efi_image_loader.c | 182 ++++++++++++++++++++++++++
>>  3 files changed, 465 insertions(+)
>>  create mode 100644 include/efi_loader.h
>>  create mode 100644 include/pe.h
>>  create mode 100644 lib/efi_loader/efi_image_loader.c
> 
> [...]
> 
>> +static void efi_loader_relocate(const IMAGE_BASE_RELOCATION *rel,
>> +                    unsigned long rel_size, void *efi_reloc)
>> +{
>> +    const IMAGE_BASE_RELOCATION *end;
>> +    int i;
>> +
>> +    end = (const IMAGE_BASE_RELOCATION *)((const char *)rel + rel_size);
>> +    while (rel < end - 1 && rel->SizeOfBlock) {
>> +            const uint16_t *relocs = (const uint16_t *)(rel + 1);
>> +            i = (rel->SizeOfBlock - sizeof(*rel)) / sizeof(uint16_t);
>> +            while (i--) {
>> +                    uint16_t offset = (*relocs & 0xfff) +
>> +                                      rel->VirtualAddress;
>> +                    int type = *relocs >> 12;
>> +                    unsigned long delta = (unsigned long)efi_reloc;
>> +                    uint64_t *x64 = efi_reloc + offset;
>> +                    uint32_t *x32 = efi_reloc + offset;
>> +                    uint16_t *x16 = efi_reloc + offset;
>> +
>> +                    switch (type) {
>> +                    case IMAGE_REL_BASED_ABSOLUTE:
>> +                            break;
>> +                    case IMAGE_REL_BASED_HIGH:
>> +                            *x16 += ((uint32_t)delta) >> 16;
>> +                            break;
>> +                    case IMAGE_REL_BASED_LOW:
>> +                            *x16 += (uint16_t)delta;
>> +                            break;
>> +                    case IMAGE_REL_BASED_HIGHLOW:
>> +                            *x32 += (uint32_t)delta;
>> +                            break;
>> +                    case IMAGE_REL_BASED_DIR64:
>> +                            *x64 += (uint64_t)delta;
>> +                            break;
>> +                    default:
>> +                            printf("Unknown Relocation off %x type %x\n",
>> +                                   offset, type);
>> +                    }
>> +                    relocs++;
>> +            }
>> +            rel = (const IMAGE_BASE_RELOCATION *)relocs;
>> +    }
>> +}
> 
> [...]
> 
>> +    /* Load sections into RAM */
>> +    for (i = num_sections - 1; i >= 0; i--) {
>> +            IMAGE_SECTION_HEADER *sec = &sections[i];
>> +            memset(efi_reloc + sec->VirtualAddress, 0,
>> +                   sec->Misc.VirtualSize);
>> +            memcpy(efi_reloc + sec->VirtualAddress,
>> +                   efi + sec->PointerToRawData,
>> +                   sec->SizeOfRawData);
>> +    }
>> +
>> +    /* Run through relocations */
>> +    efi_loader_relocate(rel, rel_size, efi_reloc);
>> +
>> +    /* Flush cache */
>> +    flush_cache((ulong)efi_reloc, virt_size);
> 
> Where's the I-cache maintenance for the image performed? I can't see it
> here and I didn't spot it in later patches.

I've added a call to invalidate_icache_all(); at this spot now.

> 
> Given that speculative instruction fetches can happen at any time for
> anything not marked NX, there may already be stale entries in the
> I-caches.
> 
> Also, flush_cache seems to perform DC CIVAC in a loop, which is
> excessively expensive. To make the instructions visible to instruction
> fetches you only need DC CVAU (i.e. clean by VA to the PoU), and you
> only need to do that for executable sections.

While I see where you're trying to aim, I don't think the performance
overhead will be really worth premature optimization. flush_cache is a
globally available operation for all u-boot target platforms. Having
something that is platform agnostic (and very simple) is worth more than
a few us performance for now imho.


Alex
_______________________________________________
U-Boot mailing list
U-Boot@lists.denx.de
http://lists.denx.de/mailman/listinfo/u-boot

Reply via email to