Hi,
I'm writing a video codec. I need some of my memory aligned to 16-byte boundaries, since I'm using SSE2 instructions in asm code, that assume this alignment. I was using a custom getmem, that called classic getmem to get some memory block, then cast the pointer and adjust it to next aligned address. Offset to original pointer was saved as byte at (aligned_address - 1). Custom freemem took aligned pointer, read the address (ptr-1) to get the offset, set the original pointer address and call the classic freemem. This worked well, though compiler was complaining about that casting the pointer to int was not portable. I discovered the Align() function in system unit, so I modified the code like this:

function evk_malloc (size: longword): pointer;
const
  ALIGNMENT = 16;
begin
  result := GetMem(size + ALIGNMENT);
  result := Align (result, ALIGNMENT);
end;

procedure evk_free (ptr: pointer);
begin
  Freemem(ptr);
end;

I assumed that freemem will handle the aligned pointer correctly. This even worked for few weeks, until I added some more getmem/freemem in the code - started crashing on freemems. Even valgrind complained. I discovered that my assumption was false - I need to free the original pointer. So I mixed the two codes and come with something like this:

function evk_malloc (size: longword): pointer;
const
  ALIGNMENT = 16;
var
  ptr: pointer;
begin
  ptr := getmem(size + ALIGNMENT);
  result := Align (ptr, ALIGNMENT);
  if result = ptr then
      pbyte(result) += 16;
  //store offset to original address
  (pbyte(result) - 1)^ := result - ptr;
end;

procedure evk_free (ptr: pointer);
begin
  if ptr = nil then exit;
  //adjust to original address
  pbyte(ptr) -= pbyte(ptr-1)^ ;
  freemem(ptr);
end;

Is this correct to use? Wouldn't be there any catches for example when running on 64-bit machine?

Also, does FPC support aligning stack variables and record members on mod16 addresses like GCC (http://gcc.gnu.org/onlinedocs/gcc-4.2.2/gcc/Variable-Attributes.html#Variable-Attributes) ? I would like to have an array inside a rather huge record that starts at aligned address for SSE2 use, too.

Best regards,

David


_______________________________________________
fpc-pascal maillist  -  fpc-pascal@lists.freepascal.org
http://lists.freepascal.org/mailman/listinfo/fpc-pascal

Reply via email to