Re: Optimization tips for alpha blending / rasterization loop

Mikko Ronkainen Fri, 22 Nov 2013 07:04:35 -0800

Do you want to use a ubyte instead of a byte here?

Yes, that was a silly mistake. It seems that fixing that removedthe need for all the masking operations, which had the biggestspeedup.

Also, for your alpha channel:

int alpha = (fg[3] & 0xff) + 1;
int inverseAlpha = 257 - alpha;

If fg[3] = 0 then inverseAlpha = 256, which is out of the range
that can be stored in a ubyte.

I think my logic should be correct. The calculations are donewith ints, and the result is then just casted/clamped to thebyte. The reason for the +1 is the >> 8, which divides by 256.


class Framebuffer
{
  uint[] framebufferData;
  uint framebufferWidth;
  uint framebufferHeight;
}

void drawRectangle(Framebuffer framebuffer, uint x, uint y, uintwidth, uint height, uint color)

{
  immutable ubyte* fg = cast(immutable ubyte*)&color;
  immutable uint alpha = fg[3] + 1;
  immutable uint invAlpha = 257 - alpha;
  immutable uint afg0 = alpha * fg[0];
  immutable uint afg1 = alpha * fg[1];
  immutable uint afg2 = alpha * fg[2];

  foreach (i; y .. y + height)
  {
    uint start = x + i * framebuffer.width;

    foreach(j; 0 .. width)
    {
      ubyte* bg = cast(ubyte*)(&framebuffer.data[start + j]);

      bg[0] = cast(ubyte)((afg0 + invAlpha * bg[0]) >> 8);
      bg[1] = cast(ubyte)((afg1 + invAlpha * bg[1]) >> 8);
      bg[2] = cast(ubyte)((afg2 + invAlpha * bg[2]) >> 8);
      bg[3] = 0xff;
    }
  }
}

Can this be made faster with SIMD? (I don't know much about it,maybe the data and algorithm doesn't fit it?)


Can this be parallelized with any real gains?

Re: Optimization tips for alpha blending / rasterization loop

Reply via email to