On Friday, 22 November 2013 at 08:44:06 UTC, Andrea Fontana wrote:
On Friday, 22 November 2013 at 03:36:38 UTC, Craig Dillabaugh wrote:
On Friday, 22 November 2013 at 02:24:56 UTC, Mikko Ronkainen
wrote:
I'm trying to learn some software rasterization stuff. Here's what I'm doing:

32-bit DMD on 64-bit Windows
Framebuffer is an int[], each int is a pixel of format 0xAABBGGRR (this seems fastest to my CPU + GPU) Framebuffer is thrown as is to OpenGL, rendered as textured quad.

Here's a simple rectangle drawing algorithm that also does alpha blending. I tried quite a many variations (for example without the byte casting, using ints and shifting instead), but none was as fast as this:

class Framebuffer
{
int[] data;
int width;
int height;
}

void drawRectangle(Framebuffer framebuffer, int x, int y, int width, int height, int color)
{
foreach (i; y .. y + height)
{
  int start = x + i * framebuffer.width;

  foreach(j; 0 .. width)
  {
    byte* bg = cast(byte*)&framebuffer.data[start + j];
    byte* fg = cast(byte*)&color;

    int alpha = (fg[3] & 0xff) + 1;
    int inverseAlpha = 257 - alpha;

bg[0] = cast(byte)((alpha * (fg[0] & 0xff) + inverseAlpha * (bg[0] & 0xff)) >> 8); bg[1] = cast(byte)((alpha * (fg[1] & 0xff) + inverseAlpha * (bg[1] & 0xff)) >> 8); bg[2] = cast(byte)((alpha * (fg[2] & 0xff) + inverseAlpha * (bg[2] & 0xff)) >> 8);
    bg[3] = cast(byte)0xff;
  }
}
}

I would like to make this as fast as possible as it is done for almost every pixel every frame.

Am I doing something stupid that is slowing things down? Cache trashing, or even branch prediction errors? :) Is this kind of algorith + data even a candidate for SIMD usage? Even if fg is of type byte, fg[0] would return greater value than 0xff. It needs to be (fg[0] & 0xff) to make things work. I wonder why?

Do you want to use a ubyte instead of a byte here?

Also, for your alpha channel:

int alpha = (fg[3] & 0xff) + 1;
int inverseAlpha = 257 - alpha;

If fg[3] = 0 then inverseAlpha = 256, which is out of the range
that can be stored in a ubyte.

Craig

If I'm right all of these lines:

byte* fg = cast(byte*)&color;
int alpha = (fg[3] & 0xff) + 1;
int inverseAlpha = 257 - alpha;

are constant, and you put it outside the both foreach using an enum;

you can also pre-calculate this:
(alpha * (fg[0] & 0xff)

before foreach.

Of course I mean immutable, not enum :)

Reply via email to