Hi, thanks for the feedback, I agree with most of what you said. Just a few comments:
> I think the main thing is that you really don't want branches inside the > inner loop. You can handle the cases where a component is unused > by building a mask before entering the loop and just or'ing with that > mask inside the loop. Good point. Not sure whether that eliminates the need for both the 'if(DestFormat.bits[x])' AND 'if(SrcFormat.bits[x])', but I can merge them into one 'if' at least. > As for the shifts, > in the large majority of cases you will only need two shifts. The only > time you need more than that is when the destination component is more > than two times the size of the source component, so I don't think you > want to bother with that in the common function. > I don't think there's much overhead by doing the for loop since even in the cases that need to shifts there's at least one if(SrcFormat.bits < DestFormat.bits) necessary. The code inside the for loop will just add one assignment and an additional if. Perhaps lateron this could really be split into a seperate function, but I don't think it's worth to add another hundred lines of code for this small optimization :/ Apart from that, I'll be changing my code accordingly to your suggestions. Best regards, Tony