On 12/21/2011 08:25 PM, Maarten Lankhorst wrote:
> Hey Christian,
>
> On 12/21/2011 07:25 PM, Christian König wrote:
>> Hi guys,
>>
>> On 21.12.2011 17:27, Maarten Lankhorst wrote:
>>> Also a remark about your patch, wish you had inlined it..
>> Sorry, not at home right now and I can't get this Thunderbird here to inline
>> the patches correctly.
>>
>>> diff --git a/src/gallium/auxiliary/vl/vl_vlc.h
>>> b/src/gallium/auxiliary/vl/vl_vlc.h
>>> index dc4faed..f961b8b 100644
>>> --- a/src/gallium/auxiliary/vl/vl_vlc.h
>>> +++ b/src/gallium/auxiliary/vl/vl_vlc.h
>>> @@ -115,22 +164,24 @@ vl_vlc_init(struct vl_vlc *vlc, const uint8_t *data,
>>> unsigned len)
>>> static INLINE unsigned
>>> vl_vlc_bits_left(struct vl_vlc *vlc)
>>> {
>>> + unsigned i;
>>> signed bytes_left = ((uint8_t*)vlc->end)-((uint8_t*)vlc->data);
>>> + for (i = 0; i< vlc->num_inputs; ++i)
>>> + bytes_left += vlc->sizes[i];
>>> Calculating this more than once is wasteful, shouldn't this be done once in
>>> init?
>> Good idea, but doesn't make so much of a difference, since vl_vlc_bits_left
>> is only called once per macroblock.
>>
>>> return bytes_left * 8 + vlc->valid_bits;
>>> }
>>>
>>> ....
>>>
>>> @@ -86,28 +123,40 @@ vl_vlc_fillbits(struct vl_vlc *vlc)
>>> vlc->buffer |= (uint64_t)value<< (32 - vlc->valid_bits);
>>> ++vlc->data;
>>> vlc->valid_bits += 32;
>>> +
>>> + /* if this input is depleted, go to next one */
>>> + if (vlc->data>= vlc->end&& vlc->num_inputs) {
>>> + unsigned bytes_overdue =
>>> ((uint8_t*)vlc->data)-((uint8_t*)vlc->end);
>>> +
>>> + /* adjust valid bits */
>>> + vlc->valid_bits -= bytes_overdue * 8;
>>> +
>>> + /* clear not valid bits */
>>> + vlc->buffer&= ~((1LL<< (64 - vlc->valid_bits)) - 1);
>>> +
>>> + /* go on to next input */
>>> + vl_vlc_next_input(vlc);
>>> +
>>> + /* and retry to fill the buffer */
>>> + vl_vlc_fillbits(vlc);
>>> Is this recursion necessary? Can't you simply change the if to a while?
>>> Also if you are evil and put in a zero-sized buffer, this code will not
>>> work,
>>> so the if checks needs to be reworked. Same problem with vl_vlc_next_input
>>> and
>>> unaligned pointers, you can cause it to crash with alignment 4n+1 and len<=
>>> 2.
>> Yeah, all the corner cases like len < 4, badly aligned start and end of
>> buffer etc where not really covered by the last version of the patch.
>>
>> Take a look at the attached one, does it now cover all cases?
>>
>> On 21.12.2011 17:24, Younes Manton wrote:
>>> 2011/12/21 Maarten Lankhorst<m.b.lankho...@gmail.com>:
>>>> Hey Christian,
>>>>
>>>> On 12/21/2011 04:41 PM, Christian König wrote:
>>>>> Those functions are called a couple of million times a second, so even if
>>>>> the assertion is only tested in debug builds it has quite an effect on
>>>>> performance, sometimes even masquerading real performance problems. So my
>>>>> practice was to only uncomment them when I work on that part of the code.
>>>>> I don't want to change the Assert semantics in any way, just a simple
>>>>> define enabling certain assertions only under certain conditions should
>>>>> be sufficient.
>>>> I think it makes sense to have the debug builds enabling those asserts by
>>>> default.
>>>>
>>>> You could always do this in vl_mpeg12_bitstream.c when testing:
>>>> #include "u_debug.h"
>>>> #undef assert
>>>> #include "vl_vlc.h"
>>>> #define assert debug_assert
>>>>
>>>> On a related note, why is the vl code using assert.h directly instead of
>>>> u_debug.h ?
>>>>
>>>> ~Maarten
>>> u_debug.h didn't always have assert wrappers.
>> I wasn't even aware that gallium has its own assert implementation in
>> u_debug.h, cause a whole bunch of state trackers and drivers are using
>> assert.h instead.
>>
>> Anyway, I have no idea what I tried the last time to make it perform so
>> awfuly that I need to comment it out. So I think just leaving the assertions
>> enabled for now should be ok. As Maarten pointed out disabling it is just a
>> two liner.
> It would be nice if you inlined patches for easier reviewing. :)
>
> + while (vlc->valid_bits < 32) {
> + /* if this input is depleted */
> + while (vlc->data >= vlc->end) {
> +
> + if (vlc->num_inputs)
> + /* go on to next input */
> + vl_vlc_next_input(vlc);
> + else
> + /* or give up since we don't have anymore inputs */
> + return;
> + }
>
> I'm spotting an overflow that could be triggered with 64 single-byte
> unaligned buffers, maybe this is better:
>
> + while (vlc->valid_bits < 32) {
> + uint32_t value;
> + if (vlc->data >= vlc->end) {
> + if (vlc->num_inputs) {
> + vl_vlc_next_input(vlc);
> + continue;
> + } else
> + return;
> + }
> + value = *(const uint32_t*)vlc->data;
> + vlc->data += 4;
> #ifndef PIPE_ARCH_BIG_ENDIAN
> value = util_bswap32(value);
> #endif
> + vlc->buffer |= (uint64_t)value << (32 - vlc->valid_bits);
> + vlc->valid_bits += 32;
> + if (vlc->data > vlc->end) {
> ....
>
>
> With all the pointer math, maybe change the type for 'end' and 'data' to
> uint8_t? Then you would only need that single cast in fillbits (which
> I did above) and you can kill all the casts everywhere.
>
Another thought, this would prevent the need to read past end of file which
could show up erroneously in valgrind, with something like this: if (end-data
>= 4) { // Nom nom 4 bytes } else { // Read at most 3 bytes }
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/mesa-dev