On 08/16/2011 02:10 PM, Christian König wrote: > Am Dienstag, den 16.08.2011, 01:15 -0400 schrieb Younes Manton: >> Anyway, here are some specific comments: >> >> + for (; num_macroblocks > 0; --num_macroblocks) { >> + mb->base.codec = PIPE_VIDEO_CODEC_MPEG12; >> + mb->macroblock_address = xvmc_mb->x + >> context->width_in_macroblocks * xvmc_mb->y; >> + mb->macroblock_type = xvmc_mb->macroblock_type; >> >> Why macroblock_address? A decoder that prefers x/y will have to do a >> div/mod to get it back, so it's better to just pass x/y along and let >> each decoder do what it wants with it. > I also need x/y for shaders, but the macroblock_address is defined in > the spec and so a (new/better) bitstream decoder could just pass the > value found in the bitstream into the field, instead of needing to > calculate the x/y value. > > If you say that the Nvidia HW needs this also as x/y I'm also quite > happy with passing x/y directly in the structure. afaict, it needs x/y >> + void (*begin_frame)(struct pipe_video_decoder *decoder); >> + void (*end_frame)(struct pipe_video_decoder *decoder); >> >> The Nvidia HW decoder doesn't need these, but I think >> begin_frame/end_frame are not as useful as they can be because you >> don't always know which video buffers they will apply to. When you >> call this in RecursiveEndFrame() it can result in 0, 1, 2, or 3 >> end_frame() calls, but which video buffers does it apply to? The 0 and >> 3 case are obvious, but 1 or 2 is ambigious if the current video >> buffer has 1 or 2 reference frames. > Take a look at "SetDecoderStatus", it should restores the correct > decoder state before calling begin/decode/end, so the video buffer, > reference frames and picture info should be available when the call into > the driver happens. > > I thought about making the decode buffer, target and reference frame > parameters to begin/decode/end instead of state like stuff, to make the > fact that you need to set a correct environment before calling any of > this function obviously. But then it would also make sense to pass > picture info and quant matrix as parameters, and in case of quant matrix > this doesn't make much sense, because it isn't needed for MC/IDCT only > stages. > > I think the question is a bit where to draw the line between state like > information and information passed in parameters. > >> Also, based on the above, I think there is a bug in >> vl_mpeg12_end_frame(struct pipe_video_decoder *decoder). It will unmap >> and destroy the tex_transfers attached to the current buf, but what >> about the reference frames? It has the same problem, it doesn't know >> about the reference frames when end_frame is called so it operates on >> the same buf multiple times in RecursiveEndFrame(). > While developing the patch I actually had the problem that end_frame was > called multiple times for the same frame or called for a wrong buffer. > But I think I have ruled out all corner cases with restoring the state > in "SetDecoderStatus", at least mplayer doesn't crash any more. > >> + vldecoder->cur_buffer %= vldecoder->num_buffers; >> + >> + vldecoder->decoder->set_decode_buffer(vldecoder->decoder, >> vldecoder->buffers[vldecoder->cur_buffer]); >> + vldecoder->decoder->set_decode_target(vldecoder->decoder, >> vlsurf->video_buffer); >> + >> + return vlVdpDecoderRenderMpeg12(vldecoder->decoder, >> (VdpPictureInfoMPEG1Or2 *)picture_info, >> + bitstream_buffer_count, >> bitstream_buffers); >> >> Not sure I understand why this is necessary. Why do you need to use a >> different decode buffer for each frame? In the XvMC case you want to >> keep data for each target video buffer seperate, because the XvMC API >> allows you to submit macroblocks to any surface at any time. Why is >> this necessary for VDPAU if it requires the user to submit the entire >> frame at once? > The short answer is: I doesn't need one for every frame. What I need is > plain old round robin (tripple) buffering. > > The kernel makes pretty sure that a buffer isn't accessed from user mode > when the hardware is accessing it. So what happens when you try to map a > buffer again shortly after you passed it to the hardware is that the > kernel is blocking your user mode process. In the end instead of a > constant flow of commands and everybody (CPU/GPU) busy all the time, you > get peaks of work on both of them. > > I added a capability to specify how many buffers should be used round > robin, but the same as with begin/end/flush applies here: If you don't > need it don't use it. OpenGL (or more specific the pipe_context > interface) offers at least three different ways of handling this: > > 1. Let the process block until the buffer is unused, that usually > happens when you don't take any otherwise care. > > 2. Specify the DONT_BLOCK flag top the map function, if the buffer isn't > ready it isn't mapped and the map function returns a NULL pointer. With > this the state tracker can react accordingly (allocate a new buffer, say > "I'm busy" to the application etc...). > > 3. Call the flush function and use the returned fence to figure out if > all scheduled operations are already done and the buffer can be reused. > > I don't think we need a so complex interface for video decoding, but all > the current interfaces seems to have one way or another to ask the > driver if an operation is completed and a buffer/surface can be reused, > so at least a combination of flush/fence seems necessary here. > >> Those are the issues that I noticed at first. Martin may have some >> comments specific to his work with VPE. What do you think? > I also started to reimplement the bitstream parser this morning, Maarten > also stated that he is interesting to do this, but I don't know how far > he got. It doesn't make much sense if we work on the same at the same > time, so what you are currently doing? > I have a crappy one now, but it's only compile tested. If you want to verify that it's correct, please!! :) The file wasn't based on any (L)GPL implementation, but on my reading of the spec and the vlc tables from mwk's stubby parser, it's available at http://repo.or.cz/w/mesa/nouveau-pmpeg.git/ as src/gallium/drivers/nouveau/nouveau_pmpeg_bitstream.[ch]
The header contains the tables. The mpeg specification is dense reading, so if you can verify its correctness, please! :) vlc b15 is still missing, and I'm wondering why 1 is substracted from f_code in vdpau/decode.c, it doesn't really make sense to do it there.. At the moment I don't care about speed, I want it correct first, then I can optimize it. ~Maarten _______________________________________________ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev