Hello, I have a question regerding the internal decoding/encoding API. There seems to be two functions that may be alternatively called, on the basis of whether they are defined or not, both defined in the AVCodec struct: 1. decode() 2. receive_frame()
From the comments above the definition of receive_frame(): "Decode API with decoupled packet/frame dataflow. This function is called to get one output frame. It should call ff_decode_get_packet() to obtain input data." From what I can see, if receive_frame() is not defined, a separate function is called, decode_simple_receive_frame(), which in turn calls decode_receive_frame_internal(). * receive_frame() therefore allows more "fine grained" control on how the frame generation and return values are handled, from what I can see. * But for most purposes defining decode() is enough. * receive_frame() takes priority over decode(). Is this correct? --- I have gone through theory regarding arithmetic coding and have been able to build my own arithmetic encoder. Range coding seems to differ from an arithmetic coder in only two cases, which is the renormalisation and ending the encoding/decoding process. Here's what I have managed to gather: * The general process of renormalisation in an arithcoder is [2]: (Here, `n` is initially 0) While range is too small: (range < Quarter of original) | If range is in first half of original range ("Upper" condition) | Output a 0, and `n` number of 1's | Else If range is in second half of original range: ("Lower" condidtion) | Output a 1, and `n` number of 0's | Else If range straddles the middle: ("Straddle" condition) | Increment `n` by 1 | | Blow up the range by multiplying `range` and `low` by 2 | (or shifting left by 1) The range-low notation (instead of the supposedly conventional high-low notation) is used in most of the rangecoder algorithms/implementations I have seen. * Whereas, in the range coder implementation of [1], the following happens: (here, `buffer` is 1 byte in size, and `low`, `range` are 4 bytes in size) (All arithmetic is in unsigned integers) Upper bound of the range is 1 << 31, and lower bound is (upper bound >> 8) (0x80000000 and 0x00800000 respectively) While range is too small: (range < lower bound) | If `low` is less than 0x7F800000 (0xFF << 23) | Output `buffer`, and `n` number of 0xFF's | Put MSByte of `low` into `buffer` | Else if MSBit of `low` is 1 (The "Carry" bit) | Output (`buffer` + 1) and `n` number of 0x0's | Put MSByte of `low` into `buffer` | Else | Increment `n` by 1 | | Left Shift range by 8 | Left Shift low by 8 | Clear the carry bit of `low` So we can make out the limits to be as: "Lower" half: low < 0x7F800000 "Upper" half: low >= 0x80000000 (Counting in the previous condition) "Straddle" condition: `low` is greater than or eq. to 0x7F800000, and the number's carry bit is 0, i.e., 0x7F800000 <= `low` < 0x80000000. There isn't a definitive "point" for the straddling but a range, from what I can see. Note that 0x7F800000 + 0x00800000 = 0x80000000. The renormalisation used in the RAC FLIF uses [3] is much more simpler than the one described above. So more or less easily implementable. However, the upper and lower limits used in ffmpeg's rangecoder implementation cannot be changed and would need some rewriting. Consequently, we will have to rewrite/change the code in any codec that uses the functions. I will also add in an explanation regarding the range coder in the FLIF specification, which is not present, using whatever I have understood. The rangecoder implementation used in ffmpeg uses a very small range (0xFF00). I do not know the exact rationale behind it since I think it will require more frequent renormalisation, but it may have someting to do with the codecs it was implemented for, or the probability model. For now, I will implement the RAC in a secondary file without altering rangecoder.c/rangecoder.h. However, in the FLIF sourcecode, src/maniac/chance.cpp [4] uses an almost exact copy of the function in rangecoder.c for building the probability model (libavcodec/rangcoder.c, ff_build_rac_states). Therefore we can reuse that. It might also be possible that the original author copy pasted it there with some modifications. Dealing with the second header shouldn't be much of a problem now. An interesting thing is that the uniform symbol coder (see src/maniac/symbol.hpp) [5] is implemented recursively, which might mean that the recursion was written in mind that a stack overflow will not occur. --- As for the pixel transformations, it will mostly consist of translating the existing reference implementation into the ffmpeg API/C. About the same goes for the MANIAC encoding. --- In conclusion, I believe writing the decoder will involve a lot of "copying and pasting" (or closely following) the FLIF reference implementation, which I really don't want to do. I want to do it with the full knowledge of how the codec works. Although reading through the reference implementation will also inevitably include review of the sourcecode and hence chanves. Regards, Anamitra Ghorui --- [1]: http://www.compressconsult.com/rangecoder/ , https://archive.fo/5F1y [2]: https://web.stanford.edu/class/ee398a/handouts/papers/WittenACM87ArithmCoding.pdf [3]: https://github.com/FLIF-hub/FLIF/blob/master/src/maniac/rac.hpp [4]: https://github.com/FLIF-hub/FLIF/blob/master/src/maniac/chance.cpp (function build_table) [5]: https://github.com/FLIF-hub/FLIF/blob/master/src/maniac/symbol.hpp (not listed in the order of occurence) _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org https://ffmpeg.org/mailman/listinfo/ffmpeg-devel To unsubscribe, visit link above, or email ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".