Hello Ruiling, On Fri, 4 May 2018 15:32:58 +0800, Ruiling Song <ruiling.s...@intel.com> wrote: > It basically does hdr to sdr conversion with tonemapping. > > Signed-off-by: Ruiling Song <ruiling.s...@intel.com> > --- > This patch tries to add a filter to do hdr to sdr conversion with tonemapping. > The filter does all the job of tonemapping in one pass, which is quite > different from the vf_tonemap.c > I choose this way because I think this would introduce less memory access. > > And I find that tonemaping shares lots of code with colorspace conversion. > So I move color space related code into seprated files (both OpenCL kernel > and host code). > > I am not sure whether the design seems OK? > Is there anybody would like to give some comments on the overall design or > implementation details? > > > Thanks! > Ruiling
As the original author of the tone mapping code that inspired vf_tonemap and (by proxy) vf_tonemap_opencl, I can provide a handful of comments. > +float3 map_one_pixel_rgb(float3 rgb, float peak) { > + // de-saturate > + float luma = get_luma(rgb.x, rgb.y, rgb.z); > + float overbright = max(luma - 2.0f, 1e-6f) / max(luma, 1e-6f); > + rgb = mix(rgb, (float3)luma, (float3)overbright); > + > + float sig = max(max(rgb.x, max(rgb.y, rgb.z)), 1e-6f); > + float sig_old = sig; > + sig = TONE_FUNC(sig, peak); > + rgb *= (sig/sig_old); > + return rgb; > +} I consider this desaturation algorithm outdated. It works, but I think it produces lower quality results than this one: float sig = max(max(rgb.x, max(rgb.y, rgb.z)), 1e-6f); float luma = get_luma(rgb.x, rgb.y, rgb.z); float coeff = max(sig - 0.18f, 1e-6f) / max(sig, 1e-6f); const float desaturation_coefficient = 0.5f; // or configurable! coeff = pow(coeff, 10 / desaturation_coefficient; rgb = mix(rgb, (float3)luma, coeff); sig = mix(sig, luma, coeff); // do the rest of tone-mapping on `sig` float sig_old = sig; ... Basically, I've done the following: - Calculate the overbright coefficient on `sig` rather than `luma` alone - Instead of using an offset of 2.0f, use an offset of 0.18f - Take the coefficient to a high exponent (lower exponent = more desaturation, which is why I invert the user-configurable parameter) Since the new code needs to read `sig`, we also have to move up the `sig` calculation and then correctly adjust it afterwards as well. ---- More importantly, this algorithm is missing something that I now consider very important, and which would align well with OpenCL: source brightness detection. Just doing the tone-mapping "blind" like this works to some extent, but I think the best results require also adjusting the exposure in order to compensate for hollywood's tendency to ship poorly mastered, over-illuminated HDR material. The basic premise is to calculate both the peak brightness as well as the average brightness on a frame-by-frame basis, and incorporate those measured values in the algorithm, in order to re-normalize overly bright scenes to correspond to a typical SDR average of around 0.25. In addition to this, calculating the peak explicitly allows us to exactly tailor the hable() function to this specific frame, even if the mastering metadata is missing or useless. (Which it almost always is) Doing this in OpenCL would essentially require implementing a big map-reduce to keep track of respectively the sum and max of each pixel's brightness. In addition to this, I recommend averaging the results over a few frames (I like to use around one second), with the caveat that this is best paired with at least a naive scene change detection heuristic to make sure this averaging window gets reset on a scene change. > +static double determine_signal_peak(AVFrame *in) > +{ > + AVFrameSideData *sd = av_frame_get_side_data(in, > AV_FRAME_DATA_CONTENT_LIGHT_LEVEL); > + double peak = 0; > + > + if (sd) { > + AVContentLightMetadata *clm = (AVContentLightMetadata *)sd->data; > + peak = clm->MaxCLL; > + } > + > + sd = av_frame_get_side_data(in, > AV_FRAME_DATA_MASTERING_DISPLAY_METADATA); > + if (!peak && sd) { > + AVMasteringDisplayMetadata *metadata = (AVMasteringDisplayMetadata > *)sd->data; > + if (metadata->has_luminance) > + peak = av_q2d(metadata->max_luminance); > + } > + > + /* smpte2084 needs the side data above to work correctly > + * if missing, assume that the original transfer was arib-std-b67 */ > + if (!peak) > + peak = 1200; > + > + return peak; > +} This seems strange. For a source without peak tagging, you should probably be deciding based on the video frame's tagged transfer function (ST.2084 or STD-B67). If it's the former, default to 10,000, rather than 1200. > + switch(input_frames_ctx->sw_format) { > + case AV_PIX_FMT_P010: > + err = launch_kernel(avctx, ctx->kernel, output, input, peak); > + if (err < 0) goto fail; > + break; > + default: > + av_log(ctx, AV_LOG_ERROR, "unsupported format in tonemap_opencl.\n"); > + err = AVERROR(ENOSYS); > + goto fail; > + } This seems a bit needlessly restrictive? At the very least, I would expect high-bit RGB and ideally float formats to also be supported. Finally, I'm not entirely sure how you ingest HLG, but in the case of HLG it's important to run the HLG OOTF (to take the content from scene-referred to display-referred light) *before* tone-mapping, rather than after it. I would assume this is probably handled by other FFmpeg components but it might be worth double checking just to be sure. ---- As a last note, you can find my GLSL(+Vulkan) implementations of the algorithm changes described above, as well as all of the related color-management code and various decision logic for what values to infer/default here: https://github.com/haasn/libplacebo/blob/master/src/shaders/colorspace.c The documentation for the tunable parameters is here: https://github.com/haasn/libplacebo/blob/master/src/include/libplacebo/shaders/colorspace.h Of specific interest are the functions `pl_shader_tone_map` (the core tone mapping logic), `hdr_update_peak` (peak/avg detection using compute shaders + SSBOs) and `pl_shader_color_map` (the calling code that also does color conversions, OOTF application, etc.) The way I implemented the logic there is fully generalized and allows going from any combination of source (space, peak, average) to any destination (space, peak, average), doing only the operations necessary while making sure to compensate for brightness differences. This also works well when viewing HDR material on HDR devices with a lower dynamic range than the original material, even when those devices are calibrated to SDR curves (search for `hdr_simulation` in that file). I hope that may provide some assistance in (ultimately) making the ffmpeg tone mapping filter as good as it can be. Thanks for reading, Niklas Haas _______________________________________________ ffmpeg-devel mailing list ffmpeg-devel@ffmpeg.org http://ffmpeg.org/mailman/listinfo/ffmpeg-devel