graphicsub2text: Add new graphicsub2text filter (OCR)

Daniel Cantarín Sat, 11 Dec 2021 07:18:03 -0800

Hi there softworkz.

Having worked before with OCR filter output, I suggest you amodification for your new filter.It's not something that should delay the patch, but just a nice addenum.Could be done in another patch, or could even do it myself in thefuture. But I let the comment here anyways, for you to consider.

If you take a look at vf_ocr, you'll see that it sets"lavfi.ocr.confidence" metadata field.Well... downstream filters can check that field in order to justconsider certain confidence threshold, discarding the rest.This is very useful when doing OCR with non-ascii chars, like I do withSpanish language.


So I propose an option like this:

{ "confidence", "Sets the confidence threshold for valid OCR. Default80." , OFFSET(confidence), AV_OPT_TYPE_INT, {.i64=80}, 0, 100, FLAGS },

Then you do an average of all confidences detected by tesseract afterOCR but before converting to text subtitle frame, and compare thatoption value to the average result.

Something like this:

  int average = sum_of_all_confidences / number_of_confidence_items;
  if (average >= s->confidence) {
    do_your_thing();
  } else {

av_log(ctx, AV_LOG_DEBUG, "Confidence average %d under threshold.Text detected: '%s'\n", average, text);

Also, I would like to do some tests with spanish OCR, as I had toexplicitly allowlist our non-ascii chars when using OCR filter, anddon't know how yours will behave in that situation. Maybe having thechars allowlist option here too is a good idea. But, again: none of thisthis should delay the patch, as your work is much more important thanthis kind of nice to have functionalities, which could be easilyimplemented later by anyone.


Thanks,
Daniel.
_______________________________________________
ffmpeg-devel mailing list
[email protected]
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
[email protected] with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v23 19/21] avfilter/graphicsub2text: Add new graphicsub2text filter (OCR)

Reply via email to