Re: [FFmpeg-devel] [PATCH v2] Add SRV3 decoder/demuxer

fishhh Thu, 26 Dec 2024 15:16:30 -0800

instructions on how to obtain a srv3 sample file would be helpful too.
this is so we can actually test your patch.


downloading the file from YouTube is probably the easiest way,
you can achieve this by the means of yt-dlp like this:

yt-dlp --skip-download --write-subs --sub-langs en --sub-format srv3<URL>

--skip-download makes it so the video itself is not downloaded

if you need a video with complex subtitles for testing, this
one exists: https://www.youtube.com/watch?v=gxWAM5mgH_o

also I just realized adding `ctx->q.keep_duplicates = 1;`
somewhere in srv3_read_header significantly improves the quality
of some conversions because it keeps the queue from removing
events with the same content but different styles
this also prompted me to make EDGE_GLOW emit a smaller glow
(`{\\bord1\\blur1}`) which looks way better
you can test with those changes if you care about the result's quality

usually we leave the version.h modifications out of patches posted tothemailining list, and only make those changes at git push time. thismakes iteasier to apply your patch in the future when those files may havechanged.
also for new codec we bump avcodec/version.h.


right, makes perfect sense, the documentation at
https://ffmpeg.org/developer.html#New-codecs-or-formats-checklist
should probably be updated to not tell you to change it yourself

this patch sets adds a demuxer and a decoder. imho, it should be splitinto
two patches. the decoder patch comes first, then the demuxer.
...
allcodecs.c sections are sorted by name.


okay, I'll do those things for v3.

+enum SRV3PenAttrs {
+    SRV3_PEN_ATTR_ITALIC = 1,
+    SRV3_PEN_ATTR_BOLD = 2,
+};


you use 0 elsewhere in the code.
should there be a SRV3_PEN_ATTR_NONE too?


the difference between PenAttrs and the other enums is that
PenAttrs is a bitfield because the pen can be both italic and bold
of course.
now that I've looked into this it does seem ffmpeg uses macros
for defining flags so maybe I should just make these
#define SRV3_PEN_ATTR_ITALIC (1 << 0)
#define SRV3_PEN_ATTR_BOLD (1 << 1)

+typedef struct SRV3Pen {
+    int id;
+
+    int font_size, font_style;
+    int attrs;
+
+    int edge_type, edge_color;
+
+    int ruby_part;

these should be enum SRV3PenAttrs, enum SRV3EdgeType, enumSRV3RubyPart,

instead of int. this helps static analysis. HOWEVER, as part of reading
in the values you assume they are int and this approach has merrit too.


yeah I didn't do that specifically so I could just pass these as
int* to the parsing functions, it would be nice if we could just do
enum SRV3EdgeType : int {}
but that's a C23 thing so not really acceptable.

= SRV3_EDGE_NONE

+#define FREE_LIST(type, list, until)                     \
+do {                                                                \

+ for (void *current = list; current && current != until; current =next) { \

+        next = ((type*)current)->next;                              \
+        av_free(current);                                           \
+    }                                                               \
+} while(0)


syyle: the \'s should all line up


will fix

+
+static int srv3_probe(const AVProbeData *p)
+{
+    if (strstr(p->buf, "<timedtext format=\"3\">"))
+        return AVPROBE_SCORE_MAX;
+


use strstr(p->buf, "<timedtext") && strstr(p->buf, "format=\"3\"");

this handle the situaton wehere an extra tag is inserted between the
two, e.g.
     <timedtext x="y" format="3" ....>


okay

+
+static int srv3_parse_numeric_attr(SRV3Context *ctx, const char*parent, xmlAttrPtr attr, int *out, int min, int max)
+{
+ return srv3_parse_numeric_value(ctx, parent, attr->name,attr->children->content, 10, out, min, max) == 0;
+}
+
+static void srv3_parse_color_attr(SRV3Context *ctx, const char*parent, xmlAttrPtr attr, int *out)
+{
+ srv3_parse_numeric_value(ctx, parent, attr->name,attr->children->content + (*attr->children->content == '#'), 16, out,0, 0xFFFFFF);
+}
+
for both functions, should probably return thesrv3_parse_numeric_value() error value


well actually the only reason srv3_parse_numeric_value returns an
error value is because attributes like "p" and "wp" have to
process the number if it is parsed successfully to find the
structure with the right id. I can make parse_color_attr return an error
but it'd be ignored in all cases.

srv3_parse_numeric_value() returns an error, but you are discarding it
everywhere here.


the error value from srv3_parse_numeric_value is discarded to allow
parsing slightly incorrect subtitles since a single event's invalid
opacity really shouldn't abort parsing and srv3_parse_numeric_value
already logs an error so its return value is only necessary in the
special cases I mentioned above.

i would probably convert these attributes ("sz", "fs", "et", etc.), thepen struct offsets, andmin/max values into an array, and then iterate through that arraysearching for the attribute name.
something like AVOption.


nice idea I'll do that

+static int srv3_clean_segment_text(char *text) {
+    char *out = text;
+    const char *start = text;
+
+    while (1) {
+        const char *end = strstr(start, ZERO_WIDTH_SPACE);

+ size_t cnt = end ? (size_t)(end - start) :(size_t)strlen(start);


strlen returns size_t, should be no need to cast it.

right thanks

ditto for end-start;

I think pointer subtraction returns ptrdiff_t which is supposed
to be signed but it appears compilers don't complain about that
without -Wconversion (gcc doesn't even with -Wconversion which
I find amusing) since I don't see any mention of -Wconversion
in ffmpeg I can remove that cast too.

+                } else if (!strcmp(attr->name, "ws")) {
+                    // TODO: Handle window styles


use avpriv_report_missing_feature()?


I can do that but it'll trigger on many subtitle files as it's
pretty commonly used, just doesn't usually change all that much
which is why it's unimplemented.

+                if(textlen == 0)
+                    continue;


if (textlen...

I assume you mean if(!textlen) and aren't suggesting to put
everything into an if block?

+
+static const AVOption options[] = {
+    { NULL }
+};
+
+static const AVClass srv3_demuxer_class = {
+    .class_name  = "SRV3 demuxer",
+    .option      = options,
+    .version     = LIBAVUTIL_VERSION_INT,
+};


are you planning to use options?
if not, options and srv3_demuxer_class can be removed.


when I wrote this code I did have in mind a hypothetical
option that would enable using non-standard libass functionality
like BorderStyle = 3 so that's a possibility but I probably won't
implement that myself anytime soon.
can a class be added retroactively? if yes then yeah this
should probably be removed.
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Re: [FFmpeg-devel] [PATCH v2] Add SRV3 decoder/demuxer

Reply via email to