On Tue, 29 Jul 2025 at 22:11, James Almer <jamr...@gmail.com> wrote:
>
> On 7/29/2025 5:02 PM, Kieran Kunhya via ffmpeg-devel wrote:
> > Hello,
> >
> > It seem there is strong evidence that AI wrote TLS code as part of the
> > WHIP patch. It goes without saying why this is bad. Further discussion
> > here:
> > https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20053
> >
> > This patch was pushed without ML review.
> >
> > I think this code should be removed before the FFmpeg release. I
> > include TC in this email for that reason.
>
> The UTF8 dashes are not so much an indication of LLM output but one that
> it was written with an unusual locale, I'd say.

I disagree. I wouldn't call out AI if there wouldn't be a good
indication that this is where those hyphens came from. I tested many
LLMs to evaluate their usefulness and this is the kind of thing that
they love to insert even in code. I would expect any developer (even
natively using different locale) to use - in the .c file, after all
this is a common token in the code too.

Additionally, now I see there is also an ’ (0x2019) few lines below in
`to a av_malloc’d PEM string.` Which is also something that LLMs love
to insert. I can even just now remove those comments and ask one of
the biggest LLM to comment on the code to reproduce the same 0x2019
being inserted.

Lastly, the strong indication of LLM are dummy comments for every
operation. LLMs love to explain themselves. Comments in code are very
useful tools, but you don't have to comment every function call and
every label. IMHO it adds more noise than information, SNR is
important. It's harmless, but look at pkey_to_pem_string() and tell me
it really is organic to add `// Copy data & NUL-terminate` to a memcpy
call. Again I can reproduce this with quaring LLM to do so.

I'm not saying we should revert this code, but a good review would be
in-order to ensure we are not shipping something bad in there.

Note that my intention was not to start some big discussion, just
clean the file from unnecessary similar looking utf-8 characters. I'm
not opposed to AI/LLM use, but their output should be heavily
sanitized as they are not reliable on their own.

- Kacper
_______________________________________________
ffmpeg-devel mailing list
ffmpeg-devel@ffmpeg.org
https://ffmpeg.org/mailman/listinfo/ffmpeg-devel

To unsubscribe, visit link above, or email
ffmpeg-devel-requ...@ffmpeg.org with subject "unsubscribe".

Reply via email to