David Kastrup <d...@gnu.org>:

> Marko Rauhamaa <ma...@pacujo.net> writes:
>> You probably cannot produce valid UTF-8 out of invalid UTF-8 snippets
>> with split(1). However split(1) does form filenames out of its
>> arguments by concatenation:
>>
>>     split --additional-suffix=suffix file prefix
>>
>> produces these kinds of filenames:
>>
>>     <prefix><ordinal><suffix>
>
> I don't really get your point here.  Why would you start with invalid
> UTF-8 sequences in the filenames?

There's nothing preventing such filenames from appearing on a Linux
system. They might come from a zip file with Latin-1 -encoded names, for
example.

I have files older than UTF-8 on my Linux system. I have files encoded
in Latin-3, for example.

Worst of all, they might be part of an attack on your system. For
example, files whose names contain invalid UTF-8 could evade file
listing altogether, they might make your program crash in unexpected
ways or you might not be able to remove them.


Marko

Reply via email to