David Kastrup <d...@gnu.org>: > Marko Rauhamaa <ma...@pacujo.net> writes: >> You probably cannot produce valid UTF-8 out of invalid UTF-8 snippets >> with split(1). However split(1) does form filenames out of its >> arguments by concatenation: >> >> split --additional-suffix=suffix file prefix >> >> produces these kinds of filenames: >> >> <prefix><ordinal><suffix> > > I don't really get your point here. Why would you start with invalid > UTF-8 sequences in the filenames?
There's nothing preventing such filenames from appearing on a Linux system. They might come from a zip file with Latin-1 -encoded names, for example. I have files older than UTF-8 on my Linux system. I have files encoded in Latin-3, for example. Worst of all, they might be part of an attack on your system. For example, files whose names contain invalid UTF-8 could evade file listing altogether, they might make your program crash in unexpected ways or you might not be able to remove them. Marko