Phillip Wood <phillip.wood...@gmail.com> writes:

> Being picking I'll point out that ':' is not a valid in refs
> either. Looking at
> https://docs.microsoft.com/en-us/windows/win32/fileio/naming-a-file I
> think only " and | are not allowed on NTFS/FAT but are valid in refs
> (see the man page for git check-ref-format for all the details). So
> the main limitation is actually what git allows in refs.

Yeah, trying to use the contents of the log message without
sufficient sanitization is looking for trouble.

>>              for (p1 = label.buf; *p1; p1++)
>> -                    if (isspace(*p1))
>> +                    if (!(*p1 & 0x80) && !isalnum(*p1))
>>                              *(char *)p1 = '-';
>
> I'm sightly concerned that this opens the possibility for unexpected
> effects if two different labels get sanitized to the same string. I
> suspect it's unlikely to happen in practice but doing something like
> percent encoding non-alphanumeric characters would avoid the problem
> entirely.

I'd rather see 'x' used instead of '-' (double-or-more dashes and
leading dash in refnames may currently be allowed but double-or-more
exes and leading ex would be much more likely to stay valid) if we
just want to redact invalid characters.

I see there are "lets make sure it is unique by suffixing "-%d" in
other codepaths; would that help if this piece of code yields a
label that is not unique?

Reply via email to