On Sat, Aug 10, 2024 at 5:39 PM pip...@protonmail.com wrote:

(Quick caveat: I'm going to use the term OpenType, exclusively, when
referring to the font format / specification, because TrueType and OpenType
are the same format and we need to get past that misunderstanding. Others
out there may disagree, but my mind is made up on the matter.)


> The Debian project contains TrueType fonts without sources. The "source"
> packages provide .ttf files which contain binary information compiled
> from secret sources.
>
> This is not about the license the .ttf files are made available under;
> it's about whether these distributions include source code, as
> required by the DFSG and the OSI Open Source definition.
>

I guess I'm not confident that I understand what point you're trying to get
at with these font examples. But, in case it helps from the standpoint of
font formats, I would personally push back on some of the particulars in
this first part of the email, since I don't think it frames the contents
and editability of OpenType fonts entirely accurately. The format itself is
not human-readable, but it's far more akin to a media format than to an
executable in my opinion, and that distinction often seems to quickly get
lost in a very broad "where's the source" discussion.

Whether or not an OpenType file constitutes 'code' is going to depend on
assumptions people bring with them. Is "a list of coordinate points" code?
Or is that data? Is it code if it's labeled as points that constitute a
path to be drawn? That's what we're talking about with glyphs in OpenType.

The rubber-meets-the-road reality of the OpenType format is that the font
files themselves are, in essence, source code, if they're code at all
rather than data. Namely, they are instructions that must be run through an
interpreter; they do not execute themselves, they don't get control of the
processor, they don't access the system or any of that stuff.

If we're measuring by weight, the bulk of them are path instructions: curve
segments to be drawn, advances to move the pen to, etc. For that reason,
it's typically not much of a loss of information if the path data was
previously in some other format, because the OpenType format is editable
as-is.

I'd fully agree that it'd be fantastic (and life would be easier) if many
of the other tables and strings inside an OpenType font were in an
easier-to-process format (metadata in modern multimedia files for
example....), but the format exists the way it does mainly because it was
optimized for size reasons back in the bad old days of printer memory and
so on. But OpenType files are editable and examinable (and diffable and
roundtrippable, as Felipe noted...) with free software, despite their not
being human-readable on-disk.

Because of that, standard practice has for a long time been to regard the
Beziers and the compact representations of all the other junk in an
OpenType font file as, so to speak, usable enough. Not _fun to use_
necessarily, but then again what is?


> Fonts are, of course, nontrivial computer programs. In addition
> (usually, see below) to the glyph outlines, which can be retrieved
> from the TTF files, they contain a large amount of code and additional
> data. For example, the "fpgm" table may specify a sub-program for
> which source code is not available.
>

This is slightly different, enough to comment on IMHO.

The fpgm table contains TrueType Instructions (more commonly referred to as
"hints", but again the choice of terminology kinda gets on my nerves),
which are more or less an assembly language. It has a very limited
instruction set, affecting the path data that is later to be interpreted.
So there is every likelihood that there was never any other source that was
compiled into the TrueType Instructions as delivered.

It seems clear that there have been tools that output TrueType Instructions
as a way of simplifying or making families / libraries of vendor fonts more
consistent and easier to maintain (Microsoft's Visual TrueType is one of
the only ones still around), but as shocking as it seems, people edited
those by hand.... In those cases, there may be no other source.

The format for the assembly language is documented, of course:
https://developer.apple.com/fonts/TrueType-Reference-Manual/RM03/Chap3.html

Here, too, because the instructions it contains are only ever run on the
interpreter in the font stack (e.g., FreeType), it's a bit different than
the case of a binary blob that can do something in its own process. I guess
I don't know how it compares to the case of binary-blob firmware; I suppose
others know that better than I.

But there are some free-software tools that can decompile the format, and
I'm not entirely sure if doing that would get you something reasonable
un-like the hypothetical "original source" used by the foundry. It might be
that there are pre-processors and things like macros that are not available
that way, but I don't know if it's been studied. And you can't rule out
that it might just have been done by hand.


> Some fonts even specify entire font families as "variable font"
> programs which take various parameters, so the outlines are very
> little help at all since they vary depending on the parameters, in
> non-trivial, non-modifiable ways hidden in the binary files.
>

I don't think I can call this statement accurate. OpenType font variations
are well-defined; the axes of each variable font are defined in the `fvar`
table, and the alterations made to each glyph are (X,Y) deltas on the
points in the contours, nothing more (*).

[* technically, other things are also subject to deltas, such as the
metrics, and it's possible for a variable font to declare that at some axis
position, a specific glyph from one master is replaced by a specific glyph
from another master — the canonical example being a dollar sign with bars
that go through the middle being swapped out for one where the bars are
only shown above & below, in a super-heavy weight where retaining the bars
in the middle would fill it in entirely.]

It's most certainly not arbitrary code, its meaning is not opaque, and it's
not _any_ less modifiable than the non-variable path data in a non-variable
OpenType font. The outlines are a great deal more than "very little help";
they're in fact the entire game.


> In particular, this affects at least the following fonts:
>
> Noto CJK: in this case, something *closer* to the source is available
> from Adobe's GitHub pages
> (https://github.com/adobe-fonts/source-han-sans) but even that font
> (a Type 1 PostScript font packed in a CID) was produced by a
> "proprietary application"
> (https://blog.typekit.com/2014/08/14/interview-with-ryoko-nishizuka/) I
> believe it's highly likely this proprietary tool consumed additional
> source data which is not available.
>

If the upstream project is using a proprietary tool, then my take on the
situation would be that it becomes problematic iff that tool produces
output that the free-software tools cannot generate equivalent output for.
I'm not clear if that's the case here, since I only have a bare-bones
understanding of Hangul and basically zero for the other writing systems
used in CJK fonts.


>
> Noto Emoji Color: the GitHub repository indicates
> (
> https://github.com/adobe-fonts/noto-emoji-svg?tab=readme-ov-file#generating-png-and-svg-files
> )
> that there are Adobe Illustrator files which constitute "the original
> Ai artwork". These files, which may include valuable information, are
> not included, only SVGs and PNGs generated from them.
>
>
I guess I would agree that the best case scenario would be if there were
Illustrator files (if those were what was originally used), but SVG itself
is a source format and I would call it the preferred format for editing; it
can not only be edited by GUI tools like Inkscape, but also generated
programmatically. I'm suspicious that there would be valuable data in the
Illustrator format of something made to be exported to SVG. Meaning, things
that do *not* end up in the exported SVG file, but were valuable to
exporting/generating it. Possibly metadata, but I'm not aware of things
that would be exported to SVG by Illustrator but couldn't be generated
otherwise.


> Droid Sans Mono: the font includes a non-trivial "fpgm" table which
> changes the appearance of some glyphs. No source code or instruction
> for rebuilding this table has been made available.
>

I think you would need to do some investigating of the TrueType
Instructions actually in the table in order to determine if it could be
modified in a way that lets you alter it. Like I was saying above, as an
instruction set it's pretty small, so you could do that. I don't know that
reverse-compiling it into some other format would leave you with something
different (much less, something less complete) than whatever might have
been used originally, but it isn't opaque enough that you would wonder what
it's doing; you could recreate it or modify it with instructions of your
own.


>
> 1. What is the source code for fonts? Is there some argument that .ttf
> files, by some process, become the source code even when they're
> generated from other sources?
>

I suppose the above might sort of provide my personal perspective on this
question. Fonts are half-data, half-interpreter-code; the interpretable
stuff does not have a lot of complexity to it (in programming-language
terms), so it's easy enough to verify what code-ish parts do, when that
needs doing, and those parts are easily edited.

I don't have any opinion on questions 2 or 3.

Anyway, I hope that's at least useful to think about.

Nate

-- 
nathan.p.willis
nwil...@glyphography.com <http://identi.ca/n8>

Reply via email to