First of all, I just want to say I love almost everything about this, and
especially I love the Hermann Hesse reference!

I am quite curious what you mean by saying you use Lilypond to write
poetry. Do you mean that you use Guile Scheme to generate the text and use
Lilypond to typeset it as top level markups, with no musical score?

I can see why Lilypond fits the spirit of your literary idea, as a
text-based format for encoding music, and especially one based on LISP. It
would definitely be cool to see Lilypond popularized a bit more in that
way. That said, I can't help but wonder whether specifying the use of
Lilypond is necessary for your literary purpose. If your reason for
specifying the use of Lilypond is technical, I outline below some thoughts
on its suitability to be the language of musical LLMs.

I am a bit skeptical that an LLM is actually what you want for your story.
LLMs as they currently exist need to overcome some major technical hurdles
before they will be able to compose sheet music competently, which I'll
outline below. Maybe it can be assumed that for fictional purposes the
practical challenges are solved, but even then, I wonder if there might be
other types of algorithms that are a better fit both technically (more
thoughts on that below) and in spirit. Admittedly my sense of the Glass
Bead Game is quite clouded by memory, but I would have thought that to get
at Hesse's idea, you would want a mechanism that is "conscious" of the
relationships of ideas, which is pretty much the opposite of what an LLM
does. For your literary purposes, do you even need to specify the type of
AI algorithm?

Serialist systems are the sort of generating scheme for musical material
that LLMs are the very worst at, but serialism also lends itself
potentially to other sorts of AI techniques, such as genetic algorithms. I
wonder also if the strength of the serialist constraints you place on the
output of each successive token would in some way render the LLM equivalent
to other, non-LLM based algorithmic methods; by crude analogy, if I asked
an LLM to calculate for me the output of some very complicated numerical
function, and then performed validation filtering out all the incorrect
answers, the resulting output would be the same as if I had entered that
function into a calculator.

I wouldn't claim to have a deep understanding of LLMs, but I have played
around with them quite a bit in trying to get them to generate music in
Lilypond format. My strong impression is that even a very powerful
current-generation LLM trained on a large dataset of musical repertoire in
a parsable format would not be well suited to generating new musical
scores, except perhaps if they were limited to a single melodic voice. The
reason is that while LLMs are capable at suggesting appropriate
continuations to an existing text, which makes them a good tool for some
types of creative writing, non-local relationships between semantic tokens
are a critical aspect of musical scores to a degree that they are not in
text. LLMs are by design not well equipped, for example, to validate that
the rhythms in each part add up to the correct total durations, or to check
harmonic and contrapuntal relationships. Lilypond actually makes this
particularly hard because Lilypond input files are typically written
"horizontally," whereas some other notation programs store data
"vertically" across all instruments measure by measure or beat by beat.

In order to suggest a musical continuation of a polyphonic score, the LLM
would first need to parse the Lilypond code sufficiently to understand what
the *semantic* tokens are from a *musical* perspective across all the
simultaneous parts, which is quite different than parsing the tokens within
the code itself or even from parsing the notes and rhythms, since musical
ideas can consist of complex interrelationships. That said, the same
mechanisms that are already being developed to allow AIs to, for example,
perform logical validation of software code or mathematical derivations,
will also provide the needed baseline capability for composing music.
However, music is likely to test AI logical reasoning capabilities to a
greater extent than even most mathematical and engineering use cases, due
to the fact that in a musical score, the relationship of every note to
every single other note in a given passage must be considered.

Even if the logical reasoning challenge were solved, that would only bring
LLMs to the starting line of being able to coherently add a new polyphonic
musical idea onto an existing sequence of polyphonic musical ideas. But
this would be the musical equivalent of the type of trite, contentless
drivel that GPT-3 was famous for. To write actually competent music
requires a whole additional set of layers of non-local relationships,
namely those between ideas. Strong LLMs are capable of *imitating* such
relationships in text because they are trained on datasets that contain
them; if you ask an LLM to write a short story about characters named Adam
and Eve, it will know that in sentences where the name of a person is
appropriate, it should prefer one of those names to some arbitrary name.
But that is not the same thing as knowing that the story is about those
characters, and the difference becomes apparent in situations where a
reasonable choice of words based purely on probability leads to nonsense or
inconsistency. This, it seems to me, is the very sort of understanding that
is critical to composing music of any quality.

Now, it might be that with a sufficiently large training dataset, the model
would be able to make good enough guesses that its lack of non-local
understanding wouldn't matter in any practical sense. Clearly that is the
case at least for certain types of prose writing. But it's difficult for me
to imagine that being the case with music, simply because even if the
entire canon of published sheet music were available as training data, that
dataset would still be orders of magnitude smaller than what is available
for prose. Maybe an LLM could be expected to compose new Christmas carols,
possibly even baroque fugues or bel canto arias, but for instance if one
wanted it to write a completion of the Mozart requiem, I am doubtful that
Mozart's catalogue would provide sufficient training data for the model to
succeed in picking up the thread of musical development where Mozart leaves
off, let alone to continue the thread of development in a convincingly
Mozartian manner.

On the subject of training data, one might naively think that text based
formats like Lilypond or MusicXML would be well suited to training LLMs on
sheet music, since interacting with code is an existing core capability of
LLMs. Personally, I think this is unlikely to be the path toward a model
that can actually compose, because the repertoire available in these type
of text formats is a tiny fraction of published sheet music. In order to
create a training dataset of sufficient size, either an implausibly large
human data entry effort would be required, or advanced OCR for sheet music
would be required. But OCR would probably itself use some type of AI model
to do the translation into text. And let's remember that Lilypond is
ultimately a language that is compiled into music notation. So if our AI
model is able to use OCR to parse sheet music into Lilypond and then write
Lilypond that compiles into sheet music, which can then be parsed back into
Lilypond by the AI...why not just let the AI read and write music notation
directly? The main benefit of Lilypond for algorithmic composition is that
the Scheme representation of music is useful for programmatic manipulation,
but this isn't the type of internal representation used by LLMs. All this
to say, IMO, if we ever get an LLM that can competently compose sheet
music, I think it is almost certainly going to be trained directly on music
notation without the need for a text representation of the music.

I'm including the list in my reply, since I suspect the overall question of
AI composing music is of interest. Please pardon my ramblings.

Saul

On Mon, Oct 21, 2024 at 4:45 PM David Olson <[email protected]>
wrote:

> Dear Lilypond Composers,
>
> Questions are regularly posed on this list that suggest that many
> Lilyponders are composers who use Lilypond to generate music based on
> creative conceptions, rather than the traditional way.
>
> I'm writing a faux-philosophical novel in which characters occasionally
> speak speculatively about Hermann Hesse's "glass bead game". It is kind of
> an updated Search for the Holy Grail (actual Glasperlenspiel). How to
> decode Hesse's many hints. What are the rules? What if Lilypond is used to
> play the game?
>
> One of the characters says:
>
> The Glass Bead Game, while being set far in the future, is nothing other
> than sets of Leibniz's characteristica universalis identified by competing
> future large language model artificial intelligence, wherein the coding
> teams who have developed each LLM compete by generating serial music using
> dadaist algorithms for their own LLM AI, in which, not the notes, but the
> large language model universal characteristics (post-Platonic forms) can
> only appear once.
>
> (the character's nickname is "Nothing Other Than"; he's an academic
> philosopher)
>
> 1. Can this general idea (scenario) be tweaked to be more specific?
> (more interesting to the coding community)
>
> 2. Is there way to mention Lilypond specifically in this paragraph?
> (e.g. replace everything that follows "wherein")
>
> 3. Is there any objection to mentioning Lilypond specifically in the novel?
>
> 4. Are there any other novels that mention Lilypond?
>
> 5. How might Lilypond be mentioned in some earlier chapter, to better
> prepare the reader?
>
> Honestly, No. 5 is entirely my responsibility, but having grown up in a
> Daoist country, I like Sets Of Five. On the other hand, if someone has had
> an humorous personal experience that could be imported into a narrative,
> I'm all ears. Maybe that would be a personal response.
>
> Hesse and dada. Hesse and dharma.
>
> I also welcome personal correspondence about thoughts Lilyponders might
> have about Das Glasperlenspiel (The Glass Bead Game). Because the
> protagonist is a member of a Freemason-like fraternity that has a karaoke
> room in every lodge, often used to sing a cappella rather than karaoke, and
> the protagonist travels frequently, it's easy to introduce a variety of
> characters with different views about music. Historically, Freemasons liked
> to sing together and had their own songbooks. Until 1991, The Sacred Harp
> contained a 4-page anthem titled "Masonic Ode." I'm not a mason, but I'm
> fascinated with Masonic music.
>
> For the record, I don't use Lilypond to compose music. I use it to write
> poetry. I feel that my first draft of Nothing Other Than's description is
> unsatisfying to people who actually write code. Suggestions welcome.
>
> Pond-fraternally yours,
>
> David Olson
> Los Angeles
>
>

Reply via email to