Re: Brief update about software freedom and artificial intelligence

Bradley M. Kuhn Thu, 02 Mar 2023 20:27:30 -0800

Hey, everyone, as many of you probably know, I've been involved with many of
the GPL and AGPL enforcement efforts that are (publicly) known to have happen
in the USA since 1999, and also have been involved with the drafting process
of various copyleft licenses.  I currently am continuing that ongoing work
along with my colleagues at Software Freedom Conservancy (SFC).


>From that context and point of view, there are three main points I want to
contribute to this discussion:

Point 0:

Always keep in forefront of your mind that the complexity of legal issues and
enforcement of licenses lags technology by a period measured in decades.  For
example, many folks have referenced the Google v. Oracle SCOTUS case — which
dealt with questions of software licensing and copyright that we were
discussing in the copyleft community as far back as the early 1980s.  Yet,
the case didn't come before SCOTUS for consideration until a few years ago,
and (on top of that) SCOTUS' decision was complex and didn't really resolve
some of the fundamental questions that we all have about how software
licensing works.  Most of the key issues (such as “where is the bright line
for when it becomes copyright infringement if you reimplement a known,
documented API”?)  that we in FOSS were worried about, while they *came up*
in Google v. Oracle, they still remain open legal questions in the USA.


Point 1:

FOSS licensing doesn't rely solely on copyright law.  Yes, grant of a
copyright license is the fundamental part of all FOSS licenses, but they are
also contractual agreements too.  (For folks unfamiliar with this point, I
encourage you to read the stuff we published at SFC's when we filed our case
against contract Vizio <https://sfconservancy.org/vizio/>.)  So, when
thinking about these questions, an exclusive focus on copyright questions
might not be particularly helpful.

Furthermore, copyright law isn't moral code: it's just an extremely flawed
legal system that we're forced to deal with because various regimes decided
back in the 1970s that software would be governed by copyright.  What
copyright law says or doesn't say in any particular jurisdiction never
provides us any moral compass to what is wrong or right for software freedom.
We must approach *that* question “a priori” (and as philosophers) because all
the “a posteriori” exploration of the question in the real world are just too
heavily biased by the incumbent capitalist structures that serve and/or
benefit from the proprietarization of software.

On that point, I do invite everyone over to the mailing list we're hosting at
SFC to discuss the morality and ethical implications in FOSS of
machine-learning-assisted software development.  You can read more about
this, and subscribe the maling list, via:
<https://sfconservancy.org/news/2022/feb/23/committee-ai-assisted-software-github-copilot/>

Point 2:

There are a number of mistakes FOSS activists have made historically in
copyleft licensing creation and drafting.  Having been involved myself in the
invention and drafting of AGPLv3, and a somewhat-involved witness to the
GPLv3 drafting process, I learned the hard way that trying to address every
“issue of the day” quickly in a copyleft license draft leads to problems.

A big example appears in the patent provisions found in A/GPLv3§11¶3-6.  They
are complicated, unnecessarily wordy, and as full of loopholes as the worst
tax legislation.  Admittedly, the primary problem there may be that the
drafting process was over-influenced by large patent holders.  However, the
reason such influence was successful was because of a fervor of concern among
FOSS activists about seemingly-urgent patent issues of the day.  In
hindsight, those issues were either moot, or turned out even *worse* than we
imagined, and therefore poorly addressed by this section anyway.

To be clear abundantly clear so I'm not misunderstood: I'm analyzing these
issues in hindsight to help inform our current issues of the day.  Lots of
really experienced and smart policy people contemporaneously believed
(probably reasonably) that A/GPLv3§11 was the be-all-end-all of patent
language for copyleft.  But the behavior and legislation both changed in the
intervening years, *and* some seemingly huge problems of those days also seem
minuscule in the rear view mirror a decade later, and problems that we
thought were solved or could be solved stubbornly got worse.

Most importantly to this point, over the decade after GPLv3's release, lots
of corporate attorneys pushed heavily anti-GPLv3 agendas — claiming that the
patent language was the problem.  In fact, after years of work responding to
those (as it turned out, specious) criticisms, we later learned that the
patent language was just a convenient place to hang their hats in their
broader anti-GPLv3 campaign.  So, IMO, we (as a FOSS community) got basically
*no* policy gains on patent issues in GPLv3 that we didn't already have in
GPLv2, *but* we handed the opposition a bunch of text for them to paint as
“big scary reasons” to avoid GPLv3.  That's a huge factor in how we ended up
in the complex GPLv2-only / GPLv3-or-later divide in copyleft circles that we
have today.

IMO, this seemingly unrelated example really shows three key issues highly
relevant to the issue of machine-learning and FOSS:

  (a) it's very easy as a FOSS license drafter to be caught up in the issues
      of the day and overcompensate by writing more text into the license
      thinking it's great policy but then it backfires for
      political/social/enforceability/advocacy reasons,

  (b) the echo chambers and deference to incumbent authority that have
      historically dominated FOSS license drafting really have been
      problematic and we've not fully explored how to solve that for future
      drafting, and

  (c) because copyleft is such an amazing invention, we (as a FOSS community)
      have a tendency to see everywhere nails that we think the hammer of
      copyleft can hit — even when they may well be screws, not nails.

On (c), I point to my current-favorite license, AGPLv3, which I admittedly
helped design and draft.  Ultimately, AGPLv3 didn't do nearly as much as we'd
hoped to solve software rights for network-deployed software, precisely
because the software freedom and rights issues that come up in such software
*can't* fully be addressed merely by a copyleft provision.  We erred because
we didn't see the obvious: a good copyleft license is a *necessary* but not a
*sufficient* condition to assure users' software rights and freedoms.

Furthermore, we didn't carefully consider when building the Affero clause how
much it could be abused in proprietary licensing schemes by companies like
Neo4j, MongoDB, and others.  Specifically, only years later did the community
(thanks to Richard Fontana) figure out that a copyleft equality clause was an
absolutely mandatory to offset the problem more on this at
<https://sfconservancy.org/blog/2020/jan/06/copyleft-equality/>.  Proprietary
relicensing is more-or-less a relatively simple problem to describe and
study, yet it took us about 30 years to come up with a copyleft clause that
can actually address the problem elegantly and in an enforceable way.

As such, based on all this that I've learned in copyleft drafting, I advise
*extreme caution* about rushing to copyleft as an obvious solution to the
disturbing things happening with machine learning applications.  There may
well be ways copyleft can be used to fight back against the horrible things
that OpenAI, Microsoft's GitHub, and dozens of other for-profit companies are
doing with machine learning.  However, I'm quite sure that whatever ways we
think copyleft can (or can't) be modified/improved/changed/applied to help
may well turn out to be the wrong decision if we rush.

The most important thing we can do now is advocacy: first and foremost, we
need to raise awareness about why this technology is bad for users and
impedes their software freedom and rights.  There are natural allies around —
from folks in the visual arts, to those who have correctly pointed out that
machine learning systems trained on existing date usually propagate the
biases inherent in past decisions and work.  Time spent coalition building
will serve us better than more navel-gazing at copyleft terms on this front.

Ultimately, if there *is* a legalistic/licensing solution implementable in
copyleft, the right one won't become apparent until the dangers and problems
are fully understood by society.  Similar to the advent of copyleft itself as
a strategy: proprietary software had to actually become a thing and a problem
before we could figure out how to answer it with copyleft.

Inventing new copyleft terms shouldn't be the first place we run to when
facing a threat to software rights or freedom; it should be a solution used
only sparingly when we're sure no other solution (including, most
importantly, enforcing the copyleft terms that we *have* already) will work
to address to the problem.

  -- bkuhn

Re: Brief update about software freedom and artificial intelligence

Reply via email to