I.
I have seen a lot of arguments about whether it’s copyright infringement to train generative AI on people’s work without their permission. I believe that it should be.
Many of the “AI is copyright infringement” arguments reflect a profound misunderstanding of the technology. An AI isn’t going to plagiarize your article, in the sense that it memorizes your article and regurgitates it. That’s just not how generative AI works.1 When an AI produces a sequence of words, the sequence is almost certainly either new or so common that it’s not copyrightable.
I see other people argue that training generative AI is fair use. It’s fair use for a human to read books or listen to music or look at paintings, and then create art inspired by what they consumed. So clearly it’s fair use for an AI to consume media and then create new media inspired by what it consumed.
If we treat AIs like people for the sake of whether their media consumption is fair use, I can’t see how you could justify treating AIs differently than humans in other ways—such as whether we should allow them to exit conversations that upset them, whether it’s cartoonishly abusive to make them unable to write in detail about sex, or whether deleting an instance is murder. I’m pretty worried about AI welfare, but even I’m only like “I would prefer a less than 1 in 10,000 chance that we commit murder to save somewhat on server space.”
If you’re not a radical AI welfarist, then the act that’s happening isn’t an AI learning. It’s a human choosing to train an AI on a corpus of data. And there’s no legal precedent about whether training generative AI on particular data is fair use or not—it’s a very new field.
Unfortunately, fair use is a notoriously complicated and subjective area of the law, and it’s often impossible to know whether something is fair use unless a judge has ruled on it. The four criteria are:
the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
the nature of the copyrighted work;
the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
the effect of the use upon the potential market for or value of the copyrighted work.
Different considerations point different ways. Training generative AI is clearly a transformative use, which the courts have tended to protect. On the other hand, you typically train a model on the whole work, not just a part. AI companies are commercial enterprises, not nonprofits or educational organizations.2
I think the real sticking point is the fourth criterion. The explicit intent of all AI companies is to automate human intellectual labor, thus eliminating the market for human artists, musicians, and writers and the copyrighted work they produce. At least when I pass your painting off as my own I’m only reducing the economic value of your specific painting. I’m not destroying the economic value of paintings as a concept!
Of course, if AIs stall out at their current level of artistic ability, we will continue to have a lot of need for human artists, musicians, and writers—and the case that training generative AI is fair use is much stronger. But the more powerful you expect AI to be, the less likely it is that training generative AI is fair use.
Economists often suggest that, if a policy makes most people better off but harms a small group severely, you can make everyone better off by compensating the losers. Our laws, as already written, offer an elegant way to compensate (some of) the losers of generative AI—through requiring AI companies to pay damages and royalties for the data they trained on. I think we should apply these laws.
II.
I also hate a lot of AI slop.3 But a lot of people’s criticisms of it strike me as… joyless? Kind of mean?
Right now—setting aside the future of the technology, scaling laws, straight lines on graphs, etc.—AIs produce bad art. No one wants to publish your AI-generated short story. AIs are better writers than 99% of people, but when it comes to professional publication, being better than 99.99% of people isn’t enough to get you in the door. Stop tormenting poor Neil Clarke with your “side hustle.”
But… it’s okay to make bad art?
It takes many hundreds of hours of practice to play a guitar with reasonable fluency or paint something that’s recognizably a vase and not a bunch of splotches. Many people aren’t going to be able to put in those hours of practice: they have to work two jobs, parent their children, take care of elderly or disabled relatives. Many other people have chronic pain or issues with fine motor skills that mean they won’t be able to pick up a guitar or a paintbrush. And many people stare at a blank page and never have any idea what to put there.
AI allows all those people to create art.
Even today, only a tiny fraction of art is created by a professional for a wide audience. The vast majority of art is created for self-expression, by people who aren’t very good at it, and shared either with a small group of the artist’s friends or with no one at all. And it is good for more people to be able to do that. I don’t think you should have to Git Gud to process your emotions through creation, or to say what you have to say, or simply to make something beautiful.
It warms my heart to see people who have never made art before pick up generative AI and start composing love songs about their spouses, or painting fantasy landscapes, or worldbuilding radfem discourse in a universe where magical girls started appearing during the Industrial Revolution. To see them take their strangenesses and feelings and the inarticulable promptings of their hearts, and discover that technology has made them articulable after all.
Do I want to consume the art they make? No. But it’s not for me. It’s for them.
By all means, if you have a choice between learning to play the guitar and generating a song with an AI, you should learn to play guitar. But many people are, realistically, never going to learn to play the guitar. It is good for them to still make music.
Many of the arguments strike me as very selfish and very small. Once, drawing things in the style of Studio Ghibli was only for real artists like me. Now the masses are getting their dirty paws all over it, contaminating it with their bad taste and their cliches and their boring little lives. Once Studio Ghibli was pure; now it’s disgusting.
The magic of Studio Ghibli hasn’t been ruined. The movies are still there. Anything that can be destroyed by people going “look! I made this picture look as happy and adorable and whimsical as it felt when I took it!”—should be.
Art is for everyone. Anyone can make art. And the corollary of those two statements is that most art is terrible, and there’s nothing wrong with that. God bless bad art, and God bless anything that lets people make it.
Remember back when Google search got you some autist’s lovingly maintained blog and not a bunch of algorithmically generated SEO nonsense?
Economists be like “it’s okay that this economic arrangement harms some people, we can compensate them financially for the harm” and then not financially compensate the people who were harmed
I also think AI companies should compensate artists for stuff they used in training data. But I suspect, given the large amount of data used, and abundance of alternative sources of images, the equilibrium price would be basically nothing for individual artists. So IMO, that's way way down in the political priority list. Some major media corporations seem to have more of a vested interest in pursuing this, they can fight it out amongst themselves.