How prestige outlets like The Guardian get away with copypasta
On the anatomy of malpractice
Major media outlets are attacked over their politics (too left-wing, too right-wing, etc) all the time. But what’s not critiqued enough are the common practices behind how they write their articles, practices which are often terrible, misleading, and deeply dishonest.
I’m talking about what happens at the supposedly most prestigious organizations around, places like The Guardian or The New York Times (and a myriad of others). What I am here to tell you, as someone who writes online and therefore has some experience trying to put together pieces that I think hold up under scrutiny (which never means they’re perfect, but does mean I’m trying) is that it is incredibly common to find articles at these outlets that don’t hold up under minimal scrutiny, in which it’s obvious no one tried. I’m talking unprofessional. I’m talking basically lying by omission to their readers. And I’m talking utterly useless as an information source. If I used the same practices as the big outlets here, on Substack, everyone on the internet would be upset with me, and justifiably so. Paradoxically, these bad practices don’t affect how prestigious these publications are thought of, which is as the historical record of our very civilization.
What could outlets have possibly done to earn my ire? Let me give a couple examples of what I consider to be inexcusable practices that are not just common but essentially universal across the industry.
First, there is the practice that undermines any notion their purpose is to inform their readers. Which is that they don’t fucking cite anything.
There are zero, zip, nada, links or citations. And I’m talking most of the time, for most of their articles. Let’s take this recent article in The Guardian, concerning the new paper showing that depression is likely not caused by a chemical imbalance.
It’s about a scientific study, so you would think that the scientific study would be linked at least somewhere in the text; or, if not a link, perhaps a citation at the bottom or in a footnote. But is the paper linked? No. Is anything linked? No. They barely tell you what journal the paper appeared in. Here’s the information they give you, essentially guaranteeing that to find it you’d have to search through the journal Molecular Psychiatry by hand, guessing at what they’re talking about, since they don’t even say the name of the study, nor the issue number.
You might think that red link there, one of a mere two in The Guardian’s article, would go to the journal—nope, it just searches The Guardian for the term “psychiatry.” What if someone read that in a couple weeks, and wanted to look up that study? How could they ever find it? And why’d they put that red “psychiatry” link exactly right there? Precisely where, normally, a link to the actual journal would sensibly go? You may think this conspiratorial, but imagine if I wrote a post about the recent evidence contradicting the chemical imbalance theory of depression and included links precisely where you would expect them, but it turned out they all only went to my own Substack? That would be sociopathic. But it’s somehow just fine for a prestige outlet.
Their refusal to link or cite or provide any outside reference anywhere that might take you off their website means you never know where any fact they give you comes from—and without its origins, you can’t assess its veracity. Like:
Says who? Is this study 5 years old? 10? This year? No one can ever know, because where this is coming from is completely opaque.
One might argue that this is merely a function of historically being in print—but still, not even a citation? Even if it were printed (I can’t find any evidence it was) how would that actually make it better? The Guardian has been online for 24 years and has an openly “digital first” strategy! And don’t get me wrong, my standard is not that everything these outlets come out with is perfectly cited and linked and there are zero mistakes. I don’t meet that standard and frankly nobody meets that standard. But what matters is that a piece of writing should try to empower a reader, and the omnipresent lack of outside context indicates that the primary purpose of these websites is not to empower their readers.
That’s only the beginning. Let’s consider another common practice, one that few people ever notice but which is also essentially universal. It involves a misrepresentation ensconced in plausible deniability. What I’m talking about is how they manage quotes for their articles. Specifically, they will use a quote without giving a source and therefore implicitly pawn off the quote as if they just did a bunch of investigative journalistic legwork and tracked it down but really all they did was poach it from another organization, eliding that the true nature of the article is basically just a copypasta of some other earlier article in a smaller outlet (“copypasta” is a term used on internet forums to describe widely-shared blocks of text that can be copy/pasted with minimal changes to fit a new context). Here’s an example of the trick from the same Guardian article, where the lead researcher is quoted:
Does it not sound like they rang up Moncrieff, got that nice quote, and are now presenting you, readers of The Guardian, with it?
Well, they didn’t. That quote by Moncrieff is from an earlier source, a press release that has nothing to do with them, which they use without crediting, essentially making it look like they did investigative legwork for the piece when really they just copy/pasted from somewhere else. Here’s the exact same quote, published the day before at a much smaller outlet called Eureka Alerts.
Now, Eureka Alerts probably doesn’t mind—they specifically publish press releases super early to be picked up by bigger outlets. My point is that this practice is hidden, and made ambiguous so that the larger outlets appear to have done more investigative work (without linking or citing it). And it’s not like The Guardian is alone in this—almost everyone who reported on the viral story used either the same pull quote or other duplicates, from Newsweek to Sci-news to Science Daily, with only a few providing a link to any sort of source; for it is indeed a common practice to pawn off quotes by leaving the origin ambiguous. It’s all carefully worded so that there’s plausible deniability—they’re not lying, like they would be if they said “obtained by The Guardian” or anything like that. But they are misrepresenting, I think on purpose, to appear more authoritative than they actually are. If it’s an accident, it sure is a beneficial one that takes maximum advantage of the ambiguity. The letter of the law is upheld—they’re all just cribbing off the same press release—but the spirit of the law is lacking. And the same goes for how similar all the stories are, almost identical—it’s like watching a bunch of college first-years try to avoid plagiarism while working off the same cheat sheet.
This aspect of the news being copypasta is the sort of thing that once you start noticing it, you can’t stop, and so I want as many people to notice as possible. As you might have guessed, I was actually going to write about the very-public shift in the “chemical imbalances” theory of brain disorders, but I got stuck at this Guardian article, since I immediately suspected that the quotes were being pawned off without crediting the source. How did I personally find out about this practice? Because The Guardian once used one of my quotes in exactly the same way.
Last year I published "The Overfitted Brain: Dreams evolved to assist generalization” in the journal Cell: Patterns, an article proposing a new theory of why dreaming occurs: that dreams are out-of-distribution stimuli designed to keep our brain from getting too well-trained on our daily (and often boring) routines, which would harm generalization. It is the only hypothesis I know of that explains dreams’ strangeness, their Lynchian quality—which is especially important since it’s a quality that often contradicts other theories and is normally very hard to explain.
The paper got a lot of press. I discussed the experience on The Intrinsic Perspective in “On being the subject of a media cycle,” writing:
. . . here’s an abridged list of the links:
The Guardian, IFSL, Nature, Gizmodo, Medical News Today, The Washington Post, Eurekalert, Dazed, ScienceAlert, Technology Networks, Big Think, iHeartRadio, Neuroscience News, Psychology Today, Inverse, Spokesman, The Boston Globe, etc.
There are plenty more. Some of my personal favorites include write-ups in Aviation Analysis and Martha Stewart Magazine. It also made its way to to places like Japanese Media and the biggest daily morning newspaper in Sweden. How about Slovenia? I even ended up explaining the hypothesis on a couple of radio stations.
What I didn’t say at the time was that most of these articles were pretty close to copypasta, and many do the exact same thing of using quotes without sources. Not all, of course. But many. E.g., zooming in on The Guardian article that covered the Overfitted Brain Hypothesis, there was this quote:
I read that and thought—wait, I never talked to a journalist from The Guardian! And I never did. They just took it from somewhere else.
I suppose that some people might find this not shocking. After all, it’s not a lie. You don’t have to assume that they did the legwork to get that quote. They just leave it unsourced. It’s only a lie of omission. And a defender might point out that it’s only my assumption that readers think the reporters spoke to the people they’re quoting to begin with.
I can see the logic behind this response, but it doesn’t face the fact that if I were to be this cavalier, even once, it would be something I would be ashamed of, and additionally something I would be rightly called out on and try to correct. That’s a pretty simple litmus test—would I purposefully do it on Substack?—and these practices don’t pass. So I’ll be honest and say I think they dissemble in order to appear more authoritative than they actually are and to prevent the horrible realization of how these outlets actually function. For what most stories at major paper-of-record outlets are is copypasta, but due to practices like (a) not linking sources and (b) not crediting where statements come from, they are able to maintain the illusion that the copypasta is self-contained, original, and thoroughly investigated. If they actually sourced everything, then what would happen is that after a couple articles where you clicked through and found basically an identical article with the same quotes, the bloom would come off the rose real quick. And while it’s pretty much impossible to figure out what percent of articles these criticisms don’t apply to (perhaps by tracking, say, the pieces of investigative journalism that these outlets break), there is simply no way that the copypasta approach isn’t the majority, perhaps 95% or more in some places.
Oh, and the reason your computer slows down so much when you open these outlets (way slower than streaming HD video) is that a lot of these websites are loading not just dynamics ads but also analytic tracking software, or, even more intrusively, using your visiting computer to mine cryptocurrency. Salon has been open about using your computer to mine, and there’s evidence CBS has as well, but it’s rumored to be a quite wide-spread, if secretive, practice.
I want to point out that I’m not criticizing any particular journalist or writer—after all, they’re operating simply under the explicit guidelines they’ve been given, shackles they cannot break, and therefore cannot do things like include outside links or appropriately credit that the quotes came from another source, or put a more personal, interesting, or involved take on a story to avoid the copypasta effect. Many of them want to do all that, and are prevented—I say that because I personally know, respect, and am friendly with multiple people, like journalists and science writers and freelancers, who’ve worked for these institutions. Heck, I’ve written for these sort of outlets. So I want to make clear my criticism is of the guidelines of the institutions that hem their authors in. I want to see writers set free.
Also, if you’re a long-time fan of the institutions themselves, I understand that these criticisms may seem unfair; after all, they put out dozens of new takes every day, often within 24 hours of when it actually happens, and therefore one could argue that they provide a valuable service, even if they don’t do it in the way I would want. This is why as writers continue to move to start independent publications like this one, the “unbundling” of media has been discussed mostly as if it’s a bad thing, as if it were dangerous to leave, say, long-form articles about some new scientific discovery to individuals rather than trusted institutions. But I don’t think this argument holds much water, at least for most subjects. For if these are the legacy practices we’re leaving behind, then I don’t see how it’s not an objective improvement. And maybe this is just local to my part of the internet, but it certainly feels like the standards are higher for individuals than what’s on display at major outlets. I don’t say that because individuals are so great, I say that because they seem to be at least trying to empower readers, and are quite easy to hold to account, and legacy outlets are explicitly not trying, and are difficult to hold to account, so therefore are allowed to evolve disingenuous practices that conveniently cultivate a false authority. So maybe following a bunch of independent authors, journalists, and thinkers who would feel bad about (and be called out for) pumping out unsourced copypasta with uncredited quotes that crypto-mines your computer is going to end up with the average reader being more informed, not less, and may, here’s a shocker, actually be a significant boon for our culture.
Plus, we allow comments. They mostly don’t, ever since around 2013. As time goes on I find that more and more telling.
Subscribe for free to receive a new essay once a week