62 Comments
Sep 21, 2022Liked by Erik Hoel

Also: ask one hundred humans to draw a horse riding an astronaut & I bet about fifty of them would draw an astronaut riding a horse.

Expand full comment
Sep 21, 2022Liked by Erik Hoel

Interesting post. I have to say I had been very sympathetic to Marcus but this post makes me less so.

Something that leaves me somewhat sympathetic to Marcus’ rhetoric calling these “parlor tricks” and such is the fact that these models, while superficially impressive, have yet to yield much value in real world applications. I realize for PALM and Dalle-2 it’s still super early, but we’ve had GPT-3 for years and as far as I know there have been no major applications of it beyond “hey look how novel it is that an AI wrote part of this article”. Whenever I talk to people who program/use AI in commercial applications, they are much more cynical about its capacities.

Given that, it really does seem like something is missing, and maybe we’re prone to overestimating these models. While many of the specific critiques made by Marcus were wrong, I think he’s getting at that more general intuition.

As a side point, I’m also more skeptical than Erik is on how much progress we’ll get for further scaling, given that we may be running out of the kind of data these models use to train: https://www.lesswrong.com/posts/6Fpvch8RR29qLEWNH/chinchilla-s-wild-implications

Expand full comment
Sep 21, 2022Liked by Erik Hoel

What conclusions would an alien Gary Marcus come to about human cognition if all they knew of it was that it produced all these after the prompt "draw a bicycle"?

https://www.washingtonpost.com/news/wonk/wp/2016/04/18/the-horrible-hilarious-pictures-you-get-when-you-ask-random-people-to-draw-a-bicycle/

Expand full comment

From the very first paragraph, I thought, “Is this Eric Hoel, or a GPT-3 bot that was prompted by Eric Hoel to write Hoel-esque thoughts?”

I am expecting, one day soon, an AI preacher, and soon after that perhaps, an AI “Jesus” (liberal and conservative versions, of course). In the latter case, the AI will presumably ingest the Bible, particularly the red-letter sections, supplemented by associated religious writings across almost two millennia, to create a convincing semblance. Miracles to follow. The future will be strange and disturbing.

Perhaps, rather than more Turing tests, which investigate whether a person can distinguish AI from another person, we need a test to see whether a person can fall in love with AI (full well knowing it is AI), and mourn when the AI dies (again knowing it is AI), with the same poignancy and pain as they would for a real person. I expect that in many cases, the answer will be yes. And that will tell us something frightening, not about AI, but about us.

Perhaps what we need to be training is not AI, but human minds, to think comprehensively, and not just emotionally.

Expand full comment

Good post!

I find myself somewhat conflicted on the issue. On the one hand, I'm sympathetic to the skeptical stance in general. I think the "clever Hans" problem is a real one––in fact, there were well-documented cases of previous LLMs like BERT performing >90% on a natural language inference task, only to find that the task's items basically baked in the answer via more superficial linguistic cues.

On the other hand, it's undeniable that these models have made incredible progress, and I'm not as sympathetic to what I see as widespread a priori assumptions that models like GPT-3 simply *can't* understand language (and therefore any demonstration of "understanding" is a kind of reductio ad absurdum of the task itself). In general it feels like there's a lack of rigorous philosophical work aimed at discovery as opposed to position-taking.

With that said, I'll put in a shameless plug for a pre-print I recently released: https://arxiv.org/abs/2209.01515

In it, my co-authors and I compared GPT-3's performance on a False Belief task to humans. The central question is where Theory of Mind comes from *in humans*, and the paper is written and framed that way (i.e., to what extent does language alone explain ToM). But beneath that result you'll find that––even though GPT-3 is not quite at human level––it's performing reasonably above chance. The study and its analyses were entirely pre-registered, and the stimuli were carefully controlled to try to omit potential confounds. (Impossible to know for sure, but there are very few differences across conditions other than the key experimental manipulation). Does this mean that GPT-3 has the beginnings of something like ToM? We don't take a strong stance on it in the paper, but I think more and more that we need to start asking ourselves hard questions like that––and about the link between experimental instruments and the constructs they're designed to test.

Anyway, I thought it was potentially relevant to your post and might be of interest to you!

Expand full comment

Whenever I read these debates, I am always left wondering exactly what the claim is, and exactly what the criticism of the claim is.

AIs of this sort are pastiche generators. That is what they are. They respond to prompts. They are, in many ways, remarkable pastiche generators, if often quirky. But what is the claim? Does it go beyond saying, wow, look at what a great pastiche generator we built. Does the critique go beyond saying, nah, your pastiche generator ain't so great?

I ask this because if there is one thing that the world is not short of, its artistic and literary pastiche. The stuff is literally being given away in vast quantities. The world is not suffering from a pastiche shortage. Why bother to automate its production?

Is it to prove a point about something else? Is this really all about proving that human beings are just automatons by building an automaton that is indistinguishable from a human being? Is the claim, humans as just robots because soon we will be able to build a robot that is just as human as a human? (Not exactly a novel concept in Sci Fi.) And if so, is the critique, that the robots are not really very lifelike at all (because, implicitly, human beings are not automatons, but are possessed of a divine spark)?

If that is the critique, then I'm not sure that it is all that different from your own conclusion that AI is not human-like and thus is not a philosophical challenge to human status, and therefore not a denial of the divine spark. Except that you would seem to be saying that that was never the claim in the first place and therefore it was never necessary to refute that claim.

In other words is this really all about saying that the question of whether Pinocchio is a real boy is irrelevant because all anybody claimed to be doing was building a really cool marionette?

Expand full comment

Fascinating. Appreciate the context on this debate, and on all the conflations and the goalposts!

Expand full comment
Sep 21, 2022Liked by Erik Hoel

This is a nicely done multi pronged analysis, and the replies cover a lot of ground (my phone just put in the word "ground" on its own. )

Re intellectual capability vs human brains, maybe consider an industrial machine... there is no human that can recall or calculate as well as a computer , but big deal, because no human can drill an oil well on their own. It's a big dumb task that we solved by imagining and creating the equipment. I think ai achievements mimic this process.

Maybe a new way to conceive of human-like intellectual behaviour is twigging. It's a mini eureka with utterly minimal input. Hey i get it !

Expand full comment

This is excellent! It also articulates a few of the points I was writing about for next week, on the Wittgensteinian language games within AI discourse, and much better than I have yet.

The idea that we can circumscribe the realities of the world, or indeed experience, to a few fuzzy categories that are neither well understood nor adequately explored, makes most of the conversations regarding this topic a fight about definitions.

Expand full comment
Oct 3, 2022Liked by Erik Hoel

> Humans think about what they’re going to write or paint, and it takes them a long time,

> and they really try to get it correct on the first go. Throw all that out the window for AIs.

This, I think, is a really important point. When a human creates something, we plan ahead, write, revise, and iterate. The latest diffusion models actually do that, to some extent, with images; although "denoising" is different from revising. However, large language models generate their answers left-to-right in one shot. For a task like programming, this is an impossibly high bar. No human could write correct code of any length like that; a single missed semicolon would lead to instant failure.

The important thing to realize is that the inability to plan and revise is not a fundamental failure of deep learning, it is a flaw in the way that our models are currently trained. Over the next few years, I fully expect that changes to architecture and/or training will fix that flaw. And when these models are able to iterate and fix their mistakes, it will be much easier to see what they are really capable of.

(Note that actor/critic RL models and GANs already have a "critic" that judges the likelihood (quality) of their own outputs, and LLMs return likelihood as well. AlphaGo has demonstrated how to use monte-carlo tree search to do planning, and training tasks like infilling and masked language modeling show how to repair broken inputs. Thus, all of the pieces would seem to be in place for a sophisticated architectures based on planning and revising.)

Expand full comment
Sep 26, 2022Liked by Erik Hoel

I am of two mind on this. On the one hand, we have made stunning strides in this last decade... downplaying this is very clearly moving goal posts.

Still to anyone who thinks that more of the same will yield AGI is also mistaken I think. Present DL systems are roughly like human unconscious or "instant" thinking. It is very stunning how (like humans) these algorithms seem to pull many kinds of information into intelligent-looking outputs.

its not a parlor trick... it is (very roughly) like human subconscious thought.

Still it lacks the subsequent deliberative thinking that human can apply to the raw materials they are provided by their subconscious.

But ten years ago we did not have human like generativity that we have now. not even close. who is to say in ten more years we wont add a new class of deliberation onto of the subconscious outputs of DL today? It seems more than plausible to me, it seems likely. And THEN these systems will be critiquing their thinking, and the comparison to humans will be far, far more even.

The idea that we are not making vast strides is just ridiculous, even if we are not yet there.

Expand full comment

As a recent entry to the conversations I really have to sift through a lot of different AI views. I do find that while reading Gary Marcus’ essays, he’s often talking about the lack of regulation also that comes with AI technology due to simply a lack of knowledge. I think that this is probably a question that needs more insight rather than the focus on replicating human intelligence

Expand full comment
Sep 21, 2022Liked by Erik Hoel

What about radiologists bring replaced by AI? We were told that was imminent, 10 years ago.

Expand full comment

Brilliant Erik. I’m not a partisan in this debate, but your approach is very useful.

Expand full comment
Sep 21, 2022Liked by Erik Hoel

This is a nicely done multi pronged analysis, and the replies cover a lot of ground (my phone just put in the word "ground" on its own. ) so just a few thoughts...

Re intellectual capability vs human brains, maybe consider an industrial machine... there is no human that can recall or calculate as well as a computer , but big deal, because no human can drill an oil well on their own. It's a big dumb task that we solved by imagining and creating the equipment. I think ai achievements mimic this process.

Maybe a new way to conceive of human-like intellectual behaviour is twigging. It's a mini eureka with utterly minimal input. Hey i get it !

On self driving cars...

Humans have the advantage of being trained over millions of years. But ai has the advantage of being trained by millions of tb of data and maybe millions of scientists. Right now it does not seem that ai will ever drive a car at L5 ie all conditions including winter and city and off road and emergencies like i could do when I was just a kid. Yes ai can be astonishing but compared to the little training that people can leverage into high level capability I'm very unsure how ai stacks up.

Expand full comment

Well, great. Thanks for dispelling my wishful thinking that if I just ignored the obvious threats from AI, it wouldn’t become a problem.

Also, incredible flourish at the conclusion... corporations always exploring new ways to generate alien minds...

Expand full comment