Re: [nexa] Can Large Language Models reason?

380° Thu, 12 Oct 2023 23:25:23 -0700

Daniela Tafani <daniela.taf...@unipi.it> writes:

> Melanie Mitchell, Can Large Language Models reason?
>
> https://aiguide.substack.com/p/can-large-language-models-reason

disponibili anche qui:
https://web.archive.org/web/20230911043145/https://aiguide.substack.com/p/can-large-language-models-reason

e qui: https://archive.ph/TVsNy

> (non lo copio di seguito perchè immagini e formattazione servono)

si ma potrebbe essere utile un po' di testo sia a fini archivistici che
per aiutare gli iscritti a capire se vale la pena approfondire o meno

--8<---------------cut here---------------start------------->8---

[...] Reasoning is a central aspect of human intelligence, and robust
domain-independent reasoning abilities have long been a key goal for AI
systems. While large language models (LLMs) are not explicitly trained
to reason, they have exhibited “emergent” behaviors that sometimes look
like reasoning. But are these behaviors actually driven by true abstract
reasoning abilities, or by some other less robust and generalizable
mechanism—for example, by memorizing their training data and later
matching patterns in a given problem to those found in training data?

Why does this matter? If robust general-purpose reasoning abilities have
emerged in LLMs, this bolsters the claim that such systems are an
important step on the way to trustworthy general intelligence. On the
other hand, if LLMs rely primarily on memorization and pattern-matching
rather than true reasoning, then they will not be generalizable—we can’t
trust them to perform well on “out of distribution” tasks, those that
are not sufficiently similar to tasks they’ve seen in the training data.

* What Is “Reasoning”?

The word “reasoning” is an umbrella term that includes abilities for
deduction, induction, abduction, analogy, common sense, and other
“rational” or systematic methods for solving problems. Reasoning is
often a process that involves composing multiple steps of
inference. Reasoning is typically thought to require abstraction—that
is, the capacity to reason is not limited to a particular example, but
is more general. If I can reason about addition, I can not only solve
23+37, but any addition problem that comes my way. If I learn to add in
base 10 and also learn about other number bases, my reasoning abilities
allow me to quickly learn to add in any other base.

* “Chain of Thought” Reasoning in LLMs

In the last few years, there has been a deluge of papers making claims
for reasoning abilities in LLMs (Huang & Chang give one recent survey).
One of the most influential such papers (by Wei et al. from Google
Research) proposed that so-called “Chain of Thought” (CoT) prompting
elicits sophisticated reasoning abilities in these models. A CoT prompt
gives one or more examples of a problem and the reasoning steps needed
to solve it, and then poses a new problem.

[...]

* Are the Reasoning Steps Generated Under CoT Prompting Faithful to the
Actual Reasoning Process? [...]

* If LLMs Are Not Reasoning, What Are They Doing? [...]

* The Trickiness of Evaluating LLMs for General Abilities [...]

[...] One might argue that humans also rely on memorization and
pattern-matching when performing reasoning tasks. Many psychological
studies have shown that people are better at reasoning about familiar
than unfamiliar situations; one group of AI researchers argued that the
same patterns of “content effects” affect both humans and LLMs.
However, it is also known that humans are (at least in some cases)
capable of abstract, content-independent reasoning, if given the time
and incentive to do so, and moreover we are able to adapt our
understanding of what we have learned to wholly new situations. Whether
LLMs have such general abstract-reasoning capacities, elicited through
prompting tricks, scratchpads, or other external enhancements, still
needs to be systematically demonstrated.

--8<---------------cut here---------------end--------------->8---

Cordiali saluti, 380°

--
380° (Giovanni Biscuolo public alter ego)

«Noi, incompetenti come siamo,
non abbiamo alcun titolo per suggerire alcunché»

Disinformation flourishes because many people care deeply about injustice
but very few check the facts. Ask me about <https://stallmansupport.org>.

signature.asc
Description: PGP signature

_______________________________________________
nexa mailing list
nexa@server-nexa.polito.it
https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa

Re: [nexa] Can Large Language Models reason?

Reply via email to