Buongiorno Daniela,

Daniela Tafani <daniela.taf...@unipi.it> writes:

> Can Large Language Models Reason and Plan?
> Subbarao Kambhampati

[...]

> https://nyaspubs.onlinelibrary.wiley.com/doi/10.1111/nyas.15125
> oppure
> https://arxiv.org/pdf/2403.04121.pdf

grazie per la segnalazione.

Qual'è la definizione scientifica di "ragionare"? :-D

L'autore scrive:

--8<---------------cut here---------------start------------->8---

Let us pause to note that my interest here is not whether
LLMs can fake reasoning (by giving correct answers to
reasoning tasks from memory and pattern finding), but
whether they can actually do principled reasoning. Of
course, seeing patterns in reasoning problems is not any-
thing to be sneezed at. After all, our interest in master-
ing it is what is behind much of “street fighting” math
(e.g. George Pólya’s ”How to Solve it”). But finding ap-
proximate shortcuts over provably correct reasoning proce-
dures is obviously not equivalent to doing reasoning–unless
you have an ability to establish from first principles that
your hunch is actually correct. It is challenging to decide
whether a system (or a human, for that matter) is memoriz-
ing or solving a problem from scratch–especially as the sys-
tems (or humans) get trained on larger and larger “question
banks.” This is a challenge that most instructors and inter-
viewers are acutely aware of. Think of that infamous “Why
are manhole covers round?” interview question. While it
may well have given the interviewer an insight into the
candidate’s analytical reasoning skills the very first time it
was asked, all it does with high probability now is to con-
firm whether the candidate trained on the interview ques-
tion banks!

--8<---------------cut here---------------end--------------->8---

...non mi dire che per decidere se un sistema sappia ragionare è
necessario che quel sistema necessiti di self-reflection!?! :-D
(ho avuto fortuna perché mi sono ricordato bene o ho compreso
l'argomento?)

Uh, quindi _molto_ probabilmente i sistemi LLM _non_ sanno ragionare: e
allora?!?

--8<---------------cut here---------------start------------->8---

While the foregoing questions the claims that LLMs are
capable of planning/reasoning, it is not meant to imply that
LLMs don’t have any constructive roles to play in solving
planning/reasoning tasks. In particular, their uncanny abil-
ity to generate ideas/potential candidate solutions–albeit
with no guarantees about those guesses–can still be valu-
able in the “LLM-Modulo” setups 6 , in conjunction with
either model-based verifiers or expert humans in the loop.
The trick to avoiding ascribing autonomous reasoning ca-
pabilities to LLMs is to recognize that LLMs are generating
potential answers that still need to be checked by external
verifiers.

--8<---------------cut here---------------end--------------->8---

Forse che forse, quindi, gli umani rischiano di essere ancora
necessari?!?

--8<---------------cut here---------------start------------->8---

LLMs can be a rich source of approximate models of world/domain dynamics
and user preferences, as long as the humans (and any specialized
critics) in the loop verify and refine the models, and give them over to
model-based solvers. This way of using LLMs has the advantage that the
humans need only be present when the dynamics/preference model is being
teased out and refined, with the actual planning after that being left
to sound frameworks with correctness guarantees 6 .

--8<---------------cut here---------------end--------------->8---

Uh guarda, quella /roba/ descritta sopra assomiglia tantissimo alla
_programmazione_ :-O, sarà un caso stocastico!

Mi spiego...

Se non ho capito male, i model-based solvers sono quelle /robe/ che
servono da "external verifiers" nei LLM-Modulo Frameworks:
https://arxiv.org/abs/2402.01817
«LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks»

Sempre se non ho capito male, gli "external model-based verifiers" sono
"symbolic solvers", cioè *programmi* scritti per sistemi CAS
https://en.wikipedia.org/wiki/Computer_algebra_system
(per i quali l'articolo di Wikipedia afferma: «the result of a
computation commonly has an unpredictable form and an unpredictable
size; therefore user intervention is frequently needed»)

...o sistemi simbolici analoghi.

Finché c'è programmazione c'è speranza.

Ho capito male?

[...]

Saluti, 380°


P.S.: la riproposizione in "salsa AI" della famosissima figura
«Figure 2. Claimed reasoning capabilities of LLMs are sometimes
due to the subconscious helpful iterative prompting by the humans
in the loop (graphic adapted from https://xkcd.com/2347/ under
Creative Commons License)» (pag. 3 del PDF) è una genialata!

-- 
380° (Giovanni Biscuolo public alter ego)

«Noi, incompetenti come siamo,
 non abbiamo alcun titolo per suggerire alcunché»

Disinformation flourishes because many people care deeply about injustice
but very few check the facts.  Ask me about <https://stallmansupport.org>.

Attachment: signature.asc
Description: PGP signature

_______________________________________________
nexa mailing list
nexa@server-nexa.polito.it
https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa

Reply via email to