Buongiorno Daniela, Daniela Tafani <daniela.taf...@unipi.it> writes:
> Can Large Language Models Reason and Plan? > Subbarao Kambhampati [...] > https://nyaspubs.onlinelibrary.wiley.com/doi/10.1111/nyas.15125 > oppure > https://arxiv.org/pdf/2403.04121.pdf grazie per la segnalazione. Qual'è la definizione scientifica di "ragionare"? :-D L'autore scrive: --8<---------------cut here---------------start------------->8--- Let us pause to note that my interest here is not whether LLMs can fake reasoning (by giving correct answers to reasoning tasks from memory and pattern finding), but whether they can actually do principled reasoning. Of course, seeing patterns in reasoning problems is not any- thing to be sneezed at. After all, our interest in master- ing it is what is behind much of “street fighting” math (e.g. George Pólya’s ”How to Solve it”). But finding ap- proximate shortcuts over provably correct reasoning proce- dures is obviously not equivalent to doing reasoning–unless you have an ability to establish from first principles that your hunch is actually correct. It is challenging to decide whether a system (or a human, for that matter) is memoriz- ing or solving a problem from scratch–especially as the sys- tems (or humans) get trained on larger and larger “question banks.” This is a challenge that most instructors and inter- viewers are acutely aware of. Think of that infamous “Why are manhole covers round?” interview question. While it may well have given the interviewer an insight into the candidate’s analytical reasoning skills the very first time it was asked, all it does with high probability now is to con- firm whether the candidate trained on the interview ques- tion banks! --8<---------------cut here---------------end--------------->8--- ...non mi dire che per decidere se un sistema sappia ragionare è necessario che quel sistema necessiti di self-reflection!?! :-D (ho avuto fortuna perché mi sono ricordato bene o ho compreso l'argomento?) Uh, quindi _molto_ probabilmente i sistemi LLM _non_ sanno ragionare: e allora?!? --8<---------------cut here---------------start------------->8--- While the foregoing questions the claims that LLMs are capable of planning/reasoning, it is not meant to imply that LLMs don’t have any constructive roles to play in solving planning/reasoning tasks. In particular, their uncanny abil- ity to generate ideas/potential candidate solutions–albeit with no guarantees about those guesses–can still be valu- able in the “LLM-Modulo” setups 6 , in conjunction with either model-based verifiers or expert humans in the loop. The trick to avoiding ascribing autonomous reasoning ca- pabilities to LLMs is to recognize that LLMs are generating potential answers that still need to be checked by external verifiers. --8<---------------cut here---------------end--------------->8--- Forse che forse, quindi, gli umani rischiano di essere ancora necessari?!? --8<---------------cut here---------------start------------->8--- LLMs can be a rich source of approximate models of world/domain dynamics and user preferences, as long as the humans (and any specialized critics) in the loop verify and refine the models, and give them over to model-based solvers. This way of using LLMs has the advantage that the humans need only be present when the dynamics/preference model is being teased out and refined, with the actual planning after that being left to sound frameworks with correctness guarantees 6 . --8<---------------cut here---------------end--------------->8--- Uh guarda, quella /roba/ descritta sopra assomiglia tantissimo alla _programmazione_ :-O, sarà un caso stocastico! Mi spiego... Se non ho capito male, i model-based solvers sono quelle /robe/ che servono da "external verifiers" nei LLM-Modulo Frameworks: https://arxiv.org/abs/2402.01817 «LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks» Sempre se non ho capito male, gli "external model-based verifiers" sono "symbolic solvers", cioè *programmi* scritti per sistemi CAS https://en.wikipedia.org/wiki/Computer_algebra_system (per i quali l'articolo di Wikipedia afferma: «the result of a computation commonly has an unpredictable form and an unpredictable size; therefore user intervention is frequently needed») ...o sistemi simbolici analoghi. Finché c'è programmazione c'è speranza. Ho capito male? [...] Saluti, 380° P.S.: la riproposizione in "salsa AI" della famosissima figura «Figure 2. Claimed reasoning capabilities of LLMs are sometimes due to the subconscious helpful iterative prompting by the humans in the loop (graphic adapted from https://xkcd.com/2347/ under Creative Commons License)» (pag. 3 del PDF) è una genialata! -- 380° (Giovanni Biscuolo public alter ego) «Noi, incompetenti come siamo, non abbiamo alcun titolo per suggerire alcunché» Disinformation flourishes because many people care deeply about injustice but very few check the facts. Ask me about <https://stallmansupport.org>.
signature.asc
Description: PGP signature
_______________________________________________ nexa mailing list nexa@server-nexa.polito.it https://server-nexa.polito.it/cgi-bin/mailman/listinfo/nexa