On Mon, Feb 10, 2025 at 10:32 PM Matt Mahoney <mattmahone...@gmail.com> wrote:
> On Sun, Feb 9, 2025, 9:01 PM Rob Freeman <chaotic.langu...@gmail.com> > wrote: > ... > >> But re. your statement for the reason NNs have been so successful >> developing language models being because "Language evolved to be learnable >> by neural networks one layer at a time, segmentation first, then vocabulary >> ... then semantics, then grammar." I dispute that. You're giving people a >> bum steer if they read that and imagine it is "settled science". Not least >> the hierarchical attribution of grammar over semantics (possibly influenced >> by cognitivist/functionalist linguistic dogma? Or is that still a Hutter >> Prize idea, that semantics will provide a minimal representation for >> grammar?) >> > > You need to understand words before you can parse sentences. How do you > parse the following? > > I ate pizza with pepperoni. > I ate pizza with a fork. > I ate pizza with Bob. > > Children learn semantics before grammar. They start to learn the meanings > of words at age 1. They learn content words like "ball" or "milk" before > high frequency function words like "the" or "of". They learn sentences > around age 2-3. That's more training data. > > To understand whale semantics, you have to observe how their behavior > associates with different calls. Whales may have different languages. We > don't know. Lexical structure (segmentation and Zipf's law) is language > independent but semantics is not. > Lots of information surely goes into what we call a "parse" for natural language. If such a thing as a "parse" can really be said to exist as more than a multiplicity of perspectives. Noting salience doesn't establish a hierarchy. More a heterarchy (in concordance with the multiplicity theme.) For instance, the classic reference on formulaicity, formulaicity being idiosyncratic structure (idiosyncratic meaning beyond formalism), Pawley and Syder, emphasizes the independence of syntax and meaning. By which I mean that they demonstrate language is formulaic beyond mere meaning restriction (e.g. why do we say "ham and eggs" and not "eggs and ham"? What is the meaning difference?) And indeed that language is formulaic beyond any formal syntactic restriction anyone can find either. Formulaic beyond both meaning, and syntactic, formalism. I recall Chomsky's classic argument that syntax is independent of meaning was his "meaningless" sentence which nevertheless gains assumes a structure: "Colourless green ideas sleep furiously". But obviously there are arguments all ways. Because there is no agreement. (On anything. To parse your examples above there might be 50 formalisms, syntactic and semantic, none of which disambiguate universally.) For debates about whether "meaning" (undefined) is fundamental, or syntax (undefined) is fundamental, I recall the "Linguistics Wars" of the early '70s might be the most heated example. I think since then the different schools have given up trying to talk to each other. Formal linguistics is still divided into schools which squabble irreconcilably with each other about this. Broadly the "functionalist" school, the "cognitivist" school, and the "generativist" school. Functionalists will argue function is fundamental, cognitivists meaning is fundamental, generativists (universal) grammar/syntax is fundamental. The cognitivists and functionalists are most friendly to each other. But only because they both tended in a humanist direction, tending to emphasize idiosyncrasy over formalism. None of these schools have any impact on contemporary machine learning dogma. Despite, "embeddings", the foundation of contemporary LLMs, being essentially the same distributional analysis pursued (and then dropped) by theoretical linguistics in the 1930s or earlier. Certainly there's no agreement on hierarchy. And in LLMs you don't see one (at best you see many, many...) > How would we find out without more data? > The answer to that is the same for all science, with better theory. -Rob ------------------------------------------ Artificial General Intelligence List: AGI Permalink: https://agi.topicbox.com/groups/agi/T73fe79f7d09a903a-M66d7456f40370334a6f3e106 Delivery options: https://agi.topicbox.com/groups/agi/subscription