On Tue, 11 Oct 2022 at 18:12, <avi.e.gr...@gmail.com> wrote: > > Thanks for a rather detailed explanation of some of what we have been > discussing, Chris. The overall outline is about what I assumed was there but > some of the details were, to put it politely, fuzzy. > > I see resemblances to something like how a web page is loaded and operated. > I mean very different but at some level not so much. > > I mean a typical web page is read in as HTML with various keyword regions > expected such as <BODY> ... </BODY> or <DIV ...> ... </DIV> with things > often cleanly nested in others. The browser makes nodes galore in some kind > of tree format with an assortment of objects whose attributes or methods > represent aspects of what it sees. The resulting treelike structure has > names like DOM.
Yes. The basic idea of "tokenize, parse, compile" can be used for pretty much any language - even English, although its grammar is a bit more convoluted than most programming languages, with many weird backward compatibility features! I'll parse your last sentence above: LETTERS The SPACE LETTERS resulting SPACE ... you get the idea LETTERS like SPACE LETTERS DOM FULLSTOP # or call this token PERIOD if you're American Now, we can group those tokens into meaningful sets. Sentence(type=Statement, subject=Noun(name="structure", addenda=[ Article(type=The), Adjective(name="treelike"), ]), verb=Verb(type=Being, name="has", addenda=[]), object=Noun(name="name", plural=True, addenda=[ Adjective(phrase=Phrase(verb=Verb(name="like"), object=Noun(name="DOM"), ]), ) Grammar nerds will probably dispute some of the awful shorthanding I did here, but I didn't want to devise thousands of AST nodes just for this :) > To a certain approximation, this tree starts a certain way but is regularly > being manipulated (or perhaps a copy is) as it regularly is looked at to see > how to display it on the screen at the moment based on the current tree > contents and another set of rules in Cascading Style Sheets. Yep; the DOM tree is initialized from the HTML (usually - it's possible to start a fresh tree with no HTML) and then can be manipulated afterwards. > These are not at all the same thing but share a certain set of ideas and > methods and can be very powerful as things interact. Oh absolutely. That's why there are languages designed to help you define other languages. > In effect the errors in the web situation have such analogies too as in what > happens if a region of HTML is not well-formed or uses a keyword not > recognized. Aaaaand they're horribly horribly messy, due to a few decades of sloppy HTML programmers and the desire to still display the page even if things are messed up :) But, again, there's a huge difference between syntactic errors (like omitting a matching angle bracket) and semantic errors (a keyword not known, like using <spam> when you should have used <span>). In the latter case, you can still build a DOM tree, but you have an unknown element; in the former case, you have to guess at what the author meant, just to get anything going at all. > There was a guy around a few years ago who suggested he would create a > system where you could create a series of some kind of configuration files > for ANY language and his system would them compile or run programs for each > and every such language? Was that on this forum? What ever happened to him? That was indeed on this forum, and I have no idea what happened to him. Maybe he realised that all he'd invented was the Unix shebang? ChrisA -- https://mail.python.org/mailman/listinfo/python-list