Re: Parsing a language with optional spaces
On 7/7/20 05:35, Akim Demaille wrote: I believe you need to read again the documentation of / 'r/s' It is not as simple as that. As I don't speak BASIC, let me rephrase this problem in FORTRAN IV which is also "blank agnostic": DO = , [, ] It is not until you reach the comma after the first expression that you know whether the statement is the beginning of a loop or it is an assignment. And the expression can contain commas in function calls, which defeats any trivial lookahead scanning. E.g., D O 17 6PQ R=FUN X(1 4, V 8) is an assignment to variable DO176PQR. The function arguments can also be expressions that contain function calls. As you can see, this more or less defeats any attempt to write a lex scanner. And you cannot just squeeze out all blanks in a front end because "Hollerith fields" can contain blanks that are significant (must remain).
Re: Parsing a language with optional spaces
Hi, > On 7 Jul 2020, at 10:55, John P. Hartmann wrote: > > On 7/7/20 05:35, Akim Demaille wrote: >> I believe you need to read again the documentation of / >> 'r/s' > > It is not as simple as that. As I don't speak BASIC, let me rephrase this > problem in FORTRAN IV which is also "blank agnostic": > > DO = , [, ] > > It is not until you reach the comma after the first expression that you know > whether the statement is the beginning of a loop or it is an assignment. And > the expression can contain commas in function calls, which defeats any > trivial lookahead scanning. E.g., > > D O 17 6PQ R=FUN X(1 4, V 8) > > is an assignment to variable DO176PQR. The function arguments can also be > expressions that contain function calls. > > As you can see, this more or less defeats any attempt to write a lex scanner. > And you cannot just squeeze out all blanks in a front end because "Hollerith > fields" can contain blanks that are significant (must remain). Then you couple the squeeze out all blanks approach with BEGIN/END %x regions? https://lists.gnu.org/archive/html/help-bison/2020-07/msg00012.html FBCC uses regions, sorry can't find proper documentation but https://bellard.org//fbcc/ I'd rather shoot myself on a foot than use regions. IDK if that closes a loop on the elegance question. But the tool's been there since forever.
Re: Parsing a language with optional spaces
Hi John, > Le 7 juil. 2020 à 10:55, John P. Hartmann a écrit : > > On 7/7/20 05:35, Akim Demaille wrote: >> I believe you need to read again the documentation of / >> 'r/s' > > It is not as simple as that. Actually the message you are quoting was really just an answer to Maury, for BASIC. > As I don't speak BASIC, let me rephrase this problem in FORTRAN IV which is > also "blank agnostic": > > DO = , [, ] > > It is not until you reach the comma after the first expression that you know > whether the statement is the beginning of a loop or it is an assignment. And > the expression can contain commas in function calls, which defeats any > trivial lookahead scanning. E.g., > > D O 17 6PQ R=FUN X(1 4, V 8) > > is an assignment to variable DO176PQR. The function arguments can also be > expressions that contain function calls. > > As you can see, this more or less defeats any attempt to write a lex scanner. > And you cannot just squeeze out all blanks in a front end because "Hollerith > fields" can contain blanks that are significant (must remain). I still think you can address this case with Flex, but I agree it's going to be painful. I would go for something like sp [ \t]* do D{sp}O id [a-zA-Z]({sp}[a-zA-Z_0-9]+)* etc. This is tedious. In Vcsn I had implemented the "shuffle" operator which would have been helpful (https://www.lrde.epita.fr/dload/vcsn/latest/notebooks/expression.shuffle.html). "Shuffle" is definitely a valid operator: the shuffling of rational languages is a rational language, so it is mathematically sound. Cheers!