Thanks Chris, Changing "." to either ".*", ".+", "[^\n]*" or "[^\n]+" all solve the problem with multiple printables on one line, but still throw a syntax error when two lines follow each other without an intervening blankline. In other words, it calls a paragraph of more than one line a syntax error.
I've already split paratext into multiple LINE tokens which represent a line without its NL, and now I'm thinking of splitting line into multiple chars ("[^\n]"). Perhaps this will make the rules less complicated, though longer. Thanks, SteveT Steve Litt Autumn 2023 featured book: Rapid Learning for the 21st Century http://www.troubleshooters.com/rl21 Chris verBurg said on Tue, 12 Dec 2023 18:57:08 -0800 >Hey Steve, > >My reading of your code is that PARATEXT will only ever be a single >character. I'm thinking you want the flex rule to be ".*" (etc) >instead of just ".". > >I'm curious whether your paragraphs are allowed to contain NLs. If so, >you're going to have to include them in the PARATEXT token value, and >also update it to not match more than one. I don't know offhand if >there's some cleverness with trailing context that could be used there. > >-Chris > > >On Tue, Dec 12, 2023 at 9:29 AM Steve Litt <sl...@troubleshooters.com> >wrote: > >> Hi all, >> >> I'm creating a parser that takes a text file whose paragraphs are >> separated by blank lines. Unfortunately, if the input file contains a >> paragraph with more than one non-space character, it gives me a >> syntax error via yyerror(). So the following works, where ===== etc >> are not in the inpyut file but just signify beginning and end of >> file: >> >> ================ >> a >> >> b >> >> c >> >> ================ >> >> The following throws a syntax error: >> >> ================ >> a >> >> bx >> >> c >> >> ================ >> >> The following also throws a syntax error: >> >> ================ >> a >> >> b >> x >> >> c >> >> ================ >> >> I'd appreciate any guidance as to what is wrong with my Flex and >> Bison programs (I suspect my rules in the rules section of Bison). >> My Flex and Bison programs follow, once again delineated by lines of >> equal signs that don't exist in the program: >> >> ======= Flex Program ========= >> %option noinput nounput >> %{ >> #include "paragraphs.tab.h" >> %} >> >> %% >> >> [ \t]*\n {strcpy (yylval.y_char, yytext); return NL; } >> . {strcpy (yylval.y_char, yytext); return PARATEXT; } >> >> %% >> >> >> int yywrap(void) >> { >> return 1; >> } >> >> int yyerror(char *errormsg) >> { >> fprintf(stderr, "%s\n", errormsg); >> exit(1); >> } >> ================ >> >> >> ======= Bison Program ========= >> %{ >> >> #include <stdio.h> >> #include <stdlib.h> >> int yylex(void); >> int yyerror (char *errmsg); >> #define EOF_ 0 >> %} >> >> %union { >> char y_char [10000]; >> } >> %token <y_char> PARATEXT >> %token <y_char> NL >> %% >> >> wholefile : wholefile2 {printf("End of file.\n");}; >> >> wholefile2 : toptrash {printf("Beginning of file.\n");} >> multichunk {printf("dia multichunk\n");} >> ; >> >> toptrash : %empty {printf("dia empty multitrash\n");} >> | toptrash {printf("dia another toptrash\n");} >> NL {printf("dia another NL in toptrash\n");} >> ; >> >> multichunk : %empty {printf("dia empty multichunk\n");} >> | multichunk chunk {printf("dia multichunk chunk\n");} >> ; >> >> chunk : PARATEXT { printf("%s %s\n", "PARATEXT", $1); } >> parend {printf("dia parend\n");} >> ; >> >> parend : >> multinewline {printf("dia multinewline\n");} >> | YYEOF {printf("dia YYEOF\n");} >> >> multinewline : NL { printf("%s%s", "NL_tok1",$1); } >> NL { printf("%s%s", "NL_tok2",$1); } >> | multinewline NL { printf("%s\n", "dia multinewline NL"); } >> ; >> >> %% >> >> int main(int argc, char *argv[]){ >> printf("\nStarting...\n"); >> printf("dia value of YYEOF is %d\n", YYEOF); >> yyparse(); >> printf("\nFinished...\n"); >> } >> >> ================ >> >> Thanks, >> >> SteveT >> >> Steve Litt >> >> Autumn 2023 featured book: Rapid Learning for the 21st Century >> http://www.troubleshooters.com/rl21 >> >>