Re: Syntax error if paragraph contains more than 1 printable character

James K. Lowden Thu, 14 Dec 2023 08:01:29 -0800

On Wed, 13 Dec 2023 19:01:22 -0500
Steve Litt <sl...@troubleshooters.com> wrote:


> >.+/\n  { ... return LINE; }
> >(\n[[:blank:]]*){2,} { return SEP; } // two or more blank lines
> >\n       { /* ignore */ }
> 
> Thanks James, this looks great!

You're welcome.  It occurs to me that

        .+/\n

is the same as

        .+

so, simpler still.  :-) 


> I won't need to consider end of line spaces because I now have a sed 1
> liner preprocessor that gets rid of trailing space :-).

Flex is a regex engine, and can do anything sed can do.  Your system is
simpler if it can deal with all acceptable input, without
preprocessing.  

Rather than remove trailing blanks from the input, I would remove them
in flex.  The problem can be solved with regular expressions but,
since we're only matching one value, it's easily done in an action: 

        .+      {
                for( auto p = yytext + yyleng - 1; p >= yytext; p-- ) {
                        if( *p != 0x20 ) break;
                        *p = '\0';
                }


To solve it with regex, 

        ([[:blank:]]*[[:^space:]])+ { ... return LINE; }
        [[:blank:]]+$   // ignore
        
--jkl

Re: Syntax error if paragraph contains more than 1 printable character

Reply via email to