> > Let me jump in for half a second here (no pun intended), but what
> > about the use of back quotes? ` `?  Use a very limited escaping
> > policy of \` => ` and \\ => \ .
> 
> Actually, having to double backslashes is one of the things I want
> to get rid of.  The here-document-based ideas seem to allow that.

Hrm, that would be nice to get rid of as \ is a highly overloaded,
overused character.  As someone who is presently in the throws of
writing a new language, might I suggest using non-newline anchored
token as opposed to more dynamic token?

Using $$[.*]\n as a lexical token is a quasi-problematic as the anchor
is the newline, something that SQL has been free of for as long as I'm
aware of.  By using a static lexical token, such as @@, newline's
aren't important, thus reducing the number of accidental syntax errors
from programmers.  While I abhor the "let's put a magic token in this
context to handle this quirk" grammar design methodology that Perl has
brought, I do think that a simple doubling up of a nearly unused
operator would be sufficient, concise, and easy.  For example:

!!      Invalid as !! is a valid expression, though a NOOP.
@@      Valid candidate as @@ is an invalid expression
##      Valid candidate, but common comment syntax, avoid using
$$      Valid candidate, but again, a common syntax in shell like languages
%%      Valid candidate, %% is an invalid expression
^^      Invalid candidate, ^^ is a valid expression
&&      Invalid as && is a valid token
**      Valid candidate, but ** is used as a power operator in Ruby

Of the above, I'd think @@, %%, or $$ would be the best choices.  If a
dynamic token is desired, use a token that is terminally anchored with
something other than a new line to keep PostgreSQL's SQL contextually
free from newlines.  If the desire for something HERE document-like is
strong enough... well, how about the following flex patterns:

@(@[^\n]+\n|[EMAIL PROTECTED]@)
%(%[^\n]+\n|[^%]*%)
$($[^\n]+\n|[^$]*$)

If the developer knows his/her string and opts to use an empty string
to name the token, so be it, @@ would be the beginning and terminating
token for a literal string block.  If the developer writing something
with pl/autoconf (doesn't exist!!!  Just an example of where @@ is
used), then @autoconf me harder@ could be used as the start and ending
token, which should provide enough bits to prevent the likelihood of
the string being used in the enclosed data.  If a newline is desired,
it would be valid in the above:

@
@ Inside the block @
@

@[EMAIL PROTECTED] the [EMAIL PROTECTED]@

and the resulting string would be " Inside the block ".

%{
/* Headers/definitions/prototypes */
#include <string.h>
static bool initialized = false;
static char *lit_name;
static char *lit_val;
%}
lit_quote_pattern       @(@[^\n]+\n|[EMAIL PROTECTED]@)
%x LIT_QUOTE
%x SQL
%%
%{
/* Init bits */
        if (!initialized) {
           BEGIN(SQL);
           initialized = true;
        }
%}
<SQL>{lit_quote_pattern}        {
                 /* -2 == leading/trailing chars, +1 '\0' = -1*/
                lit_name = malloc(yyleng - 1);
                strncpy(&lit_name, &yyleng[1], yyleng - 2);
                lit_name[yyleng-1] = '\0';
                lit_val = NULL;
                BEGIN(LIT_QUOTE);
        }
<LIT_QUOTE>{lit_quote_pattern}  {
                /* */
                if (strncmp(lit_name, yytext[1], yyleng - 2) == 0) {
                  /* Found the terminator, set yylval.??? to lit_val after appending 
yytext and return whatever the string type is to yyparse() */
                  yylval.??? = strdup(lit_val);
                  free(lit_val);
                  free(lit_name);
                  lit_name = lit_val = NULL;
                  BEGIN(SQL);
                  return(tSTRING);
                } else {
                  /* Do nothing until we hit a match */
                }
        }

<LIT_QUOTE>.    {
                /* Not sure these func names off the top of my head: */
                pg_append_str_to_buf(lit_val, yytext, yyleng);
        }

%%
/* Or something similarly flexible */

-sc

-- 
Sean Chittenden

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Reply via email to