exegesis 5 question: matching negative, multi-byte strings

esp5 Tue, 01 Oct 2002 12:03:46 -0700

I was wondering what the favored syntax in perl6 would be to match negative
multi-byte strings. In perl 5:


        $sql = "select * from a where b union select * from c where d";

        my $nonunion = "[^u]|u[^n]|un[^i]|uni[^o]|unio[^n]";
        my (@subsqls) = ($sql =~ m"((?:$nonunion)*");

guaranteeing that the subsqls have all text up to, but not including the string
"union".

I suppose I could say:

        rule nonunion { (.*) :: { fail if ($1 =~ m"union$"); } }

although that seems awful slow, and I suppose I that I could do the same thing
in perl6 as I did in perl5, although that gets ugly if you need to combine 
matching strings without "union" in them with, say parens:

rule parens                             {   \* [ <-[()]> + : | <self> ]*  \) }
rule non_union_non_parens       
{
                                [            < -[()u] > | 
                                        u    < -[()n] > | 
                                        un   < -[()i] > | 
                                        uni  < -[()o] > | 
                                        unio < -[()n] > 
                                ] 
}

my (@subsqls) = ($sql =~ m" ([ <non_union_non_parens> | <parens> ]*) ");

And finally, I suppose I could write a sql grammar (which for this application,
and most) is definitely overkill. So I guess I'd like something shorter, 
something where you could say:

< -["union"] >

or 

< -["union"\(\)] >

or 

< -["union""select"\(\)] >

a generic negative, multi-byte string matching mechanism. Any thoughts? 
Am I missing something already present or otherwise obvious?

Ed

exegesis 5 question: matching negative, multi-byte strings

Reply via email to