Re: traversing a variable with regex instead of a file

James Edward Gray II Fri, 10 Oct 2003 07:41:30 -0700

Keep your replies on the list, so you can get help from all the people smarter than me. ;)

On Friday, October 10, 2003, at 08:58 AM, angie ahl wrote:

Or did you mean, how would you go through a variable's content
line-by-line?  For that, try something like this:
my @lines = split /(\n)/, $data;
foreach (@lines) { do_something() if /pattern/; }
$data = join '', @lines;
Hope that helps.

James
That's exactly what I meant, sorry didn't realise the email could be read 2 different ways ;)

No problem. We're on the same page now.

So split the var into an array. That makes sense. The only issue I will have with that is the array will need to be remade at the end of each loop as the number of line breaks will change at the the end of each process
here's what I'm actually trying to do.
    # get anon hash from keywords array
    for my $href ( @Keywords ) {
        #get keyword from that hash
        for $kw ( keys %$href ) {
            # see if keyword is in content.
            if ($content =~ /\b($kw)\b/g) {
# do replacement patterns on content
            }
        }
    }
At the bit where it says # do replacement patterns on content I will be looking through $content and doing a substitution

$content =~ s/($kw)/\n$1\n/;

Okay, why put this inside an if block. If it doesn't find a match it will fail and do nothing, which is what you want, right? I don't think you need the if.

so this will actually be adding extra newlines to the content. But I do need to read it a line at a time so that I can check the lines don't start with the letters qz (Don't ask... very long story ;)

Why don't we work on your Regular Expression a little and see if we can do it all in one move. We want to find all occurrences of the keyword, as long as they're not on a line beginning with qz, right? This seems to do that for me:

$content =~ s/^([^\n]*)($kw)/substr($1, 0, 2) ne 'qz' ? "$1\n$2\n" : "$1$2"/mge;

This searches through the $content for a line containing the keyword. It grabs all non-\n characters from the start of the line before the keyword and stores that in $1. Then it stores the keyword in $2. The /m modifier at the end makes ^ match next to \n internally and the /g modifier, finds all of the keywords.

I used the /e modifier for the replacement, which allows me to use Perl code in there. It's pretty simple. If the line didn't start with a qz, we do a normal replace. (Add an lc(...) around that substr() call, if you want it to not do QZ lines too.) If it did start with a qz, we replace with what we found, causing no changes.

Let me know if that will work for you.

So my fear is that I will need to recreate that array after each pass because the number of newlines will be growing.

I thought that would be terribly inefficient, I'm used to using languages that choke at the mere suggestion of doing work. I like Perl, it doesn't do that.

Your right about it being inefficient, of course. It was easier to read than my Regex though, eh? <laughs> The first choice may be slow, but on modern computers they may both work in the blink of an eye. Save worrying about speed for when you need to and try and keep your life as a programmer as easy as possible until then.

If you have any suggestions I would be most grateful to hear them.

Those are my best shots. Hope they help.

James


--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: traversing a variable with regex instead of a file

Reply via email to