Jess Balint wrote: > > Hello all. I am trying to 'code' an address into a certain format. The > format is as follows: > > first 4 digits of street # > first 4 street name > first 2 of address line 2 > first 3 of zip code > > The data is a pipe delimited file with the following format: > > consumer_id|address_line_1|address_line_2|zip_code| > > ex: > > 123456789|123 s main st|apt 23|54321| > > I am trying to match the needed information with a regex, but can't quite > seem to perfect it. I am running into troubles where there might not be > anything in the address_line_2 field. I want to match the first characters > up to a space. Here is what I have so far: > > /^\d+\|(.+)\s(.{4}).*\|.{2}\|(.*)\|/ ^$1^ ^ $2 ^ ^$3^ You should only use the numerical scalars if the regular expression matched. You only have three back references but you are using four.
> $code = $1 . $2 . $3 . " " . $4; ^^^ You change this in the next line to '$'. > $code =~ s/\s/\$/g; Here is one way to do it: @cut = qw/4 4 2 3/; $line = '123456789|123 s main st|apt 23|54321|'; @data = split /\|/, $line; ( $data[0], $data[1] ) = split /\s+/, $data[1], 2; for $index ( 0 .. $#cut ) { $code .= substr $data[$index], 0, $cut[$index]; $code .= '$' if $index == 2; } $code =~ tr/ \t\r\n\f/$/; John -- use Perl; program fulfillment -- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]