In article <[EMAIL PROTECTED]>,
[EMAIL PROTECTED] (Gary Stainburn) writes:
>Hi folks.
>
>I'm trying to locate the UK postcode located somewhere inside an address,
>extract it, and place it in a specific position. The data file I'm
>processing is generated by a COBOL file and is fixed-length format text.
>
>I've almost got it, but as you can see from the output, it's not quite right.
>The split's happening after the first letter not before it.
>
>The postcode is of the format XX99 9XX where the space is optional and the
>first 'XX' and the '99' may be single character. The 9XX is always 1 digit
>followed by 2 letters.
Then make your regex say that. What you have is:
> if ($$line=~/^(.*)(\D{1,2}\d{1,2}\s{0,1}\d\D{2})\s*/) {
Match the beginning of string followed by 0 or more characters other than \n
greedily (this is superfluous), plus 1 or 2 non-digits, 1 or 2 digits, 0 or 1
white space characters, a digit, and 2 non-digits, followed by 0 or more white
space characters (also superfluous).
Ponder the meaning of "non-" for a bit and then chew on this:
/([A-Z]{1,2}\d{1,2} ?\d[A-Z]{2})/
--
Peter Scott
http://www.perldebugged.com
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]