This and other RFCs are available on the web at http://dev.perl.org/rfc/ =head1 TITLE Numeric Value Ranges In Regular Expressions =head1 VERSION Maintainer: David Nicol <[EMAIL PROTECTED]> Date: 5 Sep 2000 Last Modified: 22 Sep 2000 Mailing List: [EMAIL PROTECTED] Number: 197 Version: 2 Status: Frozen =head1 CHANGES s/numberic/numeric in title (oops) expansion of implemention/optimization section =head1 ABSTRACT round and square bratches mated around two optional comma separated numbers match iff a gobbled number is within the described range. =head1 DESCRIPTION =head2 the syntax of the numeric range regex element Given a passage of regex text matching ($B1,$N1,$N2,$B2) = /(\[|\()(\-?\d*\.?\d*),(\-?\d*\.?\d*)(\]|\))/ and ($N1 <= $N2 or $N1 eq '' or $N2 eq '') we've got something we hereinafter call a "range." =head2 what the range matches A range matches, in the target string, a passage C<(\-?\d*\.?\d*)> also known as a "number" if and only if the number is within the range. In the normal agebraic sense. =head2 "within the range" Square bracket means, that end of the range may include the range specifying number, and round parenthesis means, that end of the range includes numbers ov value up to (or down to) the number but not equal to it. =head2 infinity in the event that one or the other of the range specifying numbers is the empty string, that end of the range is unbounded. In the further event that we have defined infinity and negative infinity on our numbers, the square/round distinction will come into play. The range end indicators are literal numbers, although they may be optimized immensely. No expression evaluation occurs w/in the range specifier, beyond the normal rules of double-quote interpolation. =head1 COMPATIBILITY To disambiguate ranges from character sets including digits, commas, and parentheses, either put a backslash on the right parentheses, or the comma, or arrange things so the left hand side of the comma is greater than the right hand side, that way this special case will not apply: /(37.3,200)/; # matches any number x, 37.3 < x < 200 /((37.3,200))/; # matches any number x, 37.3 < x < 200 and saves it /([37,))/; # matches and saves any number >= 37. /(37.3\,200)/; # matches and saves the literal text '37.3,200' /[-35,9)]/; # matches any number x, -35 <= x < 9; followed by a ] /[3-5,9)]/; # matches a string containing any of 3,4,5,,,9 or ) /[$low,$high]/; # matches a number $low <= $_ <= $high, provided # low and high are both numerics. /[$low,${\highf(@data)}/; # complex interpolation tricks Tieing variables to be interpolated into range matches to types which always produce numbers is reccommended. =head1 IMPLEMENTATION Yet more special cases for interpretation of ([)] in regular expressions. We match regular expressions against ($B1,$N1,$N2,$B2) = /(\[|\()(\-?\d*\.?\d*),(\-?\d*\.?\d*)(\]|\))/ and ($N1 <= $N2 or $N1 eq '' or $N2 eq '') and mark matching passages as ranges. When applying regular expressions to numeric data, ranges may optimize away all of the digit lookahead we must currently indulge in to implement them in perl5. IOW, if we know a string literal containing interpolated numeric scalars is going to get matched by an expression containing ranges, we may be able to skip both the interpolation and the deinterpolation and go straight to multi-way numeric comparison. If we have infinity defined, we'll have to look for its string representation. And if a "simple, fast regex match mode" is defined, this pass could be switched in or out: maybe we want fast range matching. =head1 BUT WAIT THERE'S MORE It is possible that the syntax described in this document may help slice multidimensional containers. (RFC 191) =head1 REFERENCES None.