"Jim Segrave" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > In article <[EMAIL PROTECTED]>, > Paul McGuire <[EMAIL PROTECTED]> wrote: > > >Not an re solution, but pyparsing makes for an easy-to-follow program. > >TransformString only needs to scan through the string once - the > >"reals-before-ints" testing is factored into the definition of the > >formatters variable. > > > >Pyparsing's project wiki is at http://pyparsing.wikispaces.com. > > If fails for floats specified as ###. or .###, it outputs an integer > format and the decimal point separately. It also ignores \# which > should prevent the '#' from being included in a format. > Ah! This may be making some sense to me now. Here are the OP's original re's for matching.
exponentPattern = regex.compile('\(^\|[^\\#]\)\(#+\.#+\*\*\*\*\)') floatPattern = regex.compile('\(^\|[^\\#]\)\(#+\.#+\)') integerPattern = regex.compile('\(^\|[^\\#]\)\(##+\)') leftJustifiedStringPattern = regex.compile('\(^\|[^\\<]\)\(<<+\)') rightJustifiedStringPattern = regex.compile('\(^\|[^\\>]\)\(>>+\)') Each re seems to have two parts to it. The leading parts appear to be guards against escaped #, <, or > characters, yes? The second part of each re shows the actual pattern to be matched. If so: It seems that we *don't* want "###." or ".###" to be recognized as floats, floatPattern requires at least one "#" character on either side of the ".". Also note that single #, <, and > characters don't seem to be desired, but at least two or more are required for matching. Pyparsing's Word class accepts an optional min=2 constructor argument if this really is the case. And it also seems that the pattern is supposed to be enclosed in ()'s. This seems especially odd to me, since one of the main points of this funky format seems to be to set up formatting that preserves column alignment of text, as if creating a tabular output - enclosing ()'s just junks this up. My example also omitted the exponent pattern. This can be handled with another expression like realFormat, but with the trailing "****" characters. Be sure to insert this expression before realFormat in the list of formatters. I may be completely off in my re interpretation. Perhaps one of the re experts here can explain better what the OP's re's are all about. Can anybody locate/cite the actual spec for this formatting, um, format? -- Paul -- http://mail.python.org/mailman/listinfo/python-list