"Jim Segrave" <[EMAIL PROTECTED]> wrote in message news:[EMAIL PROTECTED] > > If fails for floats specified as ###. or .###, it outputs an integer > format and the decimal point separately. It also ignores \# which > should prevent the '#' from being included in a format. >
True. What is the spec for these formatting strings, anyway? I Googled a while, and it does not appear that this is really a Perl string formatting technique, despite the OP's comments to the contrary. And I'm afraid my limited Regex knowledge leaves the OP's example impenetrable to me. I got lost among the '\'s and parens. I actually thought that "###." was *not* intended to be floating point, but instead represented an integer before a sentence-ending period. You do have to be careful of making *both* leading and trailing digits optional, or else simple sentence punctuating periods will get converted to "%1f"! As for *ignoring* "\#", it would seem to me we would rather convert this to "#", since "#" shouldn't be escaped in normal string interpolation. The following modified version adds handling for "\#", "\<" and "\>", and real numbers with no integer part. The resulting program isn't radically different from the first version. (I've highlighted the changes with "<===" marks.) -- Paul ------------------ from pyparsing import Combine,Word,Optional,Regex """ read Perl-style formatting placeholders and replace with proper %x string interp formatters ###### -> %6d ##.### -> %6.3f <<<<< -> %-5s >>>>> -> %5s """ # set up patterns to be matched # (note use of results name in realFormat, for easy access to # decimal places substring) intFormat = Word("#") realFormat = Combine(Optional(Word("#"))+"."+ # <=== Word("#").setResultsName("decPlaces")) leftString = Word("<") rightString = Word(">") escapedChar = Regex(r"\\[#<>]") # <=== # define parse actions for each - the matched tokens are the third # arg to parse actions; parse actions will replace the incoming tokens with # value returned from the parse action intFormat.setParseAction( lambda s,l,toks: "%%%dd" % len(toks[0]) ) realFormat.setParseAction( lambda s,l,toks: "%%%d.%df" % (len(toks[0]),len(toks.decPlaces)) ) leftString.setParseAction( lambda s,l,toks: "%%-%ds" % len(toks[0]) ) rightString.setParseAction( lambda s,l,toks: "%%%ds" % len(toks[0]) ) escapedChar.setParseAction( lambda s,l,toks: toks[0][1] ) # <=== # collect all formatters into a single "grammar" # - note reals are checked before ints formatters = rightString | leftString | realFormat | intFormat | escapedChar # <=== # set up our test string, and use transform string to invoke parse actions # on any matched tokens testString = r""" This is a string with ints: #### # ############### floats: #####.# ###.###### #.# .### left-justified strings: <<<<<<<< << < right-justified strings: >>>>>>>>>> >> > int at end of sentence: ####. I want \##, please. """ print testString print formatters.transformString( testString ) ------------------ Prints: This is a string with ints: #### # ############### floats: #####.# ###.###### #.# .### left-justified strings: <<<<<<<< << < right-justified strings: >>>>>>>>>> >> > int at end of sentence: ####. I want \##, please. This is a string with ints: %4d %1d %15d floats: %7.1f %10.6f %3.1f %4.3f left-justified strings: %-8s %-2s %-1s right-justified strings: %10s %2s %1s int at end of sentence: %4d. I want #%1d, please. -- http://mail.python.org/mailman/listinfo/python-list