Re: Reverse string-formatting (maybe?)

Tim Chase Sun, 15 Oct 2006 06:43:59 -0700

>>>  >>> template = '%s, %s, %s'
>>>  >>> values = ('Tom', 'Dick', 'Harry')
>>>  >>> formatted = template % values
>>>  >>> import re
>>>  >>> unformat_string = template.replace('%s', '([^, ]+)')
>>>  >>> unformatter = re.compile(unformat_string)
>>>  >>> extracted_values = unformatter.search(formatted).groups()
>>>
>>> using '[^, ]+' to mean "one or more characters that aren't a
>>> comma or a space".
>>
>> One more thing (I forgot to mention this other situation earlier)
>> The %s characters are ints, and outside can be anything except int
>> characters. I do have one situation of '%s%s%s', but I can change it to
>> '%s', and change the output into the needed output, so that's not
>> important. Think something along the lines of "abckdaldj iweo%s
>> qwierxcnv !%sjd".
> 
> That was written in haste. All the information is true. The question:
> I've already created a function to do this, using your original
> deformat function. Is there any way in which it might go wrong?


Only you know what anomalies will be found in your data-sets.  If 
you know/assert that

-the only stuff in the formatting string is one set of characters

-that stuff in the replacement-values can never include any of 
your format-string characters

-that you're not using funky characters/formatting in your format 
string (such as "%%" possibly followed by an "s" to get the 
resulting text of "%s" after formatting, or trying to use other 
formatters such as the aforementioned "%f" or possibly "%i")

then you should be safe.  It could also be possible (with my 
original replacement of "(.*)") if your values will never include 
any substring of your format string.  If you can't guarantee 
these conditions, you're trying to make a cow out of hamburger. 
Or a pig out of sausage.  Or a whatever out of a hotdog. :)

Conventional wisdom would tell you to create a test-suite of 
format-strings and sample values (preferably worst-case funkiness 
in your expected format-strings/values), and then have a test 
function that will assert that the unformatting of every 
formatted string in the set returns the same set of values that 
went in.  Something like

tests = {
        'I was %s but now I am %s' : [
                ('hot', 'cold'),
                ('young', 'old'),
                ],
        'He has 3 %s and 2 %s' : [
                ('brothers', 'sisters'),
                ('cats', 'dogs')
                ]
        }

for format_string, values in tests:
        unformatter = format.replace('%s', '(.*)')
        for value_tuple in values:
                formatted = format_string % value_tuple
                unformatted = unformatter.search(formatted).groups()
                if unformatted <> value_tuple:
                        print "%s doesn't match %s when unformatting %s" % (
                                unformatted,
                                value_tuple
                                format_string)

-tkc








-- 
http://mail.python.org/mailman/listinfo/python-list

Re: Reverse string-formatting (maybe?)

Reply via email to