On Jan 18, 12:41 pm, Iain King <iaink...@gmail.com> wrote:
> On Jan 18, 10:21 am, superpollo <ute...@esempio.net> wrote:
>
>
>
> > superpollo ha scritto:
>
> > > hi.
>
> > > what is the most pythonic way to substitute substrings?
>
> > > eg: i want to apply:
>
> > > foo --> bar
> > > baz --> quux
> > > quuux --> foo
>
> > > so that:
>
> > > fooxxxbazyyyquuux --> barxxxquuxyyyfoo
>
> > > bye
>
> > i explain better:
>
> > say the subs are:
>
> > quuux --> foo
> > foo --> bar
> > baz --> quux
>
> > then i cannot apply the subs in sequence (say, .replace() in a loop),
> > otherwise:
>
> > fooxxxbazyyyquuux --> fooxxxbazyyyfoo --> barxxxbazyyybar -->
> > barxxxquuxyyybar
>
> > not as intended...
>
> Not sure if it's the most pythonic, but I'd probably do it like this:
>
> def token_replace(string, subs):
>         subs = dict(subs)
>         tokens = {}
>         for i, sub in enumerate(subs):
>                 tokens[sub] = i
>                 tokens[i] = sub
>         current = [string]
>         for sub in subs:
>                 new = []
>                 for piece in current:
>                         if type(piece) == str:
>                                 chunks = piece.split(sub)
>                                 new.append(chunks[0])
>                                 for chunk in chunks[1:]:
>                                         new.append(tokens[sub])
>                                         new.append(chunk)
>                         else:
>                                 new.append(piece)
>                 current = new
>         output = []
>         for piece in current:
>                 if type(piece) == str:
>                         output.append(piece)
>                 else:
>                         output.append(subs[tokens[piece]])
>         return ''.join(output)
>
> >>> token_replace("fooxxxbazyyyquuux", [("quuux", "foo"), ("foo", "bar"), 
> >>> ("baz", "quux")])
>
> 'barxxxquuxyyyfoo'
>
> I'm sure someone could whittle that down to a handful of list comps...
> Iain

Slightly better (lets you have overlapping search strings, used in the
order they are fed in):

def token_replace(string, subs):
        tokens = {}
        if type(subs) == dict:
                for i, sub in enumerate(subs):
                        tokens[sub] = i
                        tokens[i] = subs[sub]
        else:
                s = []
                for i, (k,v) in enumerate(subs):
                        tokens[k] = i
                        tokens[i] = v
                        s.append(k)
                subs = s
        current = [string]
        for sub in subs:
                new = []
                for piece in current:
                        if type(piece) == str:
                                chunks = piece.split(sub)
                                new.append(chunks[0])
                                for chunk in chunks[1:]:
                                        new.append(tokens[sub])
                                        new.append(chunk)
                        else:
                                new.append(piece)
                current = new
        output = []
        for piece in current:
                if type(piece) == str:
                        output.append(piece)
                else:
                        output.append(tokens[piece])
        return ''.join(output)
-- 
http://mail.python.org/mailman/listinfo/python-list

Reply via email to