Patrick Maupin schrieb: > On Apr 2, 6:24 am, Peter Otten <__pete...@web.de> wrote: >> Thomas Heller wrote: >> > Maybe I'm just lazy, but what is the fastest way to convert a string >> > into a tuple containing character sequences and integer numbers, like >> > this: >> >> > 'si_pos_99_rep_1_0.ita' -> ('si_pos_', 99, '_rep_', 1, '_', 0, '.ita') >> >>> parts = re.compile("([+-]?\d+)").split('si_pos_99_rep_1_0.ita') >> >>> parts[1::2] = map(int, parts[1::2]) >> >>> parts >> >> ['si_pos_', 99, '_rep_', 1, '_', 0, '.ita'] >> >> Peter > > You beat me to it. re.split() seems underappreciated for some > reason. When I first started using it (even though it was faster for > the tasks I was using it for than other things) I was really annoyed > at the empty strings it was providing between matches. It is only > within the past couple of years that I have come to appreciate the > elegant solutions that those empty strings allow for. In short, > re.split() is by far the fastest and most elegant way to use the re > module for a large class of problems. > > So, the only thing I have to add to this solution is that, for this > particular regular expression, if the source string starts with or > ends with digits, you will get empty strings at the beginning or end > of the resultant list, so if this is a problem, you will want to check > for and discard those.
Thanks to all for these code snippets. Peter's solution is the winner - most elegant and also the fastest. With an additional list comprehension to remove the possible empty strings at the start and at the end I get 16 us. Interesting is that Xavier's solution (which is similar to some code that I wrote myself) isn't so much slower; it get timings of around 22 us. -- Thanks, Thomas -- http://mail.python.org/mailman/listinfo/python-list