On 2012-12-04, Nick Mellor <thebalance...@gmail.com> wrote: > I have a file full of things like this: > > "CAPSICUM RED fresh from Queensland" > > Product names (all caps, at start of string) and descriptions > (mixed case, to end of string) all muddled up in the same > field. And I need to split them into two fields. Note that if > the text had said: > > "CAPSICUM RED fresh from QLD" > > I would want QLD in the description, not shunted forwards and > put in the product name. So (uncontrived) list comprehensions > and regex's are out. > > I want to split the above into: > > ("CAPSICUM RED", "fresh from QLD") > > Enter dropwhile and takewhile. 6 lines later: > > from itertools import takewhile, dropwhile > def split_product_itertools(s): > words = s.split() > allcaps = lambda word: word == word.upper() > product, description = takewhile(allcaps, words), dropwhile(allcaps, > words) > return " ".join(product), " ".join(description) > > When I tried to refactor this code to use while or for loops, I > couldn't find any way that felt shorter or more pythonic:
I'm really tempted to import re, and that means takewhile and dropwhile need to stay. ;) But seriously, this is a quick implementation of my first thought. description = s.lstrip(string.ascii_uppercase + ' ') product = s[:-len(description)-1] -- Neil Cerutti -- http://mail.python.org/mailman/listinfo/python-list