John Machin wrote: > Ermmm ... only remove "the" when you are sure it is a whole word. Even > then it's a dodgy idea. In the first 1000 lines of the nearest address > file I had to hand, I found these: Catherine, Matthew, Rotherwood, > Weatherall, and "The Avenue". >
Partial apologies: I wasn't reading Skip's snippet correctly -- he had "THE ", I read "THE". Only "The Avenue" is a problem in the above list. However Skip's snippet _does_ do damage in cases where the word ends in "the". Grepping lists of placenames found 25 distinct names in UK, including "The Mythe" and "The Wrythe". Addendum: Given examples in the UK like "Barton in the Beans" (no kiddin') and "Barton-on-the-Heath", replacing "-" by space seems indicated. -- http://mail.python.org/mailman/listinfo/python-list