I will take a patch. ;-) On Jan 23, 7:03 pm, Jonathan Lundell <jlund...@pobox.com> wrote: > urlify needs a comment to say explicitly what its intention is. That's partly > because it suppresses quite a few characters that are normally legal in URLs, > which is confusing. > > Also, > > > def urlify(s, max_length=80): > > s = s.lower() > > # string normalization, eg è => e, ñ => n > > s = unicodedata.normalize('NFKD', s.decode('utf-8')).encode('ASCII', > > 'ignore') > > # strip entities > > s = re.sub('&\w+;', '', s) > > this should be '&\w+?;' (that is, non-greedy). Otherwise, a string like > '&whatever&' will be completely eliminated. > > > # strip everything but letters, numbers, dashes and spaces > > s = re.sub('[^a-z0-9\-\s]', '', s) > > # replace spaces with dashes > > s = s.replace(' ', '-') > > # strip multiple contiguous dashes > > s = re.sub('-{2,}', '-', s) > > # strip dashes at the beginning and end of the string > > s = s.strip('-') > > # ensure the maximum length > > s = s[:max_length-1] > > return s > > (Stylistically, I think it'd be more readable if the comments were appended > to the relevant code lines.)
-- You received this message because you are subscribed to the Google Groups "web2py-users" group. To post to this group, send email to web...@googlegroups.com. To unsubscribe from this group, send email to web2py+unsubscr...@googlegroups.com. For more options, visit this group at http://groups.google.com/group/web2py?hl=en.