I will take a patch. ;-)

On Jan 23, 7:03 pm, Jonathan Lundell <jlund...@pobox.com> wrote:
> urlify needs a comment to say explicitly what its intention is. That's partly 
> because it suppresses quite a few characters that are normally legal in URLs, 
> which is confusing.
>
> Also,
>
> > def urlify(s, max_length=80):
> >     s = s.lower()
> >     # string normalization, eg è => e, ñ => n
> >     s = unicodedata.normalize('NFKD', s.decode('utf-8')).encode('ASCII', 
> > 'ignore')
> >     # strip entities
> >     s = re.sub('&\w+;', '', s)
>
> this should be '&\w+?;' (that is, non-greedy). Otherwise, a string like 
> '&amp;whatever&amp;' will be completely eliminated.
>
> >     # strip everything but letters, numbers, dashes and spaces
> >     s = re.sub('[^a-z0-9\-\s]', '', s)
> >     # replace spaces with dashes
> >     s = s.replace(' ', '-')
> >     # strip multiple contiguous dashes
> >     s = re.sub('-{2,}', '-', s)
> >     # strip dashes at the beginning and end of the string
> >     s = s.strip('-')
> >     # ensure the maximum length
> >     s = s[:max_length-1]
> >     return s
>
> (Stylistically, I think it'd be more readable if the comments were appended 
> to the relevant code lines.)

-- 
You received this message because you are subscribed to the Google Groups 
"web2py-users" group.
To post to this group, send email to web...@googlegroups.com.
To unsubscribe from this group, send email to 
web2py+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/web2py?hl=en.

Reply via email to