Rudolph wrote:
Batiste wrote:
But there is possible to increase the quality of the generated slug
with some European symbols like (é,è,à,â,È,É,À,Â,ö,ä ...)
é -> e
è -> e
à -> a
À -> a
I did this once in PHP, it worked really well (yes, I know it's PHP,
that's why I switched to Django):
$slug = strtolower(htmlentities($title, ENT_NOQUOTES, 'UTF-8'));
$slug_no_accents =
preg_replace("/&(.)(acute|cedil|circ|ring|tilde|uml);/", "$1", $slug);
One should be abled to port this to Django in no-time.
isn't this something that unicode should be able to do?
try this:
def strip(text):
decomposed_form = unicodedata.normalize('NFD',text)
simplechars = [c for c in decomposed_form if
unicodedata.category(c)[0] == 'L']
return ''.join(simplechars)
first it asks the python unicode module to decompose the strings
accented characters into separate character and accent-mark characters.
then he goes through the string, and only takes the characters that are
normal characters.
please note that it's 1:38AM here, so my code can be very wrong :) (but
it works :)...and it's clearly not optimized for speed :)
gabor