Maciej Bliziński wrote:
> On Sat, 2006-08-26 at 16:48 +0200, Michal wrote:
>> I also added a few of Slovak characters (Czech and Slovak was 
>> brothers too, and they have similar alphabet).
> 
> I looked at the Latin Unicode article in Wikipedia:
> http://en.wikipedia.org/wiki/Latin_Unicode
> 
> There are characters with accents have I never seen before... Vietnamese
> alphabet, for instance, has glyphs which are Latin characters with
> unusual accents, for example: ã, or even with two accents: ặ
> 
> For most of the characters, it's pretty easy to remove the accents.
> However, some characters are mysterious: should Ƨ be translated to S?
> I don't know. So I just deleted them from the accent removal list.
> 

Nice work Maciej :)

When I wrote my first post, I typed: "I will be glad, If some others of 
you add your own national characters."
Each nationality have its own specific characters and rules for them, so 
I think that somebody from this countries should check your version of 
patch.


> I'm including a patch with "from" and "to" constants extended with all
> the characters I found on Wikipedia that seemed to be of any use. This
> should cover all the Slavic countries except those which use cyrylic
> alphabet.
> 
> One thing... some characters want to be translated into _two_ ASCII
> characters, for example Æ to AE. This would require a different data
> structure. In present form, I just entered E. The same with ß which
> I replaced with single S.

Maybe we could try wrote one new function, which will translate one 
unicode to adequate 2 ascii chars? (translate accent chars will be then 
done in two steps: 1-replAccents, 2-new function)

> 
> Regards,
> Maciej
> 
> 
> 
> ------------------------------------------------------------------------
> 
> Index: django/contrib/admin/media/js/urlify.js
> ===================================================================
> --- django/contrib/admin/media/js/urlify.js   (revision 3618)
> +++ django/contrib/admin/media/js/urlify.js   (working copy)
> @@ -1,4 +1,43 @@
> +function replAccents(s)
> +{
> +    // Replacement lists based on article in Wikipedia,
> +    // http://en.wikipedia.org/wiki/Latin_Unicode
> +    // from and to strings must have same number of characters
> +    var from = 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝßàáâãäåæçèéêëìíîï';
> +    var to   = 'AAAAAAECEEEEIIIIDNOOOOOOUUUUYSaaaaaaaceeeeiiii';
> +    from += 'ñòóôõöøùúûüýÿĀāĂ㥹ĆćĈĉĊċČčĎďĐđĒēĔĕĖėĘęĚěĜĝĞğĠġĢģ';
> +    to   += 'noooooouuuuyyaaaaaaccccccccddddeeeeeeeeeegggggggg';
> +    from += 'ĤĥĦħĨĩĪīĬĭĮįİıĴĵĶķĸĹĺĻļĽľĿŀŁłŃńŅņŇňʼnŊŋŌōŎŏŐőŒœŔŕŖŗŘř';
> +    to   += 'hhhhiiiiiiiiiijjkkkllllllllllnnnnnnnnnoooooooorrrrrr';
> +    from += 'ŚśŜŝŞşŠšŢţŤťŦŧŨũŪūŬŭŮůŰűŲųŴŵŶŷŸŹźŻżŽžſƀƂƃƄƅƇƈƉƊƐƑƒƓƔ';
> +    to   += 'ssssssssttttttuuuuuuuuuuuuwwyyyzzzzzzfbbbbbccddeffgv';
> +    from += 'ƖƗƘƙƚƝƞƟƠƤƦƫƬƭƮƯưƱƲƳƴƵƶǍǎǏǐǑǒǓǔǕǖǗǘǙǚǛǜǝǞǟǠǡǢǣǤǥǦǧǨǩ';
> +    to   += 'likklnnoopettttuuuuyyzzaaiioouuuuuuuuuueaaaaeeggggkk';
> +    from += 'ǪǫǬǭǰǴǵǷǸǹǺǻǼǽǾǿȀȁȂȃȄȅȆȇȈȉȊȋȌȍȎȏȐȑȒȓȔȕȖȗȘșȚțȞȟȤȥȦȧȨȩ';
> +    to   += 'oooojggpnnaaeeooaaaaeeeeiiiioooorrrruuuusstthhzzaaee';
> +    from += 'ȪȫȬȭȮȯȰȱȲȳḀḁḂḃḄḅḆḇḈḉḊḋḌḍḎḏḐḑḒḓḔḕḖḗḘḙḚḛḜḝḞḟḠḡḢḣḤḥḦḧḨḩḪḫ';
> +    to   += 'ooooooooyyaabbbbbbccddddddddddeeeeeeeeeeffgghhhhhhhhhh';
> +    from += 'ḬḭḮḯḰḱḲḳḴḵḶḷḸḹḺḻḼḽḾḿṀṁṂṃṄṅṆṇṈṉṊṋṌṍṎṏṐṑṒṓṔṕṖṗṘṙṚṛṜṝṞṟ';
> +    to   += 'iiiikkkkkkllllllllmmmmmmnnnnnnnnoooooooopppprrrrrrrr';
> +    from += 'ṠṡṢṣṤṥṦṧṨṩṪṫṬṭṮṯṰṱṲṳṴṵṶṷṸṹṺṻṼṽṾṿẀẁẂẃẄẅẆẇẈẉẊẋẌẍẎẏẐẑẒẓẔẕ';
> +    to   += 'ssssssssssttttttttuuuuuuuuuuvvvvwwwwwwwwwwxxxxxyzzzzzz';
> +    from += 'ẖẗẘẙẚẛẠạẢảẤấẦầẨẩẪẫẬậẮắẰằẲẳẴẵẶặẸẹẺẻẼẽẾếỀềỂểỄễỆệỈỉỊị';
> +    to   += 'htwyafaaaaaaaaaaaaaaaaaaaaaaaaeeeeeeeeeeeeeeeeiiii';
> +    from += 'ỌọỎỏỐốỒồỔổỖỗỘộỚớỜờỞởỠỡỢợỤụỦủỨứỪừỬửỮữỰựỲỳỴỵỶỷỸỹ';
> +    to   += 'oooooooooooooooooooooooouuuuuuuuuuuuuuyyyyyyyy';
> +
> +    for (var i = 0; i != s.length; i++) {
> +        var x = from.indexOf(s[i]);
> +        if (x != -1) {
> +            r = new RegExp(from[x], 'g');
> +            s = s.replace(r, to[x]);
> +        }
> +    }
> +    return s;
> +}
> +
>  function URLify(s, num_chars) {
> +    s = replAccents(s);
>      // changes, e.g., "Petty theft" to "petty_theft"
>      // remove all these words from the string before urlifying
>      removelist = ["a", "an", "as", "at", "before", "but", "by", "for", 
> "from",
> 


--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"Django users" group.
To post to this group, send email to django-users@googlegroups.com
To unsubscribe from this group, send email to [EMAIL PROTECTED]
For more options, visit this group at 
http://groups.google.com/group/django-users
-~----------~----~----~----~------~----~------~--~---

Reply via email to