Le dimanche 2 septembre 2012 11:07:35 UTC+2, Ian a écrit : > On Sun, Sep 2, 2012 at 1:36 AM, <wxjmfa...@gmail.com> wrote: > > > I still remember my thoughts when I read the PEP 393 > > > discussion: "this is not logical", "they do no understand > > > typography", "atomic character ???", ... > > > > That would indicate one of two possibilities. Either: > > > > 1) Everybody in the PEP 393 discussion except for you is clueless > > about how to implement a Unicode type; or > > > > 2) You are clueless about how to implement a Unicode type. > > > > Taking into account Occam's razor, and also that you seem to be unable > > or unwilling to offer a solid rationale for those thoughts, I have to > > say that I'm currently leaning toward the second possibility. > > > > > > > Real world exemples. > > > > > >>>> import libfrancais > > >>>> li = ['noël', 'noir', 'nœud', 'noduleux', \ > > > ... 'noétique', 'noèse', 'noirâtre'] > > >>>> r = libfrancais.sortfr(li) > > >>>> r > > > ['noduleux', 'noël', 'noèse', 'noétique', 'nœud', 'noir', > > > 'noirâtre'] > > > > libfrancais does not appear to be publicly available. It's not listed > > in PyPI, and googling for "python libfrancais" turns up nothing > > relevant. > > > > Rewriting the example to use locale.strcoll instead: > > > > >>> li = ['noël', 'noir', 'nœud', 'noduleux', 'noétique', 'noèse', 'noirâtre'] > > >>> import locale > > >>> locale.setlocale(locale.LC_ALL, 'French_France') > > 'French_France.1252' > > >>> import functools > > >>> sorted(li, key=functools.cmp_to_key(locale.strcoll)) > > ['noduleux', 'noël', 'noèse', 'noétique', 'nœud', 'noir', 'noirâtre'] > > > > # Python 3.2 > > >>> import timeit > > >>> timeit.repeat("sorted(li, key=functools.cmp_to_key(locale.strcoll))", > >>> "import functools; import locale; li = ['noël', 'noir', 'nœud', > >>> 'noduleux', 'noétique', 'noèse', 'noirâtre']", number=10000) > > [0.5544277025009592, 0.5370117249557325, 0.5551836677925053] > > > > # Python 3.3 > > >>> import timeit > > >>> timeit.repeat("sorted(li, key=functools.cmp_to_key(locale.strcoll))", > >>> "import functools; import locale; li = ['noël', 'noir', 'nœud', > >>> 'noduleux', 'noétique', 'noèse', 'noirâtre']", number=10000) > > [0.1421166788364303, 0.12389078130001963, 0.13184190553613462] >
> > As you can see, Python 3.3 is about 77% faster than Python 3.2 on this > > example. If this was intended to show that the Python 3.3 Unicode > > representation is a regression over the Python 3.2 implementation, > > then it's a complete failure as an example. - Unfortunately, I got opposite and even much worst results on my win box, considering - libfrancais is one of my module and it does a little bit more than the std sorting tools. My rationale: very simple. 1) I never heard about something better than sticking with one of the Unicode coding scheme. (genreral theory) 2) I am not at all convinced by the "new" Py 3.3 algorithm. I'm not the only one guy, who noticed problems. Arguing, "it is fast enough", is not a correct answer. jmf -- http://mail.python.org/mailman/listinfo/python-list