In article ,
Neil Hodgson wrote:
> Low-level string manipulation often deals with blocks larger than
> an individual character for speed. Generally 32 or 64-bits at a time
> using the CPU or 128 or 256 using the vector unit. Then there may be
> entry/exit code to handle initial alignmen
Steven D'Aprano:
So while you might save memory by using "UTF-24" instead of UTF-32, it
would probably be slower because you would have to grab three bytes at a
time instead of four, and the hardware probably does not directly support
that.
Low-level string manipulation often deals with bl
--
utf-32 is already here. You are all most probably [*]
using it without noticing it. How? By using OpenType fonts,
without counting the text processing applications using them.
Why? Because there is no other way to do it.
[*] depending of the font, the internal table(s), eg "cmap" table,
ar
In article
<70844d17-22bd-4394-86e2-d7ef3efc6...@ps9g2000pbb.googlegroups.com>,
rusi wrote:
> I believe that there are many things about unicode that are less than
> satisfactory. Some are downright asinine like the 'prime-real-estate'
> devoted to the control characters and never used.
Ah, bu
On Mar 16, 6:29 pm, Roy Smith wrote:
> In article <51440235$0$29965$c3e8da3$54964...@news.astraweb.com>,
> Steven D'Aprano wrote:
>
> > UTF-32 is a *fixed width* storage mechanism where every code point takes
> > exactly four bytes. Since the entire Unicode range will fit in four
> > bytes, that
In article <51440235$0$29965$c3e8da3$54964...@news.astraweb.com>,
Steven D'Aprano wrote:
> UTF-32 is a *fixed width* storage mechanism where every code point takes
> exactly four bytes. Since the entire Unicode range will fit in four
> bytes, that ensures that every code point is covered, and
On Fri, 15 Mar 2013 21:26:28 -0700, rusi wrote:
> The unicode standard is language-agnostic. Unicode implementations exist
> withing a language x implementation x C- compiler implementation x … --
> Notice the gccs in Andriy's comparison. Do they signify?
They should not. Ideally, the behaviour
On Fri, 15 Mar 2013 21:35:42 -0700, rusi wrote:
> And ignores that 3.3 trades time for space.
So what? Lists, dicts and sets trade time for space: they are generally
over-allocated to ensure a certainly level of performance. The language
designers are perfectly permitted to make that choice. If
On Sat, 16 Mar 2013 15:09:56 +1100, Chris Angelico wrote:
> On Sat, Mar 16, 2013 at 2:56 PM, Mark Lawrence
> wrote:
>> On 16/03/2013 02:44, Thomas 'PointedEars' Lahn wrote:
>>>
>>> Chris Angelico wrote:
>>>
>>>
>> Thomas and Chris, would the two of you be kind enough to explain to
>> morons such
On Mar 16, 9:12 am, Thomas 'PointedEars' Lahn
wrote:
> You have still no clue what you are talking about. Get yourself informed at
> least about the (deprecated/obsolete) “language” and the (standards-
> compliant) “type” attribute of SCRIPT/“script” elements before you post on
> this again.
>
>
On 3/16/2013 12:35 AM, rusi wrote:
And ignores that 3.3 trades time for space.
This is at least a partial falsehood.
It is really sad to see you parroting this.
--
Terry Jan Reedy
--
http://mail.python.org/mailman/listinfo/python-list
On 16/03/2013 04:35, rusi wrote:
On Mar 16, 9:09 am, Chris Angelico wrote:
On Sat, Mar 16, 2013 at 2:56 PM, Mark Lawrence wrote:
On 16/03/2013 02:44, Thomas 'PointedEars' Lahn wrote:
Chris Angelico wrote:
Thomas and Chris, would the two of you be kind enough to explain to morons
such as
On Mar 16, 9:09 am, Chris Angelico wrote:
> On Sat, Mar 16, 2013 at 2:56 PM, Mark Lawrence
> wrote:
> > On 16/03/2013 02:44, Thomas 'PointedEars' Lahn wrote:
>
> >> Chris Angelico wrote:
>
> > Thomas and Chris, would the two of you be kind enough to explain to morons
> > such as myself how all t
On Mar 16, 8:56 am, Mark Lawrence wrote:
> On 16/03/2013 02:44, Thomas 'PointedEars' Lahn wrote:
>
> > Chris Angelico wrote:
>
> Thomas and Chris, would the two of you be kind enough to explain to
> morons such as myself how all the ECMAScript stuff relates to Python's
> unicode as implemented vi
On Sat, Mar 16, 2013 at 3:12 PM, Thomas 'PointedEars' Lahn
wrote:
> You have still no clue what you are talking about.
Fine. I'll shut up on the topic. Go ahead, the floor is yours, go and
make whatever point you want to make. Clearly I have absolutely no
idea about characters, strings, Unicode,
Chris Angelico wrote:
> On Sat, Mar 16, 2013 at 1:44 PM, Thomas 'PointedEars' Lahn
> wrote:
>> Chris Angelico wrote:
>>> The ECMAScript spec says that strings are stored and represented in
>>> UTF-16.
>>
>> No, it does not (which Edition?). It says in Edition 5.1:
>
> Okay, I was sloppy in my t
On Sat, Mar 16, 2013 at 2:56 PM, Mark Lawrence wrote:
> On 16/03/2013 02:44, Thomas 'PointedEars' Lahn wrote:
>>
>> Chris Angelico wrote:
>>
>
> Thomas and Chris, would the two of you be kind enough to explain to morons
> such as myself how all the ECMAScript stuff relates to Python's unicode as
>
On Sat, Mar 16, 2013 at 1:44 PM, Thomas 'PointedEars' Lahn
wrote:
> Chris Angelico wrote:
>> The ECMAScript spec says that strings are stored and represented in
>> UTF-16.
>
> No, it does not (which Edition?). It says in Edition 5.1:
Okay, I was sloppy in my terminology. A language will seldom,
On 16/03/2013 02:44, Thomas 'PointedEars' Lahn wrote:
Chris Angelico wrote:
Thomas and Chris, would the two of you be kind enough to explain to
morons such as myself how all the ECMAScript stuff relates to Python's
unicode as implemented via PEP 393 as you've lost me, easily done I know.
-
Chris Angelico wrote:
> Thomas 'PointedEars' Lahn […] wrote:
>> Chris Angelico wrote:
>>> jmf, I'd like to see evidence that there has been a performance
>>> regression compared against a wide build of Python 3.2. You still have
>>> never answered this fundamental, that the narrow builds of Python
000",number=1)
[0.901988381985575, 0.7517840950167738, 0.7540924890199676]
>>> repeat("s=s[:-1]+'\u1234'","s='\u1234sdf'*1",number=1)
[0.3069786810083315, 0.17701858800137416, 0.1769046070112381]
>>> repeat("s=s[:-1]+'
3.2 and 2.7 results on my desktop using Chris examples
(Hope I cut-pasted them correctly)
-
Welcome to the Emacs shell
~ $ python3
Python 3.2.3 (default, Feb 20 2013, 17:02:41)
[GCC 4.7.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
On 3/14/2013 7:14 PM, Terry Reedy wrote:
On 3/14/2013 6:48 AM, rusi wrote:
On Mar 14, 11:47 am, Chris Angelico wrote:
I expect that Python 3.2 will behave comparably to the 2.6 stats, but
I don't have 3.2s handy - can someone confirm please?
I have 3.2 but not 3.3. Can run it later today if
On 3/14/2013 6:48 AM, rusi wrote:
On Mar 14, 11:47 am, Chris Angelico wrote:
I expect that Python 3.2 will behave comparably to the 2.6 stats, but
I don't have 3.2s handy - can someone confirm please?
I have 3.2 but not 3.3. Can run it later today if no one does.
But better if someone with b
On Mar 14, 11:47 am, Chris Angelico wrote:
> I expect that Python 3.2 will behave comparably to the 2.6 stats, but
> I don't have 3.2s handy - can someone confirm please?
I have 3.2 but not 3.3. Can run it later today if no one does.
But better if someone with both on the same machine do the com
On Thu, Mar 14, 2013 at 3:05 PM, Steven D'Aprano
wrote:
> That depends on how you use the strings. Because strings are immutable,
> there isn't really anything like "switching between widths" -- the width
> is set when the string is created, and then remains fixed.
The nearest thing to "switching
On Thu, Mar 14, 2013 at 1:35 PM, Terry Reedy wrote:
>On 3/13/2013 7:43 PM, Chris Angelico wrote:
>> It's complexity cost, though, and people would need to know when it
>> would be worth giving Python that switch to change its string format.
>> Plus, every C extension would need to cope with both f
On Thu, 14 Mar 2013 02:01:35 +, MRAB wrote:
> On 14/03/2013 00:55, Chris Angelico wrote:
>> On Thu, Mar 14, 2013 at 11:52 AM, MRAB
>> wrote:
>>> On 13/03/2013 23:43, Chris Angelico wrote:
On Thu, Mar 14, 2013 at 3:49 AM, rusi wrote:
>
> On Mar 13, 3:59 pm, Chris Angelico w
On 3/13/2013 7:43 PM, Chris Angelico wrote:
On Thu, Mar 14, 2013 at 3:49 AM, rusi wrote:
This assumes that there are only three choices:
- narrow build that is buggy (surrogate pairs for astral characters)
- wide build that is 4-fold space inefficient for wide variety of
common (ASCII) use-cas
On 14/03/2013 00:55, Chris Angelico wrote:
On Thu, Mar 14, 2013 at 11:52 AM, MRAB wrote:
On 13/03/2013 23:43, Chris Angelico wrote:
On Thu, Mar 14, 2013 at 3:49 AM, rusi wrote:
On Mar 13, 3:59 pm, Chris Angelico wrote:
On Wed, Mar 13, 2013 at 9:11 PM, rusi wrote:
> Uhhh..
> Making the
On Thu, Mar 14, 2013 at 11:52 AM, MRAB wrote:
> On 13/03/2013 23:43, Chris Angelico wrote:
>>
>> On Thu, Mar 14, 2013 at 3:49 AM, rusi wrote:
>>>
>>> On Mar 13, 3:59 pm, Chris Angelico wrote:
On Wed, Mar 13, 2013 at 9:11 PM, rusi wrote:
> Uhhh..
> Making the subject line use
On 13/03/2013 23:43, Chris Angelico wrote:
On Thu, Mar 14, 2013 at 3:49 AM, rusi wrote:
On Mar 13, 3:59 pm, Chris Angelico wrote:
On Wed, Mar 13, 2013 at 9:11 PM, rusi wrote:
> Uhhh..
> Making the subject line useful for all readers
I should have read this one before replying in the other t
On Thu, Mar 14, 2013 at 4:42 AM, Thomas 'PointedEars' Lahn
wrote:
> Chris Angelico wrote:
>
>> On Wed, Mar 13, 2013 at 9:11 PM, rusi wrote:
>>> Uhhh..
>>> Making the subject line useful for all readers
>>
>> I should have read this one before replying in the other thread.
>>
>> jmf, I'd like to s
On Thu, Mar 14, 2013 at 3:49 AM, rusi wrote:
> On Mar 13, 3:59 pm, Chris Angelico wrote:
>> On Wed, Mar 13, 2013 at 9:11 PM, rusi wrote:
>> > Uhhh..
>> > Making the subject line useful for all readers
>>
>> I should have read this one before replying in the other thread.
>>
>> jmf, I'd like to s
Chris Angelico wrote:
> On Wed, Mar 13, 2013 at 9:11 PM, rusi wrote:
>> Uhhh..
>> Making the subject line useful for all readers
>
> I should have read this one before replying in the other thread.
>
> jmf, I'd like to see evidence that there has been a performance
> regression compared against
On Mar 13, 3:59 pm, Chris Angelico wrote:
> On Wed, Mar 13, 2013 at 9:11 PM, rusi wrote:
> > Uhhh..
> > Making the subject line useful for all readers
>
> I should have read this one before replying in the other thread.
>
> jmf, I'd like to see evidence that there has been a performance
> regress
On Wed, Mar 13, 2013 at 9:11 PM, rusi wrote:
> Uhhh..
> Making the subject line useful for all readers
I should have read this one before replying in the other thread.
jmf, I'd like to see evidence that there has been a performance
regression compared against a wide build of Python 3.2. You stil
On Mar 13, 3:07 pm, rusi wrote:
> On Mar 13, 2:36 pm, jmfauth wrote:
>
>
>
>
>
>
>
>
>
> > As a reply to rusi's
> > comment:http://groups.google.com/group/comp.lang.python/browse_thread/thread/...
>
> > From string creation to the itertools usage. A medley. Some timings.
>
> > Important:
> > The
38 matches
Mail list logo