On 7/14/2017 9:20 PM, Steve D'Aprano wrote:
On Sat, 15 Jul 2017 07:12 am, Terry Reedy wrote:
Does go use bytes for text, like most people did in Python 2, a separate
text string class, that hides the internal encoding format and
implementation? In other words, if you do the equivalent of print
On Sat, 15 Jul 2017 04:10 am, Marko Rauhamaa wrote:
> Steve D'Aprano :
>> On Fri, 14 Jul 2017 11:31 pm, Marko Rauhamaa wrote:
[...]
>>> As it stands, we have
>>>
>>>è --[encode>-- Unicode --[reencode>-- UTF-8
>>
>> I can't even work out what you're trying to say here.
>
> I can tell, yet tha
On Sat, 15 Jul 2017 07:12 am, Terry Reedy wrote:
> Does go use bytes for text, like most people did in Python 2, a separate
> text string class, that hides the internal encoding format and
> implementation? In other words, if you do the equivalent of print(s)
> where s is a text string with a mix
I just started using Python and I am writing code to access my serial port
using pyserial. I have no problem with unix based text coming in the stream
using a LF (0x0A) record separator. I also am using unblocked IO. However I
have some sensor devices that use the windows CRLF (0x0A,0x0D) record
On 7/14/2017 5:51 PM, Marko Rauhamaa wrote:
Yes, in Python2, Go, C and GNU textutils, when you print a text string
containing a mixture of languages, you see characters.
Why?
Because that's what the terminal emulator chooses to do upon receiving
those bytes.
>>> s = u'\u1171\u\u\u444
Terry Reedy :
> On 7/14/2017 10:30 AM, Michael Torrie wrote:
>> On 07/14/2017 07:31 AM, Marko Rauhamaa wrote:
>>> Of course, UTF-8 in a bytes object doesn't make the situation any
>>> better, but does it make it any worse?
>>
>>>
>>> As it stands, we have
>>>
>>> è --[encode>-- Unicode --[reen
On 7/14/2017 10:30 AM, Michael Torrie wrote:
On 07/14/2017 07:31 AM, Marko Rauhamaa wrote:
Of course, UTF-8 in a bytes object doesn't make the situation any
better, but does it make it any worse?
As it stands, we have
è --[encode>-- Unicode --[reencode>-- UTF-8
Why is one encoding form
Michael Torrie :
> On 07/14/2017 07:31 AM, Marko Rauhamaa wrote:
>> Of course, UTF-8 in a bytes object doesn't make the situation any
>> better, but does it make it any worse?
>>
>> As it stands, we have
>>
>>è --[encode>-- Unicode --[reencode>-- UTF-8
>>
>> Why is one encoding format bette
Rhodri James :
> On 14/07/17 15:14, Marko Rauhamaa wrote:
>> I'd like to understand this better. Maybe you have a couple of
>> examples to share?
>
> Sure.
>
> What I've mostly been looking at recently has been the Expat XML parser.
> XML chooses to deal with one of your problems by defining that
On 2017-07-14, Rhodri James wrote:
> On 14/07/17 15:32, Michael Torrie wrote:
>> Are you saying that dealing with Unicode in Google Go, which
>> uses UTF-8 in memory, is adding an extra layer of complexity
>> and makes things worse than they might be in Python?
>
> I'm not familiar with Go. If th
Steve D'Aprano :
> On Fri, 14 Jul 2017 11:31 pm, Marko Rauhamaa wrote:
>> Of course, UTF-8 in a bytes object doesn't make the situation any
>> better, but does it make it any worse?
>
> Sure it does. You want the human reader to be able to predict the
> number of graphemes ("characters") by sight.
On Fri, 14 Jul 2017 11:31 pm, Marko Rauhamaa wrote:
> Steve D'Aprano :
>
>> These are only a *few* of the *easy* questions that need to be
>> answered before we can even consider your question:
>>
>>> So the question is, should we have a third type for text. Or should
>>> the semantics of strings
On 07/14/2017 03:57 PM, jorge.conr...@cptec.inpe.br wrote:
Hi,
I installed the GDAL 2.2.1 using conda. Then I did:
import gdal
and I had:
Traceback (most recent call last):
File "", line 1, in
File "/home/conrado/miniconda2/lib/python2.7/site-packages/gdal.py",
line 2, in
fr
On Fri, 14 Jul 2017 09:06 am, Ned Batchelder wrote:
> Steve's summary is qualitatively right, but a little off on the quantitative
> details. Lists don't resize to 2*N, they resize to ~1.125*N:
>
> new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6);
>
> (https://github
Rustom Mody writes:
> Yeah I know append method is supposedly O(1).
It's amortized O(1).
--
https://mail.python.org/mailman/listinfo/python-list
On Sat, Jul 15, 2017 at 12:32 AM, Michael Torrie wrote:
> On 07/14/2017 08:05 AM, Rhodri James wrote:
>> On 14/07/17 14:31, Marko Rauhamaa wrote:
>>> Of course, UTF-8 in a bytes object doesn't make the situation any
>>> better, but does it make it any worse?
>>
>> Speaking as someone who has been
On 14/07/17 15:14, Marko Rauhamaa wrote:
Rhodri James :
On 14/07/17 14:31, Marko Rauhamaa wrote:
Of course, UTF-8 in a bytes object doesn't make the situation any
better, but does it make it any worse?
Speaking as someone who has been up to his elbows in this recently, I
would say emphatical
On 14/07/17 15:32, Michael Torrie wrote:
On 07/14/2017 08:05 AM, Rhodri James wrote:
On 14/07/17 14:31, Marko Rauhamaa wrote:
Of course, UTF-8 in a bytes object doesn't make the situation any
better, but does it make it any worse?
Speaking as someone who has been up to his elbows in this rece
On 07/14/2017 08:05 AM, Rhodri James wrote:
> On 14/07/17 14:31, Marko Rauhamaa wrote:
>> Of course, UTF-8 in a bytes object doesn't make the situation any
>> better, but does it make it any worse?
>
> Speaking as someone who has been up to his elbows in this recently, I
> would say emphatically
On 07/14/2017 07:31 AM, Marko Rauhamaa wrote:
> Of course, UTF-8 in a bytes object doesn't make the situation any
> better, but does it make it any worse?
>
> As it stands, we have
>
>è --[encode>-- Unicode --[reencode>-- UTF-8
>
> Why is one encoding format better than the other?
This is
Rhodri James :
> On 14/07/17 14:31, Marko Rauhamaa wrote:
>> Of course, UTF-8 in a bytes object doesn't make the situation any
>> better, but does it make it any worse?
>
> Speaking as someone who has been up to his elbows in this recently, I
> would say emphatically that it does make things worse
Hi,
I installed the GDAL 2.2.1 using conda. Then I did:
import gdal
and I had:
Traceback (most recent call last):
File "", line 1, in
File "/home/conrado/miniconda2/lib/python2.7/site-packages/gdal.py",
line 2, in
from osgeo.gdal import deprecation_warn
File
"/home/conrado/mi
On 14/07/17 14:31, Marko Rauhamaa wrote:
Of course, UTF-8 in a bytes object doesn't make the situation any
better, but does it make it any worse?
Speaking as someone who has been up to his elbows in this recently, I
would say emphatically that it does make things worse. It adds an extra
laye
Steve D'Aprano :
> These are only a *few* of the *easy* questions that need to be
> answered before we can even consider your question:
>
>> So the question is, should we have a third type for text. Or should
>> the semantics of strings be changed to be based on characters?
Sure, but if they can'
On Fri, 14 Jul 2017 04:30 pm, Marko Rauhamaa wrote:
> Unicode was supposed to get us out of the 8-bit locale hole.
Which it has done. Apart from use for backwards compatibility, there is no good
reason to use to use the masses of legacy extensions to ASCII or the technical
fragile non-Unicode mul
On Fri, Jul 14, 2017 at 10:05 PM, Marko Rauhamaa wrote:
> Marko Rauhamaa :
>
>> Chris Angelico :
>>> If you're trying to use strings as identifiers in any way (say, file
>>> names, or document lookup references), using the NFC/NFD normalized
>>> form of the string should be sufficient.
>>
>> Show
On Fri, Jul 14, 2017 at 8:59 PM, Marko Rauhamaa wrote:
> Chris Angelico :
>
>> On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa wrote:
>>> Chris Angelico :
>>> Then, why bother with Unicode to begin with? Why not just use bytes?
>>> After all, Python3's strings have the very same pitfalls:
>>>
>>>
Marko Rauhamaa :
> Chris Angelico :
>> If you're trying to use strings as identifiers in any way (say, file
>> names, or document lookup references), using the NFC/NFD normalized
>> form of the string should be sufficient.
>
> Show me ten Python3 database applications, and I'll show you ten Python
Chris Angelico :
> On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa wrote:
>> Chris Angelico :
>> Then, why bother with Unicode to begin with? Why not just use bytes?
>> After all, Python3's strings have the very same pitfalls:
>>
>> - you don't know the length of a text in characters
>> - chr
On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa wrote:
> Chris Angelico :
>
>> On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa wrote:
>>> Furthermore, you only dismissed my question about
>>>
>>>len(text)
>>>
>>> What about
>>>
>>>text[-1]
>>>re.match("a.c", text)
>>
>> The considerat
Chris Angelico :
> On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa wrote:
>> Furthermore, you only dismissed my question about
>>
>>len(text)
>>
>> What about
>>
>>text[-1]
>>re.match("a.c", text)
>
> The considerations and concerns in the second half of my paragraph -
> the bit you d
On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa wrote:
> Chris Angelico :
>
>> On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa wrote:
>>> When people use Unicode, they are expecting to be able to deal in real
>>> characters. I would expect:
>>>
>>>len(text) to give me the length
Chris Angelico :
> On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa wrote:
>> When people use Unicode, they are expecting to be able to deal in real
>> characters. I would expect:
>>
>>len(text) to give me the length in characters
>>text[-1]to evaluate to the
Hi,
I wrote one simple C code and integrated python interpreter.
I am using Python C API to run the python command.
Below code used Python C API inside .c file.
PyObject* PyFileObject = PyFile_FromString("test.py", (char *)"r");
int ret = PyRun_SimpleFile(PyFile_AsFile(PyFileObject), "test.py
On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa wrote:
> Unicode was supposed to get us out of the 8-bit locale hole. Now it
> seems the Unicode hole is far deeper and we haven't reached the bottom
> of it yet. I wonder if the hole even has a bottom.
>
> We now have:
>
> - an encoding: a sequence
35 matches
Mail list logo