Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Terry Reedy
On 7/14/2017 9:20 PM, Steve D'Aprano wrote: On Sat, 15 Jul 2017 07:12 am, Terry Reedy wrote: Does go use bytes for text, like most people did in Python 2, a separate text string class, that hides the internal encoding format and implementation? In other words, if you do the equivalent of print

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Steve D'Aprano
On Sat, 15 Jul 2017 04:10 am, Marko Rauhamaa wrote: > Steve D'Aprano : >> On Fri, 14 Jul 2017 11:31 pm, Marko Rauhamaa wrote: [...] >>> As it stands, we have >>> >>>è --[encode>-- Unicode --[reencode>-- UTF-8 >> >> I can't even work out what you're trying to say here. > > I can tell, yet tha

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Steve D'Aprano
On Sat, 15 Jul 2017 07:12 am, Terry Reedy wrote: > Does go use bytes for text, like most people did in Python 2, a separate > text string class, that hides the internal encoding format and > implementation? In other words, if you do the equivalent of print(s) > where s is a text string with a mix

pyserial and end-of-line specification

2017-07-14 Thread F S
I just started using Python and I am writing code to access my serial port using pyserial. I have no problem with unix based text coming in the stream using a LF (0x0A) record separator. I also am using unblocked IO. However I have some sensor devices that use the windows CRLF (0x0A,0x0D) record

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Terry Reedy
On 7/14/2017 5:51 PM, Marko Rauhamaa wrote: Yes, in Python2, Go, C and GNU textutils, when you print a text string containing a mixture of languages, you see characters. Why? Because that's what the terminal emulator chooses to do upon receiving those bytes. >>> s = u'\u1171\u\u\u444

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Terry Reedy : > On 7/14/2017 10:30 AM, Michael Torrie wrote: >> On 07/14/2017 07:31 AM, Marko Rauhamaa wrote: >>> Of course, UTF-8 in a bytes object doesn't make the situation any >>> better, but does it make it any worse? >> >>> >>> As it stands, we have >>> >>> è --[encode>-- Unicode --[reen

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Terry Reedy
On 7/14/2017 10:30 AM, Michael Torrie wrote: On 07/14/2017 07:31 AM, Marko Rauhamaa wrote: Of course, UTF-8 in a bytes object doesn't make the situation any better, but does it make it any worse? As it stands, we have è --[encode>-- Unicode --[reencode>-- UTF-8 Why is one encoding form

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Michael Torrie : > On 07/14/2017 07:31 AM, Marko Rauhamaa wrote: >> Of course, UTF-8 in a bytes object doesn't make the situation any >> better, but does it make it any worse? >> >> As it stands, we have >> >>è --[encode>-- Unicode --[reencode>-- UTF-8 >> >> Why is one encoding format bette

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Rhodri James : > On 14/07/17 15:14, Marko Rauhamaa wrote: >> I'd like to understand this better. Maybe you have a couple of >> examples to share? > > Sure. > > What I've mostly been looking at recently has been the Expat XML parser. > XML chooses to deal with one of your problems by defining that

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Neil Cerutti
On 2017-07-14, Rhodri James wrote: > On 14/07/17 15:32, Michael Torrie wrote: >> Are you saying that dealing with Unicode in Google Go, which >> uses UTF-8 in memory, is adding an extra layer of complexity >> and makes things worse than they might be in Python? > > I'm not familiar with Go. If th

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Steve D'Aprano : > On Fri, 14 Jul 2017 11:31 pm, Marko Rauhamaa wrote: >> Of course, UTF-8 in a bytes object doesn't make the situation any >> better, but does it make it any worse? > > Sure it does. You want the human reader to be able to predict the > number of graphemes ("characters") by sight.

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Steve D'Aprano
On Fri, 14 Jul 2017 11:31 pm, Marko Rauhamaa wrote: > Steve D'Aprano : > >> These are only a *few* of the *easy* questions that need to be >> answered before we can even consider your question: >> >>> So the question is, should we have a third type for text. Or should >>> the semantics of strings

Re: PYTHON GDAL

2017-07-14 Thread Fabien
On 07/14/2017 03:57 PM, jorge.conr...@cptec.inpe.br wrote: Hi, I installed the GDAL 2.2.1 using conda. Then I did: import gdal and I had: Traceback (most recent call last): File "", line 1, in File "/home/conrado/miniconda2/lib/python2.7/site-packages/gdal.py", line 2, in fr

Re: Write this accumuator in a functional style

2017-07-14 Thread Steve D'Aprano
On Fri, 14 Jul 2017 09:06 am, Ned Batchelder wrote: > Steve's summary is qualitatively right, but a little off on the quantitative > details. Lists don't resize to 2*N, they resize to ~1.125*N: > > new_allocated = (size_t)newsize + (newsize >> 3) + (newsize < 9 ? 3 : 6); > > (https://github

Re: Write this accumuator in a functional style

2017-07-14 Thread Paul Rubin
Rustom Mody writes: > Yeah I know append method is supposedly O(1). It's amortized O(1). -- https://mail.python.org/mailman/listinfo/python-list

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Chris Angelico
On Sat, Jul 15, 2017 at 12:32 AM, Michael Torrie wrote: > On 07/14/2017 08:05 AM, Rhodri James wrote: >> On 14/07/17 14:31, Marko Rauhamaa wrote: >>> Of course, UTF-8 in a bytes object doesn't make the situation any >>> better, but does it make it any worse? >> >> Speaking as someone who has been

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Rhodri James
On 14/07/17 15:14, Marko Rauhamaa wrote: Rhodri James : On 14/07/17 14:31, Marko Rauhamaa wrote: Of course, UTF-8 in a bytes object doesn't make the situation any better, but does it make it any worse? Speaking as someone who has been up to his elbows in this recently, I would say emphatical

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Rhodri James
On 14/07/17 15:32, Michael Torrie wrote: On 07/14/2017 08:05 AM, Rhodri James wrote: On 14/07/17 14:31, Marko Rauhamaa wrote: Of course, UTF-8 in a bytes object doesn't make the situation any better, but does it make it any worse? Speaking as someone who has been up to his elbows in this rece

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Michael Torrie
On 07/14/2017 08:05 AM, Rhodri James wrote: > On 14/07/17 14:31, Marko Rauhamaa wrote: >> Of course, UTF-8 in a bytes object doesn't make the situation any >> better, but does it make it any worse? > > Speaking as someone who has been up to his elbows in this recently, I > would say emphatically

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Michael Torrie
On 07/14/2017 07:31 AM, Marko Rauhamaa wrote: > Of course, UTF-8 in a bytes object doesn't make the situation any > better, but does it make it any worse? > > As it stands, we have > >è --[encode>-- Unicode --[reencode>-- UTF-8 > > Why is one encoding format better than the other? This is

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Rhodri James : > On 14/07/17 14:31, Marko Rauhamaa wrote: >> Of course, UTF-8 in a bytes object doesn't make the situation any >> better, but does it make it any worse? > > Speaking as someone who has been up to his elbows in this recently, I > would say emphatically that it does make things worse

PYTHON GDAL

2017-07-14 Thread jorge . conrado
Hi, I installed the GDAL 2.2.1 using conda. Then I did: import gdal and I had: Traceback (most recent call last): File "", line 1, in File "/home/conrado/miniconda2/lib/python2.7/site-packages/gdal.py", line 2, in from osgeo.gdal import deprecation_warn File "/home/conrado/mi

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Rhodri James
On 14/07/17 14:31, Marko Rauhamaa wrote: Of course, UTF-8 in a bytes object doesn't make the situation any better, but does it make it any worse? Speaking as someone who has been up to his elbows in this recently, I would say emphatically that it does make things worse. It adds an extra laye

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Steve D'Aprano : > These are only a *few* of the *easy* questions that need to be > answered before we can even consider your question: > >> So the question is, should we have a third type for text. Or should >> the semantics of strings be changed to be based on characters? Sure, but if they can'

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Steve D'Aprano
On Fri, 14 Jul 2017 04:30 pm, Marko Rauhamaa wrote: > Unicode was supposed to get us out of the 8-bit locale hole. Which it has done. Apart from use for backwards compatibility, there is no good reason to use to use the masses of legacy extensions to ASCII or the technical fragile non-Unicode mul

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Chris Angelico
On Fri, Jul 14, 2017 at 10:05 PM, Marko Rauhamaa wrote: > Marko Rauhamaa : > >> Chris Angelico : >>> If you're trying to use strings as identifiers in any way (say, file >>> names, or document lookup references), using the NFC/NFD normalized >>> form of the string should be sufficient. >> >> Show

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Chris Angelico
On Fri, Jul 14, 2017 at 8:59 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa wrote: >>> Chris Angelico : >>> Then, why bother with Unicode to begin with? Why not just use bytes? >>> After all, Python3's strings have the very same pitfalls: >>> >>>

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Marko Rauhamaa : > Chris Angelico : >> If you're trying to use strings as identifiers in any way (say, file >> names, or document lookup references), using the NFC/NFD normalized >> form of the string should be sufficient. > > Show me ten Python3 database applications, and I'll show you ten Python

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Chris Angelico : > On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa wrote: >> Chris Angelico : >> Then, why bother with Unicode to begin with? Why not just use bytes? >> After all, Python3's strings have the very same pitfalls: >> >> - you don't know the length of a text in characters >> - chr

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Chris Angelico
On Fri, Jul 14, 2017 at 6:53 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa wrote: >>> Furthermore, you only dismissed my question about >>> >>>len(text) >>> >>> What about >>> >>>text[-1] >>>re.match("a.c", text) >> >> The considerat

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Chris Angelico : > On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa wrote: >> Furthermore, you only dismissed my question about >> >>len(text) >> >> What about >> >>text[-1] >>re.match("a.c", text) > > The considerations and concerns in the second half of my paragraph - > the bit you d

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Chris Angelico
On Fri, Jul 14, 2017 at 6:15 PM, Marko Rauhamaa wrote: > Chris Angelico : > >> On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa wrote: >>> When people use Unicode, they are expecting to be able to deal in real >>> characters. I would expect: >>> >>>len(text) to give me the length

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Marko Rauhamaa
Chris Angelico : > On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa wrote: >> When people use Unicode, they are expecting to be able to deal in real >> characters. I would expect: >> >>len(text) to give me the length in characters >>text[-1]to evaluate to the

Read Application python logs

2017-07-14 Thread neel patel
Hi, I wrote one simple C code and integrated python interpreter. I am using Python C API to run the python command. Below code used Python C API inside .c file. PyObject* PyFileObject = PyFile_FromString("test.py", (char *)"r"); int ret = PyRun_SimpleFile(PyFile_AsFile(PyFileObject), "test.py

Re: Grapheme clusters, a.k.a.real characters

2017-07-14 Thread Chris Angelico
On Fri, Jul 14, 2017 at 4:30 PM, Marko Rauhamaa wrote: > Unicode was supposed to get us out of the 8-bit locale hole. Now it > seems the Unicode hole is far deeper and we haven't reached the bottom > of it yet. I wonder if the hole even has a bottom. > > We now have: > > - an encoding: a sequence