Python script for searching variable strings between two constant strings
import re infile = open('document.txt','r') outfile= open('output.txt','w') copy = False for line in infile: if line.strip() == "--operation():": bucket = [] copy = True elif line.strip() == "StartOperation": for strings in bucket: outfile.write( strings + ',') for strings in bucket: outfile.write('\n') copy = False elif copy: bucket.append(line.strip() -- CSV format is like this: id, name,poid, error 5896, AutoAuthOSUserSubmit, 900105270, 0x4002 My log file has several sections starting with START and ending with END . I want to extract the string between --operation(): and StartOperation. For example, AutoAuthOSUserSubmit. I also want to extract the poid value from line poid: 900105270, poidLen: 9. Finally, I want to extract the return value, e.g 0x4002 if Roll back all updates is found after it. I am not even able to extract point the original text if Start and End are not on the same line. How do I go about doing that? This is a sample LOG extract with two paragraphs: -- 08/24 02:07:56 [mds.ecas(5896) ECAS_CP1] ** START ** open file /ecas/public/onsite-be/config/timer.conf failed INFO 08/24/16 02:07:56 salt1be-d1-ap(**5896**/0) main.c(780*):--operation(): AutoAuthOSUserSubmit. StartOperation* INFO 08/24/16 02:07:56 salt1be-d1-ap(5896/0) main.c(784):--Client Information: Request from host 'malt-d1-wb' process id 12382. DEBUG 08/24/16 02:07:56 salt1be-d1-ap(5896/0) TOci.cc(571):FetchServiceObjects: ServiceCert.sql DEBUG 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) vsserviceagent.cpp(517):Generate Certificate 2: c1cd00d5c3de082360a08730fef9cd1d DEBUG 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) junk.c(1373):GenerateWebPin : poid: **900105270**, poidLen: 9 DEBUG 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) junk.c(1408):GenerateWebPin : pinStr DEBUG 08/24/16 02:07:56 salt1be-d1-ap(5896/0) uaadapter_vasco_totp.c(275):UAVascoTOTPImpl.close() -- Releasing Adapter Context DEBUG 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) vsenterprise.cpp(288):VSEnterprise::Engage returns 0x4002 - Unknown error code **(0x4002)** ERROR 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) vsautoauth.cpp(696):OSAAEndUserEnroll: error occurred. **Roll back** all updates! INFO 08/24/16 02:07:56 salt1be-d1-ap(5896/0) uaotptokenstoreqmimpl.cpp(199):Close token store INFO 08/24/16 02:07:56 salt1be-d1-ap(5896/0) main.c(990):-- EndOperation -- 08/24 02:07:56 [mds.ecas(5896) ECAS_CP1] ** END ** OPERATION = AutoAuthOSUserSubmit, rc = 0x0 (0) SYSINFO Elapse = 0.687, Heap = 1334K, Stack = 64K -- https://mail.python.org/mailman/listinfo/python-list
Is duck-typing misnamed?
"If it walks like a duck, quacks like a duck,... " so there is indeed precedence for this so-called 'duck typing' but wouldn't it be more Pythonic to call this 'witch typing'? "How do you know she is a witch?" "She looks like one." etc. I do grant that ultimately, the duck does come into play, since the witch weighs the same as a duck. Roger Christman Electrical Engineering and Computer Science Pennsylvania State University -- https://mail.python.org/mailman/listinfo/python-list
Re: Is duck-typing misnamed?
This should go to Python ideas as it would involve a substantial change to the docs. Kindest regards. Mark Lawrence. -- https://mail.python.org/mailman/listinfo/python-list
Re: Is duck-typing misnamed?
On Saturday, August 27, 2016 at 5:50:30 AM UTC-4, ROGER GRAYDON CHRISTMAN wrote: > "If it walks like a duck, quacks like a duck,... " > > so there is indeed precedence for this so-called 'duck typing' > > > but wouldn't it be more Pythonic to call this 'witch typing'? > > "How do you know she is a witch?" > > "She looks like one." > > etc. > > > I do grant that ultimately, the duck does come into play, since the witch > weighs the same as a duck. +1 :) --Ned. -- https://mail.python.org/mailman/listinfo/python-list
Re: Python script for searching variable strings between two constant strings
Thanks for the lead. I have big log file nearly 2 GB. Lets say I just want to extract the ;name' field only eg. AutoAuthOSUserSubmit.The code is failing with errors. Can you just give a tested code only for the name field. Other fields I will try to work out. --- On Saturday, August 27, 2016 at 4:03:59 AM UTC+5:30, ddream.m...@gmail.com wrote: > import re > > infile = open('document.txt','r') > outfile= open('output.txt','w') > copy = False > for line in infile: > > if line.strip() == "--operation():": > bucket = [] > copy = True > > elif line.strip() == "StartOperation": > for strings in bucket: > outfile.write( strings + ',') > for strings in bucket: > outfile.write('\n') > copy = False > > elif copy: > bucket.append(line.strip() > -- > > CSV format is like this: > id, name,poid, error > 5896, AutoAuthOSUserSubmit, 900105270, 0x4002 > > My log file has several sections starting with START and ending > with END . I want to extract the string between --operation(): and > StartOperation. For example, AutoAuthOSUserSubmit. I also want to extract the > poid value from line poid: 900105270, poidLen: 9. Finally, I want to extract > the return value, e.g 0x4002 if Roll back all updates is found after it. > > I am not even able to extract point the original text if Start and End are > not on the same line. How do I go about doing that? > > This is a sample LOG extract with two paragraphs: > -- 08/24 02:07:56 [mds.ecas(5896) ECAS_CP1] ** START ** > open file /ecas/public/onsite-be/config/timer.conf failed > INFO 08/24/16 02:07:56 salt1be-d1-ap(**5896**/0) > main.c(780*):--operation(): AutoAuthOSUserSubmit. StartOperation* > INFO 08/24/16 02:07:56 salt1be-d1-ap(5896/0) main.c(784):--Client > Information: Request from host 'malt-d1-wb' process id 12382. > DEBUG 08/24/16 02:07:56 salt1be-d1-ap(5896/0) > TOci.cc(571):FetchServiceObjects: ServiceCert.sql > DEBUG 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) > vsserviceagent.cpp(517):Generate Certificate 2: > c1cd00d5c3de082360a08730fef9cd1d > DEBUG 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) junk.c(1373):GenerateWebPin > : poid: **900105270**, poidLen: 9 > DEBUG 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) junk.c(1408):GenerateWebPin > : pinStr > DEBUG 08/24/16 02:07:56 salt1be-d1-ap(5896/0) > uaadapter_vasco_totp.c(275):UAVascoTOTPImpl.close() -- Releasing Adapter > Context > DEBUG 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) > vsenterprise.cpp(288):VSEnterprise::Engage returns 0x4002 - Unknown error > code **(0x4002)** > ERROR 08/22/16 23:15:53 pepper1be-d1-ap(2680/0) > vsautoauth.cpp(696):OSAAEndUserEnroll: error occurred. **Roll back** all > updates! > INFO 08/24/16 02:07:56 salt1be-d1-ap(5896/0) > uaotptokenstoreqmimpl.cpp(199):Close token store > INFO 08/24/16 02:07:56 salt1be-d1-ap(5896/0) main.c(990):-- EndOperation > -- 08/24 02:07:56 [mds.ecas(5896) ECAS_CP1] ** END ** > OPERATION = AutoAuthOSUserSubmit, rc = 0x0 (0) > SYSINFO Elapse = 0.687, Heap = 1334K, Stack = 64K -- https://mail.python.org/mailman/listinfo/python-list
Re: Is duck-typing misnamed?
On Fri, Aug 26, 2016 at 7:58 PM, ROGER GRAYDON CHRISTMAN wrote: > "If it walks like a duck, quacks like a duck,... " > > so there is indeed precedence for this so-called 'duck typing' > > > but wouldn't it be more Pythonic to call this 'witch typing'? > > "How do you know she is a witch?" > > "She looks like one." > > etc. > > > I do grant that ultimately, the duck does come into play, since the witch > weighs the same as a duck. Great idea, I love it. Now go and change your armor. -- https://mail.python.org/mailman/listinfo/python-list
Re: integer's methods
On 2016-08-18, ast wrote: > Hello > > I wonder why calling a method on an integer > doesn't work ? > 123.bit_length() > SyntaxError: invalid syntax Becuase the parser thinks you've entered a floating point number with a fractional part of "bit_length". You need to enter the integer such that it's identified by the parser as an integer rather than as a broken floating point number: >>> (123).bit_length() 7 >>> 123 .bit_length() 7 -- Grant Edwards grant.b.edwardsYow! Look DEEP into the at OPENINGS!! Do you see any gmail.comELVES or EDSELS ... or a HIGHBALL?? ... -- https://mail.python.org/mailman/listinfo/python-list
Re: integer's methods
On Sat, Aug 27, 2016, at 13:24, Grant Edwards wrote: > Becuase the parser thinks you've entered a floating point number with > a fractional part of "bit_length". 123.+456 doesn't think that the fractional part is "+456". (Of course, the real reason is "because it would be even more annoying to get random errors only with attributes that start with "e" or "j") -- https://mail.python.org/mailman/listinfo/python-list
Error numpy install
I have installed numpy using the command pip install numpy from command prompt and I am getting the following error: Traceback (most recent call last): File "", line 1, in import numpy File "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\__init__.py", line 180, in from . import add_newdocs File "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\add_newdocs.py", line 13, in from numpy.lib import add_newdoc File "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\lib\__init__.py", line 8, in from .type_check import * File "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\lib\type_check.py", line 11, in import numpy.core.numeric as _nx File "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\core\__init__.py", line 14, in from . import multiarray ImportError: cannot import name 'multiarray' I have also tried to install multiarray but it says: "Could not find a program that satisfies the requirement multiarray(from versions) Any suggestions on how to install -- https://mail.python.org/mailman/listinfo/python-list
Re: PEP 492: isn't the "await" redundant?
Thank you for all your answers. After all, I am more confident with the current syntax. The most important reason for 'await' to me now is the fact you quite _often_ need to prepare the 'awaitable' object to wait for it later (like the ChrisA's example with print()), i.e. split the expression into more lines: fut = coro(x) await fut I supposed it to be only a minor use case (compared to 'await coro(x)'), but I learned it isn't. Every time you need to "wait for more than one thing" (more than one 'future'), you also need the split. Not only for parallel branching, but also even for simple async operations combined with timeout - asyncio.wait_for() etc. And I prefer the explicit 'await' for simple waiting to special syntax for spliting (i.e. do simple waiting without 'await' as was the proposal at top of this thread - and - introduce more complicated syntax for split - something like functools.partial(coro, x)). Kouli -- https://mail.python.org/mailman/listinfo/python-list
Multimeter USB output
Hi, I'm using Python 3.5.1 with PyUSB 1.0 under Win 10 (64). We try to read the USB output of a DMM 'UT61B'. import usb.core import usb.util import usb.backend.libusb1 def Gosub(): dev = usb.core.find(idVendor=0x1a86, idProduct=0xe008) # Digital Multimeter UT61B if dev == None: print ('Multimeter not found') else: print ('Multimeter was found') dev.set_configuration() cfg = dev.get_active_configuration() intf = cfg[(0,0)] ep = usb.util.find_descriptor( intf, custom_match = \ lambda e: \ usb.util.endpoint_direction(e.bEndpointAddress) == \ usb.util.ENDPOINT_IN) if ep == None: print ('ep is None') else: s = ep.read(64, 500) print ('Len s: ' + len(s)) print ('Starting') Gosub() print ('Ready.-') Result: File "d:\work-d\PythonProgs\ut61b.py", line 27, in Gosub() File "d:\work-d\PythonProgs\ut61b.py", line 23, in Gosub s = ep.read(64, 500) File "D:\Python3\Lib\site-packages\usb\core.py", line 402, in read return self.device.read(self, size_or_buffer, timeout) File "D:\Python3\Lib\site-packages\usb\core.py", line 988, in read self.__get_timeout(timeout)) File "D:\Python3\Lib\site-packages\usb\backend\libusb1.py", line 851, in intr_read timeout) File "D:\Python3\Lib\site-packages\usb\backend\libusb1.py", line 936, in __read _check(retval) File "D:\Python3\Lib\site-packages\usb\backend\libusb1.py", line 595, in _check raise USBError(_strerror(ret), ret, _libusb_errno[ret]) usb.core.USBError: [Errno 10060] Operation timed out What's wrong? How to fix? Regards -- Joe -- https://mail.python.org/mailman/listinfo/python-list
Re: integer's methods
On 2016-08-27, Random832 wrote: > On Sat, Aug 27, 2016, at 13:24, Grant Edwards wrote: >> Becuase the parser thinks you've entered a floating point number with >> a fractional part of "bit_length". > > 123.+456 doesn't think that the fractional part is "+456". That's because the parser (or more traditionally the lexical analyzer) treats '+' differently than it does the characters [a-zA-Z]. > (Of course, the real reason is "because it would be even more > annoying to get random errors only with attributes that start with > "e" or "j") -- Grant Edwards grant.b.edwardsYow! The entire CHINESE at WOMEN'S VOLLEYBALL TEAM all gmail.comshare ONE personality -- and have since BIRTH!! -- https://mail.python.org/mailman/listinfo/python-list
Re: Error numpy install
On Saturday, August 27, 2016 at 5:45:58 PM UTC+1, GP wrote: > I have installed numpy using the command pip install numpy from command > prompt and I am getting the following error: > Traceback (most recent call last): > File "", line 1, in > import numpy > File > "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\__init__.py", > line 180, in > from . import add_newdocs > File > "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\add_newdocs.py", > line 13, in > from numpy.lib import add_newdoc > File > "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\lib\__init__.py", > line 8, in > from .type_check import * > File > "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\lib\type_check.py", > line 11, in > import numpy.core.numeric as _nx > File > "C:\Users\GP\AppData\Local\Programs\Python\Python35\lib\site-packages\numpy\core\__init__.py", > line 14, in > from . import multiarray > ImportError: cannot import name 'multiarray' > > I have also tried to install multiarray but it says: "Could not find a > program that satisfies the requirement multiarray(from versions) > > Any suggestions on how to install This has been reported a lot over the last couple of years. I've always overcome any numpy problems by downloading the appropriate version from http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy and then using pip install against the local file name. HTH. Kindest regards. Mark Lawrence. -- https://mail.python.org/mailman/listinfo/python-list
Re: Is duck-typing misnamed?
On 8/26/2016 7:58 PM, ROGER GRAYDON CHRISTMAN wrote: "If it walks like a duck, quacks like a duck,... " so there is indeed precedence for this so-called 'duck typing' but wouldn't it be more Pythonic to call this 'witch typing'? "How do you know she is a witch?" "She looks like one." Given that people were once burned to death for 'looking like a witch' (or sounding or acting), and can still suffer socially for such reasons, this it not funny to me. We should stick with ducks. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Multimeter USB output
On 8/27/2016 3:35 PM, Joe wrote: Hi, I'm using Python 3.5.1 with PyUSB 1.0 under Win 10 (64). We try to read the USB output of a DMM 'UT61B'. import usb.core import usb.util import usb.backend.libusb1 def Gosub(): dev = usb.core.find(idVendor=0x1a86, idProduct=0xe008) # Digital Multimeter UT61B if dev == None: print ('Multimeter not found') else: print ('Multimeter was found') dev.set_configuration() cfg = dev.get_active_configuration() intf = cfg[(0,0)] ep = usb.util.find_descriptor( intf, custom_match = \ lambda e: \ usb.util.endpoint_direction(e.bEndpointAddress) == \ usb.util.ENDPOINT_IN) if ep == None: print ('ep is None') else: s = ep.read(64, 500) print ('Len s: ' + len(s)) print ('Starting') Gosub() print ('Ready.-') Result: I presume you saw Starting Multimeter was found File "d:\work-d\PythonProgs\ut61b.py", line 27, in Gosub() File "d:\work-d\PythonProgs\ut61b.py", line 23, in Gosub s = ep.read(64, 500) File "D:\Python3\Lib\site-packages\usb\core.py", line 402, in read return self.device.read(self, size_or_buffer, timeout) File "D:\Python3\Lib\site-packages\usb\core.py", line 988, in read self.__get_timeout(timeout)) File "D:\Python3\Lib\site-packages\usb\backend\libusb1.py", line 851, in intr_read timeout) File "D:\Python3\Lib\site-packages\usb\backend\libusb1.py", line 936, in __read _check(retval) File "D:\Python3\Lib\site-packages\usb\backend\libusb1.py", line 595, in _check raise USBError(_strerror(ret), ret, _libusb_errno[ret]) usb.core.USBError: [Errno 10060] Operation timed out What's wrong? How to fix? Read (again?) the doc for the interface for the device. Because reading timed out, I suspect that it is waiting for a command for it to send something. -- Terry Jan Reedy -- https://mail.python.org/mailman/listinfo/python-list
Re: Is duck-typing misnamed?
On 26Aug2016 19:58, ROGER GRAYDON CHRISTMAN wrote: "If it walks like a duck, quacks like a duck,... " so there is indeed precedence for this so-called 'duck typing' but wouldn't it be more Pythonic to call this 'witch typing'? "How do you know she is a witch?" "She looks like one." etc. I do grant that ultimately, the duck does come into play, since the witch weighs the same as a duck. I disagree. They want to burn her because she's supposedly a witch, but the scientific test was that she weighed as much as a duck. So I think your second example is also duck typing: functioning like a duck. Cheers, Cameron Simpson -- https://mail.python.org/mailman/listinfo/python-list
Re: Is duck-typing misnamed?
On Sat, Aug 27, 2016 at 6:34 PM, Terry Reedy wrote: > On 8/26/2016 7:58 PM, ROGER GRAYDON CHRISTMAN wrote: >> >> "If it walks like a duck, quacks like a duck,... " >> >> so there is indeed precedence for this so-called 'duck typing' >> >> >> but wouldn't it be more Pythonic to call this 'witch typing'? >> >> "How do you know she is a witch?" >> >> "She looks like one." > > > Given that people were once burned to death for 'looking like a witch' (or > sounding or acting), and can still suffer socially for such reasons, this it > not funny to me. We should stick with ducks. > > -- > Terry Jan Reedy > > -- > https://mail.python.org/mailman/listinfo/python-list which ducks? -- Joel Goldstick http://joelgoldstick.com/blog http://cc-baseballstats.info/stats/birthdays -- https://mail.python.org/mailman/listinfo/python-list
Re: Is duck-typing misnamed?
Your response is appreciated. I just thought I'd comment a little more on the script: Woman: I'm not a witch! I'm not a witch! V: ehh... but you are dressed like one. W: They dressed me up like this! All: naah no we didn't... no. W: And this isn't my nose, it's a false one. (V lifts up carrot) V: Well? P1: Well we did do the nose V: The nose? P1: ...And the hat, but she is a witch! They took a woman who originally, I think we might agree, was not a witch, and they added features that were understood to be part of the protocol for witchiness. I think this is very much like me defining methods __iter__ and __next__ and voila, I've turned something into an iterator by witch -- er.. duck-typing! Perhaps she inherited her weight from her latent duckness. Thoughts? Roger Christman On Sat, Aug 27, 2016 06:27 PM, python-list@python.org wrote: > On 26Aug2016 19:58, ROGER GRAYDON CHRISTMAN wrote: >>"If it walks like a duck, quacks like a duck,... " >>so there is indeed precedence for this so-called 'duck typing' >> >>but wouldn't it be more Pythonic to call this 'witch typing'? >>"How do you know she is a witch?" >>"She looks like one." >>etc. >> >>I do grant that ultimately, the duck does come into play, since the witch >>weighs the same as a duck. > >I disagree. They want to burn her because she's supposedly a witch, but the >scientific test was that she weighed as much as a duck. So I think your second >example is also duck typing: functioning like a duck. > >Cheers, >Cameron Simpson > > > -- https://mail.python.org/mailman/listinfo/python-list
Re: Is duck-typing misnamed?
c...@zip.com.au writes: > They want to burn her because she's supposedly a witch, but the > scientific test was that she weighed as much as a duck. So I think > your second example is also duck typing: functioning like a duck. Excellent reasoning! (Also, I agree that describing objects with “looks like a witch” brings repressive social context, both historical and present-day, that should not be encouraged. I'd prefer that the Python community refrain from that.) Let's stick to the term “duck typing”. -- \ “Any sufficiently advanced bug is indistinguishable from a | `\ feature.” —Rich Kulawiec | _o__) | Ben Finney -- https://mail.python.org/mailman/listinfo/python-list
Re: What's the best way to minimize the need of run time checks?
2016-08-14 7:29 GMT-07:00 Steven D'Aprano : > On Thu, 11 Aug 2016 06:33 am, Juan Pablo Romero Méndez wrote: > > > I've been trying to find (without success so far) an example of a > > situation where the dynamic features of a language like Python provides a > > clear advantage over languages with more than one type. > > Python has more than one type. Don't confuse dynamic typing with weak > typing > or untyped (typeless) languages. More on this below. > Sorry I was not clear, I was thinking in something along these lines: https://existentialtype.wordpress.com/2011/03/19/dynamic-languages-are-static-languages/ (Warning: his pov is very different from yours) > > I don't believe that you will find "an example of a situation..." as you > say > above. There is a clear situation in my mind: both Python and JavaScript (the dynamic langs I'm most familiar with) provide an easy to use `eval` function. None of the static langs I know provide such functionality (that I'm aware of). Personally I don't have any use for eval, so I was thinking in situations other than that. > It sounds like you are hope to find a clear example of "If you do > This, then dynamic languages are the Clear Winner". But I don't think you > will. Dynamic languages tend to produce clear productivity improvements > over statically typed languages, but of course this is only "typically" > true, not a guarantee that applies to every single programmer or project. > The very few research done on the subject seems to indicate otherwise (here's a link if you are interested in such topics https://www.functionalgeekery.com/episode-55-andreas-stefik/#t=18:14.448). > > Typically: > > - dynamic languages are less verbose; > - dynamic languages are faster to develop in; many organisations > prototype applications in Python (say) before re-writing it in > C++/Java/whatever; > - far less time spent fighting the compiler; > - dynamic languages often have fewer bugs, because it is easier to > reason about the code (no "undefined behaviour" like in C!) and > fewer lines of code to reason about; - but statically typed languages allow you to prove the absence > of certain types of bugs. > > The exception is if you try to write statically typed code in a dynamic > language. Then you get the worst of both styles of coding: the verbose, > heavyweight style of many static languages, but without the automated > correctness proofs, plus the performance costs of dynamic typing, but > without the rapid development. > > > > Regarding types and type systems, if you haven't already read this, you > should: > > https://cdsmith.wordpress.com/2011/01/09/an-old-article-i-wrote/ Thanks for the link. I don't have one to give back, but what I can suggest is this book: http://haskellbook.com/. It seems to me that many of your opinions come from using C++ / Java. If that's the case they are completely understandable. Despite their popularity they are by no means good representatives of languages with modern and powerful type systems. F# (dot.net), Haskell, Scala (JVM / Browser), or Elm (Browser) provide much better examples. > > > "Static typing" (e.g. Pascal, C, Java, Haskell) and "dynamic typing" (e.g. > Python, Javascript, Ruby, Lua) differ on when and how values are checked > for type-compatibility. > > "Strong" and "weak" typing are ends of a continuum. Nearly all languages > are > a little bit weak (they allow automatic coercions between numeric types) > but mostly strong (they don't automatically coerce integers to arrays). > Javascript, Perl and PHP are weaker than Python because they'll coerce > strings to numbers automatically and Python won't. > > I don't know many untyped languages apart from machine code or maybe > assembly. Perhaps Forth? (Maybe not -- some Forths include a separate > floating point stack as well as the usual stack.) Hypertalk treated > everything as strings. Tcl treats nearly everything as strings, although it > also has arrays. > > So, Python has types, and it is a mostly strong typed language. It will do > relatively few automatic coercions. > > > > -- > Steve > “Cheer up,” they said, “things could be worse.” So I cheered up, and sure > enough, things got worse. > > -- > https://mail.python.org/mailman/listinfo/python-list > -- https://mail.python.org/mailman/listinfo/python-list
Re: What's the best way to minimize the need of run time checks?
On Sun, 28 Aug 2016 12:31 pm, Juan Pablo Romero Méndez wrote: > 2016-08-14 7:29 GMT-07:00 Steven D'Aprano : > >> On Thu, 11 Aug 2016 06:33 am, Juan Pablo Romero Méndez wrote: >> >> > I've been trying to find (without success so far) an example of a >> > situation where the dynamic features of a language like Python provides >> > a clear advantage over languages with more than one type. >> >> Python has more than one type. Don't confuse dynamic typing with weak >> typing or untyped (typeless) languages. More on this below. >> > > > Sorry I was not clear, I was thinking in something along these lines: > > https://existentialtype.wordpress.com/2011/03/19/dynamic-languages-are-static-languages/ > > (Warning: his pov is very different from yours) It is a great example of somebody suffering from the problem that when the only tool he has is a hammer, everything looks like an nail. He clearly is immersed in the world of formal type systems and understands their power. But now he sees everything from that single perspective. Now it is true that speaking in full generality, classes and types refer to different things. Or to be perhaps more accurate, *subclassing* and *subtyping* are different things: http://c2.com/cgi/wiki?SubTypingAndSubClassing Many languages treat them the same, but fundamentally they are different. (Note: for veteran Python programmers who remember the language before types and classes where unified in version 2.2, this is not the same thing! Prior to 2.2, both "types" and "classes" related to *subclassing*, albeit in a negative way for the built-in types: they couldn't be subclassed.) But the author of this piece ignores that standard distinction and invents his own non-standard one: to him, classes are merely different representations of the same data. E.g. his example of complex numbers, shown as Cartesian (x, y) values or polar (r, θ) values. These aren't two different "kinds of things", but merely two different ways of representing the same entity. That's not a good way to think about (say) Python lists and Python bools. Lists and bools are in no way the same kind of entity (except in the most general category of "they're both objects"). It's not even a very good way of thinking about complex numbers. Viewed from his perspective of type systems, the author makes what I call the food processor error. Food processors, blenders and other similar kitchen appliances are often advertised as having "five speeds", or ten speeds, or however many the machine is capable of, usually labelled as "chop", "dice", "whip", "puree", etc. And, far too often: "OFF". Since when is "off" a speed? If a blender that is turned off counts as a blending speed, then a simple bowl is a "one speed blender". You put food in the bowl, and it doesn't blend at all. That counts as "off" speed. The author is making the same mistake. He thinks that a language which lacks static typing counts as static typing. "Off" is a speed! "Bald" is a hair colour! "Raw" is a way of cooking food! In truth though, static and dynamic typing are very different. The author is fooled because you can emulate *one* part of dynamic typing in a statically typed system by adding one extra type, "Any", or "Duck Type", or whatever you want to call it. The static checker can then ignore anything declared as Any type. But deferring type checks to runtime is only part of dynamic typing. The other fundamental difference is: - statically typed languages associate types to variables; - dynamically typed languages associate types to values. This difference is more significant than the "run-time/compile-time" and its one which often confuses people. That's not surprising: if I say "x is an int", that's ambiguous whether I'm referring to the variable x or the value currently assigned to x. If you don't see the ambiguity, then you're seeing it purely from the perspective of either static or dynamic typing. In static typing, I somehow associate the name "x" with a tag that says "this may only be used with ints". Perhaps I have to declare it first, like in C or Pascal, or perhaps the compiler can infer the type, like in Haskell, but either way, "x" is now forever tagged as an int, so that the compiler can flag errors like: x = 1 # ... code can run here x.upper() The compiler knows that ints don't have a method "upper" and can flag this as an error. In such static languages, it is invariably an error to try to change the type of the variable (unless it has been tagged as "Anything" or "Duck Typed"). x = 1 x = "hello" # a type error, at compile time But in dynamic typing, the type information isn't associated with the name "x", but with the value 1 currently assigned to it. Change the assignment, and the type changes. As a consequence, it is necessary to move the type checks from compile time to runtime, but that's not the fundamental difference between the two. As further evidence that the author has missed the forest for all the tree
Re: What's the best way to minimize the need of run time checks?
2016-08-27 21:30 GMT-07:00 Steve D'Aprano : > On Sun, 28 Aug 2016 12:31 pm, Juan Pablo Romero Méndez wrote: > > > 2016-08-14 7:29 GMT-07:00 Steven D'Aprano : > > > >> On Thu, 11 Aug 2016 06:33 am, Juan Pablo Romero Méndez wrote: > >> > >> > I've been trying to find (without success so far) an example of a > >> > situation where the dynamic features of a language like Python > provides > >> > a clear advantage over languages with more than one type. > >> > >> Python has more than one type. Don't confuse dynamic typing with weak > >> typing or untyped (typeless) languages. More on this below. > >> > > > > > > Sorry I was not clear, I was thinking in something along these lines: > > > > > https://existentialtype.wordpress.com/2011/03/19/ > dynamic-languages-are-static-languages/ > > > > (Warning: his pov is very different from yours) > > It is a great example of somebody suffering from the problem that when the > only tool he has is a hammer, everything looks like an nail. > > He clearly is immersed in the world of formal type systems and understands > their power. But now he sees everything from that single perspective. > > Now it is true that speaking in full generality, classes and types refer to > different things. Or to be perhaps more accurate, *subclassing* and > *subtyping* are different things: > > http://c2.com/cgi/wiki?SubTypingAndSubClassing > > Many languages treat them the same, but fundamentally they are different. > > (Note: for veteran Python programmers who remember the language before > types > and classes where unified in version 2.2, this is not the same thing! Prior > to 2.2, both "types" and "classes" related to *subclassing*, albeit in a > negative way for the built-in types: they couldn't be subclassed.) > > But the author of this piece ignores that standard distinction and invents > his own non-standard one: to him, classes are merely different > representations of the same data. E.g. his example of complex numbers, > shown as Cartesian (x, y) values or polar (r, θ) values. These aren't two > different "kinds of things", but merely two different ways of representing > the same entity. > > That's not a good way to think about (say) Python lists and Python bools. > Lists and bools are in no way the same kind of entity (except in the most > general category of "they're both objects"). > > It's not even a very good way of thinking about complex numbers. > > Viewed from his perspective of type systems, the author makes what I call > the food processor error. Food processors, blenders and other similar > kitchen appliances are often advertised as having "five speeds", or ten > speeds, or however many the machine is capable of, usually labelled > as "chop", "dice", "whip", "puree", etc. And, far too often: "OFF". > > Since when is "off" a speed? If a blender that is turned off counts as a > blending speed, then a simple bowl is a "one speed blender". You put food > in the bowl, and it doesn't blend at all. That counts as "off" speed. > > The author is making the same mistake. He thinks that a language which > lacks > static typing counts as static typing. "Off" is a speed! "Bald" is a hair > colour! "Raw" is a way of cooking food! > > In truth though, static and dynamic typing are very different. The author > is > fooled because you can emulate *one* part of dynamic typing in a statically > typed system by adding one extra type, "Any", or "Duck Type", or whatever > you want to call it. The static checker can then ignore anything declared > as Any type. > > But deferring type checks to runtime is only part of dynamic typing. The > other fundamental difference is: > > - statically typed languages associate types to variables; > - dynamically typed languages associate types to values. > > This difference is more significant than the "run-time/compile-time" and > its > one which often confuses people. That's not surprising: if I say "x is an > int", that's ambiguous whether I'm referring to the variable x or the value > currently assigned to x. If you don't see the ambiguity, then you're seeing > it purely from the perspective of either static or dynamic typing. > > In static typing, I somehow associate the name "x" with a tag that > says "this may only be used with ints". Perhaps I have to declare it first, > like in C or Pascal, or perhaps the compiler can infer the type, like in > Haskell, but either way, "x" is now forever tagged as an int, so that the > compiler can flag errors like: > > x = 1 > # ... code can run here > x.upper() > > The compiler knows that ints don't have a method "upper" and can flag this > as an error. In such static languages, it is invariably an error to try to > change the type of the variable (unless it has been tagged as "Anything" > or "Duck Typed"). > > x = 1 > x = "hello" # a type error, at compile time > > > But in dynamic typing, the type information isn't associated with the > name "x", but with the value 1 currently assigned to it. Change the > assignment
Re: What's the best way to minimize the need of run time checks?
On Sun, Aug 28, 2016 at 2:30 PM, Steve D'Aprano wrote: > But the author of this piece ignores that standard distinction and invents > his own non-standard one: to him, classes are merely different > representations of the same data. E.g. his example of complex numbers, > shown as Cartesian (x, y) values or polar (r, θ) values. These aren't two > different "kinds of things", but merely two different ways of representing > the same entity. > > That's not a good way to think about (say) Python lists and Python bools. > Lists and bools are in no way the same kind of entity (except in the most > general category of "they're both objects"). > > It's not even a very good way of thinking about complex numbers. It might be a good way of thinking about points on a Cartesian plane, though. Rectangular and polar coordinates truly are just different ways of expressing the same information. (How well 2D coordinates map to complex numbers is a separate question.) > In static typing, I somehow associate the name "x" with a tag that > says "this may only be used with ints". Perhaps I have to declare it first, > like in C or Pascal, or perhaps the compiler can infer the type, like in > Haskell, but either way, "x" is now forever tagged as an int, so that the > compiler can flag errors like: > > x = 1 > # ... code can run here > x.upper() > > The compiler knows that ints don't have a method "upper" and can flag this > as an error. In such static languages, it is invariably an error to try to > change the type of the variable (unless it has been tagged as "Anything" > or "Duck Typed"). So far, I completely agree with you; whether you declare "x takes integers only" or the compiler infers "x has been assigned to point to an integer" or any other form of it, attempting to call .upper() on the integer 1 is an error. > x = 1 > x = "hello" # a type error, at compile time > > > But in dynamic typing, the type information isn't associated with the > name "x", but with the value 1 currently assigned to it. Change the > assignment, and the type changes. As a consequence, it is necessary to move > the type checks from compile time to runtime, but that's not the > fundamental difference between the two. This is where I'm less sure. Sometimes a variable's type should be broader than just one concrete type - for instance, a variable might hold 1 over here, and 1.5 over there, and thus is storing either "int or float" or "any number". If you have a complex hierarchy of types, how do you know that this variable should be allowed to hold anything up to a certain level in the hierarchy, and no further? If what the compiler's doing is identifying what *is* assigned, then it's easy. You've given it an int over here and a float over there, and that's legal; from that point on, the compiler knows that this contains either an int or a float. (Let's assume it can't know for sure which, eg it has "if (cond) x=1; else x=1.5" where the condition can't be known till run-time.) But for your example of x="hello" to be a compilation error, it has to either assume that the first object given determines the type completely, or be told what types are permitted. So, for example, I could make a declaration in Pike that says: int|float x = 1; and then x can have either an integer (which, like in Python, is a bignum) or a float (IEEE 64-bit, again like Python), but not a string. I could equally say: string(8bit)|int x(0..) = 12345; which would allow x to store a byte-string (an eight-bit string, as opposed to a Unicode string which stores text) or a non-negative integer. A type inference system that can't handle variables like this is limited; but if it _can_ handle something like this, how can it flag an error at compile time? It'd just infer a more complicated type. How is this resolved in type-inferring languages? (Genuine question, not rhetorical. I haven't used type-inferring languages in this way.) > As further evidence that the author has missed the forest for all the trees, > consider languages which actually do have only a single type: > > - in assembly language, everything is just bytes or words; > > - in Forth, similarly, everything is just a 16-bit or 32-bit word; > > - in Hypertalk, every value is stored internally as a string; > > to say nothing of more esoteric languages like Oook, Whitespace and > BrainF*ck. Turing Tarpits (Ook, Brain*, etc) tend to be like assembly language, treating everything as cells (assembly language might call those cells either "bytes" or "words"). REXX is like Hypertalk - everything truly is a string. Shell scripting languages generally treat everything as strings, too (although bash has arrays too). I can't imagine any untyped language using anything other than bytes/words or strings, but I'm sure someone's done it somewhere. ChrisA -- https://mail.python.org/mailman/listinfo/python-list
Re: What's the best way to minimize the need of run time checks?
2016-08-27 21:30 GMT-07:00 Steve D'Aprano : > On Sun, 28 Aug 2016 12:31 pm, Juan Pablo Romero Méndez wrote: > > > 2016-08-14 7:29 GMT-07:00 Steven D'Aprano : > > > >> On Thu, 11 Aug 2016 06:33 am, Juan Pablo Romero Méndez wrote: > >> > >> > I've been trying to find (without success so far) an example of a > >> > situation where the dynamic features of a language like Python > provides > >> > a clear advantage over languages with more than one type. > >> > >> Python has more than one type. Don't confuse dynamic typing with weak > >> typing or untyped (typeless) languages. More on this below. > >> > > > > > > Sorry I was not clear, I was thinking in something along these lines: > > > > > https://existentialtype.wordpress.com/2011/03/19/ > dynamic-languages-are-static-languages/ > > > > (Warning: his pov is very different from yours) > > It is a great example of somebody suffering from the problem that when the > only tool he has is a hammer, everything looks like an nail. > > He clearly is immersed in the world of formal type systems and understands > their power. But now he sees everything from that single perspective. > > Now it is true that speaking in full generality, classes and types refer to > different things. Or to be perhaps more accurate, *subclassing* and > *subtyping* are different things: > > http://c2.com/cgi/wiki?SubTypingAndSubClassing > > Many languages treat them the same, but fundamentally they are different. > Oh, I don't think he is thinking in terms of OO "classes", I think he meant two different "kinds" or "varieties" of values (although kind has a technical meaning) In TypeScript terms what he is saying can be described like this: type Complex = { real: number, i: number } | { r: number, φ: number} const c1: Complex = { real: 1, i: 1 } const c2: Complex = { r: 1, φ: 0.5 } You have two values of the same type but different representation. > > (Note: for veteran Python programmers who remember the language before > types > and classes where unified in version 2.2, this is not the same thing! Prior > to 2.2, both "types" and "classes" related to *subclassing*, albeit in a > negative way for the built-in types: they couldn't be subclassed.) > > But the author of this piece ignores that standard distinction and invents > his own non-standard one: to him, classes are merely different > representations of the same data. E.g. his example of complex numbers, > shown as Cartesian (x, y) values or polar (r, θ) values. These aren't two > different "kinds of things", but merely two different ways of representing > the same entity. > > That's not a good way to think about (say) Python lists and Python bools. > Lists and bools are in no way the same kind of entity (except in the most > general category of "they're both objects"). > > It's not even a very good way of thinking about complex numbers. > > Viewed from his perspective of type systems, the author makes what I call > the food processor error. Food processors, blenders and other similar > kitchen appliances are often advertised as having "five speeds", or ten > speeds, or however many the machine is capable of, usually labelled > as "chop", "dice", "whip", "puree", etc. And, far too often: "OFF". > > Since when is "off" a speed? If a blender that is turned off counts as a > blending speed, then a simple bowl is a "one speed blender". You put food > in the bowl, and it doesn't blend at all. That counts as "off" speed. > > The author is making the same mistake. He thinks that a language which > lacks > static typing counts as static typing. "Off" is a speed! "Bald" is a hair > colour! "Raw" is a way of cooking food! > > In truth though, static and dynamic typing are very different. The author > is > fooled because you can emulate *one* part of dynamic typing in a statically > typed system by adding one extra type, "Any", or "Duck Type", or whatever > you want to call it. The static checker can then ignore anything declared > as Any type. > > But deferring type checks to runtime is only part of dynamic typing. The > other fundamental difference is: > > - statically typed languages associate types to variables; > - dynamically typed languages associate types to values. > > This difference is more significant than the "run-time/compile-time" and > its > one which often confuses people. That's not surprising: if I say "x is an > int", that's ambiguous whether I'm referring to the variable x or the value > currently assigned to x. If you don't see the ambiguity, then you're seeing > it purely from the perspective of either static or dynamic typing. > > In static typing, I somehow associate the name "x" with a tag that > says "this may only be used with ints". Perhaps I have to declare it first, > like in C or Pascal, or perhaps the compiler can infer the type, like in > Haskell, but either way, "x" is now forever tagged as an int, so that the > compiler can flag errors like: > > x = 1 > # ... code can run here > x.upper() > > The compile
Re: What's the best way to minimize the need of run time checks?
Chris Angelico writes: > On Sun, Aug 28, 2016 at 2:30 PM, Steve D'Aprano wrote: >> But in dynamic typing, the type information isn't associated with the >> name "x", but with the value 1 currently assigned to it. Change the >> assignment, and the type changes. As a consequence, it is necessary >> to move the type checks from compile time to runtime, but that's not >> the fundamental difference between the two. > > This is where I'm less sure. Sometimes a variable's type should be > broader than just one concrete type - for instance, a variable might > hold 1 over here, and 1.5 over there, and thus is storing either "int > or float" or "any number". If you have a complex hierarchy of types, > how do you know that this variable should be allowed to hold anything > up to a certain level in the hierarchy, and no further? It's not just literal values that give potential type information in a dynamically typed language. Another source is functions that the compiler knows, and this information propagates back and forth in the analysis of the control flow. For example, below the compiler might infer that x must be a number but not a complex number, then generate one type check (which it might be able to prove redundant) and calls to specialized versions of ceiling and floor. d = ceiling(x) - floor(x) Also known is that the results of the calls are numbers and the difference of numbers is a number, so d gets assigned a number. Perhaps ceiling and floor in the language always return an int. Then d is known to be an int. And so on. I think "soft typing" refers to this kind of work. Scheme implementations have specialized arithmetic operators that require their arguments to be floats. The generic operators do the same thing when their arguments are floats, but the specialized operators provide this type information to an optimizing compiler. This is related to specialized container types that can only contain floats. This way it may be possible to arrange the code so that the compiler knows the type of everything, or almost everything, which helps the compiler in its art. -- https://mail.python.org/mailman/listinfo/python-list
Re: What's the best way to minimize the need of run time checks?
On Sun, Aug 28, 2016 at 4:13 PM, Jussi Piitulainen wrote: >> This is where I'm less sure. Sometimes a variable's type should be >> broader than just one concrete type - for instance, a variable might >> hold 1 over here, and 1.5 over there, and thus is storing either "int >> or float" or "any number". If you have a complex hierarchy of types, >> how do you know that this variable should be allowed to hold anything >> up to a certain level in the hierarchy, and no further? > > It's not just literal values that give potential type information in a > dynamically typed language. Another source is functions that the > compiler knows, and this information propagates back and forth in the > analysis of the control flow. > > For example, below the compiler might infer that x must be a number but > not a complex number, then generate one type check (which it might be > able to prove redundant) and calls to specialized versions of ceiling > and floor. > > d = ceiling(x) - floor(x) > > Also known is that the results of the calls are numbers and the > difference of numbers is a number, so d gets assigned a number. Perhaps > ceiling and floor in the language always return an int. Then d is known > to be an int. And so on. Right, and I understand this concept. Consider this code: x = 5; ... if (some_condition) x = "five"; else x = [0, 0, 0, 0, 0]; (adjust syntax to whatever language you like) Does this mean that the type of x is int|string|list, or will this be an error? Assuming the condition can't be known until run time (eg it involves user input), there's no way for a static analyzer to differentiate between this code and the form that Steven put forward: > x = 1 > x = "hello" # a type error, at compile time Simple type inference would either see this as meaning that x is int|string, or possibly it'd say "x is an int up to that second line, and a string thereafter" (which is basically like dynamic typing but statically checked - it's the value, not the variable, that has a type, and checks like x.upper() would take note of that). But if it flags it as an error, that would basically mean that the type system is (probably deliberately) simplistic and restrictive, requiring that x be EITHER an integer variable OR a string variable, and not both. Which is a perfectly viable stance, but I'm just not sure if it's (a) what is done, or (b) ideal. Particularly since it'd end up requiring some annoying rules, like "integers and floats are compatible, but nothing else, including user-defined types" or "integers and floats are fundamentally different things, and if you want your variable ever to contain a float, you have to always use 1.0 instead of just 1", neither of which I like. ChrisA -- https://mail.python.org/mailman/listinfo/python-list