On Sun, Feb 18, 2018 at 5:05 AM, Steven D'Aprano <steve+comp.lang.pyt...@pearwood.info> wrote: > On Sat, 17 Feb 2018 15:25:15 +1100, Chris Angelico wrote: > >> 1) Type safety. >> >> This is often touted as a necessity for industrial-grade software. It >> isn't. There are many things that a type system, no matter how >> sophisticated, cannot catch; > > The usual response to that is to make ever-finer-grained types, until the > type-system can prove the code is correct. > > integers > positive integers > positive integers greater than 10 > positive integers greater than 10 but less than 15003 > positive odd integers greater than 10 but less than 15003 > positive odd integers greater than 10 but less than 15003 divisible by 17 > > Of course, this has a few minor (ha!) difficulties... starting with the > hardest problem in computer science, naming things.
Naming things isn't a problem if we're working with a type inference system. On the flip side, if your last example is purely type inference, it's not really a type checking system - it's a holistic static analysis. You can't say "TypeError: spamminess must be less than 15003" without also saying "oh but that might be a bug in this function, since it's meant to be able to take numbers >= 15003". Some of the type names CAN be generated algebraically. For instance, Pike lets you declare that something is an "int", or "array(int)" (an array of integers), or "int(1..)" (integer, minimum of 1, no maximum - in other words, positive integer), or "array(int(1..))" (yup, array of positive integers). You could probably devise a system like "int(11..15002|1%2|0%17)" to mean "must be between 11 and 15002, and must equal 1, modulo 2, and must equal 0, modulo 17". I'm not sure how often it would be of value, though, and it's pretty ugly. > Even if you can come up with unique, concise names for these types that > won't overwhelm the reader, it isn't clear that the type system will > always be capable of representing such fine distinctions. How would you > specify two string types, one for personal names and one for family > names, so that the compiler can detect any attempt to assign a family > name to a personal name, or vise versa? That's where the type system breaks down and the variable naming system shines. My favourite example here is of collections (which should be named in the plural) and their elements (which usually won't be). For instance: for msg in msgs: for person in people: for character in disney_princesses: for item in recipe.ingredients: No type system can reliably figure out that "msg" is singular and "msgs" is plural. And the concrete data types might even be identical ("msgs" could be a dict mapping message IDs to their actual messages, and then "msg" could be a dict mapping headers to their values - "for msg in msgs.values():" would thus iterate through one dict, yielding other dicts), even though *to the programmer* they are completely different, so type inference would need a lot of help. >> for some reason, though, we don't hear >> people saying "C is useless for industrial-grade software because it >> doesn't have function contracts". > > You obviously don't speak to Eiffel programmers then :-) True, I don't, but I'm not surprised there are people who think that way. But how many people write blog posts like the one that sparked this thread, clickbaitingly describing C as useless for serious work? >> Anyway, if you want some system of type checking, you can use static >> analysis (eg tools like MyPy) to go over your code the same way a >> compiler might. > > Indeed. Despite our criticisms of the *attitude* that static typing is a > panacea, it must be recognised that it is useful, and the bigger the > project, the more useful it is. And some type checkers are *very* > impressive. Google for "compiler found my infinite loop" for a classic > example of a compiler detecting at compile-time than a while loop would > never terminate. Exactly. Though there's a bit of a blurring now between "type checking" and "holistic static analysis". I've seen some incredible discoveries by Coverity; extremely narrow situational bugs where, if this happens and that fails and thingy was exactly 47, then the response message might use one byte more space than its buffer. That's pretty useful and seriously impressive, but I'm terrified of any sort of "type system" that could actually give a NAME to the data type that shows up this bug. > I can understand people saying that for sufficiently large projects, they > consider it indispensable to have the assistance of a type checker. That > in and of itself is no worse than saying that, as a writer, I find a > spell checker to be indispensable. Hmm, I do think there are a lot of people who take lessons learned on gigantic projects with huge contributor teams, and then say "EVERY program needs these tools". When you write a 200-page book, you might find an automated table of contents to be, not simply a useful tool, but an absolute necessity. Great! But when you write a one-screen README, that TOC creator is useless, along with the boilerplate in your document to tell it what to do. (Can you imagine adding a type checker to bash scripting?) >> "The first glaring issue is that I have no guarantee that is_valid() >> returns a bool type." -- huh? It's being used in a boolean context, and >> it has a name that starts "is_". How much guarantee are you looking for? >> *ANY* object can be used in an 'if', so it doesn't even matter. This is >> a stupidly contrived criticism. > > I don't think so -- I think a lot of people really have difficulty coming > to terms with Python's duck-typing of bools. They just don't like, or > possibly even grok, the idea of truthy and falsey values, and want the > comfort of knowing that the value "really is" a True or False. Okay, but even if you don't grok the truthy/falsey concept, it says "is_valid". Unless you're expecting outright MALICIOUS code, you should be able to assume that "is_valid" returns a boolean. Oh wait. We're talking about programmers here. Malicious has nothing on the rampant stupidity... https://thedailywtf.com/articles/What_Is_Truth_0x3f_ Still, it's not a fault of *this* function if it expects the normal case. It's no worse to expect is_valid to return a boolean (or something usable in a boolean context) than to expect math.sqrt to return a non-negative number. Sure, sqrt might have a bug in it so it returns a negative... but that's not your problem. > We can come up with some contrived justifications for this... what if > is_valid() contains a bug: > > def is_valid(arg): > if some condition: > return arg # oops I meant True > return False > > then static analysis would detect this. With truthiness, you can't tell: > what if *nearly* all the input args just happen to be truthy? Then the > code will nearly always work, and the errors will be perplexing. Right, and that's either a bug, or a design flaw (maybe the returning of 'arg' was intentional, because the author thought that ALL these args were truthy). That's fine. The name implies that it's returning a boolean, so you can look at this function *on its own* and pinpoint the (potential) problem. > But I consider that a fairly contrived scenario, and one with at least > two alternate solutions: code review, and unit tests. Exactly. Also, this sort of thing DOES happen, so if ever you add a __bool__ method to an object, consider the implications. > But still, I do see the point that a static analyser could have picked up > that error even if you didn't have code review, even if the person > writing the unit tests never imagined this failure mode. Perhaps. On the flip side, unless you rigidly demand that "if" statements must ONLY operate on the two values True and False, you still won't detect these problems at the calling site. You might detect them inside is_valid (if you declare that it'll return bool, and then return something that's not a bool, poof, error), but in the original complaint, type checking can't help without straitjacketing conditionals. >> Totally not true. The GIL does not stop other threads from running. >> Also, Python has existed for multiple CPU systems pretty much since its >> inception, I believe. (Summoning the D'Aprano for history lesson?) > > If you're talking about common desktop computers, I think you're > forgetting how recent multicore machines actually are. I'm having > difficulty finding when multicore machines first hit the market, but it > seems to have been well into the 21st century -- perhaps as late as 2006 > with the AMD Athelon 64 X2: No, I'm talking about big iron. Has Python been running on multi-CPU supercomputers earlier than that? > By the way, multiple CPU machines are different from CPUs with multiple > cores: > > http://smallbusiness.chron.com/multiple-cpu-vs-multicore-33195.html Yeah, it was always "multiple CPUs", not "multiple cores" when I was growing up. And it was only ever in reference to the expensive hardware that I could never even dream of working with. I was always on the single-CPU home-grade systems. > Certainly though there have been versions of Python without a GIL for a > long time: > > Jython started as JPython, in 1997; IronPython was started around 2003 or > so, and reached the 1.0 milestone in 2006. > > Fun fact: (then) Microsoft engineer Jim Hugunin created both JPython and > IronPython! Yes. I'm not sure how Jython handles concurrency; I'm totally in the dark about IronPython. I suspect both of them let the underlying system handle it, but that doesn't help me as I don't know that either. How well do they handle the "two threads spinning, incrementing the same global" stress test? I doubt they'll improve the efficiency. ChrisA -- https://mail.python.org/mailman/listinfo/python-list