Re: Comparisons and sorting of a numeric class....

Andrew Robinson Thu, 15 Jan 2015 17:47:29 -0800


On 01/15/2015 12:41 AM, Steven D'Aprano wrote:

On Wed, 14 Jan 2015 23:23:54 -0800, Andrew Robinson wrote:


[...]

A subclass is generally backward compatible in any event -- as it is
built upon a class, so that one can almost always revert to the base
class's meaning when desired -- but subclassing allows extended meanings
to be carried.  eg: A subclass of bool is a bool -- but it can be MORE
than a bool in many ways.

You don't have to explain the benefits of subclassing here.

I'm still trying to understand why you think you *need* to use a bool
subclass. I can think of multiple alternatives:

- don't use True and False at all, create your own multi-valued
   truth values ReallyTrue, MaybeTrue, SittingOnTheFence, ProbablyFalse,
   CertainlyFalse (or whatever names you choose to give them);

- use delegation to proxy True and False;

- write a class to handle the PossiblyTrue and PossiblyFalse cases,
   and use True and False for the True and False cases;

There may be other alternatives, but what problem are you solving that
you think

     class MyBool(bool): ...

is the only solution?

That's a unfair question that has multiple overlapping answers.

Especially since I never said subclassing bool is the 'only' solution; Ihave indicated it's a far better solution than many.

So -- I'll just walk you through my thought processes and you will seewhat I consider problems:

Start with the concept that as an engineer, I have spent well overtwenty years on and off dealing with boolean values that are very oftenmixed indistinguishably with 'don't care' or 'tri-state' or 'metastablestates'. A metastable state *is* going to be True or False once themetastability resolves by some condition of measurement/timing/etc.; butthat value can not be known in advance. eg: similar to the idea thatthere is early and late binding in programming.... Sometimes there is avery good reason to delay making a final decision until the lastpossible moment; and it is good to have a default value defined if nodecision is made at all.

So -- From my perspective, Guido making Python go from an open ended andpermissive use of anything goes as a return value that can handlemetastable states -- into to a historical version of 'logic' beinghaving *only* two values in a very puritanical sense, is rather -- well-- disappointing. It makes me wonder -- what hit the fan?! Is itlemmings syndrome ? a fight ? no idea.... and is there any hope ofrecovery or a work around ?

eg: To me -- (as an engineer) undefined *IS* equivalent in useage to anacutal logic value, just as infinity is a floating point value that isreturned as a 'float'. You COULD/CAN separate the two values from eachother -- but always with penalties. They generally share an OOP 'is'relationship with respect to how and when they are used. (inf) 'IS' afloat value and -- uncertain -- 'IS' a logic value.

That is why I automatically thought before I ever started writing onthis list (and you are challenging me to change...) -- that 'uncertain'should share the same type (or at least subtype) as Bool.Mathematicians can argue all they want that 'infinity' is not a floatvalue, and uncertain is not a True or False. And they are/will betechnically right -- But as a practical matter -- I think programmershave demonstrated over the years that good code can handle 'infinity'most efficiently by considering it a value rather than an exception.And I think the same kind of considerations very very likely apply toTruth values returned from comparisons found in statistics, quantummechanics, computer logic design, and several other fields that I amless familiar with.


So -- let's look at the examples you gave:

- don't use True and False at all, create your own multi-valued
   truth values ReallyTrue, MaybeTrue, SittingOnTheFence, ProbablyFalse,
   CertainlyFalse (or whatever names you choose to give them);

OK.  So -- what do I think about when I see your suggestion:

First I need to note where my booleans come from -- although I've nevercalled it multi-valued logic... so jargon drift is an issue... thoughyou're not wrong, please note the idea of muti-value is mildly misleading.

The return values I'm concerned about come from a decimal value after acomparison with another decimal value.

eg:

a = magicFloat( '2.15623423423(1)' )
b = magicFloat('3()')

myTruthObject = a>b

Then I look at python development historically and look at the built inclass's return values for compares; and I notice; they have over timebecome more and more tied to the 'type' bool. I expect sometime in thefuture that python may implement an actual type check on all comparisonoperators so they can not be used to return anything but a bool. (eg:I already noticed a type check on the return value of len() so that Ican't return infinity, even when a method clearly is returning aninfinitely long iterator -- such as a method computing PI dynamically.That suggests to me that there is significant risk in python of havingtype checking on all __xx__ methods in the future. ) This inspection iswhat points me foremost to saying that very likely, I am going to want abool or subtye of it (if possible) as a return type as self defenseagainst future changes in Python -- although at present, I can still getaway with returning other types if bool turns out to be impossible.

Next, I notice that for compatibility it *is* very desirable that I usethe existing '>' operator, because programmers generally want to beable to use '>' when they are testing greater than -- and in legacy codeI expect people have exclusively done so -- and I know from pastexperience that programmers in general will not be happy with typing'a.greaterThan(b)' religiously. ( Extend my reasoning to all othercomparison operators.)

It would be worse to use '>' and have it trigger an exception when anon-bool is encountered to force the programmer to attend to specialmetastable states differently; because then the programmer has to writea compete set of secondary handling routines or a 'try' statement arounda very large number of lines of code and that makes for legacy coderewriting rather than minor upgrading...

It would also be bad to have my code have modal settings because I don'twant to bother with thread information or have programmers considerthread issues unless it's a last resort; although that's the approachused by the Decimal class and other examples I have seen.

So: In general, the most desirable return type is determined by whatpython actually returns for normal comparison operations; eg: apparentlya bool -- but with some way of signaling (if the user cares) that moreprecise information is available as to why a value is False if it is False.

Unfortunately, the '>', '<', '==', and other operators have no way ofreturning additional information on their own; so again, a second(undesirable) function/method would need to be invoked to overcome thelimitation if only a strict bool is allowed as a return type; and thatmeans con-concomitant issues of storage and wasted re-computation andthreads.

So, your first alternative is the most at risk of future problems due toconstraints likely to be placed on comparison return types ; and as Idon't want to do much maintenance on my library in the future -- I don'tthink that is a very good choice for making my library with.

- use delegation to proxy True and False;

That sounds like a far more likely to succeed alternative, and is one ofa handful of alternatives I have been exploring on my own.

Proxies allow detection of an actual deterministic False vs. a defaultFalse. So a proxy's id() can signal to a user when it is possible toupgrade a False to True should they care. Therefore -- If Guido wouldsee fit to permanently allow proxied True and False values to bereturned in lieu OF an actual True and False value, then this would be anear ideal alternative. But Python does not implement a general purposeproxy that I know of ...

I have gotten single instance of a class acting as a proxy to mostlywork; and I have gotten isinstance( myTruthValue, bool ) to return Truefor the proxy object -- which is not a bool itself. However, when Iattempt multiple instances of the proxy -- it becomes more difficult. Ithink a pure python implementation might be possible -- and I'llcontinue to try for a while -- but python may not be able to do ittotally from the python side because there is a difference in how Pythonhandles type() checks and isinstance() checks.

- write a class to handle the PossiblyTrue and PossiblyFalse cases,
   and use True and False for the True and False cases;

I very much would want to do as you state here because it would preserveboth True and False unaltered --- which would ALWAYS work in legacycode; but I don't know how to do it safely.

Although I can use True for absolute Truth -- I can not use False asabsolute False without inviting confusion as to when to allow advancedcompares.

When I do a comparison on any False < False, in legacy code -- it needsto return False.

But, when looking at uncertainty values, if totally False 'is' the sameas base type False -- then the issue arises that a comparison False <AnyOtherPartTrueFalse needs to be False for legacy compares but Truefor advanced compares;

It's inconsistent and I have no way of detecting where the False I amcomparing with to make a proper decision.

So: The only solution I see is to assume that whenever a uncertainty iscompared against a legacy bool -- that the legacy style of comparison isabsolutely required for safety; and a second version of False must bedefined to detect when the compare needs to take uncertainty into account.

All of these issues are handled correctly in the example tuple class Ialready showed. So the tuple class I showed is presently the bestsolution with the most compatability that I have found so far.

One example: It can also be a union.

I don't understand what you think this means. I know what *I* think it
means, but "subclass = union" doesn't make sense to me, so wonder what
you think it means.

It's a fringe use-case in Python that I don't think many people use/knowabout. I was just being thorough in listing it.

I haven't seen it used in actual python code myself -- but I know fromthe literature I've read on Python that it is occasionally used in amanner analogous to that of C/C++.

In C/C++ unions are a datatype that allow two (or more) different typesof data to be defined as held by one object, but only one of them isallowed to be initialized at a time because their location in computermemory which overlaps. C places no restrictions on the compatibility ofthe datatypes -- which is different than Python, but Python has asimilar ability.

In Python, when multiple inheritance is invoked -- it does some kind ofcheck on the base types for compatibility; but still appears to be able/ or simply does overlap the allocated memory for the different basetypes; eg: according to several sources I have read (at least onC-python internals).

So one can semantically instantiate a subclass of one subtype withoutsemantically instantiating the other.

ALl I know about it is what I have seen it in Python literature -- and Itested the examples I was shown to see if they still work, and they do-- and noted that at least at one time Guido apparently thought it was agood idea ; but I haven't pursued it beyond that.

So when Guido chose to cut off
subclassing -- his decision had a wider impact than just the one he
mentioned; eg: extra *instances* of True and False.... as if he were
trying to save memory or something.

*shrug* well maybe he was.

:) LOL. I don't have any real idea.... but it would be useful to knowfor sure.

The reason Guido's action puzzles me is twofold -- first it has been
standard industry practice to subclass singleton  (or n-ton) objects to
expand their meaning in new contexts,

I dispute that. I *strongly* dispute that.

Industry practice, in my experience, is that there is one and only one
case where you can subclass singleton classes: when you have a factory
which chooses at runtime which subclass to instantiate, after which you
can no longer instantiate any of the other subclasses.

OK.

Well, I'll just say that I believe you -- and I'm not really sure whatyou're objecting to in what I said -- but if a singleton subclass /factory existed for my purpose -- I would be happy to choose it atruntime just like your maze guys do...! If Guido would do that... hewould give me a subtype of bool and that would be very nice indeed.

But dreams aside -- I still note your admission shows that industry doesallow subclassing of singletons even if it requires the owner of thesingleton (Guido) to allow the subtypes.

Cf: Design Patters, Elements of Reusable Object Oriented Software (Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides ) pp. 127"Chapter, Singleton -- Object Creational"

--------------------------------------------------------

-- Applicability:
Use the singleton pattern when:

-- There must be exactly one instance of a class, and it must beaccessible to clients from a well known access point.-- When the sole instance should be extensible by subclassing, andclients should be able to use an extended instance without modifyingtheir code.


...

Consequences:
The singleton pattern has several benefits:
...

4. Permits a variable number of instances. The pattern makes it easyto change your mind and allow more than one instance of the SingletonClass. Moreover, you can use the same approach to control the number ofinstances that the application uses. Only the operation that grantsaccess to the singleton instance needs change.


...

Implementation:

2. Subclassing the Singleton Class: The main issue is not so muchdefining the subclass but installing it's unique instance so thatclients will be able to use it.

...

A more flexible approach uses a registry of singletons. Instead ofhaving Instance define the set of possible Singleton Classes, TheSingleton classes can register their singleton instance by name in awell known registry.

...

Of course, the constructor won't get called unless someone instantiatesthe class, which echoes the problem the Singleton is trying to solve.

---------------------------------------------------------

Why are we limited to a single Maze? Making Maze a singleton in the first
place was a bad idea. The singleton design pattern is *highly* over-used
and abused. But that is another story. Let's just assume that the
designer has good reason to insist that there be only one Maze, perhaps
it uses some resource which truly is limited to there being one only. If
that is the case, then allowing the caller to break that invariant by
subclassing will just lead to horrible bugs.

Right -- when a user does not know the reason for a singleton ; breakingit is just ASKING for bugs. I agree. That's why I have been askingabout why Guido did it... there are times to avoid breaking the rules,and times to crush them.

  By using a factory and
controlling access to the subclasses, Maze can keep the singleton
invariant and allow subclasses.

This is not relevant to bool, since True and False are already
instantiated.


There's nothing stopping Guido from making it relevant...


[...]

In general -- it's not the goal of subclassing to create more instances
of the base types

That might not be the goal, but it is the effect. When you instantiate
the subclass, by definition the instances are also instances of the base
classes.

All right -- I can agree to that and will concede that point -- as Idon't see much purpose in pursuing it further as I suspect (withoutproof) that Guido might not like that extra instances of classdefinitions that I might use as a work-around... although I don't reallyknow why it's so important to him.

-- but rather to refine meaning in a way that can be
automatically reverted to the base class value (when appropriate) and to
signal to users that the type can be passed to functions that require a
bool because of backward compatibility.

And I am wondering what you think you can extend bools to perform that is
completely backwards compatible to code that requires bools?

I've never said 'completely compatible', and have been very careful notto make extremest remarks.I want to get as close as I can to fully backward compatible -- and amwilling to put some time into it rather than taking the first solutionthat vaguely works...

I don't think you can. I think you are engaged on a fool's errand, trying
to do something impossible *even if subclassing bool were allowed*. But I
could be wrong. I just don't think you can possibly write code which is
backwards-compatible with code that expects bools while still extending
it. People are so prone to write:

     if flag is True: ...
     if flag is False: ...

D'Aprano -- I think your making what is known as a straw man argument.

Refer back to your earlier suggestion of re-using True and False torepresent themselves, and some other type to represent the intermediatemetastates; From your remark here, I surmise that you must have alreadyfigured out that the alternative you gave me was never meant to work --otherwise it would solve the very problem you now present me with forany case where my numbers are identical in meaning with legacy numbers-- eg: it *would* work perfectly for any truly legacy application. Thefailures -- would show up with new applications or non legacy data whichcould erroneously trigger a legacy compare when it ought not do sobecause you can not get the new types returned unless non-legacy datahas been encountered.

(which is naughty of them, but what are you going to do?)

Nothing, except hope that the people who wrote Python itself didn't doanything naughty in sort() min() and max() and friends. So far my testsshow that they didn't. But -- you're right -- Non core languageimplementation programmers, are going to have occasional bugs thateither they or I will have to hunt down, depending on who it is thatneeds their software to work with my library.

C doesn't have instances because it doesn't have objects. I'm not
certain, but I don't think the other languages you refer to are object-
oriented either. Verilog is a structured programming language, Silos is a
Verilog simulator, and I think VHDL and HDL are versions of Verilog (that
is, I've only seen them written as "Verilog-VHDL" and "Verilog-HDL").

OOP programming in C is not done using formal class keywords, etc, butit is done by defining structs and compiler modules and pointers tofunctions; So -- C -- doesn't have the security measures that a C++compiler implements for OOP ( 'private' ,'protected' ); but OOP canstill be done in C including inheritance. 'C' most certainly does haveinstances and singletons. Several packages available under GPL, such asthe GTK widget set, are implemented in strict C (not C++) and as fullobject oriented packages, then another optional package can be compiledif C++ bindings to the objects are desired.

Verilog is the originator of the language family I mentioned, yes; andthey are all variations on a theme -- but there are versions of HDL's byother companies, and the US government; Most are based on C syntax,some are based on ADA, and other languages that engineers happen to likefor various applications; etc. I mentioned only the most used versions.

In any case, Verilog *by design* uses four-state logic, modelling 1, 0,
floating, undefined. It is not a bool, since *by definition* bools model
only two states.

Not quite, verilog is meant to handle two state logic. AKA: Binary bitor Boolean, and to also work with metastable data; eg: In electronics,floating or unknown or oscillating or frozen between states for a periodof time while settling are traditionally called metastable. I am notsure if this is a mathematican's definition, or if it's because thesequasi-states were defined with/after (meta) the two stable ones. It'ssomething I will have to check. But I remember from my early collegecourses that it is technically wrong to call them all states, eventhough 'don't care' is often referred to as the tri-state.

In any event -- Your comment about verilog still just demonstrates thatGuido has downgraded python's return types into a more more primitivesystem than is warranted by the history of the creation of thecomputer. Verilog (1984) existed before Python (1989) so *even*verilog's conventions predate python's. And I don't even remember whenHiLo used to be around but I'm sure it's older than verilog. So -- fromthe very beginning of Python, design logic for boolean systems has*always* included meta-state information with boolean values.

The third value is usually called "TRI-state" or "don't care". (Though
its sometimes a misnomer which means -- don't know, but do care.)

And SQL has NULL, which makes it an example of tri-state logic. (To be
precise, SQL uses a version of Kleene K3 logic.)

OK.  I agree -- it does.


[snip description of modelling hardware circuits]
All very interesting, but completely irrelevant to the question of
subclassing bool.


No, not really -- but I'll respect your difference of opinion.

I'm getting the message that the reason Guido though this was importantwas because the historical meaning of bool is more important than theidea that metastability is an issue to be dealt with at the same time asthe data value -- like electrical engineers do regularly.

We've discovered that we live in a quantum-mechanical universe -- yet
people still don't grasp the pragmatic issue that basic logic can be
indeterminate at least some of the time ?!

Of course they do. My first post to you in this thread suggested that
before you start re-inventing the wheel you look at prior art in the
multi-value logic field.

Did you ? -- Did I reply to that e-mail?  I'm not sure I read it...

But the word is different from what I am used to -- eg: meta-stablelogic 'states' ... ?

Now that I'm looking up words -- I see that wikipedia is calling theindeterminate states 'multi-value'; I'm getting old... I am used to theterm metastable; not multi-value. Weird. Jargon problems...

Even so -- I seriously don't think of Quantum mechanics as multi-value ;it's uncertain and 'collapses' to a definite value when measured. I canunderstand your intention now... I'll have to go back and search for theold email. My apology.

The name 'boolean logic' has never been re-named in honor of the many
people who developed the advancements in computers -- including things
like data sheets for electronic parts,

Are you really suggesting that the name of Boolean Logic should be
renamed away from the person who invented the field and instead named
after the person who first wrote down a list of electronic part numbers
and their specifications?

Nope.

Though I DO want to point out that Charles Bool did not invent thecomputer, build the microprocessor, or any of the things which wouldgive a logical reason why his more archaic *usage* is given preferenceover the useage preferred by the very people who DID invent themicroprocessor, computer, and programming languages.

Name recognition is great for honoring a man -- but makes for a poorreason to choose a strict implementation of bool.

or the code base used for solving
large numbers of simultaneous logic equations with uncertainty included
-- which have universally refined the boolean logic meanings found in
"Truth" tables having clearly more than two values -- but don't take my
word for it -- look in any digital electronics data book, and there they
will be more than two states; marked with rising edges, falling edges,
X's for don't cares, and so forth.

And those truth tables are not part of Boolean algebra.

Oh wow!!!! I never expected to hear that -- But I guess you were nevertrained to do boolean algebra, formally ? Or did you mean something else ?


http://en.wikipedia.org/wiki/Truth_table

The truth tables on data sheets are VERY VERY much intended to berelated to boolean logic.Electronic engineers routinely put the words "Truth table" on datasheetswhere the boolean information is recorded (and I'm sure even on relaylogic prior to the vaccum tube) but still add x's because *asinventors* they knew Bool's usage wasn't enough to convey ideasefficiently and fully.

It's pragmatism over rigid formalism left over from an age where thecomputer as we have it was not even conceived.


http://www.eleccircuit.com/cd4027b_datasheet-of-dual-j-k-flip-flop/

[...]

As I said to D'Aprano -- even a *cursory* examination (eg: as in not
detailed) shows I could do things which he wasn't considering.

Andrew, I think you will be surprised at what I have considered. If you
search the archives, you will find that (by memory) a decade ago I had
considered using classes without instantiating them.

No surprise. I know from reading your work that you have been doingprogramming a long time, and have fairly well substantiated / reasonableopinions even if I disagree with some of them as trying to overemphasizeto a fault a definition which has never been honored in the past bythose who used it most.

The questions I have about your strategy is not what can be done in
Python, but how you think these things you want to do will solve the
problem you apparently have?

To give an analogy... I have no doubt that you can build an television.
But I question how building a television solves your problem of
transporting a goat, a wolf and a cabbage across the river.

Ask away. I already have one solution that works reasonably well, thetuple rich compare;So it's not like I don't have a solution -- it's just that I'm not surethat I can't do better.

If Python had never added the bool definition to the language, Iwouldn't even have to bother with any of this supposed 'fools' errandnuisance in the first place... but I'll make the best of it.

[...]

I don't have a setting on my email to turn off html.  Sorry. Can't help.

You are using Thunderbird. You certainly do have such a setting.

It's nice to know that you read and believe what you see in an email header.

Note: Headers are sometimes modified by sysadmins who actually careabout security.

PPS: If there is a way to turn off HTML in this email program -- it isnot obvious -- and I have looked.

I've done my best not to push any HTML enhancement buttons...

--
https://mail.python.org/mailman/listinfo/python-list

Re: Comparisons and sorting of a numeric class....

Reply via email to