In the previous thread (Custom C++ literals), ChrisA raised some good
questions, some of which I can actually answer :D
> Part of the problem here is that Python has to be many many things.
> Which set of units is appropriate? For instance, in a lot of contexts,
> it's fine to simply attach K to the end of something to mean "a
> thousand", while still keeping it unitless; but in other contexts,
> 273K clearly is a unit of temperature. (Although I think the solution
> there is to hard-disallow prefixes without units, as otherwise there'd
> be all manner of collisions.) Is it valid to refer to fifteen
> Angstroms as 15A, or do you have to say 15Å, or 15e-10m and accept
> that it's now a float not an int? Similarly, what if you want to write
> a Python script that works in natural units - the Planck length, mass,
> time, and temperature?
I think if you look into CGPM standards (they're the grand pooh bahs who decide
what SI units are) then you'd find that a lot of these potential collisions
have already been encountered and resolved. Under SI, there is no ambiguity
regarding K. K means Kelvin and only Kelvin, whereas k means 1000. Some units,
like Å do pose challenges. We often substitute u instead of μ, which works fine
since there don't seem to be any SI units that start with u. But we can't do
likewise for Å, since A is already reserved for Amperes. The easy way out is to
say Å is not SI, so it's out. But I would rather not see this feature limited
to SI units only (although SI should be preferred). A somewhat gentler approach
would be let Å be Å. Unicode letters are allowed in Python these days. I use
theta, mu, lambda - the whole bunch of them, in my code all the time. If
someone wants to use Å bad enough, let them use the unicode for it, otherwise
use nm.
Units like Planck's length are valid, and I don't see any reason to exclude
them. The problem is that CGPM (nor anyone else, as far as I can tell) hasn't
created an SI unit Planck length and other similar units that are
lexicographically distinct from other units. And creating one would only be
worth the trouble if all of the physicists who might use it could immediately
recognize it. Not that I speak for them, but I'm guessing the folks who run
SciPy or astropy could be of help in answering these sort of questions, rather
than trying to get a Python steering committee to work with a possibly more
bureaucratic organization like CGPM.
Regarding precision, this is not something that so many scientists and
engineers understand as well as computer scientists and engineers. I'd rather
see units available for integers as well as floats. I think that as long as a
unit is defined, it makes sense to allow integer quantities of them. If they
are to built-in types, as I would prefer, then I suppose unfortunately one
would not be able to define fractions of these units as new units. But again,
most of this work is done with floats anyway, so if units were only available
for floats, I would still see this as a big step forward.
Related to these questions, there is the question of what to do about mixed
systems? Should 2.54 in / 1 cm evaluate to 2.54 in/cm or should it evaluate to
1? I'd much rather it evaluate to 1, but if anyone else has a stronger opinion,
I would not let a dispute over such a thing stand in the way of getting units.
Regarding 1m / 1mm, though, I have a much stronger opinion. It should be 1000,
without any units.
There is yet another question related to the interpretation of K as 1000 vs
Kelvin. As I said, SI is clear that K means Kelvin, but what about Python users
that are not familiar with SI? What about those in the financial industry? To
them, K means 1000, and might not even know what Kelvin is. Now, unless adding
a suffix K to a number is supported later on, a financial person would have to
go pretty far out of their way, or be looking at the wrong code to be confused
by something referring to Kelvin. But it would indeed be a mistake to assume
that everyone who uses Python wants and can live with SI units, or even that
they would be using the same set of units! Which brings me to the next part of
ChrisA's reply...
>
> Purity and practicality are at odds here. Practicality says that you
> should be able to have "miles" as a unit, purity says that the only
> valid units are pure SI fundamentals and everything else is
> transformed into those. Leaving it to libraries would allow different
> Python programs to make different choices.
>
> But I would very much like to see a measure of language support for
> "number with alphabetic tag", without giving it any semantic meaning
> whatsoever. Python currently has precisely one such tag, and one
> conflicting piece of syntax: "10j" means "complex(imag=10)", and
> "10e1" means "100.0". (They can of course be combined, 10e1j does
> indeed mean 100*sqrt(-1).) This is what could be expanded.
>
As I mentioned above, I am not a purist. I keep a set of Thorlabs thread
adapters handy in my lab so that I can screw imperial cage plates onto metric
posts.
I think I diverge (or perhaps just don't understand) statement on "semantic
meaning". To me, semantic meaning of the units seems pretty essential. Wherever
possible, units should be simplified in a prescribed manner. 1W / 1s = 1J,
10km/1cm = 1000000. The meaning of these suffixes should be explicit, not
implicit.
Also, see above about precision of unit-aware data types. Floating point only
would be fine, but I don't see why integers cannot be supported as well.
> C++ does things differently, since it can actually compile things in,
> and declarations earlier in the file can redefine how later parts of
> the file get parsed. In Python, I think it'd make sense to
> syntactically accept *any* suffix, and then have a run-time
> translation table that can have anything registered; if you use a
> suffix that isn't registered, it's a run-time error. Something like
> this:
>
> import sys
> # sys.register_numeric_suffix("j", lambda n: complex(imag=n))
> sys.register_numeric_suffix("m", lambda n: unit(n, "meter"))
> sys.register_numeric_suffix("mol", lambda n: unit(n, "mole"))
>
> (For backward compatibility, the "j" suffix probably still has to be
> handled at compilation time, which would mean you can't actually do
> that first one.)
>
> Using it would look something like this:
>
> def spread():
> """Calculate the thickness of avocado when spread on
> a single slice of bread"""
> qty = 1.5mol
> area = 200mm * 200mm
> return qty / area
>
> Unfortunately, these would no longer be "literals" in the same way
> that imaginary numbers are, but let's call them "unit displays". To
> evaluate a unit display, you take the literal (1.5) and the unit
> (stored as a string, "mol"), and do a lookup into the core table
> (CPython would probably have an opcode for this, rather than doing it
> with a method that could be overridden, but it would basically be
> "sys.lookup_unit(1.5, 'mol')" or something). Whatever it gives back is
> the object you use.
>
> Does this seem like a plausible way to go about it?
As far as registering units, I think registering individual units is a bit
much. Of course, several of these statements could be put inside a module or
package to make things easier. But I also don't like that it means the syntax
of the "literals" needs to be allowed during parsing, and left to the
interpreter to figure out if the unit was registered. I do think it is
reasonable to require programmers to "opt in" to using SI or other units, and
possibly even specify which set or sets of units they intend to use. But if
their constants are ill-formed, then that should still be caught during parsing
and throw a SyntaxError.
How that would be implemented behind the scenes, I don't know, but from a
syntax point of view, I am envisioning something like a namespace statement
with a new keyword (I propose `measure`). Here, I am referring to namespaces
like `local` and `global`, not something like `argparse.Namespace`. Consider
the following example as of today:
```
A = 1
global A
A = 2
```
This will generate a syntax error during parsing:
SyntaxError: name 'A' is assigned to before global declaration
Similarly, what I envision is something like this:
```
length = 12m
```
SyntaxError: invalid syntax
```
measure SI
length = 12m
width = 10mm
area = length * width
print(area)
```
... with no SyntaxErrors and a result of "0.12 m2"
After the "measure SI" statement, all literals that are formed with SI units
are considered valid syntax and are evaluated accordingly. Prior to "measure
SI", only the unitless primitives are allowed. Clearly this works differently
than does the `global` or `local` statement, which are modifying a namespace.
Also, the choice of keyword matters, because making "measure" a keyword would
probably break a lot of existing code (3to4.py!!!). But it is dead simple, and
it does behave in a way that is actually quite similar to modifying the
existing namespace.
This ended up being a much longer reply than I anticipated, but I hope it helps.
_______________________________________________
Python-ideas mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/python-ideas.python.org/
Message archived at
https://mail.python.org/archives/list/[email protected]/message/QUD5GG3CBORW5OJ45DVNSACFZQG6SOXN/
Code of Conduct: http://python.org/psf/codeofconduct/