On 22Aug2019 11:12, Michael Torrie <torr...@gmail.com> wrote:
On 8/22/19 10:00 AM, Windson Yang wrote:
I can 'feel' that global variables are evil. I also read lots of articles
proves that (http://wiki.c2.com/?GlobalVariablesAreBad). However, I found
CPython Lib use quite a lot of `global` keyword. So how should we use
`global` keyword correctly? IIUC, it's fine that we use `global` keyword
inside the Lib since most of the time the user just import the lib and call
the API. In any other situation, we should avoid using it. Am I right?

The "global" keyword only refers to the current module as far as I know.
Thus global variables are global only to the current python file, so
the damage, as it were, is limited in scope.

Aye.

And it's only required if
you plan to write to a global variable from another scope; you can
always read from a parent scope if the name hasn't been used by the
local scope.  I'm sure there are use cases for using the global keyword.
It's not evil.  It's just not necessary most of the time.  I don't
think I've ever used the "global" keyword.  If I need to share state, or
simple configuration information, between modules, I place those
variables in their own module file and import them where I need them.

I've used it a few times. Maybe a handful of times in thousands of lines of code.

As Michael says, "you can always read from a parent scope if the name hasn't been used by the local scope". What this means is this:

   _MODULE_LEVEL_CACHE = {}

   def factors_of(n):
       factors = _MODULE_LEVEL_CACHE.get(n)
       if factors is None:
           factors = factorise(n)
           _MODULE_LEVEL_CACHE[n] = factors
       return factors

   def factorise(n):
       ... expensive factorisation algorithm here ...

Here we access _MODULE_LEVEL_CACHE directly without bothering with the global keyword. Because the function "factors_of" does not _assign_ to the name _MODULE_LEVEL_CACHE, that name is not local to the function; the outer scopes will be searched in order to find the name.

Now, Python decides what variable are local to a function by staticly inspecting the code and seeing which have assignments. So:

   x = 9
   y = 10
   z = 11

   function foo(x):
       y = 5
       print(x, y, z)

Within the "foo" function:

- x is local (it is assigned to by the function parameter when you call it)

- y is local (it is assigned to in the function body)

- z is not local (it is not assigned to); the namespace searching finds it in the module scope

Note that in the "factors_of" function we also do not _assign_ to _MODULE_LEVEL_CACHE. We do assign to one of its elements, but that is an access _via_ _MODULE_LEVEL_CACHE, not an assignment to the name itself. So it is nonlocal and found in the module namespace.

However, where you might want the use "global" (or its modern friend "nonlocal") is to avoid accidents and to make the globalness obvious. The same example code:

   x = 9
   y = 10
   z = 11

   function foo(x):
       y = 5
       print(x, y, z)

When you use a global, that is usually a very deliberate decision on your part, because using globals is _usually_ undesirable. When all variables are local, side effects are contained within the function and some surprises (== bugs) are prevented.

Let's modify "foo":

   function foo(x):
       y = 5
       z = y * 2
       print(x, y, z)

Suddenly "z" is a local variable because it is assigned to.

In this function it is all very obvious because the function is very short. A longer function might not have this be so obvious.

So: was "z" still intended to be global?

If yes then you need the global keyword:

   function foo(x):
       global z
       y = 5
       z = y * 2
       print(x, y, z)

And even if we were not assigning to "z", we might still use the "global" statement to make it obvious to the reader that "z" is a global; after all, if it not very visually distinctive - it looks a lot like "x" and "y".

So my advice after all of this is:

As you thought, globals are to be avoided most of the time. They invite unwanted side effects and also make it harder to write "pure functions", functions with no side effects. Pure functions (most Python functions) are much easier to reuse elsewhere.

However, if you have a good case for using a global, always use the "global" statement. It has the following benefits: it makes the globalness obvious to the person reading the code and it avoids a global variable suddenly becoming local if you assign to it. (NB: the "time" of that semantic change is when you change the code, _not_ when the assignment itself happens.)

Cheers,
Cameron Simpson <c...@cskk.id.au>
--
https://mail.python.org/mailman/listinfo/python-list

Reply via email to