Statement coverage tools revisited

Edvard Majakari Fri, 11 Feb 2005 04:20:20 -0800

Yow,

For most of you folks Gareth Rees' excellent python code coverage tool -
/statement/ coverage tool to be more precise - is familiar, but there are
probably many TDD-fans out there who are not yet aware of this wonderful
program.


I use the tool a lot, even though it is good to be aware of common false
conceptions regarding tools counting number of visited and unvisited lines
(100% line coverage means exactly what it says and nothing more; see [1] for
more information).

However, statement coverage tools are not useless. They are very handy for
catching those blocks of code you never execute in your tests, but as
mentioned above, you shouldn't feel too secure for having 100% line coverage
for your passed unit tests.

I've only had one minor gripe with the tool, and that is the notation used
for marking visited ('>') and unvisited ('!') lines; it is not handy to grep
for '>' or '!' in annotated version, because both characters occur often in
python code. Of course you could avoid that using '^!', but it's two
keystrokes more :) and stands out less clearly in the code. That's why I did
a very simple patch for the tool, attached below [2] (it is small diff,
don't fret :), so you can even add overly-verbose line prefixes like this:

./pycoverage -a -u '>>> WARNING: NEVER VISITED' -v '>' xmlkit.py

resulting in

>class VLANRange(RangeEntry):
>    """Class for VLAN ID range nodes"""

>    def __init__(self, min_vlan_id, max_vlan_id):
...
>>> WARNING: NEVER VISITED        RangeEntry.__init__(self, ...
...

By the way, there's an interesting comment block in the heading of the code,
saying

,----
| The coverage dictionary is called "c" and the trace function
| "t".  The reason for these short names is that Python looks up variables
| by name at runtime and so execution time depends on the length of
| variables!  In the bottleneck of this application it's appropriate to
| abbreviate names to increase speed.
`----

It was written when 2.1 was the most recent stable version. I wonder if it
still applies for 2.2 and later? According to my hasty tests it doesn't
seem to be so. I didn't have very large unit test files at hand, though.

Footnotes: 

[1] http://www.garethrees.org/2001/12/04/python-coverage/

[2]

--- pycoverage-orig     2005-02-11 13:41:38.000000000 +0200
+++ pycoverage  2005-02-11 13:00:32.000000000 +0200
@@ -33,14 +33,18 @@
     Report on the statement coverage for the given files.  With the -m
     option, show line numbers of the statements that weren't executed.
 
-coverage.py -a [-d dir] FILE1 FILE2 ...
+coverage.py -a [-v char] [-u char] [-d dir] FILE1 FILE2 ...
     Make annotated copies of the given files, marking statements that
     are executed with > and statements that are missed with !.  With
     the -d option, make the copies in that directory.  Without the -d
     option, make each copy in the same directory as the original.
 
+    -v and -u let you specify characters to use for marking visited and
+    unvisited lines instead of '>' and '!'.
+
 Coverage data is saved in the file .coverage by default.  Set the
-COVERAGE_FILE environment variable to save it somewhere else."""
+COVERAGE_FILE environment variable to save it somewhere else.
+"""
 
 import os
 import re
@@ -63,15 +67,16 @@
 # information so the data in the coverage dictionary is transferred to
 # the 'cexecuted' dictionary under the canonical filenames.
 #
-# The coverage dictionary is called "c" and the trace function "t".  The
-# reason for these short names is that Python looks up variables by name
-# at runtime and so execution time depends on the length of variables!
-# In the bottleneck of this application it's appropriate to abbreviate
-# names to increase speed.
+
+# The coverage dictionary is called "coverage_dict" and the trace function
+# "t".  The reason for these short names is that Python looks up variables
+# by name at runtime and so execution time depends on the length of
+# variables!  In the bottleneck of this application it's appropriate to
+# abbreviate names to increase speed.
 
 # A dictionary with an entry for (Python source file name, line number
 # in that file) if that line has been executed.
-c = {}
+coverage_dict = {}
 
 # t(f, x, y).  This method is passed to sys.settrace as a trace
 # function.  See [van Rossum 2001-07-20b, 9.2] for an explanation of
@@ -80,7 +85,7 @@
 # objects.
 
 def t(f, x, y):
-    c[(f.f_code.co_filename, f.f_lineno)] = 1
+    coverage_dict[(f.f_code.co_filename, f.f_lineno)] = 1
     return t
 
 the_coverage = None
@@ -133,6 +138,8 @@
             '-i': 'ignore-errors',
             '-m': 'show-missing',
             '-r': 'report',
+            '-v:': 'visited-prefix=',
+            '-u:': 'unvisited-prefix=',
             '-x': 'execute',
             }
         short_opts = string.join(map(lambda o: o[1:], optmap.keys()), '')
@@ -178,6 +185,9 @@
             execfile(sys.argv[0], __main__.__dict__)
         if not args:
             args = self.cexecuted.keys()
+
+        self.visited_pfx = settings.get('visited-prefix=', '>')
+        self.unvisited_pfx = settings.get('unvisited-prefix=', '!')
         ignore_errors = settings.get('ignore-errors')
         show_missing = settings.get('show-missing')
         directory = settings.get('directory=')
@@ -193,8 +203,8 @@
         sys.settrace(None)
 
     def erase(self):
-        global c
-        c = {}
+        global coverage_dict
+        coverage_dict = {}
         self.analysis_cache = {}
         self.cexecuted = {}
         if os.path.exists(self.cache):
@@ -213,8 +223,8 @@
     # exists).
 
     def restore(self):
-        global c
-        c = {}
+        global coverage_dict
+        coverage_dict = {}
         self.cexecuted = {}
         if not os.path.exists(self.cache):
             return
@@ -252,13 +262,13 @@
     # "executed" map.
 
     def canonicalize_filenames(self):
-        global c
-        for filename, lineno in c.keys():
+        global coverage_dict
+        for filename, lineno in coverage_dict.keys():
             f = self.canonical_filename(filename)
             if not self.cexecuted.has_key(f):
                 self.cexecuted[f] = {}
             self.cexecuted[f][lineno] = 1
-        c = {}
+        coverage_dict = {}
 
     # morf_filename(morf).  Return the filename for a module or file.
 
@@ -474,17 +484,17 @@
                         # Special logic for lines containing only
                         # 'else:'.  See [GDR 2001-12-04b, 3.2].
                         if i >= len(statements) and j >= len(missing):
-                            dest.write('! ')
+                            dest.write(self.unvisited_pfx)
                         elif i >= len(statements) or j >= len(missing):
-                            dest.write('> ')
+                            dest.write(self.visited_pfx)
                         elif statements[i] == missing[j]:
-                            dest.write('! ')
+                            dest.write(self.unvisited_pfx)
                         else:
-                            dest.write('> ')
+                            dest.write(self.visited_pfx)
                     elif covered:
-                        dest.write('> ')
+                        dest.write(self.visited_pfx)
                     else:
-                        dest.write('! ')
+                        dest.write(self.unvisited_pfx)
                     dest.write(line)
                 source.close()
                 dest.close()

-- 
# Edvard Majakari               Software Engineer
# PGP PUBLIC KEY available      Soli Deo Gloria!

"Debugging is twice as hard as writing the code in the firstplace. Therefore,
 if you write the code as cleverly as possible, you are, by definition,
 not smart enough to debug it."  -- Brian W. Kernighan

-- 
http://mail.python.org/mailman/listinfo/python-list

Statement coverage tools revisited

Reply via email to