As part of our effort to make programming in GCC easier, we would like
to improve the interface to bitmaps.
There are three bitmap types, each with disparate operations and
function names.  This disparity causes problems
* when changing a variable from one type to another,
* when moving one's attention from one type to another.
We would like to change the existing function names to be more uniform.

The most complicated operations
are those that produce a bitwise result
for one or more binary bitwise operations.
These do work as if written with one of the following forms:

  d = a OP b
  d = a OP b with reporting of a change to d
  a OP= b
  a OP= b with reporting of a change to a

  d = a OP (b OP c)
  d = a OP (b OP c) with reporting of a change to d
  a OP= (b OP c)
  a OP= (b OP c) with reporting of a change to a

where OP is one of

  OP  TAG  operation description
   &  and  intersection
   |  ior  union
   ^  xor  inequivalence
   -  dif  set difference (reverse relative complement) (a&~b)
   /  rdf  reverse set difference (relative complement) (~a&b)

One approach is to add operators for the bitmap operations.
Unfortunately, the straightforward way to do that would introduce
temporary variables where none exist now, which would result in a net
loss of performance.  For example,
  bitmap_a_and_b (dest, a, b);
written as
  dest = a & b;
would result in code that is effectively
  tmp = a & b;
  dest = tmp;
and would require allocation and copying of tmp.

We can avoid this run-time expense by using expression templates
http://en.wikipedia.org/wiki/Expression_templates.  However, that
change would require substantial implementation work and substantial
compile-time effort to untangle the templates.

Also, unfortunately, with C++2003 as the implementation language and
the existing use of unions in critical data structures like trees,
the overloading of assignment operators would either be prohibited or
require a significantly clumsier syntax than one would like.

Instead, I propose a more modest change, preserving the existing call
syntax, but normalizing it so that function names are predictable and
general.  By overloading these operations on the type of the bitmap,
we can use a single name for a single action, regardless of type.
Doing so eases the task of changing types, as may be desiriable to
obtain a different performance profile.

For the operations above, I suggest names as follows:

  assign_TAG (d = a OP b)
  assign_TAG_changed
  TAG_assign (a OP= b)
  TAG_assign_changed

  assign_TAG_TAG (d = a OP (b OP c))
  assign_TAG_TAG_changed
  TAG_assign_TAG (a OP= (b OP c))
  TAG_assign_TAG_changed

For example, sbitmap_a_or_b_cg (d, a, b)
becomes assign_ior_changed (d, a, b);
and bitmap_ior_and_compl_into (a, b, c)
becomes ior_assign_dif (a, b, c).

It is not surprising that some expressions using change-returning
functions do not use the function result.  However, it appears that
some such functions never have their result used.  I would propose
using void versions everwhere that the result is not used.  These can
be implemented with the non-void versions easily enough.

Testing for a change also applies to the single bit clear and set
routines for bitmap, but not ebitmap or sbitmap.  It would probably
be prudent to make names for both variants, and implement the testing
version only when needed.  The non-testing version can be implemented
via the testing version in the short term.

Other operations are simpler in structure, but still need common
names.  I propose the following:

  clear (bitmap) -> void
  clear_range (bitmap, unsigned, unsigned) -> void
  assign_not (bitmap, const_bitmap) -> void
  has_intersection (const_bitmap, const_bitmap) -> bool
  is_subset_of (const_bitmap, const_bitmap) -> bool
  cardinality (const_bitmap) -> unsigned
  cardinality_of_range (const_bitmap, unsigned, unsigned) -> unsigned
  is_empty (const_bitmap) -> bool
  is_singleton (const_bitmap) -> bool
  first_set_bit (const_bitmap) -> unsigned
  last_set_bit (const_bitmap) -> unsigned
  test_bit (const_bitmap, unsigned) -> bool

I propose to leave the allocation, iteration, vector, debug/dump/print
and functions alone for the time being.

Comments?

-- 
Lawrence Crowl

Reply via email to