Some questions on hash_set/hash_map traits

Erick Ochoa via Gcc Wed, 08 Sep 2021 13:49:41 -0700

Hello,

I am refactoring some old code that runs as an IPA_PASS. This code is
intended to run at link-time using the LTO framework and so I am
always testing it that way. At the moment, I am trying to use
hash-maps but having some difficulties. I am adding some patches that
will help illustrate along the way.


The first patch simply adds a link-time optimization pass that will be
used to demo the problem that I am having. One can run with -flto
-fipa-hash. During the first patch, the pass does not do any
meaningful work but it just helps to set up the stage.

For the second patch I create a wrapper around a symtab_node. This
wrapper is intended to add some other functionality to elements that I
might want to insert in hash-sets or hash-maps.

class symtab_wrapper
{
private:
  symtab_node *_symtab_node;
public:
  friend symtab_wrapper_traits;
  symtab_wrapper ()
    : _symtab_node (NULL)
  {}
  symtab_wrapper (symtab_node *s)
    : _symtab_node (s)
  { gcc_assert (s); }

  void print (void)
  {
    if (!dump_file) return;
    gcc_assert (_symtab_node);
    fprintf (dump_file, "%s\n", _symtab_node->name ());
  }
};

I also have a wrapper around a hash-set to add some things that might
be of interest to me in the future:

template <typename KeyId, bool Lazy = false, typename Traits =
default_hash_traits <KeyId> >
class my_set_t
{
private:
  hash_set <KeyId, Lazy, Traits> _impl;
public:
  my_set_t () {}

  void insert (const KeyId &k)
  {
    _impl.add (k);
  }

  bool contains (const KeyId &k)
  {
    return _impl.contains (k);
  }

  void print (void)
  {
    if (!dump_file) return;
    for (auto i = _impl.begin (), e = _impl.end (); i != e; ++i)
    {
      (*i).print ();
    }
  }
};

I also created a symtab_wrapper_traits because I want to store the
object in the hash-map itself, and so I need to specify how to hash
it.

class symtab_wrapper_traits
{
public:
  typedef symtab_wrapper value_type;
  typedef symtab_wrapper compare_type;
  static const bool empty_zero_p = true;

  static inline hashval_t hash (const value_type &v)
  {
    return pointer_hash <void>::hash (v._symtab_node);
  }

  static bool equal (const value_type &l, const value_type &r)
  {
    return l._symtab_node == r._symtab_node;
  }

  static void mark_deleted (value_type &v)
  {
    v._symtab_node = NULL;
  }

  static void mark_empty (value_type &v)
  {
    v._symtab_node = NULL;
  }

  static void remove (value_type &v)
  {
    v._symtab_node = NULL;
  }

  static bool is_deleted (const value_type &v)
  {
     return !v._symtab_node;
  }

  static bool is_empty (const value_type &v)
  {
     return !v._symtab_node;
  }
};

I then create a global set with my traits and populate it with some
information and later print. It seems to be working:

my_set_t <symtab_wrapper, false, symtab_wrapper_traits> my_set;

 static void
 ipa_hash_generate_summary (void)
 {
  cgraph_node *cnode = NULL;
  FOR_EACH_FUNCTION (cnode)
  {
    symtab_wrapper w (cnode);
    my_set.insert (w);
  }

  my_set.print ();

  return;
 }

Here, I already have some questions, but this is not the biggest issue
that I am having: I believe that the hash_set implementation manages
its own memory, but I am unclear about the semantics of "removed".

* Is "removed" supposed to, for example, call the destructor? Since,
in this particular case, it is only a wrapper and cgraph_node is not
intended to be destroyed, I just assign the pointer to NULL. Not sure
if that's the intention.
* I don't understand the semantics of empty_zero_p, I think it is used
to zero out deleted entries? If I mark something as empty.
* I am assuming that its memory will be re-used, is this the case?
* I see that there is a typed_delete_remove function that is sometimes
used in traits, but I think that it doesn't apply to this case because
I am storing the whole object. Is that the case?

I later did some refactoring (patch 3), which did not seem to impact
the code now, but will impact hash_map later. The refactoring is as
follows, I create an abstract "printable" class

class printable
{
public:
  virtual char* str () = 0;
  void print (void);
};

void
printable::print (void)
{
  if (!dump_file) return;
  char* _str = str ();
  fprintf (dump_file, "%s\n", _str);
  free (_str);
}

Make symtab_wrapper a publically derived class

-class symtab_wrapper
+class symtab_wrapper : public printable

and implemented the str function in symtab_wrapper:

  virtual char* str (void)
   {
    if (!dump_file) return;
    gcc_assert (_symtab_node);
    char *retval = NULL;
    asprintf (&retval, "%s", _symtab_node->name ());
    return retval;
  }

Again, this seems to be working correctly.

What is more interesting is moving from a hash-set to a hash-map. On
the fourth patch, I implemented the hash_map equivalent to the
hash_set that I am describing above:

template <typename KeyId, typename Value, typename Traits>
class my_map_t
{
private:
  hash_map <KeyId, Value, Traits> _impl;
public:
  my_map_t () {}

  void insert (const KeyId &k, const Value &v)
  {
    _impl.put (k, v);
  }

  Value *select (const KeyId &k)
  {
    return _impl.get (k);
  }

  void print (void)
  {
    if (!dump_file) return;
    for (auto i = _impl.begin (), e = _impl.end (); i != e; ++i)
    {
      (*i).first.print ();
    }
  }
};

my_map_t <symtab_wrapper, int, simple_hashmap_traits
<symtab_wrapper_traits, int>> my_map;

I fill the keys with the same data as the hash_set and for the
integers, just some incrementing values. I try to print the hash_map
after printing the hash_set, but it crashes at the call-site to "str
()"

void
printable::print (void)
{
  if (!dump_file) return;
  char* _str = str (); // <-- it crashes here
  fprintf (dump_file, "%s\n", _str);
  free (_str);
}

The interesting thing is that if I remove printable from
symtab_wrapper's class hierarchy and inline print (and devirtualize
str), then everything works correctly... (or at least to my
understanding). This is illustrated in the fifth patch, which
succeeds.

Now, I am not sure if I am doing something wrong and I am a bit lost
here. If anyone could help me, I'd appreciate it.

I have built these patches on top of trunk and also bootstrapped them.

Many thanks in advance!

0004-Runtime-error-when-printing.patch
Description: Binary data

main.c
Description: Binary data

0001-Initial-commit.patch
Description: Binary data

0003-Printing-better.patch
Description: Binary data

0002-Printing.patch
Description: Binary data

0005-Fixes-the-issue.patch
Description: Binary data

Some questions on hash_set/hash_map traits

Reply via email to