Re: [PATCH] hash: declare some functions with the warn_unused_result attribute

Eric Blake Mon, 15 Jun 2009 16:18:14 -0700

Jim Meyering <jim <at> meyering.net> writes:

> >> We should fix this, and decide whether shrinking the hash table when
> >> deletion frees up a bucket is always possible, or else deal with memory
> >> allocation failure here, too.
> >
> > Additionally, it looks like hash_rehash has a memory leak - if new_table
> > is allocated, but we later fail to allocate new_entry, the function
> > returns immediately without reclaiming new_table.  Which means that even
> > if an application is prepared to deal with a false return by trying other
> > means to reduce memory usage rather than just giving up with xalloc_die,
> > the leaked memory from the previous attempt will interfere.
> 
> Good catch.
> That is definitely a bug.


How does this look as a patch?  Is my logic sound for: a) my claim that 
the 'goto fail' path in hash_rehash will not encounter subsequent memory 
allocation failure, and b) my claim that a failed rehash when shrinking the 
table on hash_delete can be ignored?

I'm debating just deleting the USE_OBSTACK code altogether; it's even buggier 
at memory management, and probably not worth maintaining.


From: Eric Blake <e...@byu.net>
Date: Mon, 8 Jun 2009 05:56:37 -0600
Subject: [PATCH] hash: avoid memory leak on allocation failure

* lib/hash.c: Document USE_OBSTACK memory management flaws.
(hash_rehash): Avoid memory leak on allocation failure.
(hash_delete): React to hash_rehash return value.

Signed-off-by: Eric Blake <e...@byu.net>
---
 ChangeLog  |    7 ++++
 lib/hash.c |  115 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 2 files changed, 115 insertions(+), 7 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index fa976eb..ec9f5be 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,12 @@
 2009-06-15  Eric Blake  <e...@byu.net>

+       hash: avoid memory leak on allocation failure
+       * lib/hash.c: Document USE_OBSTACK memory management flaws.
+       (hash_rehash): Avoid memory leak on allocation failure.
+       (hash_delete): React to hash_rehash return value.
+
+2009-06-15  Eric Blake  <e...@byu.net>
+
        memchr: add valgrind exception
        * lib/memchr.valgrind: New file.
        * modules/memchr (Files): Distribute it.
diff --git a/lib/hash.c b/lib/hash.c
index 7d76d45..e4a84fb 100644
--- a/lib/hash.c
+++ b/lib/hash.c
@@ -1,7 +1,7 @@
 /* hash - hashing table processing.

-   Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2006, 2007 Free
-   Software Foundation, Inc.
+   Copyright (C) 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2006, 2007,
+   2009 Free Software Foundation, Inc.

    Written by Jim Meyering, 1992.

@@ -21,7 +21,8 @@
 /* A generic hash table package.  */

 /* Define USE_OBSTACK to 1 if you want the allocator to use obstacks instead
-   of malloc.  If you change USE_OBSTACK, you have to recompile!  */
+   of malloc.  If you change USE_OBSTACK, you have to recompile!  Also,
+   memory allocation failures in the obstack are not reliably tracked.  */

 #include <config.h>

@@ -697,7 +698,7 @@ hash_free (Hash_table *table)

 /* Insertion and deletion.  */

-/* Get a new hash entry for a bucket overflow, possibly by reclying a
+/* Get a new hash entry for a bucket overflow, possibly by recycling a
    previously freed one.  If this is not possible, allocate a new one.  */

 static struct hash_entry *
@@ -812,7 +813,7 @@ hash_find_entry (Hash_table *table, const void *entry,
    the table may receive at least CANDIDATE different user entries, including
    those already in the table, before any other growth of the hash table size
    occurs.  If TUNING->IS_N_BUCKETS is true, then CANDIDATE specifies the
-   exact number of buckets desired.  */
+   exact number of buckets desired.  Return true iff the rehash succeeded.  */

 bool
 hash_rehash (Hash_table *table, size_t candidate)
@@ -857,7 +858,7 @@ hash_rehash (Hash_table *table, size_t candidate)
                  struct hash_entry *new_entry = allocate_entry (new_table);

                  if (new_entry == NULL)
-                   return false;
+                   goto fail;

                  new_entry->data = data;
                  new_entry->next = new_bucket->next;
@@ -897,6 +898,89 @@ hash_rehash (Hash_table *table, size_t candidate)
   free (new_table);

   return true;
+
+ fail:
+  /* We've allocated new_table, and possibly moved some entries, but
+     could not move the remaining entries.  We must undo the partial
+     move before returning failure.  In order to get here, the free
+     list is currently empty, and we attempted to move at least one
+     original bucket head into a new_table bucket overflow.  The only
+     move that requires yet more memory is from a new_table head back
+     to an original table overflow; meanwhile, moves from new_table
+     overflow back to an original table head will repopulate the free
+     list.  Therefore, we make two passes; one to restore overflows,
+     which will reclaim enough memory for all remaining moves from
+     new_table heads back to the original table.  */
+  for (bucket = new_table->bucket; bucket < new_table->bucket_limit; bucket++)
+    if (bucket->data && bucket->next)
+      for (cursor = bucket->next; cursor; cursor = next)
+       {
+         void *data = cursor->data;
+         struct hash_entry *orig_bucket
+           = (table->bucket
+              + table->hasher (data, table->n_buckets));
+
+         if (! (orig_bucket < table->bucket_limit))
+           abort ();
+
+         next = cursor->next;
+
+         if (orig_bucket->data)
+           {
+             /* Merely relink an existing entry, when moving from a
+                bucket overflow into a bucket overflow.  */
+             cursor->next = orig_bucket->next;
+             orig_bucket->next = cursor;
+           }
+         else
+           {
+             /* Free an existing entry, when moving from a bucket
+                overflow into a bucket header.  Also take care of the
+                simple case of moving from a bucket header into a bucket
+                header.  */
+             orig_bucket->data = data;
+             table->n_buckets_used++;
+             free_entry (table, cursor);
+           }
+       }
+  for (bucket = new_table->bucket; bucket < new_table->bucket_limit; bucket++)
+    if (bucket->data)
+      {
+       void *data = bucket->data;
+       struct hash_entry *orig_bucket
+         = (table->bucket
+            + table->hasher (data, table->n_buckets));
+
+       if (! (orig_bucket < table->bucket_limit))
+         abort ();
+
+       if (orig_bucket->data)
+         {
+           /* Recycle an entry, when moving from a bucket header into
+              a bucket overflow.  Enough free entries now exist that
+              this will never encounter allocation failure.  */
+           struct hash_entry *orig_entry = allocate_entry (table);
+
+           if (orig_entry == NULL)
+             abort ();
+
+           orig_entry->data = data;
+           orig_entry->next = orig_bucket->next;
+           orig_bucket->next = orig_entry;
+         }
+       else
+         {
+           /* Free an existing entry, when moving from a bucket
+              overflow into a bucket header.  Also take care of the
+              simple case of moving from a bucket header into a bucket
+              header.  */
+           orig_bucket->data = data;
+           table->n_buckets_used++;
+         }
+      }
+
+  free (new_table);
+  return false;
 }

 /* If ENTRY matches an entry already in the hash table, return the pointer
@@ -1012,7 +1096,24 @@ hash_delete (Hash_table *table, const void *entry)
                 : (table->n_buckets * tuning->shrink_factor
                    * tuning->growth_threshold));

-             hash_rehash (table, candidate);
+             if (hash_rehash (table, candidate))
+               {
+                 /* Failure to allocate memory in an attempt to
+                    shrink the table is not fatal.  But since memory
+                    is low, we can at least be kind and free any
+                    spare entries, rather than keeping them tied up
+                    in the free entry list.  */
+#if ! USE_OBSTACK
+                 struct hash_entry *cursor = table->free_entry_list;
+                 struct hash_entry *next;
+                 while (cursor)
+                   {
+                     next = cursor->next;
+                     free (cursor);
+                     cursor = next;
+                   }
+#endif
+               }
            }
        }
     }
-- 
1.6.3.1

Re: [PATCH] hash: declare some functions with the warn_unused_result attribute

Reply via email to