Re: [PATCH 2/2] erofs-utils: lib: improve freeing hashmap in erofs_blob_exit()

Gao Xiang Thu, 23 May 2024 22:36:40 -0700

Hi Sandeep,

On 2024/5/24 05:01, Sandeep Dhavale wrote:

Depending on size of the filesystem being built there can be huge number
of elements in the hashmap. Currently we call hashmap_iter_first() in
while loop to iterate and free the elements. However technically
correct, this is inefficient in 2 aspects.


- As we are iterating the elements for removal, we do not need overhead of
rehashing.
- Second part which contributes hugely to the performance is using
hashmap_iter_first() as it starts scanning from index 0 throwing away
the previous successful scan. For sparsely populated hashmap this becomes
O(n^2) in worst case.

Lets fix this by disabling hashmap shrink which avoids rehashing
and use hashmap_iter_next() which is now guaranteed to iterate over
all the elements while removing while avoiding the performance pitfalls
of using hashmap_iter_first().

Test with random data shows performance improvement as:

fs_size  Before   After
1G       23s      7s
2G       81s      15s
4G       272s     31s
8G       1252s    61s


Sigh.. BTW, in the long term, I guess we might need to
find a better hashmap implementation (with MIT or BSD
license) instead of this one.


Signed-off-by: Sandeep Dhavale <dhav...@google.com>> ---
  lib/blobchunk.c | 8 +++++++-
  1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/lib/blobchunk.c b/lib/blobchunk.c
index 645bcc1..8082aa4 100644
--- a/lib/blobchunk.c
+++ b/lib/blobchunk.c
@@ -548,11 +548,17 @@ void erofs_blob_exit(void)
        if (blobfile)
                fclose(blobfile);

- while ((e = hashmap_iter_first(&blob_hashmap, &iter))) {

+       /* Disable hashmap shrink, effectively disabling rehash.
+        * This way we can iterate over entire hashmap efficiently
+        * and safely by using hashmap_iter_next() */


I will fix up the comment style manually, otherwise it
looks good to me...

Reviewed-by: Gao Xiang <hsiang...@linux.alibaba.com>

Thanks,
Gao Xiang

Re: [PATCH 2/2] erofs-utils: lib: improve freeing hashmap in erofs_blob_exit()

Reply via email to