There was a possibility for infinite do-while loop inside the GC thread
function in case of total failure of the caching device. I was able to
reproduce it 3 times simulating disappearing of the caching device via
'echo 1 > /sys/block/<dev>/device/delete'. In that case the btree_root
starts to return non zero and non -EAGAIN result, 'gc failed' message
start to fill the kernel log and the do-while becomes infinite loop
occupying single CPU core at 100%.
There is already a logic which unregisters the cache_set (or panics) in
case of io errors and thus we exit the loop here if the unregistering
procedure has already started.

Signed-off-by: Pavel Vazharov <[email protected]>
---
 drivers/md/bcache/btree.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 81e8dc3..a672081 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -1748,8 +1748,12 @@ static void bch_btree_gc(struct cache_set *c)
                closure_sync(&writes);
                cond_resched();
 
-               if (ret && ret != -EAGAIN)
-                       pr_warn("gc failed!");
+               if (ret && ret != -EAGAIN) {
+                       if (test_bit(CACHE_SET_UNREGISTERING, &c->flags))
+                               break;
+                       else
+                               pr_warn("gc failed!");
+               }
        } while (ret);
 
        bch_btree_gc_finish(c);
-- 
2.7.4

Reply via email to