dirtyrate: refactor dirty page rate calculation

Hyman Huang Fri, 21 Jan 2022 19:24:12 -0800



在 2022/1/17 10:19, Peter Xu 写道:

On Wed, Jan 05, 2022 at 01:14:06AM +0800, huang...@chinatelecom.cn wrote:

From: Hyman Huang(黄勇) <huang...@chinatelecom.cn>

+
+static void vcpu_dirty_stat_collect(VcpuStat *stat,
+                                    DirtyPageRecord *records,
+                                    bool start)
+{
+    CPUState *cpu;
+
+    CPU_FOREACH(cpu) {
+        if (!start && cpu->cpu_index >= stat->nvcpu) {
+            /*
+             * Never go there unless cpu is hot-plugged,
+             * just ignore in this case.
+             */
+            continue;
+        }


As commented before, I think the only way to do it right is does not allow cpu
plug/unplug during measurement..

Say, even if index didn't get out of range, an unplug even should generate very
stange output of the unplugged cpu.  Please see more below.

+        record_dirtypages(records, cpu, start);
+    }
+}
+
+int64_t vcpu_calculate_dirtyrate(int64_t calc_time_ms,
+                                 int64_t init_time_ms,
+                                 VcpuStat *stat,
+                                 unsigned int flag,
+                                 bool one_shot)
+{
+    DirtyPageRecord *records;
+    int64_t duration;
+    int64_t dirtyrate;
+    int i = 0;
+
+    cpu_list_lock();
+    records = vcpu_dirty_stat_alloc(stat);
+    vcpu_dirty_stat_collect(stat, records, true);
+    cpu_list_unlock();


Continue with above - then I'm wondering whether we should just keep taking the
lock until vcpu_dirty_stat_collect().

Yes we could be taking the lock for a long time because of the sleep, but the
main thread plug thread will just wait for it to complete and it is at least
not a e.g. deadlock.

The other solution is we do cpu_list_unlock() like this, but introduce another
cpu_list_generation_id and boost it after any plug/unplug of cpu, aka, when cpu
list changes.

Then we record cpu generation ID at the entry of this function and retry the
whole measurement if at some point we found generation ID changed (we need to
fetch the gen ID after having the lock, of course).  That could avoid us taking
the cpu list lock during dirty_stat_wait(), but it'll start to complicate cpu
list locking rules.

The simpler way is still just to take the lock, imho.

Hi, Peter, i'm working on this as you suggetion, and keep taking thecpu_list_lock during dirty page rate calculation. I found the deadlockwhen testing hotplug scenario, the logic is as the following:


calc thread                             qemu main thread
1. take qemu_cpu_list_lock
                                        1. take the BQL
2. collect dirty page and wait          2. cpu hotplug
                                        3. take qemu_cpu_list_lock
3. take the BQL

4. sync dirty log                       

5. release the BQL

I just recall that is one of the reasons why i handle the plug/unplugscenario(another is cpu plug may wait a little bit long time whendirtylimit in service).

It seems that we have two strategies, one is just keep this logicuntouched in v12 and add "cpu_list_generation_id" implementaion in TODOlist(once this patchset been merged, i'll try that out)， another isintroducing the "cpu_list_generation_id" right now.


What strategy do you prefer to?

Uh... I think the "unmatched_cnt" also kind of like this too, becauceonce we remove the "unmatched count" logic, the throttle algo is morelikely to oscillate and i prefer to add the "unmatched_cnt" in TODO listas above.

The rest looks good, thanks.

+
+    duration = dirty_stat_wait(calc_time_ms, init_time_ms);
+
+    global_dirty_log_sync(flag, one_shot);
+
+    cpu_list_lock();
+    vcpu_dirty_stat_collect(stat, records, false);
+    cpu_list_unlock();
+
+    for (i = 0; i < stat->nvcpu; i++) {
+        dirtyrate = do_calculate_dirtyrate(records[i], duration);
+
+        stat->rates[i].id = i;
+        stat->rates[i].dirty_rate = dirtyrate;
+
+        trace_dirtyrate_do_calculate_vcpu(i, dirtyrate);
+    }
+
+    g_free(records);
+
+    return duration;
+}
+
  static bool is_sample_period_valid(int64_t sec)
  {
      if (sec < MIN_FETCH_DIRTYRATE_TIME_SEC ||
@@ -396,44 +518,6 @@ static bool compare_page_hash_info(struct 
RamblockDirtyInfo *info,
      return true;
  }

-static inline void record_dirtypages(DirtyPageRecord *dirty_pages,

-                                     CPUState *cpu, bool start)
-{
-    if (start) {
-        dirty_pages[cpu->cpu_index].start_pages = cpu->dirty_pages;
-    } else {
-        dirty_pages[cpu->cpu_index].end_pages = cpu->dirty_pages;
-    }
-}
-
-static void dirtyrate_global_dirty_log_start(void)
-{
-    qemu_mutex_lock_iothread();
-    memory_global_dirty_log_start(GLOBAL_DIRTY_DIRTY_RATE);
-    qemu_mutex_unlock_iothread();
-}
-
-static void dirtyrate_global_dirty_log_stop(void)
-{
-    qemu_mutex_lock_iothread();
-    memory_global_dirty_log_sync();
-    memory_global_dirty_log_stop(GLOBAL_DIRTY_DIRTY_RATE);
-    qemu_mutex_unlock_iothread();
-}
-
-static int64_t do_calculate_dirtyrate_vcpu(DirtyPageRecord dirty_pages)
-{
-    uint64_t memory_size_MB;
-    int64_t time_s;
-    uint64_t increased_dirty_pages =
-        dirty_pages.end_pages - dirty_pages.start_pages;
-
-    memory_size_MB = (increased_dirty_pages * TARGET_PAGE_SIZE) >> 20;
-    time_s = DirtyStat.calc_time;
-
-    return memory_size_MB / time_s;
-}
-
  static inline void record_dirtypages_bitmap(DirtyPageRecord *dirty_pages,
                                              bool start)
  {
@@ -444,11 +528,6 @@ static inline void 
record_dirtypages_bitmap(DirtyPageRecord *dirty_pages,
      }
  }

-static void do_calculate_dirtyrate_bitmap(DirtyPageRecord dirty_pages)

-{
-    DirtyStat.dirty_rate = do_calculate_dirtyrate_vcpu(dirty_pages);
-}
-
  static inline void dirtyrate_manual_reset_protect(void)
  {
      RAMBlock *block = NULL;
@@ -492,71 +571,52 @@ static void calculate_dirtyrate_dirty_bitmap(struct 
DirtyRateConfig config)
      DirtyStat.start_time = start_time / 1000;

msec = config.sample_period_seconds * 1000;

-    msec = set_sample_page_period(msec, start_time);
+    msec = dirty_stat_wait(msec, start_time);
      DirtyStat.calc_time = msec / 1000;

/*

-     * dirtyrate_global_dirty_log_stop do two things.
+     * do two things.
       * 1. fetch dirty bitmap from kvm
       * 2. stop dirty tracking
       */
-    dirtyrate_global_dirty_log_stop();
+    global_dirty_log_sync(GLOBAL_DIRTY_DIRTY_RATE, true);

record_dirtypages_bitmap(&dirty_pages, false);- do_calculate_dirtyrate_bitmap(dirty_pages);

+    DirtyStat.dirty_rate = do_calculate_dirtyrate(dirty_pages, msec);
  }


--
Best regard

Hyman Huang(黄勇)

Re: [PATCH v11 1/4] migration/dirtyrate: refactor dirty page rate calculation

Reply via email to