** Also affects: linux-aws (Ubuntu Groovy)
   Importance: Undecided
       Status: New

** Also affects: linux-aws (Ubuntu Hirsute)
   Importance: Undecided
       Status: New

** Also affects: linux (Ubuntu)
   Importance: Undecided
       Status: New

** Description changed:

  [Impact]
  
  In LP: #1918694 we applied a fix and a workaround to solve the
  hibernation issues on c5.18xlarge. The workaround was in the form of a
  SAUCE patch:
  
    "UBUNTU: SAUCE: aws: kvm: double the size of hv_clock_boot"
  
  It looks like we can replace this workaround with a proper fix, by
  applying this patch:
  
  http://next.patchew.org/Linux/20210414123544.1060604-1-vkuzn...@redhat.com/
  
+ This is required because various PV features (Async PF, PV EOI, steal
+ time) work through memory shared with hypervisor and when we restore
+ from hibernation we must properly tear down all these features to make
+ sure hypervisor doesn't write to stale locations after we jump to the
+ previously hibernated kernel.
+ 
+ For this reason it is safe to apply this patch set also to the other
+ generic kernels and not just AWS.
+ 
  [Test plan]
  
- Create a c5.18xlarge instance, run the memory stress test script (the
- same test script that we are using to stress test hibernation), trigger
- the hibernate event, trigger the resume event. Repeat a couple of times
- and the problem is very likely to happen.
+ This can be easily tested on AWS (but it should be reproduced by
+ hibernating any kvm instance with multiple CPUs). Create a c5.18xlarge
+ instance, run the memory stress test script (the same test script that
+ we are using to stress test hibernation), trigger the hibernate event,
+ trigger the resume event. Repeat a couple of times and the problem is
+ very likely to happen.
  
  [Fix]
  
- Replace "UBUNTU: SAUCE: aws: kvm: double the size of hv_clock_boot"
- with:
+ On the AWS kernel replace "UBUNTU: SAUCE: aws: kvm: double the size of
+ hv_clock_boot" with:
  
  http://next.patchew.org/Linux/20210414123544.1060604-1-vkuzn...@redhat.com/
+ 
+ For the other kernels, simply apply this patch set.
  
  The fix has been tested extensively in the AWS infrastructure with
  positive results.
  
  [Regression potential]
  
  This new code introduced by the fix can be executed also when a CPU is
  put offline, so we may see potential regressions in the KVM CPU
  hotplugging.

** Summary changed:

- aws: proper fix for c5.18xlarge hibernation issues
+ properly tear down KVM PV features on hibernate

-- 
You received this bug notification because you are a member of Kernel
Packages, which is subscribed to linux-aws in Ubuntu.
https://bugs.launchpad.net/bugs/1920944

Title:
  kvm: properly tear down PV features on hibernate

Status in linux package in Ubuntu:
  Incomplete
Status in linux-aws package in Ubuntu:
  New
Status in linux source package in Focal:
  Incomplete
Status in linux-aws source package in Focal:
  New
Status in linux source package in Groovy:
  Incomplete
Status in linux-aws source package in Groovy:
  New
Status in linux source package in Hirsute:
  Incomplete
Status in linux-aws source package in Hirsute:
  New

Bug description:
  [Impact]

  In LP: #1918694 we applied a fix and a workaround to solve the
  hibernation issues on c5.18xlarge. The workaround was in the form of a
  SAUCE patch:

    "UBUNTU: SAUCE: aws: kvm: double the size of hv_clock_boot"

  It looks like we can replace this workaround with a proper fix, by
  applying this patch:

  http://next.patchew.org/Linux/20210414123544.1060604-1-vkuzn...@redhat.com/

  This is required because various PV features (Async PF, PV EOI, steal
  time) work through memory shared with hypervisor and when we restore
  from hibernation we must properly tear down all these features to make
  sure hypervisor doesn't write to stale locations after we jump to the
  previously hibernated kernel.

  For this reason it is safe to apply this patch set also to the other
  generic kernels and not just AWS.

  [Test plan]

  This can be easily tested on AWS (but it should be reproduced by
  hibernating any kvm instance with multiple CPUs). Create a c5.18xlarge
  instance, run the memory stress test script (the same test script that
  we are using to stress test hibernation), trigger the hibernate event,
  trigger the resume event. Repeat a couple of times and the problem is
  very likely to happen.

  [Fix]

  On the AWS kernel replace "UBUNTU: SAUCE: aws: kvm: double the size of
  hv_clock_boot" with:

  http://next.patchew.org/Linux/20210414123544.1060604-1-vkuzn...@redhat.com/

  For the other kernels, simply apply this patch set.

  The fix has been tested extensively in the AWS infrastructure with
  positive results.

  [Regression potential]

  This new code introduced by the fix can be executed also when a CPU is
  put offline, so we may see potential regressions in the KVM CPU
  hotplugging.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1920944/+subscriptions

-- 
Mailing list: https://launchpad.net/~kernel-packages
Post to     : kernel-packages@lists.launchpad.net
Unsubscribe : https://launchpad.net/~kernel-packages
More help   : https://help.launchpad.net/ListHelp

Reply via email to