** Description changed: [Impact] The option CONFIG_DMA_CMA seems to cause resume problems on the t2.* instance types (Xen). - With this option enabled device drivers are allowed to use the Contiguous Memory Allocator (CMA) for DMA operations. So, drivers can allocate large physically-contiguous blocks of memory, instead of relying on the I/O map or scatter-gather support. - - However, on resume, the memory used by DMA needs to be re-initialized / re-allocated, but it may fail to allocate large chunks of contiguous memory due to the fact that we also need to restore the hibernation image, using more memory and causing a system hang during the resume process. + With this option enabled device drivers are allowed to use the + Contiguous Memory Allocator (CMA) for DMA operations. So, drivers can + allocate large physically-contiguous blocks of memory, instead of + relying on the I/O map or scatter-gather support. + + However, on resume, the memory used by DMA needs to be re-initialized / + re-allocated, but it may fail to allocate large chunks of contiguous + memory due to the fact that we also need to restore the hibernation + image, using more memory and causing a system hang during the resume + process. [Test case] Hibernate / resume any t2.* instance (especially t2.nano, where the problem seems to happen 100% of the times after 2 consecutive hibernate/resume cycles). [Fix] Disable CONFIG_DMA_CMA. NOTE: this option is already disabled in the generic kernel (see LP: #1362261). + With this option disabled the success rate of hibernation on the t2.* + instance types during our tests jumped to 100%. + [Regression potential] It is a .config change, no regression potential except with the fact that disabling this option also disables the module 'etnaviv' (Vivante graphic card), that is not really needed in the aws kernel.
** Description changed: [Impact] The option CONFIG_DMA_CMA seems to cause resume problems on the t2.* instance types (Xen). With this option enabled device drivers are allowed to use the Contiguous Memory Allocator (CMA) for DMA operations. So, drivers can allocate large physically-contiguous blocks of memory, instead of relying on the I/O map or scatter-gather support. However, on resume, the memory used by DMA needs to be re-initialized / re-allocated, but it may fail to allocate large chunks of contiguous memory due to the fact that we also need to restore the hibernation image, using more memory and causing a system hang during the resume process. [Test case] Hibernate / resume any t2.* instance (especially t2.nano, where the problem seems to happen 100% of the times after 2 consecutive hibernate/resume cycles). [Fix] Disable CONFIG_DMA_CMA. NOTE: this option is already disabled in the generic kernel (see LP: #1362261). With this option disabled the success rate of hibernation on the t2.* instance types during our tests jumped to 100%. [Regression potential] - It is a .config change, no regression potential except with the fact - that disabling this option also disables the module 'etnaviv' (Vivante + It is a .config change, no regression potential except for the fact that + disabling this option also disables the module 'etnaviv' (Vivante graphic card), that is not really needed in the aws kernel. ** Description changed: [Impact] - The option CONFIG_DMA_CMA seems to cause resume problems on the t2.* - instance types (Xen). + The option CONFIG_DMA_CMA seems to cause hibernation failures on the + t2.* instance types (Xen). With this option enabled device drivers are allowed to use the Contiguous Memory Allocator (CMA) for DMA operations. So, drivers can allocate large physically-contiguous blocks of memory, instead of relying on the I/O map or scatter-gather support. However, on resume, the memory used by DMA needs to be re-initialized / re-allocated, but it may fail to allocate large chunks of contiguous memory due to the fact that we also need to restore the hibernation image, using more memory and causing a system hang during the resume process. [Test case] Hibernate / resume any t2.* instance (especially t2.nano, where the problem seems to happen 100% of the times after 2 consecutive hibernate/resume cycles). [Fix] Disable CONFIG_DMA_CMA. NOTE: this option is already disabled in the generic kernel (see LP: #1362261). With this option disabled the success rate of hibernation on the t2.* instance types during our tests jumped to 100%. [Regression potential] It is a .config change, no regression potential except for the fact that disabling this option also disables the module 'etnaviv' (Vivante graphic card), that is not really needed in the aws kernel. -- You received this bug notification because you are a member of Ubuntu Bugs, which is subscribed to Ubuntu. https://bugs.launchpad.net/bugs/1879711 Title: aws: disable CONFIG_DMA_CMA To manage notifications about this bug go to: https://bugs.launchpad.net/ubuntu/+source/linux-aws/+bug/1879711/+subscriptions -- ubuntu-bugs mailing list ubuntu-bugs@lists.ubuntu.com https://lists.ubuntu.com/mailman/listinfo/ubuntu-bugs