> On Tue, May 06, 2025 at 04:23:07AM GMT, Jack Ng wrote: > Thanks Dmitry. Right, the coordination mechanism in v4-0006 works as expected > in various tests (sorry, I misunderstood some details initially).
Great, thanks for checking. > I also want to report a couple of minor issues found during testing (which > you may be aware of already): > > 1. For memory segments other the first one ('main'), the start address passed > to mmap may not be aligned to 4KB or huge page size (since reserved_offset > may not be aligned) and cause mmap to fail. > > 2. Since the ratio for main/desc/iocv/checkpt/strategy in SHMEM_RESIZE_RATIO > are relatively small, I think we need to guard against the case where > 'max_available_memory' is too small for the required sizes of these segments > (from CalculateShmemSize). > Like when max_available_memory=default and shared_numbers=128kB, 'main' still > needs ~109MB, but since only 10% of max_available_memory is reserved for it > (~102MB) and start address of the next segment is calculated based on > reserved_offset, this would cause the mappings to overlap and memory problems > later (I hit this after fixing 1.) > I suppose we can change the minimum value of max_available_memory to be large > enough, and may also adjust the ratios in SHMEM_RESIZE_RATIO to ensure the > reserved space of those segments are sufficient. Yeah, good points. I've introduced max_available_memory expecting some heated discussions about it, and thus didn't put lots of efforts into covering all the possible scenarios. But now I'm reworking it along the lines suggested by Thomas, and will address those as well. Thanks!