On Tue, May 20, 2014 at 10:47 AM, Cyrill Gorcunov <gorcu...@gmail.com> wrote: > On Tue, May 20, 2014 at 10:24:49AM -0700, Andy Lutomirski wrote: >> On Tue, May 20, 2014 at 10:21 AM, Cyrill Gorcunov <gorcu...@gmail.com> wrote: >> > On Mon, May 19, 2014 at 03:58:33PM -0700, Andy Lutomirski wrote: >> >> Using arch_vma_name to give special mappings a name is awkward. x86 >> >> currently implements it by comparing the start address of the vma to >> >> the expected address of the vdso. This requires tracking the start >> >> address of special mappings and is probably buggy if a special vma >> >> is split or moved. >> >> >> >> Improve _install_special_mapping to just name the vma directly. Use >> >> it to give the x86 vvar area a name, which should make CRIU's life >> >> easier. >> >> >> >> As a side effect, the vvar area will show up in core dumps. This >> >> could be considered weird and is fixable. Thoughts? >> >> >> >> Cc: Cyrill Gorcunov <gorcu...@openvz.org> >> >> Cc: Pavel Emelyanov <xe...@parallels.com> >> >> Signed-off-by: Andy Lutomirski <l...@amacapital.net> >> > >> > Hi Andy, thanks a lot for this! I must confess I don't yet know how >> > would we deal with compat tasks but this is 'must have' mark which >> > allow us to detect vvar area! >> >> Out of curiosity, how does CRIU currently handle checkpointing a >> restored task? In current kernels, the "[vdso]" name in maps goes >> away after mremapping the vdso. > > We use not only [vdso] mark to detect vdso area but also page frame > number of the living vdso. If mark is not present in procfs output > we examinate executable areas and check if pfn == vdso_pfn, it's > a slow path because there migh be a bunch of executable areas and > touching every of it is not that fast thing, but we simply have no > choise.
This patch should fix this issue, at least. If there's still a way to get a native vdso that doesn't say "[vdso]", please let me know/ > > The situation get worse when task was dumped on one kernel and > then restored on another kernel where vdso content is different > from one save in image -- is such case as I mentioned we need > that named vdso proxy which redirect calls to vdso of the system > where task is restoring. And when such "restored" task get checkpointed > second time we don't dump new living vdso but save only old vdso > proxy on disk (detecting it is a different story, in short we > inject a unique mark into elf header). Yuck. But I don't know whether the kernel can help much here. > >> >> I suspect that you'll need kernel changes for compat tasks, since I >> think that mremapping the vdso on any reasonably modern hardware in a >> 32-bit task will cause sigreturn to blow up. This could be fixed by >> making mremap magical, although adding a new prctl or arch_prctl to >> reliably move the vdso might be a better bet. > > Well, as far as I understand compat code uses abs addressing for > vvar data and if vvar data position doesn't change we're safe, > but same time because vvar addresses are not abi I fear one day > we indeed hit the problems and the only solution would be > to use kernel's help. But again, Andy, I didn't think much > about implementing compat mode in criu yet so i might be > missing some details. Prior to 3.15, the compat code didn't have vvar data at all. In 3.15 and up, the vvar data is accessed using PC-relative addressing, even in compat mode (using the usual call; mov trick to read EIP). --Andy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/