From: Cyrill Gorcunov <gorcu...@virtuozzo.com> This two members represent monotonic and bootbased clocks for container's uptime. When container is in suspended state (or moving to another node) we trest monotonic and bootbased clocks as being stopped so we need to account delta time on restore and adjust the members in subject.
Moreover this timestamps are involved into posix-timers setup so once application tries to setup monotonic clocks after the restore (with absolute time specification) we adjust the values as well. The application which migrate a container must fetch the current settings from /sys/fs/cgroup/ve/$VE/ve.real_start_timespec and /sys/fs/cgroup/ve/$VE/ve.start_timespec, then write them back on the restore. https://jira.sw.ru/browse/PSBM-41311 https://jira.sw.ru/browse/PSBM-41406 v2: - use clock_[monotonic|bootbased] for cgroup entry names instead Original-by: Andrew Vagin <ava...@openvz.org> Signed-off-by: Cyrill Gorcunov <gorcu...@virtuozzo.com> Reviewed-by: Vladimir Davydov <vdavy...@virtuozzo.com> (cherry picked from vz7 commit 43f4b0c752abd84aa1b346373d152941123d2446 ("ve: Add interface for @start_timespec and @real_start_timespec adjustmen")) Signed-off-by: Konstantin Khorenko <khore...@virtuozzo.com> +++ ve/time: Limit values to write in ve::clock_[monotonic|bootbased] What do we mean when write a valie XXX into, say, ve::ve.clock_bootbased? We mean that "up to now the CT worked for XXX secs/usecs already". And we store the delta between Node "now" and XXX into ve->start_time_real. If the CT worked less than the current Node, ve->start_time_real will contain positive value and we'll substitute it from Node's "now" each time when we need to get the time since the CT start. If the CT worked longer than the current CT (say, CT has been migrated from another HN), the stored delta will be negative and thus we'll "add" more time for Node's "now". So then what do we want to limit? 1. Negative values written to ve::clock_[monotonic|bootbased]. Indeed we can hardly imagine that the CT has been started, but the time since it's start is negative. 2. A big positive value, so some time later when we read from ve::clock_[monotonic|bootbased] we get an overflowed value. Both these checks are performed by timespec_valid_strict(). mFixes: 25cab3041305 ("ve: Add interface for ve::clock_[monotonic|bootbased] adjustment") Signed-off-by: Konstantin Khorenko <khore...@virtuozzo.com> Reviewed-by: Kirill Tkhai <ktk...@virtuozzo.com> Cherry-picked from vz8 commit ad5d9cc5fd62 ("ve: Add interface for ve::clock_[monotonic|bootbased] adjustment")). Ported to timespec64. Followed ve->real_start_time -> ve->start_boottime rename. Followed ktime_get_boot_ns() -> ktime_get_boottime_ns() rename. Signed-off-by: Nikita Yushchenko <nikita.yushche...@virtuozzo.com> --- kernel/ve/ve.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 76 insertions(+) diff --git a/kernel/ve/ve.c b/kernel/ve/ve.c index e3a07d4c9fe4..f3df12f8638b 100644 --- a/kernel/ve/ve.c +++ b/kernel/ve/ve.c @@ -955,6 +955,68 @@ static ssize_t ve_os_release_write(struct kernfs_open_file *of, char *buf, return ret ? ret : nbytes; } +enum { + VE_CF_CLOCK_MONOTONIC, + VE_CF_CLOCK_BOOTBASED, +}; + +static int ve_ts_read(struct seq_file *sf, void *v) +{ + struct ve_struct *ve = css_to_ve(seq_css(sf)); + struct timespec64 ts; + u64 now, delta; + + switch (seq_cft(sf)->private) { + case VE_CF_CLOCK_MONOTONIC: + now = ktime_get_ns(); + delta = ve->start_time; + break; + case VE_CF_CLOCK_BOOTBASED: + now = ktime_get_boottime_ns(); + delta = ve->start_boottime; + break; + default: + now = delta = 0; + WARN_ON_ONCE(1); + break; + } + + ts = ns_to_timespec64(now - delta); + seq_printf(sf, "%lld %ld", ts.tv_sec, ts.tv_nsec); + return 0; +} + +static ssize_t ve_ts_write(struct kernfs_open_file *of, char *buf, + size_t nbytes, loff_t off) +{ + struct ve_struct *ve = css_to_ve(of_css(of)); + struct timespec64 delta; + u64 delta_ns, now, *target; + + if (sscanf(buf, "%lld %ld", &delta.tv_sec, &delta.tv_nsec) != 2) + return -EINVAL; + if (!timespec64_valid_strict(&delta)) + return -EINVAL; + delta_ns = timespec64_to_ns(&delta); + + switch (of_cft(of)->private) { + case VE_CF_CLOCK_MONOTONIC: + now = ktime_get_ns(); + target = &ve->start_time; + break; + case VE_CF_CLOCK_BOOTBASED: + now = ktime_get_boottime_ns(); + target = &ve->start_boottime; + break; + default: + WARN_ON_ONCE(1); + return -EINVAL; + } + + *target = now - delta_ns; + return nbytes; +} + static struct cftype ve_cftypes[] = { { @@ -981,6 +1043,20 @@ static struct cftype ve_cftypes[] = { .read_u64 = ve_reatures_read, .write_u64 = ve_reatures_write, }, + { + .name = "clock_monotonic", + .flags = CFTYPE_NOT_ON_ROOT, + .seq_show = ve_ts_read, + .write = ve_ts_write, + .private = VE_CF_CLOCK_MONOTONIC, + }, + { + .name = "clock_bootbased", + .flags = CFTYPE_NOT_ON_ROOT, + .seq_show = ve_ts_read, + .write = ve_ts_write, + .private = VE_CF_CLOCK_BOOTBASED, + }, { .name = "netns_max_nr", .flags = CFTYPE_NOT_ON_ROOT, -- 2.30.2 _______________________________________________ Devel mailing list Devel@openvz.org https://lists.openvz.org/mailman/listinfo/devel