| .\" Copyright (c) 2020 by Michael Kerrisk <mtk.manpages@gmail.com> |
| .\" |
| .\" %%%LICENSE_START(VERBATIM) |
| .\" Permission is granted to make and distribute verbatim copies of this |
| .\" manual provided the copyright notice and this permission notice are |
| .\" preserved on all copies. |
| .\" |
| .\" Permission is granted to copy and distribute modified versions of this |
| .\" manual under the conditions for verbatim copying, provided that the |
| .\" entire resulting derived work is distributed under the terms of a |
| .\" permission notice identical to this one. |
| .\" |
| .\" Since the Linux kernel and libraries are constantly changing, this |
| .\" manual page may be incorrect or out-of-date. The author(s) assume no |
| .\" responsibility for errors or omissions, or for damages resulting from |
| .\" the use of the information contained herein. The author(s) may not |
| .\" have taken the same level of care in the production of this manual, |
| .\" which is licensed free of charge, as they might when working |
| .\" professionally. |
| .\" |
| .\" Formatted or processed versions of this manual, if unaccompanied by |
| .\" the source, must acknowledge the copyright and authors of this work. |
| .\" %%%LICENSE_END |
| .\" |
| .\" |
| .TH TIME_NAMESPACES 7 2020-06-09 "Linux" "Linux Programmer's Manual" |
| .SH NAME |
| time_namespaces \- overview of Linux time namespaces |
| .SH DESCRIPTION |
| Time namespaces virtualize the values of two system clocks: |
| .IP \(bu 2 |
| .BR CLOCK_MONOTONIC |
| (and likewise |
| .BR CLOCK_MONOTONIC_COARSE |
| and |
| .BR CLOCK_MONOTONIC_RAW ), |
| a nonsettable clock that represents monotonic time since\(emas |
| described by POSIX\(em"some unspecified point in the past". |
| .IP \(bu |
| .BR CLOCK_BOOTTIME |
| (and likewise |
| .BR CLOCK_BOOTTIME_ALARM ), |
| a nonsettable clock that is identical to |
| .BR CLOCK_MONOTONIC , |
| except that it also includes any time that the system is suspended. |
| .PP |
| Thus, the processes in a time namespace share per-namespace values |
| for these clocks. |
| This affects various APIs that measure against these clocks, including: |
| .BR clock_gettime (2), |
| .BR clock_nanosleep (2), |
| .BR nanosleep (2), |
| .BR timer_settime (2), |
| .BR timerfd_settime (2), |
| and |
| .IR /proc/uptime . |
| .PP |
| Currently, the only way to create a time namespace is by calling |
| .BR unshare (2) |
| with the |
| .BR CLONE_NEWTIME |
| flag. |
| This call creates a new time namespace but does |
| .I not |
| place the calling process in the new namespace. |
| Instead, the calling process's |
| subsequently created children are placed in the new namespace. |
| This allows clock offsets (see below) for the new namespace |
| to be set before the first process is placed in the namespace. |
| The |
| .IR /proc/[pid]/ns/time_for_children |
| symbolic link shows the time namespace in which |
| the children of a process will be created. |
| (A process can use a file descriptor opened on |
| this symbolic link in a call to |
| .BR setns (2) |
| in order to move into the namespace.) |
| .\" |
| .SS /proc/PID/timens_offsets |
| Associated with each time namespace are offsets, |
| expressed with respect to the initial time namespace, |
| that define the values of the monotonic and |
| boot-time clocks in that namespace. |
| These offsets are exposed via the file |
| .IR /proc/PID/timens_offsets . |
| Within this file, |
| the offsets are expressed as lines consisting of |
| three space-delimited fields: |
| .PP |
| .in +4n |
| .EX |
| <clock-id> <offset-secs> <offset-nanosecs> |
| .EE |
| .in |
| .PP |
| The |
| .I clock-id |
| is a string that identifies the clock whose offsets are being shown. |
| This field is either |
| .IR monotonic , |
| for |
| .BR CLOCK_MONOTONIC , |
| or |
| .IR boottime , |
| for |
| .BR CLOCK_BOOTTIME . |
| The remaining fields express the offset (seconds plus nanoseconds) for the |
| clock in this time namespace. |
| These offsets are expressed relative to the clock values in |
| the initial time namespace. |
| The |
| .I offset-secs |
| value can be negative, subject to restrictions noted below; |
| .I offset-nanosecs |
| is an unsigned value. |
| .PP |
| In the initial time namespace, the contents of the |
| .I timens_offsets |
| file are as follows: |
| .PP |
| .in +4n |
| .EX |
| $ \fBcat /proc/self/timens_offsets\fP |
| monotonic 0 0 |
| boottime 0 0 |
| .EE |
| .in |
| .PP |
| In a new time namespace that has had no member processes, |
| the clock offsets can be modified by writing newline-terminated |
| records of the same form to the |
| .I timens_offsets |
| file. |
| The file can be written to multiple times, |
| but after the first process has been created in or has entered the namespace, |
| .BR write (2)s |
| on this file fail with the error |
| .BR EACCES . |
| In order to write to the |
| .IR timens_offsets |
| file, a process must have the |
| .BR CAP_SYS_TIME |
| capability in the user namespace that owns the time namespace. |
| .PP |
| Writes to the |
| .I timens_offsets |
| file can fail with the following errors: |
| .TP |
| .B EINVAL |
| An |
| .I offset-nanosecs |
| value is greater than 999,999,999. |
| .TP |
| .B EINVAL |
| A |
| .I clock-id |
| value is not valid. |
| .TP |
| .B EPERM |
| The caller does not have the |
| .BR CAP_SYS_TIME |
| capability. |
| .TP |
| .B ERANGE |
| An |
| .I offset-secs |
| value is out of range. |
| In particular; |
| .RS |
| .IP \(bu 2 |
| .I offset-secs |
| can't be set to a value which would make the current |
| time on the corresponding clock inside the namespace a negative value; and |
| .IP \(bu |
| .I offset-secs |
| can't be set to a value such that the time on the corresponding clock |
| inside the namespace would exceed half of the value of the kernel constant |
| .BR KTIME_SEC_MAX |
| (this limits the clock value to a maximum of approximately 146 years). |
| .RE |
| .PP |
| In a new time namespace created by |
| .BR unshare (2), |
| the contents of the |
| .I timens_offsets |
| file are inherited from the time namespace of the creating process. |
| .SH NOTES |
| .PP |
| Use of time namespaces requires a kernel that is configured with the |
| .B CONFIG_TIME_NS |
| option. |
| .PP |
| Note that time namespaces do not virtualize the |
| .BR CLOCK_REALTIME |
| clock. |
| Virtualization of this clock was avoided for reasons of complexity |
| and overhead within the kernel. |
| .PP |
| For compatibility with the initial implementation, when writing a |
| .I clock-id |
| to the |
| .IR /proc/[pid]/timens_offsets |
| file, the numerical values of the IDs can be written |
| instead of the symbolic names show above; i.e., 1 instead of |
| .IR monotonic , |
| and 7 instead of |
| .IR boottime . |
| For redability, the use of the symbolic names over the numbers is preferred. |
| .PP |
| The motivation for adding time namespaces was to allow |
| the monotonic and boot-time clocks to maintain consistent values |
| during container migration and checkpoint/restore. |
| .SH EXAMPLES |
| .PP |
| The following shell session demonstrates the operation of time namespaces. |
| We begin by displaying the inode number of the time namespace |
| of a shell in the initial time namespace: |
| .PP |
| .in +4n |
| .EX |
| $ \fBreadlink /proc/$$/ns/time\fP |
| time:[4026531834] |
| .EE |
| .in |
| .PP |
| Continuing in the initial time namespace, we display the system uptime using |
| .BR uptime (1) |
| and use the |
| .I clock_times |
| example program shown in |
| .BR clock_getres (2) |
| to display the values of various clocks: |
| .PP |
| .in +4n |
| .EX |
| $ \fBuptime \-\-pretty\fP |
| up 21 hours, 17 minutes |
| $ \fB./clock_times\fP |
| CLOCK_REALTIME : 1585989401.971 (18356 days + 8h 36m 41s) |
| CLOCK_TAI : 1585989438.972 (18356 days + 8h 37m 18s) |
| CLOCK_MONOTONIC: 56338.247 (15h 38m 58s) |
| CLOCK_BOOTTIME : 76633.544 (21h 17m 13s) |
| .EE |
| .in |
| .PP |
| We then use |
| .BR unshare (1) |
| to create a time namespace and execute a |
| .BR bash (1) |
| shell. |
| From the new shell, we use the built-in |
| .B echo |
| command to write records to the |
| .I timens_offsets |
| file adjusting the offset for the |
| .B CLOCK_MONOTONIC |
| clock forward 2 days |
| and the offset for the |
| .B CLOCK_BOOTTIME |
| clock forward 7 days: |
| .PP |
| .in +4n |
| .EX |
| $ \fBPS1="ns2# " sudo unshare \-T \-\- bash \-\-norc\fP |
| ns2# \fBecho "monotonic $((2*24*60*60)) 0" > /proc/$$/timens_offsets\fP |
| ns2# \fBecho "boottime $((7*24*60*60)) 0" > /proc/$$/timens_offsets\fP |
| .EE |
| .in |
| .PP |
| Above, we started the |
| .BR bash (1) |
| shell with the |
| .B \-\-norc |
| options so that no start-up scripts were executed. |
| This ensures that no child processes are created from the |
| shell before we have a chance to update the |
| .I timens_offsets |
| file. |
| .PP |
| We then use |
| .BR cat (1) |
| to display the contents of the |
| .I timens_offsets |
| file. |
| The execution of |
| .BR cat (1) |
| creates the first process in the new time namespace, |
| after which further attempts to update the |
| .I timens_offsets |
| file produce an error. |
| .PP |
| .in +4n |
| .EX |
| ns2# \fBcat /proc/$$/timens_offsets\fP |
| monotonic 172800 0 |
| boottime 604800 0 |
| ns2# \fBecho "boottime $((9*24*60*60)) 0" > /proc/$$/timens_offsets\fP |
| bash: echo: write error: Permission denied |
| .EE |
| .in |
| .PP |
| Continuing in the new namespace, we execute |
| .BR uptime (1) |
| and the |
| .I clock_times |
| example program: |
| .PP |
| .in +4n |
| .EX |
| ns2# \fBuptime \-\-pretty\fP |
| up 1 week, 21 hours, 18 minutes |
| ns2# \fB./clock_times\fP |
| CLOCK_REALTIME : 1585989457.056 (18356 days + 8h 37m 37s) |
| CLOCK_TAI : 1585989494.057 (18356 days + 8h 38m 14s) |
| CLOCK_MONOTONIC: 229193.332 (2 days + 15h 39m 53s) |
| CLOCK_BOOTTIME : 681488.629 (7 days + 21h 18m 8s) |
| .EE |
| .in |
| .PP |
| From the above output, we can see that the monotonic |
| and boot-time clocks have different values in the new time namespace. |
| .PP |
| Examining the |
| .I /proc/[pid]/ns/time |
| and |
| .I /proc/[pid]/ns/time_for_children |
| symbolic links, we see that the shell is a member of the initial time |
| namespace, but its children are created in the new namespace. |
| |
| .PP |
| .in +4n |
| .EX |
| ns2# \fBreadlink /proc/$$/ns/time\fP |
| time:[4026531834] |
| ns2# \fBreadlink /proc/$$/ns/time_for_children\fP |
| time:[4026532900] |
| ns2# \fBreadlink /proc/self/ns/time\fP # Creates a child process |
| time:[4026532900] |
| .EE |
| .in |
| .PP |
| Returning to the shell in the initial time namespace, |
| we see that the monotonic and boot-time clocks |
| are unaffected by the |
| .I timens_offsets |
| changes that were made in the other time namespace: |
| .PP |
| .in +4n |
| .EX |
| $ \fBuptime \-\-pretty\fP |
| up 21 hours, 19 minutes |
| $ \fB./clock_times\fP |
| CLOCK_REALTIME : 1585989401.971 (18356 days + 8h 38m 51s) |
| CLOCK_TAI : 1585989438.972 (18356 days + 8h 39m 28s) |
| CLOCK_MONOTONIC: 56338.247 (15h 41m 8s) |
| CLOCK_BOOTTIME : 76633.544 (21h 19m 23s) |
| .EE |
| .in |
| .SH SEE ALSO |
| .BR nsenter (1), |
| .BR unshare (1), |
| .BR clock_settime (2), |
| .\" clone3() support for time namespaces is a work in progress |
| .\" .BR clone3 (2), |
| .BR setns (2), |
| .BR unshare (2), |
| .BR namespaces (7), |
| .BR time (7) |