| .\" Copyright (c) 1993 Michael Haardt <michael@moria.de> |
| .\" Fri Apr 2 11:32:09 MET DST 1993 |
| .\" |
| .\" and changes Copyright (C) 1999 Mike Coleman (mkc@acm.org) |
| .\" -- major revision to fully document ptrace semantics per recent Linux |
| .\" kernel (2.2.10) and glibc (2.1.2) |
| .\" Sun Nov 7 03:18:35 CST 1999 |
| .\" |
| .\" and Copyright (c) 2011, Denys Vlasenko <vda.linux@googlemail.com> |
| .\" and Copyright (c) 2015, 2016, Michael Kerrisk <mtk.manpages@gmail.com> |
| .\" |
| .\" %%%LICENSE_START(GPLv2+_DOC_FULL) |
| .\" This is free documentation; you can redistribute it and/or |
| .\" modify it under the terms of the GNU General Public License as |
| .\" published by the Free Software Foundation; either version 2 of |
| .\" the License, or (at your option) any later version. |
| .\" |
| .\" The GNU General Public License's references to "object code" |
| .\" and "executables" are to be interpreted as the output of any |
| .\" document formatting or typesetting system, including |
| .\" intermediate and printed output. |
| .\" |
| .\" This manual is distributed in the hope that it will be useful, |
| .\" but WITHOUT ANY WARRANTY; without even the implied warranty of |
| .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the |
| .\" GNU General Public License for more details. |
| .\" |
| .\" You should have received a copy of the GNU General Public |
| .\" License along with this manual; if not, see |
| .\" <http://www.gnu.org/licenses/>. |
| .\" %%%LICENSE_END |
| .\" |
| .\" Modified Fri Jul 23 23:47:18 1993 by Rik Faith <faith@cs.unc.edu> |
| .\" Modified Fri Jan 31 16:46:30 1997 by Eric S. Raymond <esr@thyrsus.com> |
| .\" Modified Thu Oct 7 17:28:49 1999 by Andries Brouwer <aeb@cwi.nl> |
| .\" Modified, 27 May 2004, Michael Kerrisk <mtk.manpages@gmail.com> |
| .\" Added notes on capability requirements |
| .\" |
| .\" 2006-03-24, Chuck Ebbert <76306.1226@compuserve.com> |
| .\" Added PTRACE_SETOPTIONS, PTRACE_GETEVENTMSG, PTRACE_GETSIGINFO, |
| .\" PTRACE_SETSIGINFO, PTRACE_SYSEMU, PTRACE_SYSEMU_SINGLESTEP |
| .\" (Thanks to Blaisorblade, Daniel Jacobowitz and others who helped.) |
| .\" 2011-09, major update by Denys Vlasenko <vda.linux@googlemail.com> |
| .\" 2015-01, Kees Cook <keescook@chromium.org> |
| .\" Added PTRACE_O_TRACESECCOMP, PTRACE_EVENT_SECCOMP |
| .\" |
| .\" FIXME The following are undocumented: |
| .\" |
| .\" PTRACE_GETWMMXREGS |
| .\" PTRACE_SETWMMXREGS |
| .\" ARM |
| .\" Linux 2.6.12 |
| .\" |
| .\" PTRACE_SET_SYSCALL |
| .\" ARM and ARM64 |
| .\" Linux 2.6.16 |
| .\" commit 3f471126ee53feb5e9b210ea2f525ed3bb9b7a7f |
| .\" Author: Nicolas Pitre <nico@cam.org> |
| .\" Date: Sat Jan 14 19:30:04 2006 +0000 |
| .\" |
| .\" PTRACE_GETCRUNCHREGS |
| .\" PTRACE_SETCRUNCHREGS |
| .\" ARM |
| .\" Linux 2.6.18 |
| .\" commit 3bec6ded282b331552587267d67a06ed7fd95ddd |
| .\" Author: Lennert Buytenhek <buytenh@wantstofly.org> |
| .\" Date: Tue Jun 27 22:56:18 2006 +0100 |
| .\" |
| .\" PTRACE_GETVFPREGS |
| .\" PTRACE_SETVFPREGS |
| .\" ARM and ARM64 |
| .\" Linux 2.6.30 |
| .\" commit 3d1228ead618b88e8606015cbabc49019981805d |
| .\" Author: Catalin Marinas <catalin.marinas@arm.com> |
| .\" Date: Wed Feb 11 13:12:56 2009 +0100 |
| .\" |
| .\" PTRACE_GETHBPREGS |
| .\" PTRACE_SETHBPREGS |
| .\" ARM and ARM64 |
| .\" Linux 2.6.37 |
| .\" commit 864232fa1a2f8dfe003438ef0851a56722740f3e |
| .\" Author: Will Deacon <will.deacon@arm.com> |
| .\" Date: Fri Sep 3 10:42:55 2010 +0100 |
| .\" |
| .\" PTRACE_SINGLEBLOCK |
| .\" Since at least Linux 2.4.0 on various architectures |
| .\" Since Linux 2.6.25 on x86 (and others?) |
| .\" commit 5b88abbf770a0e1975c668743100f42934f385e8 |
| .\" Author: Roland McGrath <roland@redhat.com> |
| .\" Date: Wed Jan 30 13:30:53 2008 +0100 |
| .\" ptrace: generic PTRACE_SINGLEBLOCK |
| .\" |
| .\" PTRACE_GETFPXREGS |
| .\" PTRACE_SETFPXREGS |
| .\" Since at least Linux 2.4.0 on various architectures |
| .\" |
| .\" PTRACE_GETFDPIC |
| .\" PTRACE_GETFDPIC_EXEC |
| .\" PTRACE_GETFDPIC_INTERP |
| .\" blackfin, c6x, frv, sh |
| .\" First appearance in Linux 2.6.11 on frv |
| .\" |
| .\" and others that can be found in the arch/*/include/uapi/asm/ptrace files |
| .\" |
| .TH PTRACE 2 2020-02-09 "Linux" "Linux Programmer's Manual" |
| .SH NAME |
| ptrace \- process trace |
| .SH SYNOPSIS |
| .nf |
| .B #include <sys/ptrace.h> |
| .PP |
| .BI "long ptrace(enum __ptrace_request " request ", pid_t " pid ", " |
| .BI " void *" addr ", void *" data ); |
| .fi |
| .SH DESCRIPTION |
| The |
| .BR ptrace () |
| system call provides a means by which one process (the "tracer") |
| may observe and control the execution of another process (the "tracee"), |
| and examine and change the tracee's memory and registers. |
| It is primarily used to implement breakpoint debugging and system |
| call tracing. |
| .PP |
| A tracee first needs to be attached to the tracer. |
| Attachment and subsequent commands are per thread: |
| in a multithreaded process, |
| every thread can be individually attached to a |
| (potentially different) tracer, |
| or left not attached and thus not debugged. |
| Therefore, "tracee" always means "(one) thread", |
| never "a (possibly multithreaded) process". |
| Ptrace commands are always sent to |
| a specific tracee using a call of the form |
| .PP |
| ptrace(PTRACE_foo, pid, ...) |
| .PP |
| where |
| .I pid |
| is the thread ID of the corresponding Linux thread. |
| .PP |
| (Note that in this page, a "multithreaded process" |
| means a thread group consisting of threads created using the |
| .BR clone (2) |
| .B CLONE_THREAD |
| flag.) |
| .PP |
| A process can initiate a trace by calling |
| .BR fork (2) |
| and having the resulting child do a |
| .BR PTRACE_TRACEME , |
| followed (typically) by an |
| .BR execve (2). |
| Alternatively, one process may commence tracing another process using |
| .B PTRACE_ATTACH |
| or |
| .BR PTRACE_SEIZE . |
| .PP |
| While being traced, the tracee will stop each time a signal is delivered, |
| even if the signal is being ignored. |
| (An exception is |
| .BR SIGKILL , |
| which has its usual effect.) |
| The tracer will be notified at its next call to |
| .BR waitpid (2) |
| (or one of the related "wait" system calls); that call will return a |
| .I status |
| value containing information that indicates |
| the cause of the stop in the tracee. |
| While the tracee is stopped, |
| the tracer can use various ptrace requests to inspect and modify the tracee. |
| The tracer then causes the tracee to continue, |
| optionally ignoring the delivered signal |
| (or even delivering a different signal instead). |
| .PP |
| If the |
| .B PTRACE_O_TRACEEXEC |
| option is not in effect, all successful calls to |
| .BR execve (2) |
| by the traced process will cause it to be sent a |
| .B SIGTRAP |
| signal, |
| giving the parent a chance to gain control before the new program |
| begins execution. |
| .PP |
| When the tracer is finished tracing, it can cause the tracee to continue |
| executing in a normal, untraced mode via |
| .BR PTRACE_DETACH . |
| .PP |
| The value of |
| .I request |
| determines the action to be performed: |
| .TP |
| .B PTRACE_TRACEME |
| Indicate that this process is to be traced by its parent. |
| A process probably shouldn't make this request if its parent |
| isn't expecting to trace it. |
| .RI ( pid , |
| .IR addr , |
| and |
| .IR data |
| are ignored.) |
| .IP |
| The |
| .B PTRACE_TRACEME |
| request is used only by the tracee; |
| the remaining requests are used only by the tracer. |
| In the following requests, |
| .I pid |
| specifies the thread ID of the tracee to be acted on. |
| For requests other than |
| .BR PTRACE_ATTACH , |
| .BR PTRACE_SEIZE , |
| .BR PTRACE_INTERRUPT , |
| and |
| .BR PTRACE_KILL , |
| the tracee must be stopped. |
| .TP |
| .BR PTRACE_PEEKTEXT ", " PTRACE_PEEKDATA |
| Read a word at the address |
| .I addr |
| in the tracee's memory, returning the word as the result of the |
| .BR ptrace () |
| call. |
| Linux does not have separate text and data address spaces, |
| so these two requests are currently equivalent. |
| .RI ( data |
| is ignored; but see NOTES.) |
| .TP |
| .B PTRACE_PEEKUSER |
| .\" PTRACE_PEEKUSR in kernel source, but glibc uses PTRACE_PEEKUSER, |
| .\" and that is the name that seems common on other systems. |
| Read a word at offset |
| .I addr |
| in the tracee's USER area, |
| which holds the registers and other information about the process |
| (see |
| .IR <sys/user.h> ). |
| The word is returned as the result of the |
| .BR ptrace () |
| call. |
| Typically, the offset must be word-aligned, though this might vary by |
| architecture. |
| See NOTES. |
| .RI ( data |
| is ignored; but see NOTES.) |
| .TP |
| .BR PTRACE_POKETEXT ", " PTRACE_POKEDATA |
| Copy the word |
| .I data |
| to the address |
| .I addr |
| in the tracee's memory. |
| As for |
| .BR PTRACE_PEEKTEXT |
| and |
| .BR PTRACE_PEEKDATA , |
| these two requests are currently equivalent. |
| .TP |
| .B PTRACE_POKEUSER |
| .\" PTRACE_POKEUSR in kernel source, but glibc uses PTRACE_POKEUSER, |
| .\" and that is the name that seems common on other systems. |
| Copy the word |
| .I data |
| to offset |
| .I addr |
| in the tracee's USER area. |
| As for |
| .BR PTRACE_PEEKUSER , |
| the offset must typically be word-aligned. |
| In order to maintain the integrity of the kernel, |
| some modifications to the USER area are disallowed. |
| .\" FIXME In the preceding sentence, which modifications are disallowed, |
| .\" and when they are disallowed, how does user space discover that fact? |
| .TP |
| .BR PTRACE_GETREGS ", " PTRACE_GETFPREGS |
| Copy the tracee's general-purpose or floating-point registers, |
| respectively, to the address |
| .I data |
| in the tracer. |
| See |
| .I <sys/user.h> |
| for information on the format of this data. |
| .RI ( addr |
| is ignored.) |
| Note that SPARC systems have the meaning of |
| .I data |
| and |
| .I addr |
| reversed; that is, |
| .I data |
| is ignored and the registers are copied to the address |
| .IR addr . |
| .B PTRACE_GETREGS |
| and |
| .B PTRACE_GETFPREGS |
| are not present on all architectures. |
| .TP |
| .BR PTRACE_GETREGSET " (since Linux 2.6.34)" |
| Read the tracee's registers. |
| .I addr |
| specifies, in an architecture-dependent way, the type of registers to be read. |
| .B NT_PRSTATUS |
| (with numerical value 1) |
| usually results in reading of general-purpose registers. |
| If the CPU has, for example, |
| floating-point and/or vector registers, they can be retrieved by setting |
| .I addr |
| to the corresponding |
| .B NT_foo |
| constant. |
| .I data |
| points to a |
| .BR "struct iovec" , |
| which describes the destination buffer's location and length. |
| On return, the kernel modifies |
| .B iov.len |
| to indicate the actual number of bytes returned. |
| .TP |
| .BR PTRACE_SETREGS ", " PTRACE_SETFPREGS |
| Modify the tracee's general-purpose or floating-point registers, |
| respectively, from the address |
| .I data |
| in the tracer. |
| As for |
| .BR PTRACE_POKEUSER , |
| some general-purpose register modifications may be disallowed. |
| .\" FIXME . In the preceding sentence, which modifications are disallowed, |
| .\" and when they are disallowed, how does user space discover that fact? |
| .RI ( addr |
| is ignored.) |
| Note that SPARC systems have the meaning of |
| .I data |
| and |
| .I addr |
| reversed; that is, |
| .I data |
| is ignored and the registers are copied from the address |
| .IR addr . |
| .B PTRACE_SETREGS |
| and |
| .B PTRACE_SETFPREGS |
| are not present on all architectures. |
| .TP |
| .BR PTRACE_SETREGSET " (since Linux 2.6.34)" |
| Modify the tracee's registers. |
| The meaning of |
| .I addr |
| and |
| .I data |
| is analogous to |
| .BR PTRACE_GETREGSET . |
| .TP |
| .BR PTRACE_GETSIGINFO " (since Linux 2.3.99-pre6)" |
| Retrieve information about the signal that caused the stop. |
| Copy a |
| .I siginfo_t |
| structure (see |
| .BR sigaction (2)) |
| from the tracee to the address |
| .I data |
| in the tracer. |
| .RI ( addr |
| is ignored.) |
| .TP |
| .BR PTRACE_SETSIGINFO " (since Linux 2.3.99-pre6)" |
| Set signal information: |
| copy a |
| .I siginfo_t |
| structure from the address |
| .I data |
| in the tracer to the tracee. |
| This will affect only signals that would normally be delivered to |
| the tracee and were caught by the tracer. |
| It may be difficult to tell |
| these normal signals from synthetic signals generated by |
| .BR ptrace () |
| itself. |
| .RI ( addr |
| is ignored.) |
| .TP |
| .BR PTRACE_PEEKSIGINFO " (since Linux 3.10)" |
| .\" commit 84c751bd4aebbaae995fe32279d3dba48327bad4 |
| Retrieve |
| .I siginfo_t |
| structures without removing signals from a queue. |
| .I addr |
| points to a |
| .I ptrace_peeksiginfo_args |
| structure that specifies the ordinal position from which |
| copying of signals should start, |
| and the number of signals to copy. |
| .I siginfo_t |
| structures are copied into the buffer pointed to by |
| .IR data . |
| The return value contains the number of copied signals (zero indicates |
| that there is no signal corresponding to the specified ordinal position). |
| Within the returned |
| .I siginfo |
| structures, |
| the |
| .IR si_code |
| field includes information |
| .RB ( __SI_CHLD , |
| .BR __SI_FAULT , |
| etc.) that are not otherwise exposed to user space. |
| .PP |
| .in +4n |
| .EX |
| struct ptrace_peeksiginfo_args { |
| u64 off; /* Ordinal position in queue at which |
| to start copying signals */ |
| u32 flags; /* PTRACE_PEEKSIGINFO_SHARED or 0 */ |
| s32 nr; /* Number of signals to copy */ |
| }; |
| .EE |
| .in |
| .IP |
| Currently, there is only one flag, |
| .BR PTRACE_PEEKSIGINFO_SHARED , |
| for dumping signals from the process-wide signal queue. |
| If this flag is not set, |
| signals are read from the per-thread queue of the specified thread. |
| .in |
| .PP |
| .TP |
| .BR PTRACE_GETSIGMASK " (since Linux 3.11)" |
| .\" commit 29000caecbe87b6b66f144f72111f0d02fbbf0c1 |
| Place a copy of the mask of blocked signals (see |
| .BR sigprocmask (2)) |
| in the buffer pointed to by |
| .IR data , |
| which should be a pointer to a buffer of type |
| .IR sigset_t . |
| The |
| .I addr |
| argument contains the size of the buffer pointed to by |
| .IR data |
| (i.e., |
| .IR sizeof(sigset_t) ). |
| .TP |
| .BR PTRACE_SETSIGMASK " (since Linux 3.11)" |
| Change the mask of blocked signals (see |
| .BR sigprocmask (2)) |
| to the value specified in the buffer pointed to by |
| .IR data , |
| which should be a pointer to a buffer of type |
| .IR sigset_t . |
| The |
| .I addr |
| argument contains the size of the buffer pointed to by |
| .IR data |
| (i.e., |
| .IR sizeof(sigset_t) ). |
| .TP |
| .BR PTRACE_SETOPTIONS " (since Linux 2.4.6; see BUGS for caveats)" |
| Set ptrace options from |
| .IR data . |
| .RI ( addr |
| is ignored.) |
| .IR data |
| is interpreted as a bit mask of options, |
| which are specified by the following flags: |
| .RS |
| .TP |
| .BR PTRACE_O_EXITKILL " (since Linux 3.8)" |
| .\" commit 992fb6e170639b0849bace8e49bf31bd37c4123 |
| Send a |
| .B SIGKILL |
| signal to the tracee if the tracer exits. |
| This option is useful for ptrace jailers that |
| want to ensure that tracees can never escape the tracer's control. |
| .TP |
| .BR PTRACE_O_TRACECLONE " (since Linux 2.5.46)" |
| Stop the tracee at the next |
| .BR clone (2) |
| and automatically start tracing the newly cloned process, |
| which will start with a |
| .BR SIGSTOP , |
| or |
| .B PTRACE_EVENT_STOP |
| if |
| .B PTRACE_SEIZE |
| was used. |
| A |
| .BR waitpid (2) |
| by the tracer will return a |
| .I status |
| value such that |
| .IP |
| .nf |
| status>>8 == (SIGTRAP | (PTRACE_EVENT_CLONE<<8)) |
| .fi |
| .IP |
| The PID of the new process can be retrieved with |
| .BR PTRACE_GETEVENTMSG . |
| .IP |
| This option may not catch |
| .BR clone (2) |
| calls in all cases. |
| If the tracee calls |
| .BR clone (2) |
| with the |
| .B CLONE_VFORK |
| flag, |
| .B PTRACE_EVENT_VFORK |
| will be delivered instead |
| if |
| .B PTRACE_O_TRACEVFORK |
| is set; otherwise if the tracee calls |
| .BR clone (2) |
| with the exit signal set to |
| .BR SIGCHLD , |
| .B PTRACE_EVENT_FORK |
| will be delivered if |
| .B PTRACE_O_TRACEFORK |
| is set. |
| .TP |
| .BR PTRACE_O_TRACEEXEC " (since Linux 2.5.46)" |
| Stop the tracee at the next |
| .BR execve (2). |
| A |
| .BR waitpid (2) |
| by the tracer will return a |
| .I status |
| value such that |
| .IP |
| .nf |
| status>>8 == (SIGTRAP | (PTRACE_EVENT_EXEC<<8)) |
| .fi |
| .IP |
| If the execing thread is not a thread group leader, |
| the thread ID is reset to thread group leader's ID before this stop. |
| Since Linux 3.0, the former thread ID can be retrieved with |
| .BR PTRACE_GETEVENTMSG . |
| .TP |
| .BR PTRACE_O_TRACEEXIT " (since Linux 2.5.60)" |
| Stop the tracee at exit. |
| A |
| .BR waitpid (2) |
| by the tracer will return a |
| .I status |
| value such that |
| .IP |
| .nf |
| status>>8 == (SIGTRAP | (PTRACE_EVENT_EXIT<<8)) |
| .fi |
| .IP |
| The tracee's exit status can be retrieved with |
| .BR PTRACE_GETEVENTMSG . |
| .IP |
| The tracee is stopped early during process exit, |
| when registers are still available, |
| allowing the tracer to see where the exit occurred, |
| whereas the normal exit notification is done after the process |
| is finished exiting. |
| Even though context is available, |
| the tracer cannot prevent the exit from happening at this point. |
| .TP |
| .BR PTRACE_O_TRACEFORK " (since Linux 2.5.46)" |
| Stop the tracee at the next |
| .BR fork (2) |
| and automatically start tracing the newly forked process, |
| which will start with a |
| .BR SIGSTOP , |
| or |
| .B PTRACE_EVENT_STOP |
| if |
| .B PTRACE_SEIZE |
| was used. |
| A |
| .BR waitpid (2) |
| by the tracer will return a |
| .I status |
| value such that |
| .IP |
| .nf |
| status>>8 == (SIGTRAP | (PTRACE_EVENT_FORK<<8)) |
| .fi |
| .IP |
| The PID of the new process can be retrieved with |
| .BR PTRACE_GETEVENTMSG . |
| .TP |
| .BR PTRACE_O_TRACESYSGOOD " (since Linux 2.4.6)" |
| When delivering system call traps, set bit 7 in the signal number |
| (i.e., deliver |
| .IR "SIGTRAP|0x80" ). |
| This makes it easy for the tracer to distinguish |
| normal traps from those caused by a system call. |
| .TP |
| .BR PTRACE_O_TRACEVFORK " (since Linux 2.5.46)" |
| Stop the tracee at the next |
| .BR vfork (2) |
| and automatically start tracing the newly vforked process, |
| which will start with a |
| .BR SIGSTOP , |
| or |
| .B PTRACE_EVENT_STOP |
| if |
| .B PTRACE_SEIZE |
| was used. |
| A |
| .BR waitpid (2) |
| by the tracer will return a |
| .I status |
| value such that |
| .IP |
| .nf |
| status>>8 == (SIGTRAP | (PTRACE_EVENT_VFORK<<8)) |
| .fi |
| .IP |
| The PID of the new process can be retrieved with |
| .BR PTRACE_GETEVENTMSG . |
| .TP |
| .BR PTRACE_O_TRACEVFORKDONE " (since Linux 2.5.60)" |
| Stop the tracee at the completion of the next |
| .BR vfork (2). |
| A |
| .BR waitpid (2) |
| by the tracer will return a |
| .I status |
| value such that |
| .IP |
| .nf |
| status>>8 == (SIGTRAP | (PTRACE_EVENT_VFORK_DONE<<8)) |
| .fi |
| .IP |
| The PID of the new process can (since Linux 2.6.18) be retrieved with |
| .BR PTRACE_GETEVENTMSG . |
| .TP |
| .BR PTRACE_O_TRACESECCOMP " (since Linux 3.5)" |
| Stop the tracee when a |
| .BR seccomp (2) |
| .BR SECCOMP_RET_TRACE |
| rule is triggered. |
| A |
| .BR waitpid (2) |
| by the tracer will return a |
| .I status |
| value such that |
| .IP |
| .nf |
| status>>8 == (SIGTRAP | (PTRACE_EVENT_SECCOMP<<8)) |
| .fi |
| .IP |
| While this triggers a |
| .BR PTRACE_EVENT |
| stop, it is similar to a syscall-enter-stop. |
| For details, see the note on |
| .B PTRACE_EVENT_SECCOMP |
| below. |
| The seccomp event message data (from the |
| .BR SECCOMP_RET_DATA |
| portion of the seccomp filter rule) can be retrieved with |
| .BR PTRACE_GETEVENTMSG . |
| .TP |
| .BR PTRACE_O_SUSPEND_SECCOMP " (since Linux 4.3)" |
| .\" commit 13c4a90119d28cfcb6b5bdd820c233b86c2b0237 |
| Suspend the tracee's seccomp protections. |
| This applies regardless of mode, and |
| can be used when the tracee has not yet installed seccomp filters. |
| That is, a valid use case is to suspend a tracee's seccomp protections |
| before they are installed by the tracee, |
| let the tracee install the filters, |
| and then clear this flag when the filters should be resumed. |
| Setting this option requires that the tracer have the |
| .BR CAP_SYS_ADMIN |
| capability, |
| not have any seccomp protections installed, and not have |
| .BR PTRACE_O_SUSPEND_SECCOMP |
| set on itself. |
| .RE |
| .TP |
| .BR PTRACE_GETEVENTMSG " (since Linux 2.5.46)" |
| Retrieve a message (as an |
| .IR "unsigned long" ) |
| about the ptrace event |
| that just happened, placing it at the address |
| .I data |
| in the tracer. |
| For |
| .BR PTRACE_EVENT_EXIT , |
| this is the tracee's exit status. |
| For |
| .BR PTRACE_EVENT_FORK , |
| .BR PTRACE_EVENT_VFORK , |
| .BR PTRACE_EVENT_VFORK_DONE , |
| and |
| .BR PTRACE_EVENT_CLONE , |
| this is the PID of the new process. |
| For |
| .BR PTRACE_EVENT_SECCOMP , |
| this is the |
| .BR seccomp (2) |
| filter's |
| .BR SECCOMP_RET_DATA |
| associated with the triggered rule. |
| .RI ( addr |
| is ignored.) |
| .TP |
| .B PTRACE_CONT |
| Restart the stopped tracee process. |
| If |
| .I data |
| is nonzero, |
| it is interpreted as the number of a signal to be delivered to the tracee; |
| otherwise, no signal is delivered. |
| Thus, for example, the tracer can control |
| whether a signal sent to the tracee is delivered or not. |
| .RI ( addr |
| is ignored.) |
| .TP |
| .BR PTRACE_SYSCALL ", " PTRACE_SINGLESTEP |
| Restart the stopped tracee as for |
| .BR PTRACE_CONT , |
| but arrange for the tracee to be stopped at |
| the next entry to or exit from a system call, |
| or after execution of a single instruction, respectively. |
| (The tracee will also, as usual, be stopped upon receipt of a signal.) |
| From the tracer's perspective, the tracee will appear to have been |
| stopped by receipt of a |
| .BR SIGTRAP . |
| So, for |
| .BR PTRACE_SYSCALL , |
| for example, the idea is to inspect |
| the arguments to the system call at the first stop, |
| then do another |
| .B PTRACE_SYSCALL |
| and inspect the return value of the system call at the second stop. |
| The |
| .I data |
| argument is treated as for |
| .BR PTRACE_CONT . |
| .RI ( addr |
| is ignored.) |
| .TP |
| .BR PTRACE_SYSEMU ", " PTRACE_SYSEMU_SINGLESTEP " (since Linux 2.6.14)" |
| For |
| .BR PTRACE_SYSEMU , |
| continue and stop on entry to the next system call, |
| which will not be executed. |
| See the documentation on syscall-stops below. |
| For |
| .BR PTRACE_SYSEMU_SINGLESTEP , |
| do the same but also singlestep if not a system call. |
| This call is used by programs like |
| User Mode Linux that want to emulate all the tracee's system calls. |
| The |
| .I data |
| argument is treated as for |
| .BR PTRACE_CONT . |
| The |
| .I addr |
| argument is ignored. |
| These requests are currently |
| .\" As at 3.7 |
| supported only on x86. |
| .TP |
| .BR PTRACE_LISTEN " (since Linux 3.4)" |
| Restart the stopped tracee, but prevent it from executing. |
| The resulting state of the tracee is similar to a process which |
| has been stopped by a |
| .B SIGSTOP |
| (or other stopping signal). |
| See the "group-stop" subsection for additional information. |
| .B PTRACE_LISTEN |
| works only on tracees attached by |
| .BR PTRACE_SEIZE . |
| .TP |
| .B PTRACE_KILL |
| Send the tracee a |
| .B SIGKILL |
| to terminate it. |
| .RI ( addr |
| and |
| .I data |
| are ignored.) |
| .IP |
| .I This operation is deprecated; do not use it! |
| Instead, send a |
| .BR SIGKILL |
| directly using |
| .BR kill (2) |
| or |
| .BR tgkill (2). |
| The problem with |
| .B PTRACE_KILL |
| is that it requires the tracee to be in signal-delivery-stop, |
| otherwise it may not work |
| (i.e., may complete successfully but won't kill the tracee). |
| By contrast, sending a |
| .B SIGKILL |
| directly has no such limitation. |
| .\" [Note from Denys Vlasenko: |
| .\" deprecation suggested by Oleg Nesterov. He prefers to deprecate it |
| .\" instead of describing (and needing to support) PTRACE_KILL's quirks.] |
| .TP |
| .BR PTRACE_INTERRUPT " (since Linux 3.4)" |
| Stop a tracee. |
| If the tracee is running or sleeping in kernel space and |
| .B PTRACE_SYSCALL |
| is in effect, |
| the system call is interrupted and syscall-exit-stop is reported. |
| (The interrupted system call is restarted when the tracee is restarted.) |
| If the tracee was already stopped by a signal and |
| .B PTRACE_LISTEN |
| was sent to it, |
| the tracee stops with |
| .B PTRACE_EVENT_STOP |
| and |
| .I WSTOPSIG(status) |
| returns the stop signal. |
| If any other ptrace-stop is generated at the same time (for example, |
| if a signal is sent to the tracee), this ptrace-stop happens. |
| If none of the above applies (for example, if the tracee is running in user |
| space), it stops with |
| .B PTRACE_EVENT_STOP |
| with |
| .I WSTOPSIG(status) |
| == |
| .BR SIGTRAP . |
| .B PTRACE_INTERRUPT |
| only works on tracees attached by |
| .BR PTRACE_SEIZE . |
| .TP |
| .B PTRACE_ATTACH |
| Attach to the process specified in |
| .IR pid , |
| making it a tracee of the calling process. |
| .\" No longer true (removed by Denys Vlasenko, 2011, who remarks: |
| .\" "I think it isn't true in non-ancient 2.4 and in 2.6/3.x. |
| .\" Basically, it's not true for any Linux in practical use. |
| .\" ; the behavior of the tracee is as if it had done a |
| .\" .BR PTRACE_TRACEME . |
| .\" The calling process actually becomes the parent of the tracee |
| .\" process for most purposes (e.g., it will receive |
| .\" notification of tracee events and appears in |
| .\" .BR ps (1) |
| .\" output as the tracee's parent), but a |
| .\" .BR getppid (2) |
| .\" by the tracee will still return the PID of the original parent. |
| The tracee is sent a |
| .BR SIGSTOP , |
| but will not necessarily have stopped |
| by the completion of this call; use |
| .BR waitpid (2) |
| to wait for the tracee to stop. |
| See the "Attaching and detaching" subsection for additional information. |
| .RI ( addr |
| and |
| .I data |
| are ignored.) |
| .IP |
| Permission to perform a |
| .BR PTRACE_ATTACH |
| is governed by a ptrace access mode |
| .B PTRACE_MODE_ATTACH_REALCREDS |
| check; see below. |
| .TP |
| .BR PTRACE_SEIZE " (since Linux 3.4)" |
| .\" |
| .\" Noted by Dmitry Levin: |
| .\" |
| .\" PTRACE_SEIZE was introduced by commit v3.1-rc1~308^2~28, but |
| .\" it had to be used along with a temporary flag PTRACE_SEIZE_DEVEL, |
| .\" which was removed later by commit v3.4-rc1~109^2~20. |
| .\" |
| .\" That is, [before] v3.4 we had a test mode of PTRACE_SEIZE API, |
| .\" which was not compatible with the current PTRACE_SEIZE API introduced |
| .\" in Linux 3.4. |
| .\" |
| Attach to the process specified in |
| .IR pid , |
| making it a tracee of the calling process. |
| Unlike |
| .BR PTRACE_ATTACH , |
| .B PTRACE_SEIZE |
| does not stop the process. |
| Group-stops are reported as |
| .B PTRACE_EVENT_STOP |
| and |
| .I WSTOPSIG(status) |
| returns the stop signal. |
| Automatically attached children stop with |
| .B PTRACE_EVENT_STOP |
| and |
| .I WSTOPSIG(status) |
| returns |
| .B SIGTRAP |
| instead of having |
| .B SIGSTOP |
| signal delivered to them. |
| .BR execve (2) |
| does not deliver an extra |
| .BR SIGTRAP . |
| Only a |
| .BR PTRACE_SEIZE d |
| process can accept |
| .B PTRACE_INTERRUPT |
| and |
| .B PTRACE_LISTEN |
| commands. |
| The "seized" behavior just described is inherited by |
| children that are automatically attached using |
| .BR PTRACE_O_TRACEFORK , |
| .BR PTRACE_O_TRACEVFORK , |
| and |
| .BR PTRACE_O_TRACECLONE . |
| .I addr |
| must be zero. |
| .I data |
| contains a bit mask of ptrace options to activate immediately. |
| .IP |
| Permission to perform a |
| .BR PTRACE_SEIZE |
| is governed by a ptrace access mode |
| .B PTRACE_MODE_ATTACH_REALCREDS |
| check; see below. |
| .\" |
| .TP |
| .BR PTRACE_SECCOMP_GET_FILTER " (since Linux 4.4)" |
| .\" commit f8e529ed941ba2bbcbf310b575d968159ce7e895 |
| This operation allows the tracer to dump the tracee's |
| classic BPF filters. |
| .IP |
| .I addr |
| is an integer specifying the index of the filter to be dumped. |
| The most recently installed filter has the index 0. |
| If |
| .I addr |
| is greater than the number of installed filters, |
| the operation fails with the error |
| .BR ENOENT . |
| .IP |
| .I data |
| is either a pointer to a |
| .IR "struct sock_filter" |
| array that is large enough to store the BPF program, |
| or NULL if the program is not to be stored. |
| .IP |
| Upon success, |
| the return value is the number of instructions in the BPF program. |
| If |
| .I data |
| was NULL, then this return value can be used to correctly size the |
| .IR "struct sock_filter" |
| array passed in a subsequent call. |
| .IP |
| This operation fails with the error |
| .B EACCES |
| if the caller does not have the |
| .B CAP_SYS_ADMIN |
| capability or if the caller is in strict or filter seccomp mode. |
| If the filter referred to by |
| .I addr |
| is not a classic BPF filter, the operation fails with the error |
| .BR EMEDIUMTYPE . |
| .IP |
| This operation is available if the kernel was configured with both the |
| .B CONFIG_SECCOMP_FILTER |
| and the |
| .B CONFIG_CHECKPOINT_RESTORE |
| options. |
| .TP |
| .B PTRACE_DETACH |
| Restart the stopped tracee as for |
| .BR PTRACE_CONT , |
| but first detach from it. |
| Under Linux, a tracee can be detached in this way regardless |
| of which method was used to initiate tracing. |
| .RI ( addr |
| is ignored.) |
| .\" |
| .TP |
| .BR PTRACE_GET_THREAD_AREA " (since Linux 2.6.0)" |
| This operation performs a similar task to |
| .BR get_thread_area (2). |
| It reads the TLS entry in the GDT whose index is given in |
| .IR addr , |
| placing a copy of the entry into the |
| .IR "struct user_desc" |
| pointed to by |
| .IR data . |
| (By contrast with |
| .BR get_thread_area (2), |
| the |
| .I entry_number |
| of the |
| .IR "struct user_desc" |
| is ignored.) |
| .TP |
| .BR PTRACE_SET_THREAD_AREA " (since Linux 2.6.0)" |
| This operation performs a similar task to |
| .BR set_thread_area (2). |
| It sets the TLS entry in the GDT whose index is given in |
| .IR addr , |
| assigning it the data supplied in the |
| .IR "struct user_desc" |
| pointed to by |
| .IR data . |
| (By contrast with |
| .BR set_thread_area (2), |
| the |
| .I entry_number |
| of the |
| .IR "struct user_desc" |
| is ignored; in other words, |
| this ptrace operation can't be used to allocate a free TLS entry.) |
| .TP |
| .BR PTRACE_GET_SYSCALL_INFO " (since Linux 5.3)" |
| .\" commit 201766a20e30f982ccfe36bebfad9602c3ff574a |
| Retrieve information about the system call that caused the stop. |
| The information is placed into the buffer pointed by the |
| .I data |
| argument, which should be a pointer to a buffer of type |
| .IR "struct ptrace_syscall_info" . |
| The |
| .I addr |
| argument contains the size of the buffer pointed to |
| by the |
| .I data |
| argument (i.e., |
| .IR "sizeof(struct ptrace_syscall_info)" ). |
| The return value contains the number of bytes available |
| to be written by the kernel. |
| If the size of the data to be written by the kernel exceeds the size |
| specified by the |
| .I addr |
| argument, the output data is truncated. |
| .IP |
| The |
| .I ptrace_syscall_info |
| structure contains the following fields: |
| .IP |
| .in +2n |
| .EX |
| struct ptrace_syscall_info { |
| __u8 op; /* Type of system call stop */ |
| __u32 arch; /* AUDIT_ARCH_* value; see seccomp(2) */ |
| __u64 instruction_pointer; /* CPU instruction pointer */ |
| __u64 stack_pointer; /* CPU stack pointer */ |
| union { |
| struct { /* op == PTRACE_SYSCALL_INFO_ENTRY */ |
| __u64 nr; /* System call number */ |
| __u64 args[6]; /* System call arguments */ |
| } entry; |
| struct { /* op == PTRACE_SYSCALL_INFO_EXIT */ |
| __s64 rval; /* System call return value */ |
| __u8 is_error; /* System call error flag; |
| Boolean: does rval contain |
| an error value (\-ERRCODE) or |
| a nonerror return value? */ |
| } exit; |
| struct { /* op == PTRACE_SYSCALL_INFO_SECCOMP */ |
| __u64 nr; /* System call number */ |
| __u64 args[6]; /* System call arguments */ |
| __u32 ret_data; /* SECCOMP_RET_DATA portion |
| of SECCOMP_RET_TRACE |
| return value */ |
| } seccomp; |
| }; |
| }; |
| .EE |
| .in |
| .IP |
| The |
| .IR op , |
| .IR arch , |
| .IR instruction_pointer , |
| and |
| .I stack_pointer |
| fields are defined for all kinds of ptrace system call stops. |
| The rest of the structure is a union; one should read only those fields |
| that are meaningful for the kind of system call stop specified by the |
| .IR op |
| field. |
| .IP |
| The |
| .I op |
| field has one of the following values (defined in |
| .IR <linux/ptrace.h>) |
| indicating what type of stop occurred and |
| which part of the union is filled: |
| .RS |
| .TP |
| .BR PTRACE_SYSCALL_INFO_ENTRY |
| The |
| .I entry |
| component of the union contains information relating to a |
| system call entry stop. |
| .TP |
| .BR PTRACE_SYSCALL_INFO_EXIT |
| The |
| .I exit |
| component of the union contains information relating to a |
| system call exit stop. |
| .TP |
| .BR PTRACE_SYSCALL_INFO_SECCOMP |
| The |
| .I seccomp |
| component of the union contains information relating to a |
| .B PTRACE_EVENT_SECCOMP |
| stop. |
| .TP |
| .BR PTRACE_SYSCALL_INFO_NONE |
| No component of the union contains relevant information. |
| .RE |
| .\" |
| .SS Death under ptrace |
| When a (possibly multithreaded) process receives a killing signal |
| (one whose disposition is set to |
| .B SIG_DFL |
| and whose default action is to kill the process), |
| all threads exit. |
| Tracees report their death to their tracer(s). |
| Notification of this event is delivered via |
| .BR waitpid (2). |
| .PP |
| Note that the killing signal will first cause signal-delivery-stop |
| (on one tracee only), |
| and only after it is injected by the tracer |
| (or after it was dispatched to a thread which isn't traced), |
| will death from the signal happen on |
| .I all |
| tracees within a multithreaded process. |
| (The term "signal-delivery-stop" is explained below.) |
| .PP |
| .B SIGKILL |
| does not generate signal-delivery-stop and |
| therefore the tracer can't suppress it. |
| .B SIGKILL |
| kills even within system calls |
| (syscall-exit-stop is not generated prior to death by |
| .BR SIGKILL ). |
| The net effect is that |
| .B SIGKILL |
| always kills the process (all its threads), |
| even if some threads of the process are ptraced. |
| .PP |
| When the tracee calls |
| .BR _exit (2), |
| it reports its death to its tracer. |
| Other threads are not affected. |
| .PP |
| When any thread executes |
| .BR exit_group (2), |
| every tracee in its thread group reports its death to its tracer. |
| .PP |
| If the |
| .B PTRACE_O_TRACEEXIT |
| option is on, |
| .B PTRACE_EVENT_EXIT |
| will happen before actual death. |
| This applies to exits via |
| .BR exit (2), |
| .BR exit_group (2), |
| and signal deaths (except |
| .BR SIGKILL , |
| depending on the kernel version; see BUGS below), |
| and when threads are torn down on |
| .BR execve (2) |
| in a multithreaded process. |
| .PP |
| The tracer cannot assume that the ptrace-stopped tracee exists. |
| There are many scenarios when the tracee may die while stopped (such as |
| .BR SIGKILL ). |
| Therefore, the tracer must be prepared to handle an |
| .B ESRCH |
| error on any ptrace operation. |
| Unfortunately, the same error is returned if the tracee |
| exists but is not ptrace-stopped |
| (for commands which require a stopped tracee), |
| or if it is not traced by the process which issued the ptrace call. |
| The tracer needs to keep track of the stopped/running state of the tracee, |
| and interpret |
| .B ESRCH |
| as "tracee died unexpectedly" only if it knows that the tracee has |
| been observed to enter ptrace-stop. |
| Note that there is no guarantee that |
| .I waitpid(WNOHANG) |
| will reliably report the tracee's death status if a |
| ptrace operation returned |
| .BR ESRCH . |
| .I waitpid(WNOHANG) |
| may return 0 instead. |
| In other words, the tracee may be "not yet fully dead", |
| but already refusing ptrace requests. |
| .PP |
| The tracer can't assume that the tracee |
| .I always |
| ends its life by reporting |
| .I WIFEXITED(status) |
| or |
| .IR WIFSIGNALED(status) ; |
| there are cases where this does not occur. |
| For example, if a thread other than thread group leader does an |
| .BR execve (2), |
| it disappears; |
| its PID will never be seen again, |
| and any subsequent ptrace stops will be reported under |
| the thread group leader's PID. |
| .SS Stopped states |
| A tracee can be in two states: running or stopped. |
| For the purposes of ptrace, a tracee which is blocked in a system call |
| (such as |
| .BR read (2), |
| .BR pause (2), |
| etc.) |
| is nevertheless considered to be running, even if the tracee is blocked |
| for a long time. |
| The state of the tracee after |
| .BR PTRACE_LISTEN |
| is somewhat of a gray area: it is not in any ptrace-stop (ptrace commands |
| won't work on it, and it will deliver |
| .BR waitpid (2) |
| notifications), |
| but it also may be considered "stopped" because |
| it is not executing instructions (is not scheduled), and if it was |
| in group-stop before |
| .BR PTRACE_LISTEN , |
| it will not respond to signals until |
| .B SIGCONT |
| is received. |
| .PP |
| There are many kinds of states when the tracee is stopped, and in ptrace |
| discussions they are often conflated. |
| Therefore, it is important to use precise terms. |
| .PP |
| In this manual page, any stopped state in which the tracee is ready |
| to accept ptrace commands from the tracer is called |
| .IR ptrace-stop . |
| Ptrace-stops can |
| be further subdivided into |
| .IR signal-delivery-stop , |
| .IR group-stop , |
| .IR syscall-stop , |
| .IR "PTRACE_EVENT stops" , |
| and so on. |
| These stopped states are described in detail below. |
| .PP |
| When the running tracee enters ptrace-stop, it notifies its tracer using |
| .BR waitpid (2) |
| (or one of the other "wait" system calls). |
| Most of this manual page assumes that the tracer waits with: |
| .PP |
| pid = waitpid(pid_or_minus_1, &status, __WALL); |
| .PP |
| Ptrace-stopped tracees are reported as returns with |
| .I pid |
| greater than 0 and |
| .I WIFSTOPPED(status) |
| true. |
| .\" Denys Vlasenko: |
| .\" Do we require __WALL usage, or will just using 0 be ok? (With 0, |
| .\" I am not 100% sure there aren't ugly corner cases.) Are the |
| .\" rules different if user wants to use waitid? Will waitid require |
| .\" WEXITED? |
| .\" |
| .PP |
| The |
| .B __WALL |
| flag does not include the |
| .B WSTOPPED |
| and |
| .B WEXITED |
| flags, but implies their functionality. |
| .PP |
| Setting the |
| .B WCONTINUED |
| flag when calling |
| .BR waitpid (2) |
| is not recommended: the "continued" state is per-process and |
| consuming it can confuse the real parent of the tracee. |
| .PP |
| Use of the |
| .B WNOHANG |
| flag may cause |
| .BR waitpid (2) |
| to return 0 ("no wait results available yet") |
| even if the tracer knows there should be a notification. |
| Example: |
| .PP |
| .in +4n |
| .EX |
| errno = 0; |
| ptrace(PTRACE_CONT, pid, 0L, 0L); |
| if (errno == ESRCH) { |
| /* tracee is dead */ |
| r = waitpid(tracee, &status, __WALL | WNOHANG); |
| /* r can still be 0 here! */ |
| } |
| .EE |
| .in |
| .\" FIXME . |
| .\" waitid usage? WNOWAIT? |
| .\" describe how wait notifications queue (or not queue) |
| .PP |
| The following kinds of ptrace-stops exist: signal-delivery-stops, |
| group-stops, |
| .B PTRACE_EVENT |
| stops, syscall-stops. |
| They all are reported by |
| .BR waitpid (2) |
| with |
| .I WIFSTOPPED(status) |
| true. |
| They may be differentiated by examining the value |
| .IR status>>8 , |
| and if there is ambiguity in that value, by querying |
| .BR PTRACE_GETSIGINFO . |
| (Note: the |
| .I WSTOPSIG(status) |
| macro can't be used to perform this examination, |
| because it returns the value |
| .IR "(status>>8)\ &\ 0xff" .) |
| .SS Signal-delivery-stop |
| When a (possibly multithreaded) process receives any signal except |
| .BR SIGKILL , |
| the kernel selects an arbitrary thread which handles the signal. |
| (If the signal is generated with |
| .BR tgkill (2), |
| the target thread can be explicitly selected by the caller.) |
| If the selected thread is traced, it enters signal-delivery-stop. |
| At this point, the signal is not yet delivered to the process, |
| and can be suppressed by the tracer. |
| If the tracer doesn't suppress the signal, |
| it passes the signal to the tracee in the next ptrace restart request. |
| This second step of signal delivery is called |
| .I "signal injection" |
| in this manual page. |
| Note that if the signal is blocked, |
| signal-delivery-stop doesn't happen until the signal is unblocked, |
| with the usual exception that |
| .B SIGSTOP |
| can't be blocked. |
| .PP |
| Signal-delivery-stop is observed by the tracer as |
| .BR waitpid (2) |
| returning with |
| .I WIFSTOPPED(status) |
| true, with the signal returned by |
| .IR WSTOPSIG(status) . |
| If the signal is |
| .BR SIGTRAP , |
| this may be a different kind of ptrace-stop; |
| see the "Syscall-stops" and "execve" sections below for details. |
| If |
| .I WSTOPSIG(status) |
| returns a stopping signal, this may be a group-stop; see below. |
| .SS Signal injection and suppression |
| After signal-delivery-stop is observed by the tracer, |
| the tracer should restart the tracee with the call |
| .PP |
| ptrace(PTRACE_restart, pid, 0, sig) |
| .PP |
| where |
| .B PTRACE_restart |
| is one of the restarting ptrace requests. |
| If |
| .I sig |
| is 0, then a signal is not delivered. |
| Otherwise, the signal |
| .I sig |
| is delivered. |
| This operation is called |
| .I "signal injection" |
| in this manual page, to distinguish it from signal-delivery-stop. |
| .PP |
| The |
| .I sig |
| value may be different from the |
| .I WSTOPSIG(status) |
| value: the tracer can cause a different signal to be injected. |
| .PP |
| Note that a suppressed signal still causes system calls to return |
| prematurely. |
| In this case, system calls will be restarted: the tracer will |
| observe the tracee to reexecute the interrupted system call (or |
| .BR restart_syscall (2) |
| system call for a few system calls which use a different mechanism |
| for restarting) if the tracer uses |
| .BR PTRACE_SYSCALL . |
| Even system calls (such as |
| .BR poll (2)) |
| which are not restartable after signal are restarted after |
| signal is suppressed; |
| however, kernel bugs exist which cause some system calls to fail with |
| .B EINTR |
| even though no observable signal is injected to the tracee. |
| .PP |
| Restarting ptrace commands issued in ptrace-stops other than |
| signal-delivery-stop are not guaranteed to inject a signal, even if |
| .I sig |
| is nonzero. |
| No error is reported; a nonzero |
| .I sig |
| may simply be ignored. |
| Ptrace users should not try to "create a new signal" this way: use |
| .BR tgkill (2) |
| instead. |
| .PP |
| The fact that signal injection requests may be ignored |
| when restarting the tracee after |
| ptrace stops that are not signal-delivery-stops |
| is a cause of confusion among ptrace users. |
| One typical scenario is that the tracer observes group-stop, |
| mistakes it for signal-delivery-stop, restarts the tracee with |
| .PP |
| ptrace(PTRACE_restart, pid, 0, stopsig) |
| .PP |
| with the intention of injecting |
| .IR stopsig , |
| but |
| .I stopsig |
| gets ignored and the tracee continues to run. |
| .PP |
| The |
| .B SIGCONT |
| signal has a side effect of waking up (all threads of) |
| a group-stopped process. |
| This side effect happens before signal-delivery-stop. |
| The tracer can't suppress this side effect (it can |
| only suppress signal injection, which only causes the |
| .BR SIGCONT |
| handler to not be executed in the tracee, if such a handler is installed). |
| In fact, waking up from group-stop may be followed by |
| signal-delivery-stop for signal(s) |
| .I other than |
| .BR SIGCONT , |
| if they were pending when |
| .B SIGCONT |
| was delivered. |
| In other words, |
| .B SIGCONT |
| may be not the first signal observed by the tracee after it was sent. |
| .PP |
| Stopping signals cause (all threads of) a process to enter group-stop. |
| This side effect happens after signal injection, and therefore can be |
| suppressed by the tracer. |
| .PP |
| In Linux 2.4 and earlier, the |
| .B SIGSTOP |
| signal can't be injected. |
| .\" In the Linux 2.4 sources, in arch/i386/kernel/signal.c::do_signal(), |
| .\" there is: |
| .\" |
| .\" /* The debugger continued. Ignore SIGSTOP. */ |
| .\" if (signr == SIGSTOP) |
| .\" continue; |
| .PP |
| .B PTRACE_GETSIGINFO |
| can be used to retrieve a |
| .I siginfo_t |
| structure which corresponds to the delivered signal. |
| .B PTRACE_SETSIGINFO |
| may be used to modify it. |
| If |
| .B PTRACE_SETSIGINFO |
| has been used to alter |
| .IR siginfo_t , |
| the |
| .I si_signo |
| field and the |
| .I sig |
| parameter in the restarting command must match, |
| otherwise the result is undefined. |
| .SS Group-stop |
| When a (possibly multithreaded) process receives a stopping signal, |
| all threads stop. |
| If some threads are traced, they enter a group-stop. |
| Note that the stopping signal will first cause signal-delivery-stop |
| (on one tracee only), and only after it is injected by the tracer |
| (or after it was dispatched to a thread which isn't traced), |
| will group-stop be initiated on |
| .I all |
| tracees within the multithreaded process. |
| As usual, every tracee reports its group-stop separately |
| to the corresponding tracer. |
| .PP |
| Group-stop is observed by the tracer as |
| .BR waitpid (2) |
| returning with |
| .I WIFSTOPPED(status) |
| true, with the stopping signal available via |
| .IR WSTOPSIG(status) . |
| The same result is returned by some other classes of ptrace-stops, |
| therefore the recommended practice is to perform the call |
| .PP |
| ptrace(PTRACE_GETSIGINFO, pid, 0, &siginfo) |
| .PP |
| The call can be avoided if the signal is not |
| .BR SIGSTOP , |
| .BR SIGTSTP , |
| .BR SIGTTIN , |
| or |
| .BR SIGTTOU ; |
| only these four signals are stopping signals. |
| If the tracer sees something else, it can't be a group-stop. |
| Otherwise, the tracer needs to call |
| .BR PTRACE_GETSIGINFO . |
| If |
| .B PTRACE_GETSIGINFO |
| fails with |
| .BR EINVAL , |
| then it is definitely a group-stop. |
| (Other failure codes are possible, such as |
| .B ESRCH |
| ("no such process") if a |
| .B SIGKILL |
| killed the tracee.) |
| .PP |
| If tracee was attached using |
| .BR PTRACE_SEIZE , |
| group-stop is indicated by |
| .BR PTRACE_EVENT_STOP : |
| .IR "status>>16 == PTRACE_EVENT_STOP" . |
| This allows detection of group-stops |
| without requiring an extra |
| .B PTRACE_GETSIGINFO |
| call. |
| .PP |
| As of Linux 2.6.38, |
| after the tracer sees the tracee ptrace-stop and until it |
| restarts or kills it, the tracee will not run, |
| and will not send notifications (except |
| .B SIGKILL |
| death) to the tracer, even if the tracer enters into another |
| .BR waitpid (2) |
| call. |
| .PP |
| The kernel behavior described in the previous paragraph |
| causes a problem with transparent handling of stopping signals. |
| If the tracer restarts the tracee after group-stop, |
| the stopping signal |
| is effectively ignored\(emthe tracee doesn't remain stopped, it runs. |
| If the tracer doesn't restart the tracee before entering into the next |
| .BR waitpid (2), |
| future |
| .B SIGCONT |
| signals will not be reported to the tracer; |
| this would cause the |
| .B SIGCONT |
| signals to have no effect on the tracee. |
| .PP |
| Since Linux 3.4, there is a method to overcome this problem: instead of |
| .BR PTRACE_CONT , |
| a |
| .B PTRACE_LISTEN |
| command can be used to restart a tracee in a way where it does not execute, |
| but waits for a new event which it can report via |
| .BR waitpid (2) |
| (such as when |
| it is restarted by a |
| .BR SIGCONT ). |
| .SS PTRACE_EVENT stops |
| If the tracer sets |
| .B PTRACE_O_TRACE_* |
| options, the tracee will enter ptrace-stops called |
| .B PTRACE_EVENT |
| stops. |
| .PP |
| .B PTRACE_EVENT |
| stops are observed by the tracer as |
| .BR waitpid (2) |
| returning with |
| .IR WIFSTOPPED(status) , |
| and |
| .I WSTOPSIG(status) |
| returns |
| .BR SIGTRAP |
| (or for |
| .BR PTRACE_EVENT_STOP , |
| returns the stopping signal if tracee is in a group-stop). |
| An additional bit is set in the higher byte of the status word: |
| the value |
| .I status>>8 |
| will be |
| .PP |
| ((PTRACE_EVENT_foo<<8) | SIGTRAP). |
| .PP |
| The following events exist: |
| .TP |
| .B PTRACE_EVENT_VFORK |
| Stop before return from |
| .BR vfork (2) |
| or |
| .BR clone (2) |
| with the |
| .B CLONE_VFORK |
| flag. |
| When the tracee is continued after this stop, it will wait for child to |
| exit/exec before continuing its execution |
| (in other words, the usual behavior on |
| .BR vfork (2)). |
| .TP |
| .B PTRACE_EVENT_FORK |
| Stop before return from |
| .BR fork (2) |
| or |
| .BR clone (2) |
| with the exit signal set to |
| .BR SIGCHLD . |
| .TP |
| .B PTRACE_EVENT_CLONE |
| Stop before return from |
| .BR clone (2). |
| .TP |
| .B PTRACE_EVENT_VFORK_DONE |
| Stop before return from |
| .BR vfork (2) |
| or |
| .BR clone (2) |
| with the |
| .B CLONE_VFORK |
| flag, |
| but after the child unblocked this tracee by exiting or execing. |
| .PP |
| For all four stops described above, |
| the stop occurs in the parent (i.e., the tracee), |
| not in the newly created thread. |
| .BR PTRACE_GETEVENTMSG |
| can be used to retrieve the new thread's ID. |
| .TP |
| .B PTRACE_EVENT_EXEC |
| Stop before return from |
| .BR execve (2). |
| Since Linux 3.0, |
| .BR PTRACE_GETEVENTMSG |
| returns the former thread ID. |
| .TP |
| .B PTRACE_EVENT_EXIT |
| Stop before exit (including death from |
| .BR exit_group (2)), |
| signal death, or exit caused by |
| .BR execve (2) |
| in a multithreaded process. |
| .B PTRACE_GETEVENTMSG |
| returns the exit status. |
| Registers can be examined |
| (unlike when "real" exit happens). |
| The tracee is still alive; it needs to be |
| .BR PTRACE_CONT ed |
| or |
| .BR PTRACE_DETACH ed |
| to finish exiting. |
| .TP |
| .B PTRACE_EVENT_STOP |
| Stop induced by |
| .B PTRACE_INTERRUPT |
| command, or group-stop, or initial ptrace-stop when a new child is attached |
| (only if attached using |
| .BR PTRACE_SEIZE ). |
| .TP |
| .B PTRACE_EVENT_SECCOMP |
| Stop triggered by a |
| .BR seccomp (2) |
| rule on tracee syscall entry when |
| .BR PTRACE_O_TRACESECCOMP |
| has been set by the tracer. |
| The seccomp event message data (from the |
| .BR SECCOMP_RET_DATA |
| portion of the seccomp filter rule) can be retrieved with |
| .BR PTRACE_GETEVENTMSG . |
| The semantics of this stop are described in |
| detail in a separate section below. |
| .PP |
| .B PTRACE_GETSIGINFO |
| on |
| .B PTRACE_EVENT |
| stops returns |
| .B SIGTRAP |
| in |
| .IR si_signo , |
| with |
| .I si_code |
| set to |
| .IR "(event<<8)\ |\ SIGTRAP" . |
| .SS Syscall-stops |
| If the tracee was restarted by |
| .BR PTRACE_SYSCALL |
| or |
| .BR PTRACE_SYSEMU , |
| the tracee enters |
| syscall-enter-stop just prior to entering any system call (which |
| will not be executed if the restart was using |
| .BR PTRACE_SYSEMU , |
| regardless of any change made to registers at this point or how the |
| tracee is restarted after this stop). |
| No matter which method caused the syscall-entry-stop, |
| if the tracer restarts the tracee with |
| .BR PTRACE_SYSCALL , |
| the tracee enters syscall-exit-stop when the system call is finished, |
| or if it is interrupted by a signal. |
| (That is, signal-delivery-stop never happens between syscall-enter-stop |
| and syscall-exit-stop; it happens |
| .I after |
| syscall-exit-stop.). |
| If the tracee is continued using any other method (including |
| .BR PTRACE_SYSEMU ), |
| no syscall-exit-stop occurs. |
| Note that all mentions |
| .BR PTRACE_SYSEMU |
| apply equally to |
| .BR PTRACE_SYSEMU_SINGLESTEP. |
| .PP |
| However, even if the tracee was continued using |
| .BR PTRACE_SYSCALL , |
| it is not guaranteed that the next stop will be a syscall-exit-stop. |
| Other possibilities are that the tracee may stop in a |
| .B PTRACE_EVENT |
| stop (including seccomp stops), exit (if it entered |
| .BR _exit (2) |
| or |
| .BR exit_group (2)), |
| be killed by |
| .BR SIGKILL , |
| or die silently (if it is a thread group leader, the |
| .BR execve (2) |
| happened in another thread, |
| and that thread is not traced by the same tracer; |
| this situation is discussed later). |
| .PP |
| Syscall-enter-stop and syscall-exit-stop are observed by the tracer as |
| .BR waitpid (2) |
| returning with |
| .I WIFSTOPPED(status) |
| true, and |
| .I WSTOPSIG(status) |
| giving |
| .BR SIGTRAP . |
| If the |
| .B PTRACE_O_TRACESYSGOOD |
| option was set by the tracer, then |
| .I WSTOPSIG(status) |
| will give the value |
| .IR "(SIGTRAP\ |\ 0x80)" . |
| .PP |
| Syscall-stops can be distinguished from signal-delivery-stop with |
| .B SIGTRAP |
| by querying |
| .BR PTRACE_GETSIGINFO |
| for the following cases: |
| .TP |
| .IR si_code " <= 0" |
| .B SIGTRAP |
| was delivered as a result of a user-space action, |
| for example, a system call |
| .RB ( tgkill (2), |
| .BR kill (2), |
| .BR sigqueue (3), |
| etc.), |
| expiration of a POSIX timer, |
| change of state on a POSIX message queue, |
| or completion of an asynchronous I/O request. |
| .TP |
| .IR si_code " == SI_KERNEL (0x80)" |
| .B SIGTRAP |
| was sent by the kernel. |
| .TP |
| .IR si_code " == SIGTRAP or " si_code " == (SIGTRAP|0x80)" |
| This is a syscall-stop. |
| .PP |
| However, syscall-stops happen very often (twice per system call), |
| and performing |
| .B PTRACE_GETSIGINFO |
| for every syscall-stop may be somewhat expensive. |
| .PP |
| Some architectures allow the cases to be distinguished |
| by examining registers. |
| For example, on x86, |
| .I rax |
| == |
| .RB - ENOSYS |
| in syscall-enter-stop. |
| Since |
| .B SIGTRAP |
| (like any other signal) always happens |
| .I after |
| syscall-exit-stop, |
| and at this point |
| .I rax |
| almost never contains |
| .RB - ENOSYS , |
| the |
| .B SIGTRAP |
| looks like "syscall-stop which is not syscall-enter-stop"; |
| in other words, it looks like a |
| "stray syscall-exit-stop" and can be detected this way. |
| But such detection is fragile and is best avoided. |
| .PP |
| Using the |
| .B PTRACE_O_TRACESYSGOOD |
| option is the recommended method to distinguish syscall-stops |
| from other kinds of ptrace-stops, |
| since it is reliable and does not incur a performance penalty. |
| .PP |
| Syscall-enter-stop and syscall-exit-stop are |
| indistinguishable from each other by the tracer. |
| The tracer needs to keep track of the sequence of |
| ptrace-stops in order to not misinterpret syscall-enter-stop as |
| syscall-exit-stop or vice versa. |
| In general, a syscall-enter-stop is |
| always followed by syscall-exit-stop, |
| .B PTRACE_EVENT |
| stop, or the tracee's death; |
| no other kinds of ptrace-stop can occur in between. |
| However, note that seccomp stops (see below) can cause syscall-exit-stops, |
| without preceding syscall-entry-stops. |
| If seccomp is in use, care needs |
| to be taken not to misinterpret such stops as syscall-entry-stops. |
| .PP |
| If after syscall-enter-stop, |
| the tracer uses a restarting command other than |
| .BR PTRACE_SYSCALL , |
| syscall-exit-stop is not generated. |
| .PP |
| .B PTRACE_GETSIGINFO |
| on syscall-stops returns |
| .B SIGTRAP |
| in |
| .IR si_signo , |
| with |
| .I si_code |
| set to |
| .B SIGTRAP |
| or |
| .IR (SIGTRAP|0x80) . |
| .\" |
| .SS PTRACE_EVENT_SECCOMP stops (Linux 3.5 to 4.7) |
| The behavior of |
| .BR PTRACE_EVENT_SECCOMP |
| stops and their interaction with other kinds |
| of ptrace stops has changed between kernel versions. |
| This documents the behavior |
| from their introduction until Linux 4.7 (inclusive). |
| The behavior in later kernel versions is documented in the next section. |
| .PP |
| A |
| .BR PTRACE_EVENT_SECCOMP |
| stop occurs whenever a |
| .BR SECCOMP_RET_TRACE |
| rule is triggered. |
| This is independent of which methods was used to restart the system call. |
| Notably, seccomp still runs even if the tracee was restarted using |
| .BR PTRACE_SYSEMU |
| and this system call is unconditionally skipped. |
| .PP |
| Restarts from this stop will behave as if the stop had occurred right |
| before the system call in question. |
| In particular, both |
| .BR PTRACE_SYSCALL |
| and |
| .BR PTRACE_SYSEMU |
| will normally cause a subsequent syscall-entry-stop. |
| However, if after the |
| .BR PTRACE_EVENT_SECCOMP |
| the system call number is negative, |
| both the syscall-entry-stop and the system call itself will be skipped. |
| This means that if the system call number is negative after a |
| .BR PTRACE_EVENT_SECCOMP |
| and the tracee is restarted using |
| .BR PTRACE_SYSCALL , |
| the next observed stop will be a syscall-exit-stop, |
| rather than the syscall-entry-stop that might have been expected. |
| .\" |
| .SS PTRACE_EVENT_SECCOMP stops (since Linux 4.8) |
| Starting with Linux 4.8, |
| .\" commit 93e35efb8de45393cf61ed07f7b407629bf698ea |
| the |
| .BR PTRACE_EVENT_SECCOMP |
| stop was reordered to occur between syscall-entry-stop and |
| syscall-exit-stop. |
| Note that seccomp no longer runs (and no |
| .B PTRACE_EVENT_SECCOMP |
| will be reported) if the system call is skipped due to |
| .BR PTRACE_SYSEMU . |
| .PP |
| Functionally, a |
| .B PTRACE_EVENT_SECCOMP |
| stop functions comparably |
| to a syscall-entry-stop (i.e., continuations using |
| .BR PTRACE_SYSCALL |
| will cause syscall-exit-stops, |
| the system call number may be changed and any other modified registers |
| are visible to the to-be-executed system call as well). |
| Note that there may be, |
| but need not have been a preceding syscall-entry-stop. |
| .PP |
| After a |
| .BR PTRACE_EVENT_SECCOMP |
| stop, seccomp will be rerun, with a |
| .BR SECCOMP_RET_TRACE |
| rule now functioning the same as a |
| .BR SECCOMP_RET_ALLOW . |
| Specifically, this means that if registers are not modified during the |
| .BR PTRACE_EVENT_SECCOMP |
| stop, the system call will then be allowed. |
| .\" |
| .SS PTRACE_SINGLESTEP stops |
| [Details of these kinds of stops are yet to be documented.] |
| .\" |
| .\" FIXME . |
| .\" document stops occurring with PTRACE_SINGLESTEP |
| .\" |
| .SS Informational and restarting ptrace commands |
| Most ptrace commands (all except |
| .BR PTRACE_ATTACH , |
| .BR PTRACE_SEIZE , |
| .BR PTRACE_TRACEME , |
| .BR PTRACE_INTERRUPT , |
| and |
| .BR PTRACE_KILL ) |
| require the tracee to be in a ptrace-stop, otherwise they fail with |
| .BR ESRCH . |
| .PP |
| When the tracee is in ptrace-stop, |
| the tracer can read and write data to |
| the tracee using informational commands. |
| These commands leave the tracee in ptrace-stopped state: |
| .PP |
| .in +4n |
| .EX |
| ptrace(PTRACE_PEEKTEXT/PEEKDATA/PEEKUSER, pid, addr, 0); |
| ptrace(PTRACE_POKETEXT/POKEDATA/POKEUSER, pid, addr, long_val); |
| ptrace(PTRACE_GETREGS/GETFPREGS, pid, 0, &struct); |
| ptrace(PTRACE_SETREGS/SETFPREGS, pid, 0, &struct); |
| ptrace(PTRACE_GETREGSET, pid, NT_foo, &iov); |
| ptrace(PTRACE_SETREGSET, pid, NT_foo, &iov); |
| ptrace(PTRACE_GETSIGINFO, pid, 0, &siginfo); |
| ptrace(PTRACE_SETSIGINFO, pid, 0, &siginfo); |
| ptrace(PTRACE_GETEVENTMSG, pid, 0, &long_var); |
| ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_flags); |
| .EE |
| .in |
| .PP |
| Note that some errors are not reported. |
| For example, setting signal information |
| .RI ( siginfo ) |
| may have no effect in some ptrace-stops, yet the call may succeed |
| (return 0 and not set |
| .IR errno ); |
| querying |
| .B PTRACE_GETEVENTMSG |
| may succeed and return some random value if current ptrace-stop |
| is not documented as returning a meaningful event message. |
| .PP |
| The call |
| .PP |
| ptrace(PTRACE_SETOPTIONS, pid, 0, PTRACE_O_flags); |
| .PP |
| affects one tracee. |
| The tracee's current flags are replaced. |
| Flags are inherited by new tracees created and "auto-attached" via active |
| .BR PTRACE_O_TRACEFORK , |
| .BR PTRACE_O_TRACEVFORK , |
| or |
| .BR PTRACE_O_TRACECLONE |
| options. |
| .PP |
| Another group of commands makes the ptrace-stopped tracee run. |
| They have the form: |
| .PP |
| ptrace(cmd, pid, 0, sig); |
| .PP |
| where |
| .I cmd |
| is |
| .BR PTRACE_CONT , |
| .BR PTRACE_LISTEN , |
| .BR PTRACE_DETACH , |
| .BR PTRACE_SYSCALL , |
| .BR PTRACE_SINGLESTEP , |
| .BR PTRACE_SYSEMU , |
| or |
| .BR PTRACE_SYSEMU_SINGLESTEP . |
| If the tracee is in signal-delivery-stop, |
| .I sig |
| is the signal to be injected (if it is nonzero). |
| Otherwise, |
| .I sig |
| may be ignored. |
| (When restarting a tracee from a ptrace-stop other than signal-delivery-stop, |
| recommended practice is to always pass 0 in |
| .IR sig .) |
| .SS Attaching and detaching |
| A thread can be attached to the tracer using the call |
| .PP |
| ptrace(PTRACE_ATTACH, pid, 0, 0); |
| .PP |
| or |
| .PP |
| ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_flags); |
| .PP |
| .B PTRACE_ATTACH |
| sends |
| .B SIGSTOP |
| to this thread. |
| If the tracer wants this |
| .B SIGSTOP |
| to have no effect, it needs to suppress it. |
| Note that if other signals are concurrently sent to |
| this thread during attach, |
| the tracer may see the tracee enter signal-delivery-stop |
| with other signal(s) first! |
| The usual practice is to reinject these signals until |
| .B SIGSTOP |
| is seen, then suppress |
| .B SIGSTOP |
| injection. |
| The design bug here is that a ptrace attach and a concurrently delivered |
| .B SIGSTOP |
| may race and the concurrent |
| .B SIGSTOP |
| may be lost. |
| .\" |
| .\" FIXME Describe how to attach to a thread which is already group-stopped. |
| .PP |
| Since attaching sends |
| .B SIGSTOP |
| and the tracer usually suppresses it, this may cause a stray |
| .B EINTR |
| return from the currently executing system call in the tracee, |
| as described in the "Signal injection and suppression" section. |
| .PP |
| Since Linux 3.4, |
| .B PTRACE_SEIZE |
| can be used instead of |
| .BR PTRACE_ATTACH . |
| .B PTRACE_SEIZE |
| does not stop the attached process. |
| If you need to stop |
| it after attach (or at any other time) without sending it any signals, |
| use |
| .B PTRACE_INTERRUPT |
| command. |
| .PP |
| The request |
| .PP |
| ptrace(PTRACE_TRACEME, 0, 0, 0); |
| .PP |
| turns the calling thread into a tracee. |
| The thread continues to run (doesn't enter ptrace-stop). |
| A common practice is to follow the |
| .B PTRACE_TRACEME |
| with |
| .PP |
| raise(SIGSTOP); |
| .PP |
| and allow the parent (which is our tracer now) to observe our |
| signal-delivery-stop. |
| .PP |
| If the |
| .BR PTRACE_O_TRACEFORK , |
| .BR PTRACE_O_TRACEVFORK , |
| or |
| .BR PTRACE_O_TRACECLONE |
| options are in effect, then children created by, respectively, |
| .BR vfork (2) |
| or |
| .BR clone (2) |
| with the |
| .B CLONE_VFORK |
| flag, |
| .BR fork (2) |
| or |
| .BR clone (2) |
| with the exit signal set to |
| .BR SIGCHLD , |
| and other kinds of |
| .BR clone (2), |
| are automatically attached to the same tracer which traced their parent. |
| .B SIGSTOP |
| is delivered to the children, causing them to enter |
| signal-delivery-stop after they exit the system call which created them. |
| .PP |
| Detaching of the tracee is performed by: |
| .PP |
| ptrace(PTRACE_DETACH, pid, 0, sig); |
| .PP |
| .B PTRACE_DETACH |
| is a restarting operation; |
| therefore it requires the tracee to be in ptrace-stop. |
| If the tracee is in signal-delivery-stop, a signal can be injected. |
| Otherwise, the |
| .I sig |
| parameter may be silently ignored. |
| .PP |
| If the tracee is running when the tracer wants to detach it, |
| the usual solution is to send |
| .B SIGSTOP |
| (using |
| .BR tgkill (2), |
| to make sure it goes to the correct thread), |
| wait for the tracee to stop in signal-delivery-stop for |
| .B SIGSTOP |
| and then detach it (suppressing |
| .B SIGSTOP |
| injection). |
| A design bug is that this can race with concurrent |
| .BR SIGSTOP s. |
| Another complication is that the tracee may enter other ptrace-stops |
| and needs to be restarted and waited for again, until |
| .B SIGSTOP |
| is seen. |
| Yet another complication is to be sure that |
| the tracee is not already ptrace-stopped, |
| because no signal delivery happens while it is\(emnot even |
| .BR SIGSTOP . |
| .\" FIXME Describe how to detach from a group-stopped tracee so that it |
| .\" doesn't run, but continues to wait for SIGCONT. |
| .PP |
| If the tracer dies, all tracees are automatically detached and restarted, |
| unless they were in group-stop. |
| Handling of restart from group-stop is currently buggy, |
| but the "as planned" behavior is to leave tracee stopped and waiting for |
| .BR SIGCONT . |
| If the tracee is restarted from signal-delivery-stop, |
| the pending signal is injected. |
| .SS execve(2) under ptrace |
| .\" clone(2) CLONE_THREAD says: |
| .\" If any of the threads in a thread group performs an execve(2), |
| .\" then all threads other than the thread group leader are terminated, |
| .\" and the new program is executed in the thread group leader. |
| .\" |
| When one thread in a multithreaded process calls |
| .BR execve (2), |
| the kernel destroys all other threads in the process, |
| .\" In kernel 3.1 sources, see fs/exec.c::de_thread() |
| and resets the thread ID of the execing thread to the |
| thread group ID (process ID). |
| (Or, to put things another way, when a multithreaded process does an |
| .BR execve (2), |
| at completion of the call, it appears as though the |
| .BR execve (2) |
| occurred in the thread group leader, regardless of which thread did the |
| .BR execve (2).) |
| This resetting of the thread ID looks very confusing to tracers: |
| .IP * 3 |
| All other threads stop in |
| .B PTRACE_EVENT_EXIT |
| stop, if the |
| .BR PTRACE_O_TRACEEXIT |
| option was turned on. |
| Then all other threads except the thread group leader report |
| death as if they exited via |
| .BR _exit (2) |
| with exit code 0. |
| .IP * |
| The execing tracee changes its thread ID while it is in the |
| .BR execve (2). |
| (Remember, under ptrace, the "pid" returned from |
| .BR waitpid (2), |
| or fed into ptrace calls, is the tracee's thread ID.) |
| That is, the tracee's thread ID is reset to be the same as its process ID, |
| which is the same as the thread group leader's thread ID. |
| .IP * |
| Then a |
| .B PTRACE_EVENT_EXEC |
| stop happens, if the |
| .BR PTRACE_O_TRACEEXEC |
| option was turned on. |
| .IP * |
| If the thread group leader has reported its |
| .B PTRACE_EVENT_EXIT |
| stop by this time, |
| it appears to the tracer that |
| the dead thread leader "reappears from nowhere". |
| (Note: the thread group leader does not report death via |
| .I WIFEXITED(status) |
| until there is at least one other live thread. |
| This eliminates the possibility that the tracer will see |
| it dying and then reappearing.) |
| If the thread group leader was still alive, |
| for the tracer this may look as if thread group leader |
| returns from a different system call than it entered, |
| or even "returned from a system call even though |
| it was not in any system call". |
| If the thread group leader was not traced |
| (or was traced by a different tracer), then during |
| .BR execve (2) |
| it will appear as if it has become a tracee of |
| the tracer of the execing tracee. |
| .PP |
| All of the above effects are the artifacts of |
| the thread ID change in the tracee. |
| .PP |
| The |
| .B PTRACE_O_TRACEEXEC |
| option is the recommended tool for dealing with this situation. |
| First, it enables |
| .BR PTRACE_EVENT_EXEC |
| stop, |
| which occurs before |
| .BR execve (2) |
| returns. |
| In this stop, the tracer can use |
| .B PTRACE_GETEVENTMSG |
| to retrieve the tracee's former thread ID. |
| (This feature was introduced in Linux 3.0.) |
| Second, the |
| .B PTRACE_O_TRACEEXEC |
| option disables legacy |
| .B SIGTRAP |
| generation on |
| .BR execve (2). |
| .PP |
| When the tracer receives |
| .B PTRACE_EVENT_EXEC |
| stop notification, |
| it is guaranteed that except this tracee and the thread group leader, |
| no other threads from the process are alive. |
| .PP |
| On receiving the |
| .B PTRACE_EVENT_EXEC |
| stop notification, |
| the tracer should clean up all its internal |
| data structures describing the threads of this process, |
| and retain only one data structure\(emone which |
| describes the single still running tracee, with |
| .PP |
| thread ID == thread group ID == process ID. |
| .PP |
| Example: two threads call |
| .BR execve (2) |
| at the same time: |
| .PP |
| .nf |
| *** we get syscall-enter-stop in thread 1: ** |
| PID1 execve("/bin/foo", "foo" <unfinished ...> |
| *** we issue PTRACE_SYSCALL for thread 1 ** |
| *** we get syscall-enter-stop in thread 2: ** |
| PID2 execve("/bin/bar", "bar" <unfinished ...> |
| *** we issue PTRACE_SYSCALL for thread 2 ** |
| *** we get PTRACE_EVENT_EXEC for PID0, we issue PTRACE_SYSCALL ** |
| *** we get syscall-exit-stop for PID0: ** |
| PID0 <... execve resumed> ) = 0 |
| .fi |
| .PP |
| If the |
| .B PTRACE_O_TRACEEXEC |
| option is |
| .I not |
| in effect for the execing tracee, |
| and if the tracee was |
| .BR PTRACE_ATTACH ed |
| rather that |
| .BR PTRACE_SEIZE d, |
| the kernel delivers an extra |
| .B SIGTRAP |
| to the tracee after |
| .BR execve (2) |
| returns. |
| This is an ordinary signal (similar to one which can be |
| generated by |
| .IR "kill -TRAP" ), |
| not a special kind of ptrace-stop. |
| Employing |
| .B PTRACE_GETSIGINFO |
| for this signal returns |
| .I si_code |
| set to 0 |
| .RI ( SI_USER ). |
| This signal may be blocked by signal mask, |
| and thus may be delivered (much) later. |
| .PP |
| Usually, the tracer (for example, |
| .BR strace (1)) |
| would not want to show this extra post-execve |
| .B SIGTRAP |
| signal to the user, and would suppress its delivery to the tracee (if |
| .B SIGTRAP |
| is set to |
| .BR SIG_DFL , |
| it is a killing signal). |
| However, determining |
| .I which |
| .B SIGTRAP |
| to suppress is not easy. |
| Setting the |
| .B PTRACE_O_TRACEEXEC |
| option or using |
| .B PTRACE_SEIZE |
| and thus suppressing this extra |
| .B SIGTRAP |
| is the recommended approach. |
| .SS Real parent |
| The ptrace API (ab)uses the standard UNIX parent/child signaling over |
| .BR waitpid (2). |
| This used to cause the real parent of the process to stop receiving |
| several kinds of |
| .BR waitpid (2) |
| notifications when the child process is traced by some other process. |
| .PP |
| Many of these bugs have been fixed, but as of Linux 2.6.38 several still |
| exist; see BUGS below. |
| .PP |
| As of Linux 2.6.38, the following is believed to work correctly: |
| .IP * 3 |
| exit/death by signal is reported first to the tracer, then, |
| when the tracer consumes the |
| .BR waitpid (2) |
| result, to the real parent (to the real parent only when the |
| whole multithreaded process exits). |
| If the tracer and the real parent are the same process, |
| the report is sent only once. |
| .SH RETURN VALUE |
| On success, the |
| .B PTRACE_PEEK* |
| requests return the requested data (but see NOTES), |
| the |
| .B PTRACE_SECCOMP_GET_FILTER |
| request returns the number of instructions in the BPF program, and |
| other requests return zero. |
| .PP |
| On error, all requests return \-1, and |
| .I errno |
| is set appropriately. |
| Since the value returned by a successful |
| .B PTRACE_PEEK* |
| request may be \-1, the caller must clear |
| .I errno |
| before the call, and then check it afterward |
| to determine whether or not an error occurred. |
| .SH ERRORS |
| .TP |
| .B EBUSY |
| (i386 only) There was an error with allocating or freeing a debug register. |
| .TP |
| .B EFAULT |
| There was an attempt to read from or write to an invalid area in |
| the tracer's or the tracee's memory, |
| probably because the area wasn't mapped or accessible. |
| Unfortunately, under Linux, different variations of this fault |
| will return |
| .B EIO |
| or |
| .B EFAULT |
| more or less arbitrarily. |
| .TP |
| .B EINVAL |
| An attempt was made to set an invalid option. |
| .TP |
| .B EIO |
| .I request |
| is invalid, or an attempt was made to read from or |
| write to an invalid area in the tracer's or the tracee's memory, |
| or there was a word-alignment violation, |
| or an invalid signal was specified during a restart request. |
| .TP |
| .B EPERM |
| The specified process cannot be traced. |
| This could be because the |
| tracer has insufficient privileges (the required capability is |
| .BR CAP_SYS_PTRACE ); |
| unprivileged processes cannot trace processes that they |
| cannot send signals to or those running |
| set-user-ID/set-group-ID programs, for obvious reasons. |
| Alternatively, the process may already be being traced, |
| or (on kernels before 2.6.26) be |
| .BR init (1) |
| (PID 1). |
| .TP |
| .B ESRCH |
| The specified process does not exist, or is not currently being traced |
| by the caller, or is not stopped |
| (for requests that require a stopped tracee). |
| .SH CONFORMING TO |
| SVr4, 4.3BSD. |
| .SH NOTES |
| Although arguments to |
| .BR ptrace () |
| are interpreted according to the prototype given, |
| glibc currently declares |
| .BR ptrace () |
| as a variadic function with only the |
| .I request |
| argument fixed. |
| It is recommended to always supply four arguments, |
| even if the requested operation does not use them, |
| setting unused/ignored arguments to |
| .I 0L |
| or |
| .IR "(void\ *)\ 0". |
| .PP |
| In Linux kernels before 2.6.26, |
| .\" See commit 00cd5c37afd5f431ac186dd131705048c0a11fdb |
| .BR init (1), |
| the process with PID 1, may not be traced. |
| .PP |
| A tracees parent continues to be the tracer even if that tracer calls |
| .BR execve (2). |
| .PP |
| The layout of the contents of memory and the USER area are |
| quite operating-system- and architecture-specific. |
| The offset supplied, and the data returned, |
| might not entirely match with the definition of |
| .IR "struct user" . |
| .\" See http://lkml.org/lkml/2008/5/8/375 |
| .PP |
| The size of a "word" is determined by the operating-system variant |
| (e.g., for 32-bit Linux it is 32 bits). |
| .PP |
| This page documents the way the |
| .BR ptrace () |
| call works currently in Linux. |
| Its behavior differs significantly on other flavors of UNIX. |
| In any case, use of |
| .BR ptrace () |
| is highly specific to the operating system and architecture. |
| .\" |
| .\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" |
| .\" |
| .SS Ptrace access mode checking |
| Various parts of the kernel-user-space API (not just |
| .BR ptrace () |
| operations), require so-called "ptrace access mode" checks, |
| whose outcome determines whether an operation is permitted |
| (or, in a few cases, causes a "read" operation to return sanitized data). |
| These checks are performed in cases where one process can |
| inspect sensitive information about, |
| or in some cases modify the state of, another process. |
| The checks are based on factors such as the credentials and capabilities |
| of the two processes, |
| whether or not the "target" process is dumpable, |
| and the results of checks performed by any enabled Linux Security Module |
| (LSM)\(emfor example, SELinux, Yama, or Smack\(emand by the commoncap LSM |
| (which is always invoked). |
| .PP |
| Prior to Linux 2.6.27, all access checks were of a single type. |
| Since Linux 2.6.27, |
| .\" commit 006ebb40d3d65338bd74abb03b945f8d60e362bd |
| two access mode levels are distinguished: |
| .TP |
| .BR PTRACE_MODE_READ |
| For "read" operations or other operations that are less dangerous, |
| such as: |
| .BR get_robust_list (2); |
| .BR kcmp (2); |
| reading |
| .IR /proc/[pid]/auxv , |
| .IR /proc/[pid]/environ , |
| or |
| .IR /proc/[pid]/stat ; |
| or |
| .BR readlink (2) |
| of a |
| .IR /proc/[pid]/ns/* |
| file. |
| .TP |
| .BR PTRACE_MODE_ATTACH |
| For "write" operations, or other operations that are more dangerous, |
| such as: ptrace attaching |
| .RB ( PTRACE_ATTACH ) |
| to another process |
| or calling |
| .BR process_vm_writev (2). |
| .RB ( PTRACE_MODE_ATTACH |
| was effectively the default before Linux 2.6.27.) |
| .\" |
| .\" Regarding the above description of the distinction between |
| .\" PTRACE_MODE_READ and PTRACE_MODE_ATTACH, Stephen Smalley notes: |
| .\" |
| .\" That was the intent when the distinction was introduced, but it doesn't |
| .\" appear to have been properly maintained, e.g. there is now a common |
| .\" helper lock_trace() that is used for |
| .\" /proc/pid/{stack,syscall,personality} but checks PTRACE_MODE_ATTACH, and |
| .\" PTRACE_MODE_ATTACH is also used in timerslack_ns_write/show(). Likely |
| .\" should review and make them consistent. There was also some debate |
| .\" about proper handling of /proc/pid/fd. Arguably that one might belong |
| .\" back in the _ATTACH camp. |
| .\" |
| .PP |
| Since Linux 4.5, |
| .\" commit caaee6234d05a58c5b4d05e7bf766131b810a657 |
| the above access mode checks are combined (ORed) with |
| one of the following modifiers: |
| .TP |
| .B PTRACE_MODE_FSCREDS |
| Use the caller's filesystem UID and GID (see |
| .BR credentials (7)) |
| or effective capabilities for LSM checks. |
| .TP |
| .B PTRACE_MODE_REALCREDS |
| Use the caller's real UID and GID or permitted capabilities for LSM checks. |
| This was effectively the default before Linux 4.5. |
| .PP |
| Because combining one of the credential modifiers with one of |
| the aforementioned access modes is typical, |
| some macros are defined in the kernel sources for the combinations: |
| .TP |
| .B PTRACE_MODE_READ_FSCREDS |
| Defined as |
| .BR "PTRACE_MODE_READ | PTRACE_MODE_FSCREDS" . |
| .TP |
| .B PTRACE_MODE_READ_REALCREDS |
| Defined as |
| .BR "PTRACE_MODE_READ | PTRACE_MODE_REALCREDS" . |
| .TP |
| .B PTRACE_MODE_ATTACH_FSCREDS |
| Defined as |
| .BR "PTRACE_MODE_ATTACH | PTRACE_MODE_FSCREDS" . |
| .TP |
| .B PTRACE_MODE_ATTACH_REALCREDS |
| Defined as |
| .BR "PTRACE_MODE_ATTACH | PTRACE_MODE_REALCREDS" . |
| .PP |
| One further modifier can be ORed with the access mode: |
| .TP |
| .BR PTRACE_MODE_NOAUDIT " (since Linux 3.3)" |
| .\" commit 69f594a38967f4540ce7a29b3fd214e68a8330bd |
| .\" Just for /proc/pid/stat |
| Don't audit this access mode check. |
| This modifier is employed for ptrace access mode checks |
| (such as checks when reading |
| .IR /proc/[pid]/stat ) |
| that merely cause the output to be filtered or sanitized, |
| rather than causing an error to be returned to the caller. |
| In these cases, accessing the file is not a security violation and |
| there is no reason to generate a security audit record. |
| This modifier suppresses the generation of |
| such an audit record for the particular access check. |
| .PP |
| Note that all of the |
| .BR PTRACE_MODE_* |
| constants described in this subsection are kernel-internal, |
| and not visible to user space. |
| The constant names are mentioned here in order to label the various kinds of |
| ptrace access mode checks that are performed for various system calls |
| and accesses to various pseudofiles (e.g., under |
| .IR /proc ). |
| These names are used in other manual pages to provide a simple |
| shorthand for labeling the different kernel checks. |
| .PP |
| The algorithm employed for ptrace access mode checking determines whether |
| the calling process is allowed to perform the corresponding action |
| on the target process. |
| (In the case of opening |
| .IR /proc/[pid] |
| files, the "calling process" is the one opening the file, |
| and the process with the corresponding PID is the "target process".) |
| The algorithm is as follows: |
| .IP 1. 3 |
| If the calling thread and the target thread are in the same |
| thread group, access is always allowed. |
| .IP 2. |
| If the access mode specifies |
| .BR PTRACE_MODE_FSCREDS , |
| then, for the check in the next step, |
| employ the caller's filesystem UID and GID. |
| (As noted in |
| .BR credentials (7), |
| the filesystem UID and GID almost always have the same values |
| as the corresponding effective IDs.) |
| .IP |
| Otherwise, the access mode specifies |
| .BR PTRACE_MODE_REALCREDS , |
| so use the caller's real UID and GID for the checks in the next step. |
| (Most APIs that check the caller's UID and GID use the effective IDs. |
| For historical reasons, the |
| .BR PTRACE_MODE_REALCREDS |
| check uses the real IDs instead.) |
| .IP 3. |
| Deny access if |
| .I neither |
| of the following is true: |
| .RS |
| .IP \(bu 2 |
| The real, effective, and saved-set user IDs of the target |
| match the caller's user ID, |
| .IR and |
| the real, effective, and saved-set group IDs of the target |
| match the caller's group ID. |
| .IP \(bu |
| The caller has the |
| .B CAP_SYS_PTRACE |
| capability in the user namespace of the target. |
| .RE |
| .IP 4. |
| Deny access if the target process "dumpable" attribute has a value other than 1 |
| .RB ( SUID_DUMP_USER ; |
| see the discussion of |
| .BR PR_SET_DUMPABLE |
| in |
| .BR prctl (2)), |
| and the caller does not have the |
| .BR CAP_SYS_PTRACE |
| capability in the user namespace of the target process. |
| .IP 5. |
| The kernel LSM |
| .IR security_ptrace_access_check () |
| interface is invoked to see if ptrace access is permitted. |
| The results depend on the LSM(s). |
| The implementation of this interface in the commoncap LSM performs |
| the following steps: |
| .\" (in cap_ptrace_access_check()): |
| .RS |
| .IP a) 3 |
| If the access mode includes |
| .BR PTRACE_MODE_FSCREDS , |
| then use the caller's |
| .I effective |
| capability set |
| in the following check; |
| otherwise (the access mode specifies |
| .BR PTRACE_MODE_REALCREDS , |
| so) use the caller's |
| .I permitted |
| capability set. |
| .IP b) |
| Deny access if |
| .I neither |
| of the following is true: |
| .RS |
| .IP \(bu 2 |
| The caller and the target process are in the same user namespace, |
| and the caller's capabilities are a superset of the target process's |
| .I permitted |
| capabilities. |
| .IP \(bu |
| The caller has the |
| .B CAP_SYS_PTRACE |
| capability in the target process's user namespace. |
| .RE |
| .IP |
| Note that the commoncap LSM does not distinguish between |
| .B PTRACE_MODE_READ |
| and |
| .BR PTRACE_MODE_ATTACH . |
| .RE |
| .IP 6. |
| If access has not been denied by any of the preceding steps, |
| then access is allowed. |
| .\" |
| .\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" |
| .\" |
| .SS /proc/sys/kernel/yama/ptrace_scope |
| On systems with the Yama Linux Security Module (LSM) installed |
| (i.e., the kernel was configured with |
| .BR CONFIG_SECURITY_YAMA ), |
| the |
| .I /proc/sys/kernel/yama/ptrace_scope |
| file (available since Linux 3.4) |
| .\" commit 2d514487faf188938a4ee4fb3464eeecfbdcf8eb |
| can be used to restrict the ability to trace a process with |
| .BR ptrace () |
| (and thus also the ability to use tools such as |
| .BR strace (1) |
| and |
| .BR gdb (1)). |
| The goal of such restrictions is to prevent attack escalation whereby |
| a compromised process can ptrace-attach to other sensitive processes |
| (e.g., a GPG agent or an SSH session) owned by the user in order |
| to gain additional credentials that may exist in memory |
| and thus expand the scope of the attack. |
| .PP |
| More precisely, the Yama LSM limits two types of operations: |
| .IP * 3 |
| Any operation that performs a ptrace access mode |
| .BR PTRACE_MODE_ATTACH |
| check\(emfor example, |
| .BR ptrace () |
| .BR PTRACE_ATTACH . |
| (See the "Ptrace access mode checking" discussion above.) |
| .IP |
| .IP * |
| .BR ptrace () |
| .BR PTRACE_TRACEME . |
| .PP |
| A process that has the |
| .B CAP_SYS_PTRACE |
| capability can update the |
| .IR /proc/sys/kernel/yama/ptrace_scope |
| file with one of the following values: |
| .TP |
| 0 ("classic ptrace permissions") |
| No additional restrictions on operations that perform |
| .BR PTRACE_MODE_ATTACH |
| checks (beyond those imposed by the commoncap and other LSMs). |
| .IP |
| The use of |
| .BR PTRACE_TRACEME |
| is unchanged. |
| .TP |
| 1 ("restricted ptrace") [default value] |
| When performing an operation that requires a |
| .BR PTRACE_MODE_ATTACH |
| check, the calling process must either have the |
| .B CAP_SYS_PTRACE |
| capability in the user namespace of the target process or |
| it must have a predefined relationship with the target process. |
| By default, |
| the predefined relationship is that the target process |
| must be a descendant of the caller. |
| .IP |
| A target process can employ the |
| .BR prctl (2) |
| .B PR_SET_PTRACER |
| operation to declare an additional PID that is allowed to perform |
| .BR PTRACE_MODE_ATTACH |
| operations on the target. |
| See the kernel source file |
| .IR Documentation/admin\-guide/LSM/Yama.rst |
| .\" commit 90bb766440f2147486a2acc3e793d7b8348b0c22 |
| (or |
| .IR Documentation/security/Yama.txt |
| before Linux 4.13) |
| for further details. |
| .IP |
| The use of |
| .BR PTRACE_TRACEME |
| is unchanged. |
| .TP |
| 2 ("admin-only attach") |
| Only processes with the |
| .B CAP_SYS_PTRACE |
| capability in the user namespace of the target process may perform |
| .BR PTRACE_MODE_ATTACH |
| operations or trace children that employ |
| .BR PTRACE_TRACEME . |
| .TP |
| 3 ("no attach") |
| No process may perform |
| .BR PTRACE_MODE_ATTACH |
| operations or trace children that employ |
| .BR PTRACE_TRACEME . |
| .IP |
| Once this value has been written to the file, it cannot be changed. |
| .PP |
| With respect to values 1 and 2, |
| note that creating a new user namespace effectively removes the |
| protection offered by Yama. |
| This is because a process in the parent user namespace whose effective |
| UID matches the UID of the creator of a child namespace |
| has all capabilities (including |
| .BR CAP_SYS_PTRACE ) |
| when performing operations within the child user namespace |
| (and further-removed descendants of that namespace). |
| Consequently, when a process tries to use user namespaces to sandbox itself, |
| it inadvertently weakens the protections offered by the Yama LSM. |
| .\" |
| .\""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" |
| .\" |
| .SS C library/kernel differences |
| At the system call level, the |
| .BR PTRACE_PEEKTEXT , |
| .BR PTRACE_PEEKDATA , |
| and |
| .BR PTRACE_PEEKUSER |
| requests have a different API: they store the result |
| at the address specified by the |
| .I data |
| parameter, and the return value is the error flag. |
| The glibc wrapper function provides the API given in DESCRIPTION above, |
| with the result being returned via the function return value. |
| .SH BUGS |
| On hosts with 2.6 kernel headers, |
| .B PTRACE_SETOPTIONS |
| is declared with a different value than the one for 2.4. |
| This leads to applications compiled with 2.6 kernel |
| headers failing when run on 2.4 kernels. |
| This can be worked around by redefining |
| .B PTRACE_SETOPTIONS |
| to |
| .BR PTRACE_OLDSETOPTIONS , |
| if that is defined. |
| .PP |
| Group-stop notifications are sent to the tracer, but not to real parent. |
| Last confirmed on 2.6.38.6. |
| .PP |
| If a thread group leader is traced and exits by calling |
| .BR _exit (2), |
| .\" Note from Denys Vlasenko: |
| .\" Here "exits" means any kind of death - _exit, exit_group, |
| .\" signal death. Signal death and exit_group cases are trivial, |
| .\" though: since signal death and exit_group kill all other threads |
| .\" too, "until all other threads exit" thing happens rather soon |
| .\" in these cases. Therefore, only _exit presents observably |
| .\" puzzling behavior to ptrace users: thread leader _exit's, |
| .\" but WIFEXITED isn't reported! We are trying to explain here |
| .\" why it is so. |
| a |
| .B PTRACE_EVENT_EXIT |
| stop will happen for it (if requested), but the subsequent |
| .B WIFEXITED |
| notification will not be delivered until all other threads exit. |
| As explained above, if one of other threads calls |
| .BR execve (2), |
| the death of the thread group leader will |
| .I never |
| be reported. |
| If the execed thread is not traced by this tracer, |
| the tracer will never know that |
| .BR execve (2) |
| happened. |
| One possible workaround is to |
| .B PTRACE_DETACH |
| the thread group leader instead of restarting it in this case. |
| Last confirmed on 2.6.38.6. |
| .\" FIXME . need to test/verify this scenario |
| .PP |
| A |
| .B SIGKILL |
| signal may still cause a |
| .B PTRACE_EVENT_EXIT |
| stop before actual signal death. |
| This may be changed in the future; |
| .B SIGKILL |
| is meant to always immediately kill tasks even under ptrace. |
| Last confirmed on Linux 3.13. |
| .PP |
| Some system calls return with |
| .B EINTR |
| if a signal was sent to a tracee, but delivery was suppressed by the tracer. |
| (This is very typical operation: it is usually |
| done by debuggers on every attach, in order to not introduce |
| a bogus |
| .BR SIGSTOP ). |
| As of Linux 3.2.9, the following system calls are affected |
| (this list is likely incomplete): |
| .BR epoll_wait (2), |
| and |
| .BR read (2) |
| from an |
| .BR inotify (7) |
| file descriptor. |
| The usual symptom of this bug is that when you attach to |
| a quiescent process with the command |
| .PP |
| .in +4n |
| .EX |
| strace \-p <process-ID> |
| .EE |
| .in |
| .PP |
| then, instead of the usual |
| and expected one-line output such as |
| .PP |
| .in +4n |
| .EX |
| restart_syscall(<... resuming interrupted call ...>_ |
| .EE |
| .in |
| .PP |
| or |
| .PP |
| .in +4n |
| .EX |
| select(6, [5], NULL, [5], NULL_ |
| .EE |
| .in |
| .PP |
| ('_' denotes the cursor position), you observe more than one line. |
| For example: |
| .PP |
| .in +4n |
| .EX |
| clock_gettime(CLOCK_MONOTONIC, {15370, 690928118}) = 0 |
| epoll_wait(4,_ |
| .EE |
| .in |
| .PP |
| What is not visible here is that the process was blocked in |
| .BR epoll_wait (2) |
| before |
| .BR strace (1) |
| has attached to it. |
| Attaching caused |
| .BR epoll_wait (2) |
| to return to user space with the error |
| .BR EINTR . |
| In this particular case, the program reacted to |
| .B EINTR |
| by checking the current time, and then executing |
| .BR epoll_wait (2) |
| again. |
| (Programs which do not expect such "stray" |
| .BR EINTR |
| errors may behave in an unintended way upon an |
| .BR strace (1) |
| attach.) |
| .PP |
| Contrary to the normal rules, the glibc wrapper for |
| .BR ptrace () |
| can set |
| .I errno |
| to zero. |
| .SH SEE ALSO |
| .BR gdb (1), |
| .BR ltrace (1), |
| .BR strace (1), |
| .BR clone (2), |
| .BR execve (2), |
| .BR fork (2), |
| .BR gettid (2), |
| .BR prctl (2), |
| .BR seccomp (2), |
| .BR sigaction (2), |
| .BR tgkill (2), |
| .BR vfork (2), |
| .BR waitpid (2), |
| .BR exec (3), |
| .BR capabilities (7), |
| .BR signal (7) |