| TRACE-CMD-RECORD(1) |
| =================== |
| |
| NAME |
| ---- |
| trace-cmd-record - record a trace from the Ftrace Linux internal tracer |
| |
| SYNOPSIS |
| -------- |
| *trace-cmd record* ['OPTIONS'] ['command'] |
| |
| DESCRIPTION |
| ----------- |
| The trace-cmd(1) record command will set up the Ftrace Linux kernel tracer to |
| record the specified plugins or events that happen while the 'command' |
| executes. If no command is given, then it will record until the user hits |
| Ctrl-C. |
| |
| The record command of trace-cmd will set up the Ftrace tracer to start tracing |
| the various events or plugins that are given on the command line. It will then |
| create a number of tracing processes (one per CPU) that will start recording |
| from the kernel ring buffer straight into temporary files. When the command is |
| complete (or Ctrl-C is hit) all the files will be combined into a trace.dat |
| file that can later be read (see trace-cmd-report(1)). |
| |
| OPTIONS |
| ------- |
| *-p* 'plugin':: |
| Specify a trace plugin. Plugins are special Ftrace tracers that usually do |
| more than just trace an event. Common plugins are *function*, |
| *function_graph*, *preemptirqsoff*, *irqsoff*, *preemptoff*, and *wakeup*. |
| A plugin must be supported by the running kernel. To see a list of |
| available plugins, see trace-cmd-list(1). |
| |
| *-e* 'event':: |
| Specify an event to trace. Various static trace points have been added to |
| the Linux kernel. They are grouped by subsystem where you can enable all |
| events of a given subsystem or specify specific events to be enabled. The |
| 'event' is of the format "subsystem:event-name". You can also just specify |
| the subsystem without the ':event-name' or the event-name without the |
| "subsystem:". Using "-e sched_switch" will enable the "sched_switch" event |
| where as, "-e sched" will enable all events under the "sched" subsystem. |
| |
| The 'event' can also contain glob expressions. That is, "*stat*" will |
| select all events (or subsystems) that have the characters "stat" in their |
| names. |
| |
| The keyword 'all' can be used to enable all events. |
| |
| *-a*:: |
| Every event that is being recorded has its output format file saved |
| in the output file to be able to display it later. But if other |
| events are enabled in the trace without trace-cmd's knowledge, the |
| formats of those events will not be recorded and trace-cmd report will |
| not be able to display them. If this is the case, then specify the |
| *-a* option and the format for all events in the system will be saved. |
| |
| *-T*:: |
| Enable a stacktrace on each event. For example: |
| |
| <idle>-0 [003] 58549.289091: sched_switch: kworker/0:1:0 [120] R ==> trace-cmd:2603 [120] |
| <idle>-0 [003] 58549.289092: kernel_stack: <stack trace> |
| => schedule (ffffffff814b260e) |
| => cpu_idle (ffffffff8100a38c) |
| => start_secondary (ffffffff814ab828) |
| |
| *--func-stack*:: |
| Enable a stack trace on all functions. Note this is only applicable |
| for the "function" plugin tracer, and will only take effect if the |
| -l option is used and succeeds in limiting functions. If the function |
| tracer is not filtered, and the stack trace is enabled, you can live |
| lock the machine. |
| |
| *-f* 'filter':: |
| Specify a filter for the previous event. This must come after a *-e*. This |
| will filter what events get recorded based on the content of the event. |
| Filtering is passed to the kernel directly so what filtering is allowed |
| may depend on what version of the kernel you have. Basically, it will |
| let you use C notation to check if an event should be processed or not. |
| |
| ---------------------------------------- |
| ==, >=, <=, >, <, &, |, && and || |
| ---------------------------------------- |
| |
| The above are usually safe to use to compare fields. |
| |
| *-v*:: |
| This will cause all events specified after it on the command line to not |
| be traced. This is useful for selecting a subsystem to be traced but to |
| leave out various events. For Example: "-e sched -v -e "\*stat\*"" will |
| enable all events in the sched subsystem except those that have "stat" in |
| their names. |
| |
| Note: the *-v* option was taken from the way grep(1) inverts the following |
| matches. |
| |
| *-F*:: |
| This will filter only the executable that is given on the command line. If |
| no command is given, then it will filter itself (pretty pointless). |
| Using *-F* will let you trace only events that are caused by the given |
| command. |
| |
| *-P* 'pid':: |
| Similar to *-F* but lets you specify a process ID to trace. |
| |
| *-c*:: |
| Used with either *-F* to trace the process' children too. |
| |
| *-o* 'output-file':: |
| By default, trace-cmd report will create a 'trace.dat' file. You can |
| specify a different file to write to with the *-o* option. |
| |
| *-l* 'function-name':: |
| This will limit the 'function' and 'function_graph' tracers to only trace |
| the given function name. More than one *-l* may be specified on the |
| command line to trace more than one function. The limited use of glob |
| expressions are also allowed. These are 'match\*' to only filter functions |
| that start with 'match'. '\*match' to only filter functions that end with |
| 'match'. '\*match\*' to only filter on functions that contain 'match'. |
| |
| *-g* 'function-name':: |
| This option is for the function_graph plugin. It will graph the given |
| function. That is, it will only trace the function and all functions that |
| it calls. You can have more than one *-g* on the command line. |
| |
| *-n* 'function-name':: |
| This has the opposite effect of *-l*. The function given with the *-n* |
| option will not be traced. This takes precedence, that is, if you include |
| the same function for both *-n* and *-l*, it will not be traced. |
| |
| *-d*:: |
| Some tracer plugins enable the function tracer by default. Like the |
| latency tracers. This option prevents the function tracer from being |
| enabled at start up. |
| |
| *-D*:: |
| The option *-d* will try to use the function-trace option to disable the |
| function tracer (if available), otherwise it defaults to the proc file: |
| /proc/sys/kernel/ftrace_enabled, but will not touch it if the function-trace |
| option is available. The *-D* option will disable both the ftrace_enabled |
| proc file as well as the function-trace option if it exists. |
| |
| Note, this disable function tracing for all users, which includes users |
| outside of ftrace tracers (stack_tracer, perf, etc). |
| |
| *-O* 'option':: |
| Ftrace has various options that can be enabled or disabled. This allows |
| you to set them. Appending the text 'no' to an option disables it. |
| For example: "-O nograph-time" will disable the "graph-time" Ftrace |
| option. |
| |
| *-s* 'interval':: |
| The processes that trace-cmd creates to record from the ring buffer need |
| to wake up to do the recording. Setting the 'interval' to zero will cause |
| the processes to wakeup every time new data is written into the buffer. |
| But since Ftrace is recording kernel activity, the act of this processes |
| going back to sleep may cause new events into the ring buffer which will |
| wake the process back up. This will needlessly add extra data into the |
| ring buffer. |
| |
| The 'interval' metric is microseconds. The default is set to 1000 (1 ms). |
| This is the time each recording process will sleep before waking up to |
| record any new data that was written to the ring buffer. |
| |
| *-r* 'priority':: |
| The priority to run the capture threads at. In a busy system the trace |
| capturing threads may be staved and events can be lost. This increases |
| the priority of those threads to the real time (FIFO) priority. |
| But use this option with care, it can also change the behaviour of |
| the system being traced. |
| |
| *-b* 'size':: |
| This sets the ring buffer size to 'size' kilobytes. Because the Ftrace |
| ring buffer is per CPU, this size is the size of each per CPU ring buffer |
| inside the kernel. Using "-b 10000" on a machine with 4 CPUs will make |
| Ftrace have a total buffer size of 40 Megs. |
| |
| *-B* 'buffer-name':: |
| If the kernel supports multiple buffers, this will add a buffer with |
| the given name. If the buffer name already exists, that buffer is just |
| reset and will not be deleted at the end of record execution. If the |
| buffer is created, it will be removed at the end of execution (unless |
| the *-k* is set, or 'start' command was used). |
| |
| After a buffer name is stated, all events added after that will be |
| associated with that buffer. If no buffer is specified, or an event |
| is specified before a buffer name, it will be associated with the |
| main (toplevel) buffer. |
| |
| trace-cmd record -e sched -B block -e block -B time -e timer sleep 1 |
| |
| The above is will enable all sched events in the main buffer. It will |
| then create a 'block' buffer instance and enable all block events within |
| that buffer. A 'time' buffer instance is created and all timer events |
| will be enabled for that event. |
| |
| *-m* 'size':: |
| The max size in kilobytes that a per cpu buffer should be. Note, due |
| to rounding to page size, the number may not be totally correct. |
| Also, this is performed by switching between two buffers that are half |
| the given size thus the output may not be of the given size even if |
| much more was written. |
| |
| Use this to prevent running out of diskspace for long runs. |
| |
| *-M* 'cpumask':: |
| Set the cpumask for to trace. It only affects the last buffer instance |
| given. If supplied before any buffer instance, then it affects the |
| main buffer. The value supplied must be a hex number. |
| |
| trace-cmd record -p function -M c -B events13 -e all -M 5 |
| |
| If the -M is left out, then the mask stays the same. To enable all |
| CPUs, pass in a value of '-1'. |
| |
| *-k*:: |
| By default, when trace-cmd is finished tracing, it will reset the buffers |
| and disable all the tracing that it enabled. This option keeps trace-cmd |
| from disabling the tracer and reseting the buffer. This option is useful for |
| debugging trace-cmd. |
| |
| Note: usually trace-cmd will set the "tracing_on" file back to what it |
| was before it was called. This option will leave that file set to zero. |
| |
| *-i*:: |
| By default, if an event is listed that trace-cmd does not find, it |
| will exit with an error. This option will just ignore events that are |
| listed on the command line but are not found on the system. |
| |
| *-N* 'host:port':: |
| If another machine is running "trace-cmd listen", this option is used to |
| have the data sent to that machine with UDP packets. Instead of writing |
| to an output file, the data is sent off to a remote box. This is ideal for |
| embedded machines with little storage, or having a single machine that |
| will keep all the data in a single repository. |
| |
| Note: This option is not supported with latency tracer plugins: |
| wakeup, wakeup_rt, irqsoff, preemptoff and preemptirqsoff |
| |
| *-t*:: |
| This option is used with *-N*, when there's a need to send the live data |
| with TCP packets instead of UDP. Although TCP is not nearly as fast as |
| sending the UDP packets, but it may be needed if the network is not that |
| reliable, the amount of data is not that intensive, and a guarantee is |
| needed that all traced information is transfered successfully. |
| |
| *--date*:: |
| With the *--date* option, "trace-cmd" will write timestamps into the |
| trace buffer after it has finished recording. It will then map the |
| timestamp to gettimeofday which will allow wall time output from the |
| timestamps reading the created 'trace.dat' file. |
| |
| EXAMPLES |
| -------- |
| |
| The basic way to trace all events: |
| |
| ------------------------------ |
| # trace-cmd record -e all ls > /dev/null |
| # trace-cmd report |
| trace-cmd-13541 [003] 106260.693809: filemap_fault: address=0x128122 offset=0xce |
| trace-cmd-13543 [001] 106260.693809: kmalloc: call_site=81128dd4 ptr=0xffff88003dd83800 bytes_req=768 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_ZERO |
| ls-13545 [002] 106260.693809: kfree: call_site=810a7abb ptr=0x0 |
| ls-13545 [002] 106260.693818: sys_exit_write: 0x1 |
| |
| |
| ------------------------------ |
| |
| |
| |
| To use the function tracer with sched switch tracing: |
| |
| ------------------------------ |
| # trace-cmd record -p function -e sched_switch ls > /dev/null |
| # trace-cmd report |
| ls-13587 [002] 106467.860310: function: hrtick_start_fair <-- pick_next_task_fair |
| ls-13587 [002] 106467.860313: sched_switch: prev_comm=trace-cmd prev_pid=13587 prev_prio=120 prev_state=R ==> next_comm=trace-cmd next_pid=13583 next_prio=120 |
| trace-cmd-13585 [001] 106467.860314: function: native_set_pte_at <-- __do_fault |
| trace-cmd-13586 [003] 106467.860314: function: up_read <-- do_page_fault |
| ls-13587 [002] 106467.860317: function: __phys_addr <-- schedule |
| trace-cmd-13585 [001] 106467.860318: function: _raw_spin_unlock <-- __do_fault |
| ls-13587 [002] 106467.860320: function: native_load_sp0 <-- __switch_to |
| trace-cmd-13586 [003] 106467.860322: function: down_read_trylock <-- do_page_fault |
| |
| |
| ------------------------------ |
| |
| Here is a nice way to find what interrupts have the highest latency: |
| ------------------------------------------ |
| # trace-cmd record -p function_graph -e irq_handler_entry -l do_IRQ sleep 10 |
| # trace-cmd report |
| <idle>-0 [000] 157412.933969: funcgraph_entry: | do_IRQ() { |
| <idle>-0 [000] 157412.933974: irq_handler_entry: irq=48 name=eth0 |
| <idle>-0 [000] 157412.934004: funcgraph_exit: + 36.358 us | } |
| <idle>-0 [000] 157413.895004: funcgraph_entry: | do_IRQ() { |
| <idle>-0 [000] 157413.895011: irq_handler_entry: irq=48 name=eth0 |
| <idle>-0 [000] 157413.895026: funcgraph_exit: + 24.014 us | } |
| <idle>-0 [000] 157415.891762: funcgraph_entry: | do_IRQ() { |
| <idle>-0 [000] 157415.891769: irq_handler_entry: irq=48 name=eth0 |
| <idle>-0 [000] 157415.891784: funcgraph_exit: + 22.928 us | } |
| <idle>-0 [000] 157415.934869: funcgraph_entry: | do_IRQ() { |
| <idle>-0 [000] 157415.934874: irq_handler_entry: irq=48 name=eth0 |
| <idle>-0 [000] 157415.934906: funcgraph_exit: + 37.512 us | } |
| <idle>-0 [000] 157417.888373: funcgraph_entry: | do_IRQ() { |
| <idle>-0 [000] 157417.888381: irq_handler_entry: irq=48 name=eth0 |
| <idle>-0 [000] 157417.888398: funcgraph_exit: + 25.943 us | } |
| |
| |
| ------------------------------------------ |
| |
| SEE ALSO |
| -------- |
| trace-cmd(1), trace-cmd-report(1), trace-cmd-start(1), trace-cmd-stop(1), |
| trace-cmd-extract(1), trace-cmd-reset(1), trace-cmd-split(1), |
| trace-cmd-list(1), trace-cmd-listen(1) |
| |
| AUTHOR |
| ------ |
| Written by Steven Rostedt, <rostedt@goodmis.org> |
| |
| RESOURCES |
| --------- |
| git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/trace-cmd.git |
| |
| COPYING |
| ------- |
| Copyright \(C) 2010 Red Hat, Inc. Free use of this software is granted under |
| the terms of the GNU Public License (GPL). |
| |