| '\" et |
| .TH PAX "1P" 2017 "IEEE/The Open Group" "POSIX Programmer's Manual" |
| .\" |
| .SH PROLOG |
| This manual page is part of the POSIX Programmer's Manual. |
| The Linux implementation of this interface may differ (consult |
| the corresponding Linux manual page for details of Linux behavior), |
| or the interface may not be implemented on Linux. |
| .\" |
| .SH NAME |
| pax |
| \(em portable archive interchange |
| .SH SYNOPSIS |
| .LP |
| .nf |
| pax \fB[\fR-dv\fB] [\fR-c|-n\fB] [\fR-H|-L\fB] [\fR-o \fIoptions\fB] [\fR-f \fIarchive\fB] [\fR-s \fIreplstr\fB]\fR... |
| \fB[\fIpattern\fR...\fB]\fR |
| .P |
| pax -r\fB[\fR-c|-n\fB] [\fR-dikuv\fB] [\fR-H|-L\fB] [\fR-f \fIarchive\fB] [\fR-o \fIoptions\fB]\fR... \fB[\fR-p \fIstring\fB]\fR... |
| \fB[\fR-s \fIreplstr\fB]\fR... \fB[\fIpattern\fR...\fB]\fR |
| .P |
| pax -w \fB[\fR-dituvX\fB] [\fR-H|-L\fB] [\fR-b \fIblocksize\fB] [[\fR-a\fB] [\fR-f \fIarchive\fB]] [\fR-o \fIoptions\fB]\fR... |
| \fB[\fR-s \fIreplstr\fB]\fR... \fB[\fR-x \fIformat\fB] [\fIfile\fR...\fB]\fR |
| .P |
| pax -r -w \fB[\fR-diklntuvX\fB] [\fR-H|-L\fB] [\fR-o \fIoptions\fB]\fR... \fB[\fR-p \fIstring\fB]\fR... |
| \fB[\fR-s \fIreplstr\fB]\fR... \fB[\fIfile\fR...\fB] \fIdirectory\fR |
| .fi |
| .SH DESCRIPTION |
| The |
| .IR pax |
| utility shall read, write, and write lists of the members of archive |
| files and copy directory hierarchies. A variety of archive formats |
| shall be supported; see the |
| .BR \-x |
| .IR format |
| option. |
| .P |
| The action to be taken depends on the presence of the |
| .BR \-r |
| and |
| .BR \-w |
| options. The four combinations of |
| .BR \-r |
| and |
| .BR \-w |
| are referred to as the four modes of operation: |
| .BR list , |
| .BR read , |
| .BR write , |
| and |
| .BR copy |
| modes, corresponding respectively to the four forms shown in the |
| SYNOPSIS section. |
| .IP "\fBlist\fP" 10 |
| In |
| .BR list |
| mode (when neither |
| .BR \-r |
| nor |
| .BR \-w |
| are specified), |
| .IR pax |
| shall write the names of the members of the archive file read from the |
| standard input, with pathnames matching the specified patterns, to |
| standard output. If a named file is of type directory, the file |
| hierarchy rooted at that file shall be listed as well. |
| .IP "\fBread\fP" 10 |
| In |
| .BR read |
| mode (when |
| .BR \-r |
| is specified, but |
| .BR \-w |
| is not), |
| .IR pax |
| shall extract the members of the archive file read from the standard |
| input, with pathnames matching the specified patterns. If an extracted |
| file is of type directory, the file hierarchy rooted at that file shall |
| be extracted as well. The extracted files shall be created performing |
| pathname resolution with the directory in which |
| .IR pax |
| was invoked as the current working directory. |
| .RS 10 |
| .P |
| If an attempt is made to extract a directory when the directory |
| already exists, this shall not be considered an error. If |
| an attempt is made to extract a FIFO when the FIFO already exists, |
| this shall not be considered an error. |
| .P |
| The ownership, access, and modification times, and file mode of the |
| restored files are discussed under the |
| .BR \-p |
| option. |
| .RE |
| .IP "\fBwrite\fP" 10 |
| In |
| .BR write |
| mode (when |
| .BR \-w |
| is specified, but |
| .BR \-r |
| is not), |
| .IR pax |
| shall write the contents of the |
| .IR file |
| operands to the standard output in an archive format. If no |
| .IR file |
| operands are specified, a list of files to copy, one per line, shall be |
| read from the standard input and each entry in this list shall be |
| processed as if it had been a |
| .IR file |
| operand on the command line. A file of type directory shall include |
| all of the files in the file hierarchy rooted at the file. |
| .IP "\fBcopy\fP" 10 |
| In |
| .BR copy |
| mode (when both |
| .BR \-r |
| and |
| .BR \-w |
| are specified), |
| .IR pax |
| shall copy the |
| .IR file |
| operands to the destination directory. |
| .RS 10 |
| .P |
| If no |
| .IR file |
| operands are specified, a list of files to copy, one per line, shall be |
| read from the standard input. A file of type directory shall include |
| all of the files in the file hierarchy rooted at the file. |
| .P |
| The effect of the |
| .BR copy |
| shall be as if the copied files were written to a |
| .IR pax |
| format archive file and then subsequently extracted, except that |
| copying of sockets may be supported even if archiving them in write |
| mode is not supported, and that there may be hard links between the |
| original and the copied files. If the destination directory is a |
| subdirectory of one of the files to be copied, the results |
| are unspecified. If the destination directory is a file of a |
| type not defined by the System Interfaces volume of POSIX.1\(hy2017, the results are implementation-defined; |
| otherwise, it shall be an error for the file named by the |
| .IR directory |
| operand not to exist, not be writable by the user, or not be a file of |
| type directory. |
| .RE |
| .P |
| In |
| .BR read |
| or |
| .BR copy |
| modes, if intermediate directories are necessary to extract an archive |
| member, |
| .IR pax |
| shall perform actions equivalent to the |
| \fImkdir\fR() |
| function defined in the System Interfaces volume of POSIX.1\(hy2017, called with the following arguments: |
| .IP " *" 4 |
| The intermediate directory used as the |
| .IR path |
| argument |
| .IP " *" 4 |
| The value of the bitwise-inclusive OR of S_IRWXU, S_IRWXG, and S_IRWXO |
| as the |
| .IR mode |
| argument |
| .P |
| If any specified |
| .IR pattern |
| or |
| .IR file |
| operands are not matched by at least one file or archive member, |
| .IR pax |
| shall write a diagnostic message to standard error for each one that |
| did not match and exit with a non-zero exit status. |
| .P |
| The archive formats described in the EXTENDED DESCRIPTION section shall |
| be automatically detected on input. The default output archive format |
| shall be implementation-defined. |
| .P |
| A single archive can span multiple files. The |
| .IR pax |
| utility shall determine, in an implementation-defined manner, what |
| file to read or write as the next file. |
| .P |
| If the selected archive format supports the specification of linked files, |
| it shall be an error if these files cannot be linked when the archive |
| is extracted. For archive formats that do not store file contents with |
| each name that causes a hard link, if the file that contains the data |
| is not extracted during this |
| .IR pax |
| session, either the data shall be restored from the original file, or a |
| diagnostic message shall be displayed with the name of a file that can |
| be used to extract the data. In traversing directories, |
| .IR pax |
| shall detect infinite loops; that is, entering a previously visited |
| directory that is an ancestor of the last file visited. When it detects |
| an infinite loop, |
| .IR pax |
| shall write a diagnostic message to standard error and shall |
| terminate. |
| .SH OPTIONS |
| The |
| .IR pax |
| utility shall conform to the Base Definitions volume of POSIX.1\(hy2017, |
| .IR "Section 12.2" ", " "Utility Syntax Guidelines", |
| except that the order of presentation of the |
| .BR \-o , |
| .BR \-p , |
| and |
| .BR \-s |
| options is significant. |
| .P |
| The following options shall be supported: |
| .IP "\fB\-r\fP" 10 |
| Read an archive file from standard input. |
| .IP "\fB\-w\fP" 10 |
| Write files to the standard output in the specified archive format. |
| .IP "\fB\-a\fP" 10 |
| Append files to the end of the archive. It is implementation-defined |
| which devices on the system support appending. Additional file formats |
| unspecified by this volume of POSIX.1\(hy2017 may impose restrictions on appending. |
| .IP "\fB\-b\ \fIblocksize\fR" 10 |
| Block the output at a positive decimal integer number of bytes per |
| write to the archive file. Devices and archive formats may impose |
| restrictions on blocking. Blocking shall be automatically determined on |
| input. Conforming applications shall not specify a |
| .IR blocksize |
| value larger than 32\|256. Default blocking when creating archives |
| depends on the archive format. (See the |
| .BR \-x |
| option below.) |
| .IP "\fB\-c\fP" 10 |
| Match all file or archive members except those specified by the |
| .IR pattern |
| or |
| .IR file |
| operands. |
| .IP "\fB\-d\fP" 10 |
| Cause files of type directory being copied or archived or archive |
| members of type directory being extracted or listed to match only the |
| file or archive member itself and not the file hierarchy rooted at the |
| file. |
| .IP "\fB\-f\ \fIarchive\fR" 10 |
| Specify the pathname of the input or output archive, overriding the |
| default standard input (in |
| .BR list |
| or |
| .BR read |
| modes) or standard output (\c |
| .BR write |
| mode). |
| .IP "\fB\-H\fP" 10 |
| If a symbolic link referencing a file of type directory is specified on |
| the command line, |
| .IR pax |
| shall archive the file hierarchy rooted in the file referenced by the |
| link, using the name of the link as the root of the file hierarchy. |
| Otherwise, if a symbolic link referencing a file of any other file type |
| which |
| .IR pax |
| can normally archive is specified on the command line, then |
| .IR pax |
| shall archive the file referenced by the link, using the name of the |
| link. The default behavior, when neither |
| .BR \-H |
| or |
| .BR \-L |
| are specified, shall be to archive the symbolic link itself. |
| .IP "\fB\-i\fP" 10 |
| Interactively rename files or archive members. For each archive member |
| matching a |
| .IR pattern |
| operand or file matching a |
| .IR file |
| operand, a prompt shall be written to the file |
| .BR /dev/tty . |
| The prompt shall contain the name of the file or archive member, but |
| the format is otherwise unspecified. A line shall then be read from |
| .BR /dev/tty . |
| If this line is blank, the file or archive member shall be skipped. If |
| this line consists of a single period, the file or archive member shall |
| be processed with no modification to its name. Otherwise, its name |
| shall be replaced with the contents of the line. The |
| .IR pax |
| utility shall immediately exit with a non-zero exit status if |
| end-of-file is encountered when reading a response or if |
| .BR /dev/tty |
| cannot be opened for reading and writing. |
| .RS 10 |
| .P |
| The results of extracting a hard link to a file that has been renamed |
| during extraction are unspecified. |
| .RE |
| .IP "\fB\-k\fP" 10 |
| Prevent the overwriting of existing files. |
| .IP "\fB\-l\fP" 10 |
| (The letter ell.) In |
| .BR copy |
| mode, hard links shall be made between the source and destination file |
| hierarchies whenever possible. If specified in conjunction with |
| .BR \-H |
| or |
| .BR \-L , |
| when a symbolic link is encountered, the hard link created in the |
| destination file hierarchy shall be to the file referenced by the |
| symbolic link. If specified when neither |
| .BR \-H |
| nor |
| .BR \-L |
| is specified, when a symbolic link is encountered, the implementation |
| shall create a hard link to the symbolic link in the source file |
| hierarchy or copy the symbolic link to the destination. |
| .IP "\fB\-L\fP" 10 |
| If a symbolic link referencing a file of type directory is specified on |
| the command line or encountered during the traversal of a file |
| hierarchy, |
| .IR pax |
| shall archive the file hierarchy rooted in the file referenced by the |
| link, using the name of the link as the root of the file hierarchy. |
| Otherwise, if a symbolic link referencing a file of any other file type |
| which |
| .IR pax |
| can normally archive is specified on the command line or encountered |
| during the traversal of a file hierarchy, |
| .IR pax |
| shall archive the file referenced by the link, using the name of the |
| link. The default behavior, when neither |
| .BR \-H |
| or |
| .BR \-L |
| are specified, shall be to archive the symbolic link itself. |
| .IP "\fB\-n\fP" 10 |
| Select the first archive member that matches each |
| .IR pattern |
| operand. No more than one archive member shall be matched for each |
| pattern (although members of type directory shall still match the file |
| hierarchy rooted at that file). |
| .IP "\fB\-o\ \fIoptions\fR" 10 |
| Provide information to the implementation to modify the algorithm for |
| extracting or writing files. The value of |
| .IR options |
| shall consist of one or more |
| <comma>-separated |
| keywords of the form: |
| .RS 10 |
| .sp |
| .RS 4 |
| .nf |
| |
| \fIkeyword\fB[[\fR:\fB]\fR=\fIvalue\fB][\fR,\fIkeyword\fB[[\fR:\fB]\fR=\fIvalue\fB]\fR, ...\fB]\fR |
| .fi |
| .P |
| .RE |
| .P |
| Some keywords apply only to certain file formats, as indicated with |
| each description. Use of keywords that are inapplicable to the file |
| format being processed produces undefined results. |
| .P |
| Keywords in the |
| .IR options |
| argument shall be a string that would be a valid portable filename as |
| described in the Base Definitions volume of POSIX.1\(hy2017, |
| .IR "Section 3.282" ", " "Portable Filename Character Set". |
| .TP 10 |
| .BR Note: |
| Keywords are not expected to be filenames, merely to follow the same |
| character composition rules as portable filenames. |
| .P |
| .P |
| Keywords can be preceded with white space. The |
| .IR value |
| field shall consist of zero or more characters; within |
| .IR value , |
| the application shall precede any literal |
| <comma> |
| with a |
| <backslash>, |
| which shall be ignored, but preserves the |
| <comma> |
| as part of |
| .IR value . |
| A |
| <comma> |
| as the final character, or a |
| <comma> |
| followed solely by white space as the final characters, in |
| .IR options |
| shall be ignored. Multiple |
| .BR \-o |
| options can be specified; if keywords given to these multiple |
| .BR \-o |
| options conflict, the keywords and values appearing later in command |
| line sequence shall take precedence and the earlier shall be silently |
| ignored. The following keyword values of |
| .IR options |
| shall be supported for the file formats as indicated: |
| .IP "\fBdelete\fR=\fIpattern\fR" 6 |
| .br |
| (Applicable only to the |
| .BR \-x |
| .BR pax |
| format.) When used in |
| .BR write |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall omit from extended header records that it produces any keywords |
| matching the string pattern. When used in |
| .BR read |
| or |
| .BR list |
| mode, |
| .IR pax |
| shall ignore any keywords matching the string pattern in the extended |
| header records. In both cases, matching shall be performed using the |
| pattern matching notation described in |
| .IR "Section 2.13.1" ", " "Patterns Matching a Single Character" |
| and |
| .IR "Section 2.13.2" ", " "Patterns Matching Multiple Characters". |
| For example: |
| .RS 6 |
| .sp |
| .RS 4 |
| .nf |
| |
| -o \fBdelete\fR=\fIsecurity\fR.* |
| .fi |
| .P |
| .RE |
| .P |
| would suppress security-related information. See |
| .IR "pax Extended Header" |
| for extended header record keyword usage. |
| .P |
| When multiple |
| .BR \-o \c |
| .BR delete=pattern |
| options are specified, the patterns shall be additive; all keywords |
| matching the specified string patterns shall be omitted from extended |
| header records that |
| .IR pax |
| produces. |
| .RE |
| .IP "\fBexthdr.name\fR=\fIstring\fR" 6 |
| .br |
| (Applicable only to the |
| .BR \-x |
| .BR pax |
| format.) This keyword allows user control over the name that is written |
| into the |
| .BR ustar |
| header blocks for the extended header produced under the circumstances |
| described in |
| .IR "pax Header Block". |
| The name shall be the contents of |
| .IR string , |
| after the following character substitutions have been made: |
| .TS |
| center box tab(!); |
| cB | cB |
| cB | cB |
| lf5 | lw(3.8i). |
| \fIstring\fP |
| Includes:!Replaced by: |
| _ |
| %d!T{ |
| The directory name of the file, equivalent to the result of the |
| .IR dirname |
| utility on the translated pathname. |
| T} |
| %f!T{ |
| The filename of the file, equivalent to the result of the |
| .IR basename |
| utility on the translated pathname. |
| T} |
| %p!T{ |
| The process ID of the |
| .IR pax |
| process. |
| T} |
| %%!T{ |
| A |
| .BR '%' |
| character. |
| T} |
| .TE |
| .RS 6 |
| .P |
| Any other |
| .BR '%' |
| characters in |
| .IR string |
| produce undefined results. |
| .P |
| If no |
| .BR \-o |
| .BR exthdr.name=string |
| is specified, |
| .IR pax |
| shall use the following default value: |
| .sp |
| .RS 4 |
| .nf |
| |
| %d/PaxHeaders.%p/%f |
| .fi |
| .P |
| .RE |
| .RE |
| .IP "\fBglobexthdr.name\fR=\fIstring\fR" 6 |
| .br |
| (Applicable only to the |
| .BR \-x |
| .BR pax |
| format.) When used in |
| .BR write |
| or |
| .BR copy |
| mode with the appropriate options, |
| .IR pax |
| shall create global extended header records with |
| .BR ustar |
| header blocks that will be treated as regular files by previous |
| versions of |
| .IR pax . |
| This keyword allows user control over the name that is written into the |
| .BR ustar |
| header blocks for global extended header records. The name shall be the |
| contents of string, after the following character substitutions have |
| been made: |
| .TS |
| center box tab(!); |
| cB | cB |
| cB | cB |
| lf5 | lw(3.8i). |
| \fIstring\fP |
| Includes:!Replaced by: |
| _ |
| %n!T{ |
| An integer that represents the sequence number of the global extended |
| header record in the archive, starting at 1. |
| T} |
| %p!T{ |
| The process ID of the |
| .IR pax |
| process. |
| T} |
| %%!T{ |
| A |
| .BR '%' |
| character. |
| T} |
| .TE |
| .RS 6 |
| .P |
| Any other |
| .BR '%' |
| characters in |
| .IR string |
| produce undefined results. |
| .P |
| If no |
| .BR \-o |
| .BR globexthdr.name=string |
| is specified, |
| .IR pax |
| shall use the following default value: |
| .sp |
| .RS 4 |
| .nf |
| |
| $TMPDIR/GlobalHead.%p.%n |
| .fi |
| .P |
| .RE |
| .P |
| where $\c |
| .IR TMPDIR |
| represents the value of the |
| .IR TMPDIR |
| environment variable. If |
| .IR TMPDIR |
| is not set, |
| .IR pax |
| shall use |
| .BR /tmp . |
| .RE |
| .IP "\fBinvalid\fR=\fIaction\fR" 6 |
| .br |
| (Applicable only to the |
| .BR \-x |
| .BR pax |
| format.) This keyword allows user control over the action |
| .IR pax |
| takes upon encountering values in an extended header record that, in |
| .BR read |
| or |
| .BR copy |
| mode, are invalid in the destination hierarchy or, in |
| .BR list |
| mode, cannot be written in the codeset and current locale of the |
| implementation. The following are invalid values that shall be |
| recognized by |
| .IR pax : |
| .RS 6 |
| .IP -- 4 |
| In |
| .BR read |
| or |
| .BR copy |
| mode, a filename or link name that contains character encodings |
| invalid in the destination hierarchy. (For example, the name may |
| contain embedded NULs.) |
| .IP -- 4 |
| In |
| .BR read |
| or |
| .BR copy |
| mode, a filename or link name that is longer than the maximum allowed |
| in the destination hierarchy (for either a pathname component or the |
| entire pathname). |
| .IP -- 4 |
| In |
| .BR list |
| mode, any character string value (filename, link name, user name, and |
| so on) that cannot be written in the codeset and current locale of the |
| implementation. |
| .P |
| The following mutually-exclusive values of the |
| .IR action |
| argument are supported: |
| .IP "\fBbinary\fR" 10 |
| In |
| .BR write |
| mode, |
| .IR pax |
| shall generate a |
| .BR hdrcharset = BINARY |
| extended header record for each file with a filename, link name, group |
| name, owner name, or any other field in an extended header record that |
| cannot be translated to the UTF\(hy8 codeset, allowing the archive to |
| contain the files with unencoded extended header record values. In |
| .BR read |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall use the values specified in the header without translation, |
| regardless of whether this may overwrite an existing file with a valid |
| name. In |
| .BR list |
| mode, |
| .IR pax |
| shall behave identically to the |
| .BR bypass |
| action. |
| .IP "\fBbypass\fR" 10 |
| In |
| .BR read |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall bypass the file, causing no change to the destination hierarchy. |
| In |
| .BR list |
| mode, |
| .IR pax |
| shall write all requested valid values for the file, but its method for |
| writing invalid values is unspecified. |
| .IP "\fBrename\fR" 10 |
| In |
| .BR read |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall act as if the |
| .BR \-i |
| option were in effect for each file with invalid filename or link name |
| values, allowing the user to provide a replacement name interactively. |
| In |
| .BR list |
| mode, |
| .IR pax |
| shall behave identically to the |
| .BR bypass |
| action. |
| .IP "\fBUTF\(hy8\fR" 10 |
| When used in |
| .BR read , |
| .BR copy , |
| or |
| .BR list |
| mode and a filename, link name, owner name, or any other field in an |
| extended header record cannot be translated from the |
| .BR pax |
| UTF\(hy8 codeset format to the codeset and current locale of the |
| implementation, |
| .IR pax |
| shall use the actual UTF\(hy8 encoding for the name. If a |
| .BR hdrcharset |
| extended header record is in effect for this file, the character set |
| specified by that record shall be used instead of UTF\(hy8. If a |
| .BR hdrcharset = BINARY |
| extended header record is in effect for this file, no translation shall |
| be performed. |
| .IP "\fBwrite\fR" 10 |
| In |
| .BR read |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall write the file, translating the name, regardless of whether this |
| may overwrite an existing file with a valid name. In |
| .BR list |
| mode, |
| .IR pax |
| shall behave identically to the |
| .BR bypass |
| action. |
| .P |
| If no |
| .BR \-o |
| .BR invalid=option |
| is specified, |
| .IR pax |
| shall act as if |
| .BR \-o \c |
| .BR invalid=bypass |
| were specified. Any overwriting of existing files that may be allowed |
| by the |
| .BR \-o \c |
| .BR invalid= |
| actions shall be subject to permission (\c |
| .BR \-p ) |
| and modification time (\c |
| .BR \-u ) |
| restrictions, and shall be suppressed if the |
| .BR \-k |
| option is also specified. |
| .RE |
| .IP "\fBlinkdata\fP" 6 |
| .br |
| (Applicable only to the |
| .BR \-x |
| .BR pax |
| format.) In |
| .BR write |
| mode, |
| .IR pax |
| shall write the contents of a file to the archive even when that file |
| is merely a hard link to a file whose contents have already been |
| written to the archive. |
| .IP "\fBlistopt\fR=\fIformat\fP" 6 |
| .br |
| This keyword specifies the output format of the table of contents |
| produced when the |
| .BR \-v |
| option is specified in |
| .BR list |
| mode. See |
| .IR "List Mode Format Specifications". |
| To avoid ambiguity, the |
| .BR listopt=format |
| shall be the only or final |
| .BR keyword=value |
| pair in a |
| .BR \-o |
| option-argument; all characters in the remainder of the option-argument |
| shall be considered part of the format string. When multiple |
| .BR \-o \c |
| .BR listopt=format |
| options are specified, the format strings shall be considered a single, |
| concatenated string, evaluated in command line order. |
| .IP "\fBtimes\fR" 6 |
| .br |
| (Applicable only to the |
| .BR \-x |
| .IR pax |
| format.) When used in |
| .BR write |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall include |
| .BR atime |
| and |
| .BR mtime |
| extended header records for each file. See |
| .IR "pax Extended Header File Times". |
| .P |
| In addition to these keywords, if the |
| .BR \-x |
| .IR pax |
| format is specified, any of the keywords and values defined in |
| .IR "pax Extended Header", |
| including implementation extensions, can be used in |
| .BR \-o |
| option-arguments, in either of two modes: |
| .IP "\fBkeyword\fR=\fIvalue\fR" 6 |
| .br |
| When used in |
| .BR write |
| or |
| .BR copy |
| mode, these keyword/value pairs shall be included at the beginning of |
| the archive as |
| .BR typeflag |
| .BR g |
| global extended header records. When used in |
| .BR read |
| or |
| .BR list |
| mode, these keyword/value pairs shall act as if they had been at the |
| beginning of the archive as |
| .BR typeflag |
| .BR g |
| global extended header records. |
| .IP "\fBkeyword\fR:=\fIvalue\fR" 6 |
| .br |
| When used in |
| .BR write |
| or |
| .BR copy |
| mode, these keyword/value pairs shall be included as records at the |
| beginning of a |
| .BR typeflag |
| .BR x |
| extended header for each file. (This shall be equivalent to the |
| <equals-sign> |
| form except that it creates no |
| .BR typeflag |
| .BR g |
| global extended header records.) When used in |
| .BR read |
| or |
| .BR list |
| mode, these keyword/value pairs shall act as if they were included as |
| records at the end of each extended header; thus, they shall override |
| any global or file-specific extended header record keywords of the same |
| names. For example, in the command: |
| .RS 6 |
| .sp |
| .RS 4 |
| .nf |
| |
| pax -r -o " |
| gname:=mygroup, |
| " <archive |
| .fi |
| .P |
| .RE |
| .P |
| the group name will be forced to a new value for all files read from |
| the archive. |
| .RE |
| .P |
| The precedence of |
| .BR \-o |
| keywords over various fields in the archive is described in |
| .IR "pax Extended Header Keyword Precedence". |
| If the |
| .BR \-o |
| .BR delete =\c |
| .IR pattern , |
| .BR \-o |
| .BR keyword =\c |
| .IR value , |
| or |
| .BR \-o |
| .BR keyword :=\c |
| .IR value |
| options are used to override or remove any extended header data needed |
| to find files in an archive (e.g., |
| .BR "-o delete=size" |
| for a file whose size cannot be represented in a |
| .BR ustar |
| header or |
| .BR "-o size=100" |
| for a file whose size is not 100 bytes), the behavior is undefined. |
| .RE |
| .IP "\fB\-p\ \fIstring\fR" 10 |
| Specify one or more file characteristic options (privileges). The |
| .IR string |
| option-argument shall be a string specifying file characteristics to be |
| retained or discarded on extraction. The string shall consist of the |
| specification characters |
| .BR a , |
| .BR e , |
| .BR m , |
| .BR o , |
| and |
| .BR p . |
| Other implementation-defined characters can be included. Multiple |
| characteristics can be concatenated within the same string and multiple |
| .BR \-p |
| options can be specified. The meaning of the specification characters |
| are as follows: |
| .RS 10 |
| .IP "\fRa\fP" 6 |
| Do not preserve file access times. |
| .IP "\fRe\fP" 6 |
| Preserve the user ID, group ID, file mode bits (see the Base Definitions volume of POSIX.1\(hy2017, |
| .IR "Section 3.169" ", " "File Mode Bits"), |
| access time, modification time, and any other implementation-defined |
| file characteristics. |
| .IP "\fRm\fP" 6 |
| Do not preserve file modification times. |
| .IP "\fRo\fP" 6 |
| Preserve the user ID and group ID. |
| .IP "\fRp\fP" 6 |
| Preserve the file mode bits. Other implementation-defined file mode |
| attributes may be preserved. |
| .P |
| In the preceding list, ``preserve'' indicates that an attribute stored |
| in the archive shall be given to the extracted file, subject to the |
| permissions of the invoking process. The access and modification times |
| of the file shall be preserved unless otherwise specified with the |
| .BR \-p |
| option or not stored in the archive. All attributes that are not |
| preserved shall be determined as part of the normal file creation |
| action (see |
| .IR "Section 1.1.1.4" ", " "File Read" ", " "Write" ", " "and Creation"). |
| .P |
| If neither the |
| .BR e |
| nor the |
| .BR o |
| specification character is specified, or the user ID and group ID are |
| not preserved for any reason, |
| .IR pax |
| shall not set the S_ISUID and S_ISGID bits of the file mode. |
| .P |
| If the preservation of any of these items fails for any reason, |
| .IR pax |
| shall write a diagnostic message to standard error. Failure to preserve |
| these items shall affect the final exit status, but shall not cause the |
| extracted file to be deleted. |
| .P |
| If file characteristic letters in any of the |
| .IR string |
| option-arguments are duplicated or conflict with each other, the ones |
| given last shall take precedence. For example, if |
| .BR \-p |
| .BR eme |
| is specified, file modification times are preserved. |
| .RE |
| .IP "\fB\-s\ \fIreplstr\fR" 10 |
| Modify file or archive member names named by |
| .IR pattern |
| or |
| .IR file |
| operands according to the substitution expression |
| .IR replstr , |
| using the syntax of the |
| .IR ed |
| utility. The concepts of ``address'' and ``line'' are meaningless in |
| the context of the |
| .IR pax |
| utility, and shall not be supplied. The format shall be: |
| .RS 10 |
| .sp |
| .RS 4 |
| .nf |
| |
| -s /\fIold\fR/\fInew\fR/\fB[\fRgp\fB]\fR |
| .fi |
| .P |
| .RE |
| .P |
| where as in |
| .IR ed , |
| .IR old |
| is a basic regular expression and |
| .IR new |
| can contain an |
| <ampersand>, |
| .BR '\en' |
| (where |
| .IR n |
| is a digit) back-references, or subexpression matching. The |
| .IR old |
| string shall also be permitted to contain |
| <newline> |
| characters. |
| .P |
| Any non-null character can be used as a delimiter (\c |
| .BR '/' |
| shown here). Multiple |
| .BR \-s |
| expressions can be specified; the expressions shall be applied in the |
| order specified, terminating with the first successful substitution. |
| The optional trailing |
| .BR 'g' |
| is as defined in the |
| .IR ed |
| utility. The optional trailing |
| .BR 'p' |
| shall cause successful substitutions to be written to standard error. |
| File or archive member names that substitute to the empty string shall |
| be ignored when reading and writing archives. |
| .RE |
| .IP "\fB\-t\fP" 10 |
| When reading files from the file system, and if the user has the |
| permissions required by |
| \fIutime\fR() |
| to do so, set the access time of each file read to the access time that |
| it had before being read by |
| .IR pax . |
| .IP "\fB\-u\fP" 10 |
| Ignore files that are older (having a less recent file modification |
| time) than a pre-existing file or archive member with the same name. |
| In |
| .BR read |
| mode, an archive member with the same name as a file in the file system |
| shall be extracted if the archive member is newer than the file. In |
| .BR write |
| mode, an archive file member with the same name as a file in the file |
| system shall be superseded if the file is newer than the archive |
| member. If |
| .BR \-a |
| is also specified, this is accomplished by appending to the archive; |
| otherwise, it is unspecified whether this is accomplished by actual |
| replacement in the archive or by appending to the archive. In |
| .BR copy |
| mode, the file in the destination hierarchy shall be replaced by the |
| file in the source hierarchy or by a link to the file in the source |
| hierarchy if the file in the source hierarchy is newer. |
| .IP "\fB\-v\fP" 10 |
| In |
| .BR list |
| mode, produce a verbose table of contents (see the STDOUT section). |
| Otherwise, write archive member pathnames to standard error (see the |
| STDERR section). |
| .IP "\fB\-x\ \fIformat\fR" 10 |
| Specify the output archive format. The |
| .IR pax |
| utility shall support the following formats: |
| .RS 10 |
| .IP "\fBcpio\fR" 10 |
| The |
| .BR cpio |
| interchange format; see the EXTENDED DESCRIPTION section. The default |
| .IR blocksize |
| for this format for character special archive files shall be 5\|120. |
| Implementations shall support all |
| .IR blocksize |
| values less than or equal to 32\|256 that are multiples of 512. |
| .IP "\fBpax\fR" 10 |
| The |
| .BR pax |
| interchange format; see the EXTENDED DESCRIPTION section. The default |
| .IR blocksize |
| for this format for character special archive files shall be 5\|120. |
| Implementations shall support all |
| .IR blocksize |
| values less than or equal to 32\|256 that are multiples of 512. |
| .IP "\fBustar\fR" 10 |
| The |
| .BR tar |
| interchange format; see the EXTENDED DESCRIPTION section. The default |
| .IR blocksize |
| for this format for character special archive files shall be 10\|240. |
| Implementations shall support all |
| .IR blocksize |
| values less than or equal to 32\|256 that are multiples of 512. |
| .P |
| Implementation-defined formats shall specify a default block size as |
| well as any other block sizes supported for character special archive |
| files. |
| .P |
| Any attempt to append to an archive file in a format different from the |
| existing archive format shall cause |
| .IR pax |
| to exit immediately with a non-zero exit status. |
| .RE |
| .IP "\fB\-X\fP" 10 |
| When traversing the file hierarchy specified by a pathname, |
| .IR pax |
| shall not descend into directories that have a different device ID (\c |
| .IR st_dev ; |
| see the System Interfaces volume of POSIX.1\(hy2017, |
| \fIstat\fR()). |
| .P |
| Specifying more than one of the mutually-exclusive options |
| .BR \-H |
| and |
| .BR \-L |
| shall not be considered an error and the last option specified shall |
| determine the behavior of the utility. |
| .P |
| The options that operate on the names of files or archive members (\c |
| .BR \-c , |
| .BR \-i , |
| .BR \-n , |
| .BR \-s , |
| .BR \-u , |
| and |
| .BR \-v ) |
| shall interact as follows. In |
| .BR read |
| mode, the archive members shall be selected based on the user-specified |
| .IR pattern |
| operands as modified by the |
| .BR \-c , |
| .BR \-n , |
| and |
| .BR \-u |
| options. Then, any |
| .BR \-s |
| and |
| .BR \-i |
| options shall modify, in that order, the names of the selected files. |
| The |
| .BR \-v |
| option shall write names resulting from these modifications. |
| .P |
| In |
| .BR write |
| mode, the files shall be selected based on the user-specified |
| pathnames as modified by the |
| .BR \-n |
| and |
| .BR \-u |
| options. Then, any |
| .BR \-s |
| and |
| .BR \-i |
| options shall modify, in that order, the names of these selected files. |
| The |
| .BR \-v |
| option shall write names resulting from these modifications. |
| .P |
| If both the |
| .BR \-u |
| and |
| .BR \-n |
| options are specified, |
| .IR pax |
| shall not consider a file selected unless it is newer than the file to |
| which it is compared. |
| .SS "List Mode Format Specifications" |
| .P |
| In |
| .BR list |
| mode with the |
| .BR \-o |
| .BR listopt=format |
| option, the |
| .IR format |
| argument shall be applied for each selected file. The |
| .IR pax |
| utility shall append a |
| <newline> |
| to the |
| .BR listopt |
| output for each selected file. The |
| .IR format |
| argument shall be used as the |
| .IR format |
| string described in the Base Definitions volume of POSIX.1\(hy2017, |
| .IR "Chapter 5" ", " "File Format Notation", |
| with the exceptions 1. through 6. defined in the EXTENDED DESCRIPTION |
| section of |
| .IR printf , |
| plus the following exceptions: |
| .IP 7. 6 |
| The sequence (\c |
| .IR keyword ) |
| can occur before a format conversion specifier. The conversion |
| argument is defined by the value of |
| .IR keyword . |
| The implementation shall support the following keywords: |
| .RS 6 |
| .IP -- 4 |
| Any of the Field Name entries in |
| .IR "Table 4-14, ustar Header Block" |
| and |
| .IR "Table 4-16, Octet-Oriented cpio Archive Entry". |
| The implementation may support the |
| .IR cpio |
| keywords without the leading |
| .BR c_ |
| in addition to the form required by |
| .IR "Table 4-16, Octet-Oriented cpio Archive Entry". |
| .IP -- 4 |
| Any keyword defined for the extended header in |
| .IR "pax Extended Header". |
| .IP -- 4 |
| Any keyword provided as an implementation-defined extension within |
| the extended header defined in |
| .IR "pax Extended Header". |
| .P |
| For example, the sequence |
| .BR \(dq%(charset)s\(dq |
| is the string value of the name of the character set in the extended |
| header. |
| .P |
| The result of the keyword conversion argument shall be the value from |
| the applicable header field or extended header, without any trailing |
| NULs. |
| .P |
| All keyword values used as conversion arguments shall be translated |
| from the UTF\(hy8 encoding (or alternative encoding specified by any |
| .BR hdrcharset |
| extended header record) to the character set appropriate for the local |
| file system, user database, and so on, as applicable. |
| .RE |
| .IP 8. 6 |
| An additional conversion specifier character, |
| .BR T , |
| shall be used to specify time formats. The |
| .BR T |
| conversion specifier character can be preceded by the sequence (\c |
| .IR keyword= \c |
| .IR subformat ), |
| where |
| .IR subformat |
| is a date format as defined by |
| .IR date |
| operands. The default |
| .IR keyword |
| shall be |
| .BR mtime |
| and the default subformat shall be: |
| .RS 6 |
| .sp |
| .RS 4 |
| .nf |
| |
| %b %e %H:%M %Y |
| .fi |
| .P |
| .RE |
| .RE |
| .IP 9. 6 |
| An additional conversion specifier character, |
| .BR M , |
| shall be used to specify the file mode string as defined in |
| .IR ls |
| Standard Output. If (\c |
| .IR keyword ) |
| is omitted, the |
| .BR mode |
| keyword shall be used. For example, |
| .BR %.1M |
| writes the single character corresponding to the <\fIentry\ type\fP> |
| field of the |
| .IR ls |
| .BR \-l |
| command. |
| .IP 10. 6 |
| An additional conversion specifier character, |
| .BR D , |
| shall be used to specify the device for block or special files, if |
| applicable, in an implementation-defined format. If not applicable, |
| and (\c |
| .IR keyword ) |
| is specified, then this conversion shall be equivalent to |
| \fR%(\fIkeyword\fR)u\fR. If not applicable, and (\c |
| .IR keyword ) |
| is omitted, then this conversion shall be equivalent to |
| <space>. |
| .IP 11. 6 |
| An additional conversion specifier character, |
| .BR F , |
| shall be used to specify a pathname. The |
| .BR F |
| conversion character can be preceded by a sequence of |
| <comma>-separated |
| keywords: |
| .RS 6 |
| .sp |
| .RS 4 |
| .nf |
| |
| (\fIkeyword\fB[\fR,\fIkeyword\fB]\fR ... ) |
| .fi |
| .P |
| .RE |
| .P |
| The values for all the keywords that are non-null shall be concatenated |
| together, each separated by a |
| .BR '/' . |
| The default shall be (\c |
| .BR path ) |
| if the keyword |
| .BR path |
| is defined; otherwise, the default shall be (\c |
| .BR prefix ,\c |
| .BR name ). |
| .RE |
| .IP 12. 6 |
| An additional conversion specifier character, |
| .BR L , |
| shall be used to specify a symbolic link expansion. If the current |
| file is a symbolic link, then |
| .BR %L |
| shall expand to: |
| .RS 6 |
| .sp |
| .RS 4 |
| .nf |
| |
| "%s -> %s", <\fIvalue of keyword\fR>, <\fIcontents of link\fR> |
| .fi |
| .P |
| .RE |
| .P |
| Otherwise, the |
| .BR %L |
| conversion specification shall be the equivalent of |
| .BR %F . |
| .RE |
| .SH OPERANDS |
| The following operands shall be supported: |
| .IP "\fIdirectory\fR" 10 |
| The destination directory pathname for |
| .BR copy |
| mode. |
| .IP "\fIfile\fR" 10 |
| A pathname of a file to be copied or archived. |
| .IP "\fIpattern\fR" 10 |
| A pattern matching one or more pathnames of archive members. A pattern |
| must be given in the name-generating notation of the pattern matching |
| notation in |
| .IR "Section 2.13" ", " "Pattern Matching Notation", |
| including the filename expansion rules in |
| .IR "Section 2.13.3" ", " "Patterns Used for Filename Expansion". |
| The default, if no |
| .IR pattern |
| is specified, is to select all members in the archive. |
| .SH STDIN |
| In |
| .BR write |
| mode, the standard input shall be used only if no |
| .IR file |
| operands are specified. It shall be a file containing a list of |
| pathnames, each terminated by a |
| <newline> |
| character. |
| .P |
| In |
| .BR list |
| and |
| .BR read |
| modes, if |
| .BR \-f |
| is not specified, the standard input shall be an archive file. |
| .P |
| Otherwise, the standard input shall not be used. |
| .SH "INPUT FILES" |
| The input file named by the |
| .IR archive |
| option-argument, or standard input when the archive is read from there, |
| shall be a file formatted according to one of the specifications in the |
| EXTENDED DESCRIPTION section or some other implementation-defined |
| format. |
| .P |
| The file |
| .BR /dev/tty |
| shall be used to write prompts and read responses. |
| .SH "ENVIRONMENT VARIABLES" |
| The following environment variables shall affect the execution of |
| .IR pax : |
| .IP "\fILANG\fP" 10 |
| Provide a default value for the internationalization variables that are |
| unset or null. (See the Base Definitions volume of POSIX.1\(hy2017, |
| .IR "Section 8.2" ", " "Internationalization Variables" |
| the precedence of internationalization variables used to determine the |
| values of locale categories.) |
| .IP "\fILC_ALL\fP" 10 |
| If set to a non-empty string value, override the values of all the |
| other internationalization variables. |
| .IP "\fILC_COLLATE\fP" 10 |
| .br |
| Determine the locale for the behavior of ranges, equivalence classes, |
| and multi-character collating elements used in the pattern matching |
| expressions for the |
| .IR pattern |
| operand, the basic regular expression for the |
| .BR \-s |
| option, and the extended regular expression defined for the |
| .BR yesexpr |
| locale keyword in the |
| .IR LC_MESSAGES |
| category. |
| .IP "\fILC_CTYPE\fP" 10 |
| Determine the locale for the interpretation of sequences of bytes of |
| text data as characters (for example, single-byte as opposed to |
| multi-byte characters in arguments and input files), the behavior of |
| character classes used in the extended regular expression defined for |
| the |
| .BR yesexpr |
| locale keyword in the |
| .IR LC_MESSAGES |
| category, and pattern matching. |
| .IP "\fILC_MESSAGES\fP" 10 |
| .br |
| Determine the locale used to process affirmative responses, and the |
| locale used to affect the format and contents of diagnostic messages |
| and prompts written to standard error. |
| .IP "\fILC_TIME\fP" 10 |
| Determine the format and contents of date and time strings when the |
| .BR \-v |
| option is specified. |
| .IP "\fINLSPATH\fP" 10 |
| Determine the location of message catalogs for the processing of |
| .IR LC_MESSAGES . |
| .IP "\fITMPDIR\fP" 10 |
| Determine the pathname that provides part of the default global |
| extended header record file, as described for the |
| .BR \-o |
| .BR globexthdr= |
| keyword in the OPTIONS section. |
| .IP "\fITZ\fP" 10 |
| Determine the timezone used to calculate date and time strings when the |
| .BR \-v |
| option is specified. If |
| .IR TZ |
| is unset or null, an unspecified default timezone shall be used. |
| .SH "ASYNCHRONOUS EVENTS" |
| Default. |
| .SH STDOUT |
| In |
| .BR write |
| mode, if |
| .BR \-f |
| is not specified, the standard output shall be the archive formatted |
| according to one of the specifications in the EXTENDED DESCRIPTION |
| section, or some other implementation-defined format (see |
| .BR \-x |
| .IR format ). |
| .P |
| In |
| .BR list |
| mode, when the |
| .BR \-o \c |
| .BR listopt =\c |
| .IR format |
| has been specified, the selected archive members shall be written to |
| standard output using the format described under |
| .IR "List Mode Format Specifications". |
| In |
| .BR list |
| mode without the |
| .BR \-o \c |
| .BR listopt =\c |
| .IR format |
| option, the table of contents of the selected archive members shall |
| be written to standard output using the following format: |
| .sp |
| .RS 4 |
| .nf |
| |
| "%s\en", <\fIpathname\fR> |
| .fi |
| .P |
| .RE |
| .P |
| If the |
| .BR \-v |
| option is specified in |
| .BR list |
| mode, the table of contents of the selected archive members shall be |
| written to standard output using the following formats. |
| .P |
| For pathnames representing hard links to previous members of the |
| archive: |
| .sp |
| .RS 4 |
| .nf |
| |
| "%s == %s\en", <\fIls\fR -l \fIlisting\fR>, <\fIlinkname\fR> |
| .fi |
| .P |
| .RE |
| .P |
| For all other pathnames: |
| .sp |
| .RS 4 |
| .nf |
| |
| "%s\en", <\fIls\fR -l \fIlisting\fR> |
| .fi |
| .P |
| .RE |
| .P |
| where <\fIls\ \fR\-l\ \fIlisting\fR> shall be the format specified by |
| the |
| .IR ls |
| utility with the |
| .BR \-l |
| option. When writing pathnames in this format, it is unspecified what |
| is written for fields for which the underlying archive format does not |
| have the correct information, although the correct number of |
| <blank>-separated |
| fields shall be written. |
| .P |
| In |
| .BR list |
| mode, standard output shall not be buffered more than a pathname |
| (plus any associated information and a |
| <newline> |
| terminator) at a time. |
| .SH STDERR |
| If |
| .BR \-v |
| is specified in |
| .BR read , |
| .BR write , |
| or |
| .BR copy |
| modes, |
| .IR pax |
| shall write the pathnames it processes to the standard error output |
| using the following format: |
| .sp |
| .RS 4 |
| .nf |
| |
| "%s\en", <\fIpathname\fR> |
| .fi |
| .P |
| .RE |
| .P |
| These pathnames shall be written as soon as processing is begun on the |
| file or archive member, and shall be flushed to standard error. The |
| trailing |
| <newline>, |
| which shall not be buffered, is written when the file has been read or |
| written. |
| .P |
| If the |
| .BR \-s |
| option is specified, and the replacement string has a trailing |
| .BR 'p' , |
| substitutions shall be written to standard error in the following |
| format: |
| .sp |
| .RS 4 |
| .nf |
| |
| "%s >> %s\en", <\fIoriginal pathname\fR>, <\fInew pathname\fR> |
| .fi |
| .P |
| .RE |
| .P |
| In all operating modes of |
| .IR pax , |
| optional messages of unspecified format concerning the input archive |
| format and volume number, the number of files, blocks, volumes, and |
| media parts as well as other diagnostic messages may be written to |
| standard error. |
| .P |
| In all formats, for both standard output and standard error, it is |
| unspecified how non-printable characters in pathnames or link names |
| are written. |
| .P |
| When using the |
| .BR \-x \c |
| .BR pax |
| archive format, if a filename, link name, group name, owner name, or |
| any other field in an extended header record cannot be translated |
| between the codeset in use for that extended header record and the |
| character set of the current locale, |
| .IR pax |
| shall write a diagnostic message to standard error, shall process the |
| file as described for the |
| .BR \-o |
| .BR invalid= |
| option, and then shall continue processing with the next file. |
| .SH "OUTPUT FILES" |
| In |
| .BR read |
| mode, the extracted output files shall be of the archived file type. |
| In |
| .BR copy |
| mode, the copied output files shall be the type of the file being |
| copied. In either mode, existing files in the destination hierarchy |
| shall be overwritten only when all permission (\c |
| .BR \-p ), |
| modification time (\c |
| .BR \-u ), |
| and invalid-value (\c |
| .BR \-o \c |
| .BR invalid= ) |
| tests allow it. |
| .P |
| In |
| .BR write |
| mode, the output file named by the |
| .BR \-f |
| option-argument shall be a file formatted according to one of the |
| specifications in the EXTENDED DESCRIPTION section, or some other |
| implementation-defined format. |
| .SH "EXTENDED DESCRIPTION" |
| .SS "pax Interchange Format" |
| .P |
| A |
| .IR pax |
| archive tape or file produced in the |
| .BR \-x \c |
| .BR pax |
| format shall contain a series of blocks. The physical layout of the |
| archive shall be identical to the |
| .BR ustar |
| format described in |
| .IR "ustar Interchange Format". |
| Each file archived shall be represented by the following sequence: |
| .IP " *" 4 |
| An optional header block with extended header records. This header |
| block is of the form described in |
| .IR "pax Header Block", |
| with a |
| .IR typeflag |
| value of |
| .BR x |
| or |
| .BR g . |
| The extended header records, described in |
| .IR "pax Extended Header", |
| shall be included as the data for this header block. |
| .IP " *" 4 |
| A header block that describes the file. Any fields in the preceding |
| optional extended header shall override the associated fields in |
| this header block for this file. |
| .IP " *" 4 |
| Zero or more blocks that contain the contents of the file. |
| .P |
| At the end of the archive file there shall be two 512-byte blocks |
| filled with binary zeros, interpreted as an end-of-archive indicator. |
| .P |
| A schematic of an example archive with global extended header records |
| and two actual files is shown in |
| .IR "Figure 4-1, pax Format Archive Example". |
| In the example, the second file in the archive has no extended header |
| preceding it, presumably because it has no need for extended |
| attributes. |
| .sp |
| .ce 1 |
| \fBFigure 4-1: pax Format Archive Example\fR |
| .SS "pax Header Block" |
| .P |
| The |
| .BR pax |
| header block shall be identical to the |
| .BR ustar |
| header block described in |
| .IR "ustar Interchange Format", |
| except that two additional |
| .IR typeflag |
| values are defined: |
| .IP "\fRx\fP" 6 |
| Represents extended header records for the following file in the |
| archive (which shall have its own |
| .BR ustar |
| header block). The format of these extended header records shall be as |
| described in |
| .IR "pax Extended Header". |
| .IP "\fRg\fR" 6 |
| Represents global extended header records for the following files in |
| the archive. The format of these extended header records shall be as |
| described in |
| .IR "pax Extended Header". |
| Each value shall affect all subsequent files that do not override that |
| value in their own extended header record and until another global |
| extended header record is reached that provides another value for the |
| same field. The |
| .IR typeflag |
| .BR g |
| global headers should not be used with interchange media that could |
| suffer partial data loss in transporting the archive. |
| .P |
| For both of these types, the |
| .IR size |
| field shall be the size of the extended header records in octets. The |
| other fields in the header block are not meaningful to this version of |
| the |
| .IR pax |
| utility. However, if this archive is read by a |
| .IR pax |
| utility conforming to the ISO\ POSIX\(hy2:\|1993 standard, the header block fields are used to |
| create a regular file that contains the extended header records as |
| data. Therefore, header block field values should be selected to |
| provide reasonable file access to this regular file. |
| .P |
| A further difference from the |
| .BR ustar |
| header block is that data blocks for files of |
| .IR typeflag |
| 1 (the digit one) (hard link) may be included, which means that the |
| size field may be greater than zero. Archives created by |
| .IR pax |
| .BR \-o |
| .BR linkdata |
| shall include these data blocks with the hard links. |
| .SS "pax Extended Header" |
| .P |
| A |
| .BR pax |
| extended header contains values that are inappropriate for the |
| .BR ustar |
| header block because of limitations in that format: fields requiring a |
| character encoding other than that described in the ISO/IEC\ 646:\|1991 standard, fields |
| representing file attributes not described in the |
| .BR ustar |
| header, and fields whose format or length do not fit the requirements |
| of the |
| .BR ustar |
| header. The values in an extended header add attributes to the |
| following file (or files; see the description of the |
| .IR typeflag |
| .BR g |
| header block) or override values in the following header block(s), as |
| indicated in the following list of keywords. |
| .P |
| An extended header shall consist of one or more records, each |
| constructed as follows: |
| .sp |
| .RS 4 |
| .nf |
| |
| "%d %s=%s\en", <\fIlength\fR>, <\fIkeyword\fR>, <\fIvalue\fR> |
| .fi |
| .P |
| .RE |
| .P |
| The extended header records shall be encoded according to the ISO/IEC\ 10646\(hy1:\|2000 standard |
| UTF\(hy8 encoding. The <\fIlength\fP> field, |
| <blank>, |
| <equals-sign>, |
| and |
| <newline> |
| shown shall be limited to the portable character set, as encoded in |
| UTF\(hy8. The <\fIkeyword\fP> fields can be any UTF\(hy8 characters. |
| The <\fIlength\fP> field shall be the decimal length of the extended |
| header record in octets, including the trailing |
| <newline>. |
| If there is a |
| .BR hdrcharset |
| extended header in effect for a file, the |
| .IR value |
| field for any |
| .BR gname , |
| .BR linkpath , |
| .BR path , |
| and |
| .BR uname |
| extended header records shall be encoded using the character set |
| specified by the |
| .BR hdrcharset |
| extended header record; otherwise, the |
| .IR value |
| field shall be encoded using UTF\(hy8. The |
| .IR value |
| field for all other keywords specified by POSIX.1\(hy2008 shall be |
| encoded using UTF\(hy8. |
| .P |
| The <\fIkeyword\fP> field shall be one of the entries from the |
| following list or a keyword provided as an implementation extension. |
| Keywords consisting entirely of lowercase letters, digits, and periods |
| are reserved for future standardization. A keyword shall not include an |
| <equals-sign>. |
| (In the following list, the notations ``file(s)'' or ``block(s)'' is used |
| to acknowledge that a keyword affects the following single file after a |
| .IR typeflag |
| .BR x |
| extended header, but possibly multiple files after |
| .IR typeflag |
| .BR g . |
| Any requirements in the list for |
| .IR pax |
| to include a record when in |
| .BR write |
| or |
| .BR copy |
| mode shall apply only when such a record has not already been provided |
| through the use of the |
| .BR \-o |
| option. When used in |
| .BR copy |
| mode, |
| .IR pax |
| shall behave as if an archive had been created with applicable extended |
| header records and then extracted.) |
| .IP "\fBatime\fP" 10 |
| The file access time for the following file(s), equivalent to the value |
| of the |
| .IR st_atime |
| member of the |
| .BR stat |
| structure for a file, as described by the |
| \fIstat\fR() |
| function. The access time shall be restored if the process has |
| appropriate privileges required to do so. The format of the |
| <\fIvalue\fP> shall be as described in |
| .IR "pax Extended Header File Times". |
| .IP "\fBcharset\fP" 10 |
| The name of the character set used to encode the data in the following |
| file(s). The entries in the following table are defined to refer to |
| known standards; additional names may be agreed on between the |
| originator and recipient. |
| .TS |
| center box tab(!); |
| cB | cB |
| lf5 | l. |
| <value>!Formal Standard |
| _ |
| ISO-IR 646 1990!ISO/IEC 646:\|1990 |
| ISO-IR 8859 1 1998!ISO/IEC 8859\(hy1:\|1998 |
| ISO-IR 8859 2 1999!ISO/IEC 8859\(hy2:\|1999 |
| ISO-IR 8859 3 1999!ISO/IEC 8859\(hy3:\|1999 |
| ISO-IR 8859 4 1998!ISO/IEC 8859\(hy4:\|1998 |
| ISO-IR 8859 5 1999!ISO/IEC 8859\(hy5:\|1999 |
| ISO-IR 8859 6 1999!ISO/IEC 8859\(hy6:\|1999 |
| ISO-IR 8859 7 1987!ISO/IEC 8859\(hy7:\|1987 |
| ISO-IR 8859 8 1999!ISO/IEC 8859\(hy8:\|1999 |
| ISO-IR 8859 9 1999!ISO/IEC 8859\(hy9:\|1999 |
| ISO-IR 8859 10 1998!ISO/IEC 8859\(hy10:\|1998 |
| ISO-IR 8859 13 1998!ISO/IEC 8859\(hy13:\|1998 |
| ISO-IR 8859 14 1998!ISO/IEC 8859\(hy14:\|1998 |
| ISO-IR 8859 15 1999!ISO/IEC 8859\(hy15:\|1999 |
| ISO-IR 10646 2000!ISO/IEC 10646:\|2000 |
| ISO-IR 10646 2000 UTF-8!ISO/IEC 10646, UTF-8 encoding |
| BINARY!None. |
| .TE |
| .RS 10 |
| .P |
| The encoding is included in an extended header for information only; |
| when |
| .IR pax |
| is used as described in POSIX.1\(hy2008, it shall not translate the file data |
| into any other encoding. The |
| .BR BINARY |
| entry indicates unencoded binary data. |
| .P |
| When used in |
| .BR write |
| or |
| .BR copy |
| mode, it is implementation-defined whether |
| .IR pax |
| includes a |
| .BR charset |
| extended header record for a file. |
| .RE |
| .IP "\fBcomment\fP" 10 |
| A series of characters used as a comment. All characters in the |
| <\fIvalue\fP> field shall be ignored by |
| .IR pax . |
| .IP "\fBgid\fP" 10 |
| The group ID of the group that owns the file, expressed as a decimal |
| number using digits from the ISO/IEC\ 646:\|1991 standard. This record shall override the |
| .IR gid |
| field in the following header block(s). When used in |
| .BR write |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall include a |
| .IR gid |
| extended header record for each file whose group ID is greater than |
| 2\|097\|151 (octal 7\|777\|777). |
| .IP "\fBgname\fP" 10 |
| The group of the file(s), formatted as a group name in the group |
| database. This record shall override the |
| .IR gid |
| and |
| .IR gname |
| fields in the following header block(s), and any |
| .IR gid |
| extended header record. When used in |
| .BR read , |
| .BR copy , |
| or |
| .BR list |
| mode, |
| .IR pax |
| shall translate the name from the encoding in the header record to |
| the character set appropriate for the group database on the |
| receiving system. If any of the characters cannot be |
| translated, and if neither the |
| .BR \-o \c |
| .BR invalid=UTF\(hy8 |
| option nor the |
| .BR \-o \c |
| .BR invalid=binary |
| option is specified, the results are implementation-defined. |
| When used in |
| .BR write |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall include a |
| .BR gname |
| extended header record for each file whose group name cannot be |
| represented entirely with the letters and digits of the portable |
| character set. |
| .IP "\fBhdrcharset\fR" 10 |
| The name of the character set used to encode the value field of the |
| .BR gname , |
| .BR linkpath , |
| .BR path , |
| and |
| .BR uname |
| .IR pax |
| extended header records. The entries in the following table are defined |
| to refer to known standards; additional names may be agreed between the |
| originator and the recipient. |
| .br |
| .TS |
| center box tab(!); |
| cB | cB |
| lf5 | l. |
| <value>!Formal Standard |
| _ |
| ISO-IR 10646 2000 UTF-8!ISO/IEC 10646, UTF-8 encoding |
| BINARY!None. |
| .TE |
| .RS 10 |
| .P |
| If no |
| .BR hdrcharset |
| extended header record is specified, the default character set used to |
| encode all values in extended header records shall be the ISO/IEC\ 10646\(hy1:\|2000 standard |
| UTF\(hy8 encoding. |
| .P |
| The |
| .BR BINARY |
| entry indicates that all values recorded in extended headers for |
| affected files are unencoded binary data from the underlying system. |
| .RE |
| .IP "\fBlinkpath\fP" 10 |
| The pathname of a link being created to another file, of any type, |
| previously archived. This record shall override the |
| .IR linkname |
| field in the following |
| .BR ustar |
| header block(s). The following |
| .BR ustar |
| header block shall determine the type of link created. If |
| .IR typeflag |
| of the following header block is 1, it shall be a hard link. If |
| .IR typeflag |
| is 2, it shall be a symbolic link and the |
| .BR linkpath |
| value shall be the contents of the symbolic link. The |
| .IR pax |
| utility shall translate the name of the link (contents of the symbolic |
| link) from the encoding in the header to the character set appropriate |
| for the local file system. When used in |
| .BR write |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall include a |
| .BR linkpath |
| extended header record for each link whose pathname cannot be |
| represented entirely with the members of the portable character set |
| other than NUL. |
| .IP "\fBmtime\fP" 10 |
| The file modification time of the following file(s), equivalent to the |
| value of the |
| .IR st_mtime |
| member of the |
| .BR stat |
| structure for a file, as described in the |
| \fIstat\fR() |
| function. This record shall override the |
| .IR mtime |
| field in the following header block(s). The modification time shall be |
| restored if the process has appropriate privileges required to do |
| so. The format of the <\fIvalue\fP> shall be as described in |
| .IR "pax Extended Header File Times". |
| .IP "\fBpath\fP" 10 |
| The pathname of the following file(s). This record shall override the |
| .IR name |
| and |
| .IR prefix |
| fields in the following header block(s). The |
| .IR pax |
| utility shall translate the pathname of the file from the encoding in |
| the header to the character set appropriate for the local file system. |
| .RS 10 |
| .P |
| When used in |
| .BR write |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall include a |
| .IR path |
| extended header record for each file whose pathname cannot be |
| represented entirely with the members of the portable character set |
| other than NUL. |
| .RE |
| .IP "\fBrealtime.\fIany\fR" 10 |
| The keywords prefixed by ``realtime.'' are reserved for future |
| standardization. |
| .IP "\fBsecurity.\fIany\fR" 10 |
| The keywords prefixed by ``security.'' are reserved for future |
| standardization. |
| .IP "\fBsize\fP" 10 |
| The size of the file in octets, expressed as a decimal number using |
| digits from the ISO/IEC\ 646:\|1991 standard. This record shall override the |
| .IR size |
| field in the following header block(s). When used in |
| .BR write |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall include a |
| .IR size |
| extended header record for each file with a size value greater than |
| 8\|589\|934\|591 (octal 77\|777\|777\|777). |
| .IP "\fBuid\fP" 10 |
| The user ID of the file owner, expressed as a decimal number using |
| digits from the ISO/IEC\ 646:\|1991 standard. This record shall override the |
| .IR uid |
| field in the following header block(s). When used in |
| .BR write |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall include a |
| .IR uid |
| extended header record for each file whose owner ID is greater than |
| 2\|097\|151 (octal 7\|777\|777). |
| .IP "\fBuname\fP" 10 |
| The owner of the following file(s), formatted as a user name in the |
| user database. This record shall override the |
| .IR uid |
| and |
| .IR uname |
| fields in the following header block(s), and any |
| .IR uid |
| extended header record. When used in |
| .BR read , |
| .BR copy , |
| or |
| .BR list |
| mode, |
| .IR pax |
| shall translate the name from the encoding in the header record to the |
| character set appropriate for the user database on the receiving |
| system. If any of the characters cannot be translated, and if neither |
| the |
| .BR \-o \c |
| .BR invalid=UTF\(hy8 |
| option nor the |
| .BR \-o \c |
| .BR invalid=binary |
| option is specified, the results are implementation-defined. |
| When used in |
| .BR write |
| or |
| .BR copy |
| mode, |
| .IR pax |
| shall include a |
| .BR uname |
| extended header record for each file whose user name cannot be |
| represented entirely with the letters and digits of the portable |
| character set. |
| .P |
| If the <\fIvalue\fP> field is zero length, it shall delete any header |
| block field, previously entered extended header value, or global |
| extended header value of the same name. |
| .P |
| If a keyword in an extended header record (or in a |
| .BR \-o |
| option-argument) overrides or deletes a corresponding field in the |
| .BR ustar |
| header block, |
| .IR pax |
| shall ignore the contents of that header block field. |
| .P |
| Unlike the |
| .BR ustar |
| header block fields, NULs shall not delimit <\fIvalue\fP>s; all |
| characters within the <\fIvalue\fP> field shall be considered data for |
| the field. None of the length limitations of the |
| .BR ustar |
| header block fields in |
| .IR "Table 4-14, ustar Header Block" |
| shall apply to the extended header records. |
| .SS "pax Extended Header Keyword Precedence" |
| .P |
| This section describes the precedence in which the various header |
| records and fields and command line options are selected to apply to a |
| file in the archive. When |
| .IR pax |
| is used in |
| .BR read |
| or |
| .BR list |
| modes, it shall determine a file attribute in the following sequence: |
| .IP " 1." 4 |
| If |
| .BR \-o \c |
| .BR delete=keyword-prefix |
| is used, the affected attributes shall be determined from step 7., if |
| applicable, or ignored otherwise. |
| .IP " 2." 4 |
| If |
| .BR \-o \c |
| .IR keyword := |
| is used, the affected attributes shall be ignored. |
| .IP " 3." 4 |
| If |
| .BR \-o \c |
| .BR keyword:=value |
| is used, the affected attribute shall be assigned the value. |
| .IP " 4." 4 |
| If there is a |
| .IR typeflag |
| .BR x |
| extended header record, the affected attribute shall be assigned the |
| <\fIvalue\fP>. When extended header records conflict, the last one |
| given in the header shall take precedence. |
| .IP " 5." 4 |
| If |
| .BR \-o \c |
| .BR keyword=value |
| is used, the affected attribute shall be assigned the value. |
| .IP " 6." 4 |
| If there is a |
| .IR typeflag |
| .BR g |
| global extended header record, the affected attribute shall be assigned |
| the <\fIvalue\fP>. When global extended header records conflict, the |
| last one given in the global header shall take precedence. |
| .IP " 7." 4 |
| Otherwise, the attribute shall be determined from the |
| .BR ustar |
| header block. |
| .SS "pax Extended Header File Times" |
| .P |
| The |
| .IR pax |
| utility shall write an |
| .BR mtime |
| record for each file in |
| .BR write |
| or |
| .BR copy |
| modes if the file's modification time cannot be represented exactly in |
| the |
| .BR ustar |
| header logical record described in |
| .IR "ustar Interchange Format". |
| This can occur if the time is out of |
| .BR ustar |
| range, or if the file system of the underlying implementation supports |
| non-integer time granularities and the time is not an integer. All of |
| these time records shall be formatted as a decimal representation of |
| the time in seconds since the Epoch. If a |
| <period> |
| (\c |
| .BR '.' ) |
| decimal point character is present, the digits to the right of the |
| point shall represent the units of a subsecond timing granularity, |
| where the first digit is tenths of a second and each subsequent digit |
| is a tenth of the previous digit. In |
| .BR read |
| or |
| .BR copy |
| mode, the |
| .IR pax |
| utility shall truncate the time of a file to the greatest value that is |
| not greater than the input header file time. In |
| .BR write |
| or |
| .BR copy |
| mode, the |
| .IR pax |
| utility shall output a time exactly if it can be represented exactly as |
| a decimal number, and otherwise shall generate only enough digits so |
| that the same time shall be recovered if the file is extracted on a |
| system whose underlying implementation supports the same time |
| granularity. |
| .SS "ustar Interchange Format" |
| .P |
| A |
| .BR ustar |
| archive tape or file shall contain a series of logical records. Each |
| logical record shall be a fixed-size logical record of 512 octets (see |
| below). Although this format may be thought of as being stored on |
| 9-track industry-standard 12.7 mm (0.5 in) magnetic tape, other types of |
| transportable media are not excluded. Each file archived shall be |
| represented by a header logical record that describes the file, |
| followed by zero or more logical records that give the contents of the |
| file. At the end of the archive file there shall be two 512-octet |
| logical records filled with binary zeros, interpreted as an |
| end-of-archive indicator. |
| .P |
| The logical records may be grouped for physical I/O operations, as |
| described under the |
| .BR \-b \c |
| .IR blocksize |
| and |
| .BR \-x |
| .BR ustar |
| options. Each group of logical records may be written with a single |
| operation equivalent to the |
| \fIwrite\fR() |
| function. On magnetic tape, the result of this write shall be a single |
| tape physical block. The last physical block shall always be the full |
| size, so logical records after the two zero logical records may contain |
| undefined data. |
| .P |
| The header logical record shall be structured as shown in the following |
| table. All lengths and offsets are in decimal. |
| .br |
| .sp |
| .ce 1 |
| \fBTable 4-14: ustar Header Block\fR |
| .TS |
| center box tab(@); |
| cB | cB | cB |
| lI | n | n. |
| Field Name@Octet Offset@Length (in Octets) |
| _ |
| name@0@100 |
| mode@100@8 |
| uid@108@8 |
| gid@116@8 |
| size@124@12 |
| mtime@136@12 |
| chksum@148@8 |
| typeflag@156@1 |
| linkname@157@100 |
| magic@257@6 |
| version@263@2 |
| uname@265@32 |
| gname@297@32 |
| devmajor@329@8 |
| devminor@337@8 |
| prefix@345@155 |
| .TE |
| .P |
| All characters in the header logical record shall be represented in the |
| coded character set of the ISO/IEC\ 646:\|1991 standard. For maximum portability between |
| implementations, names should be selected from characters represented |
| by the portable filename character set as octets with the most |
| significant bit zero. If an implementation supports the use of |
| characters outside of |
| <slash> |
| and the portable filename character set in names for files, users, and |
| groups, one or more implementation-defined encodings of these characters |
| shall be provided for interchange purposes. |
| .P |
| However, the |
| .IR pax |
| utility shall never create filenames on the local system that cannot |
| be accessed via the procedures described in POSIX.1\(hy2008. If a filename is |
| found on the medium that would create an invalid filename, it is |
| implementation-defined whether the data from the file is stored on the |
| file hierarchy and under what name it is stored. The |
| .IR pax |
| utility may choose to ignore these files as long as it produces an |
| error indicating that the file is being ignored. |
| .P |
| Each field within the header logical record is contiguous; that is, |
| there is no padding used. Each character on the archive medium shall be |
| stored contiguously. |
| .P |
| The fields |
| .IR magic , |
| .IR uname , |
| and |
| .IR gname |
| are character strings each terminated by a NUL character. The fields |
| .IR name , |
| .IR linkname , |
| and |
| .IR prefix |
| are NUL-terminated character strings except when all characters in the |
| array contain non-NUL characters including the last character. The |
| .IR version |
| field is two octets containing the characters |
| .BR \(dq00\(dq |
| (zero-zero). The |
| .IR typeflag |
| contains a single character. All other fields are leading zero-filled |
| octal numbers using digits from the ISO/IEC\ 646:\|1991 standard IRV. Each numeric field is |
| terminated by one or more |
| <space> |
| or NUL characters. |
| .P |
| The |
| .IR name |
| and the |
| .IR prefix |
| fields shall produce the pathname of the file. A new pathname shall |
| be formed, if |
| .IR prefix |
| is not an empty string (its first character is not NUL), by |
| concatenating |
| .IR prefix |
| (up to the first NUL character), a |
| <slash> |
| character, and |
| .IR name ; |
| otherwise, |
| .IR name |
| is used alone. In either case, |
| .IR name |
| is terminated at the first NUL character. If |
| .IR prefix |
| begins with a NUL character, it shall be ignored. In this manner, |
| pathnames of at most 256 characters can be supported. If a pathname |
| does not fit in the space provided, |
| .IR pax |
| shall notify the user of the error, and shall not store any part of the |
| file\(emheader or data\(emon the medium. |
| .P |
| The |
| .IR linkname |
| field, described below, shall not use the |
| .IR prefix |
| to produce a pathname. As such, a |
| .IR linkname |
| is limited to 100 characters. If the name does not fit in the space |
| provided, |
| .IR pax |
| shall notify the user of the error, and shall not attempt to store the |
| link on the medium. |
| .P |
| The |
| .IR mode |
| field provides 12 bits encoded in the ISO/IEC\ 646:\|1991 standard octal digit representation. |
| The encoded bits shall represent the following values: |
| .br |
| .sp |
| .ce 1 |
| \fBTable: ustar \fImode\fP Field\fR |
| .TS |
| tab(!) center box; |
| cB | cB | cB |
| n | l | l. |
| Bit Value!POSIX.1\(hy2008 Bit!Description |
| _ |
| 04\|000!S_ISUID!Set UID on execution. |
| 02\|000!S_ISGID!Set GID on execution. |
| 01\|000!<reserved>!Reserved for future standardization. |
| 00\|400!S_IRUSR!Read permission for file owner class. |
| 00\|200!S_IWUSR!Write permission for file owner class. |
| 00\|100!S_IXUSR!Execute/search permission for file owner class. |
| 00\|040!S_IRGRP!Read permission for file group class. |
| 00\|020!S_IWGRP!Write permission for file group class. |
| 00\|010!S_IXGRP!Execute/search permission for file group class. |
| 00\|004!S_IROTH!Read permission for file other class. |
| 00\|002!S_IWOTH!Write permission for file other class. |
| 00\|001!S_IXOTH!Execute/search permission for file other class. |
| .TE |
| .P |
| When appropriate privileges are required to set one of these mode bits, |
| and the user restoring the files from the archive does not have |
| appropriate privileges, the mode bits for which the user does not have |
| appropriate privileges shall be ignored. Some of the mode bits in the |
| archive format are not mentioned elsewhere in this volume of POSIX.1\(hy2017. If the |
| implementation does not support those bits, they may be ignored. |
| .P |
| The |
| .IR uid |
| and |
| .IR gid |
| fields are the user and group ID of the owner and group of the file, |
| respectively. |
| .P |
| The |
| .IR size |
| field is the size of the file in octets. If the |
| .IR typeflag |
| field is set to specify a file to be of type 1 (a link) or 2 (a |
| symbolic link), the |
| .IR size |
| field shall be specified as zero. If the |
| .IR typeflag |
| field is set to specify a file of type 5 (directory), the |
| .IR size |
| field shall be interpreted as described under the definition of that |
| record type. No data logical records are stored for types 1, 2, or 5. |
| If the |
| .IR typeflag |
| field is set to 3 (character special file), 4 (block special file), or |
| 6 (FIFO), the meaning of the |
| .IR size |
| field is unspecified by this volume of POSIX.1\(hy2017, and no data logical records shall be |
| stored on the medium. Additionally, for type 6, the |
| .IR size |
| field shall be ignored when reading. If the |
| .IR typeflag |
| field is set to any other value, the number of logical records written |
| following the header shall be (\c |
| .IR size +511)/512, |
| ignoring any fraction in the result of the division. |
| .P |
| The |
| .IR mtime |
| field shall be the modification time of the file at the time it was |
| archived. It is the ISO/IEC\ 646:\|1991 standard representation of the octal value of the |
| modification time obtained from the |
| \fIstat\fR() |
| function. |
| .P |
| The |
| .IR chksum |
| field shall be the ISO/IEC\ 646:\|1991 standard IRV representation of the octal value of the |
| simple sum of all octets in the header logical record. Each octet in |
| the header shall be treated as an unsigned value. These values shall be |
| added to an unsigned integer, initialized to zero, the precision of |
| which is not less than 17 bits. When calculating the checksum, the |
| .IR chksum |
| field is treated as if it were all |
| <space> |
| characters. |
| .P |
| The |
| .IR typeflag |
| field specifies the type of file archived. If a particular |
| implementation does not recognize the type, or the user does not have |
| appropriate privileges to create that type, the file shall be extracted |
| as if it were a regular file if the file type is defined to have a |
| meaning for the |
| .IR size |
| field that could cause data logical records to be written on the medium |
| (see the previous description for |
| .IR size ). |
| If conversion to a regular file occurs, the |
| .IR pax |
| utility shall produce an error indicating that the conversion took |
| place. All of the |
| .IR typeflag |
| fields shall be coded in the ISO/IEC\ 646:\|1991 standard IRV: |
| .IP "\fR0\fR" 8 |
| Represents a regular file. For backwards-compatibility, a |
| .IR typeflag |
| value of binary zero (\c |
| .BR '\e0' ) |
| should be recognized as meaning a regular file when extracting files |
| from the archive. Archives written with this version of the archive |
| file format create regular files with a |
| .IR typeflag |
| value of the ISO/IEC\ 646:\|1991 standard IRV |
| .BR '0' . |
| .IP "\fR1\fR" 8 |
| Represents a file linked to another file, of any type, previously |
| archived. Such files are identified by having the same device |
| and file serial numbers, and pathnames that refer to different |
| directory entries. All such files shall be archived as linked files. |
| The linked-to name is specified in the |
| .IR linkname |
| field with a NUL-character terminator if it is less than 100 octets in |
| length. |
| .IP "\fR2\fR" 8 |
| Represents a symbolic link. The contents of the symbolic link shall be |
| stored in the |
| .IR linkname |
| field. |
| .IP "\fR3,4\fR" 8 |
| Represent character special files and block special files respectively. |
| In this case the |
| .IR devmajor |
| and |
| .IR devminor |
| fields shall contain information defining the device, the format of |
| which is unspecified by this volume of POSIX.1\(hy2017. Implementations may map the device |
| specifications to their own local specification or may ignore the |
| entry. |
| .IP "\fR5\fR" 8 |
| Specifies a directory or subdirectory. On systems where disk allocation |
| is performed on a directory basis, the |
| .IR size |
| field shall contain the maximum number of octets (which may be rounded |
| to the nearest disk block allocation unit) that the directory may hold. |
| A |
| .IR size |
| field of zero indicates no such limiting. Systems that do not support |
| limiting in this manner should ignore the |
| .IR size |
| field. |
| .IP "\fR6\fR" 8 |
| Specifies a FIFO special file. Note that the archiving of a FIFO file |
| archives the existence of this file and not its contents. |
| .IP "\fR7\fR" 8 |
| Reserved to represent a file to which an implementation has associated |
| some high-performance attribute. Implementations without such |
| extensions should treat this file as a regular file (type 0). |
| .IP "\fRA\(hyZ\fR" 8 |
| The letters |
| .BR 'A' |
| to |
| .BR 'Z' , |
| inclusive, are reserved for custom implementations. All other values |
| are reserved for future versions of this standard. |
| .P |
| It is unspecified whether files with pathnames that refer to the same |
| directory entry are archived as linked files or as separate files. If |
| they are archived as linked files, this means that attempting to |
| extract both pathnames from the resulting archive will always cause an |
| error (unless the |
| .BR \-u |
| option is used) because the link cannot be created. |
| .P |
| It is unspecified whether files with the same device and file serial |
| numbers being appended to an archive are treated as linked files to |
| members that were in the archive before the append. |
| .P |
| Attempts to archive a socket shall produce a diagnostic message when |
| .BR ustar |
| interchange format is used, but may be allowed when |
| .BR pax |
| interchange format is used. Handling of other file types is |
| implementation-defined. |
| .P |
| The |
| .IR magic |
| field is the specification that this archive was output in this archive |
| format. If this field contains |
| .BR ustar |
| (the five characters from the ISO/IEC\ 646:\|1991 standard IRV shown followed by NUL), the |
| .IR uname |
| and |
| .IR gname |
| fields shall contain the ISO/IEC\ 646:\|1991 standard IRV representation of the owner and |
| group of the file, respectively (truncated to fit, if necessary). When |
| the file is restored by a privileged, protection-preserving version of |
| the utility, the user and group databases shall be scanned for these |
| names. If found, the user and group IDs contained within these files |
| shall be used rather than the values contained within the |
| .IR uid |
| and |
| .IR gid |
| fields. |
| .SS "cpio Interchange Format" |
| .P |
| The octet-oriented |
| .BR cpio |
| archive format shall be a series of entries, each comprising a header |
| that describes the file, the name of the file, and then the contents of |
| the file. |
| .P |
| An archive may be recorded as a series of fixed-size blocks of octets. |
| This blocking shall be used only to make physical I/O more efficient. |
| The last group of blocks shall always be at the full size. |
| .P |
| For the octet-oriented |
| .BR cpio |
| archive format, the individual entry information shall be in the order |
| indicated and described by the following table; see also the |
| .IR <cpio.h> |
| header. |
| .br |
| .sp |
| .ce 1 |
| \fBTable 4-16: Octet-Oriented cpio Archive Entry\fR |
| .TS |
| center box tab(!); |
| cB | cB | cB |
| lI | n | l. |
| Header Field Name!Length (in Octets)!Interpreted as |
| _ |
| c_magic!6!Octal number |
| c_dev!6!Octal number |
| c_ino!6!Octal number |
| c_mode!6!Octal number |
| c_uid!6!Octal number |
| c_gid!6!Octal number |
| c_nlink!6!Octal number |
| c_rdev!6!Octal number |
| c_mtime!11!Octal number |
| c_namesize!6!Octal number |
| c_filesize!11!Octal number |
| _ |
| .T& |
| cB | cB | cB |
| lI lI l. |
| Filename Field Name!Length!Interpreted as |
| _ |
| c_name!c_namesize!Pathname string |
| _ |
| .T& |
| cB | cB | cB |
| lI lI l. |
| File Data Field Name!Length!Interpreted as |
| _ |
| c_filedata!c_filesize!Data |
| .TE |
| .SS "cpio Header" |
| .P |
| For each file in the archive, a header as defined previously shall be |
| written. The information in the header fields is written as streams of |
| the ISO/IEC\ 646:\|1991 standard characters interpreted as octal numbers. The octal numbers |
| shall be extended to the necessary length by appending the ISO/IEC\ 646:\|1991 standard IRV |
| zeros at the most-significant-digit end of the number; the result is |
| written to the most-significant digit of the stream of octets first. |
| The fields shall be interpreted as follows: |
| .IP "\fIc_magic\fR" 10 |
| Identify the archive as being a transportable archive by containing the |
| identifying value |
| .BR \(dq070707\(dq . |
| .IP "\fIc_dev\fR,\ \fIc_ino\fR" 10 |
| Contains values that uniquely identify the file within the archive |
| (that is, no files contain the same pair of |
| .IR c_dev |
| and |
| .IR c_ino |
| values unless they are links to the same file). The values shall be |
| determined in an unspecified manner. |
| .IP "\fIc_mode\fR" 10 |
| Contains the file type and access permissions as defined in the |
| following table. |
| .br |
| .sp |
| .ce 1 |
| \fBTable 4-17: Values for cpio c_mode Field\fR |
| .TS |
| center box tab(@); |
| cB | cB | cB |
| l | n | l. |
| File Permissions Name@Value@Indicates |
| _ |
| C_IRUSR@000\|400@Read by owner |
| C_IWUSR@000\|200@Write by owner |
| C_IXUSR@000\|100@Execute by owner |
| C_IRGRP@000\|040@Read by group |
| C_IWGRP@000\|020@Write by group |
| C_IXGRP@000\|010@Execute by group |
| C_IROTH@000\|004@Read by others |
| C_IWOTH@000\|002@Write by others |
| C_IXOTH@000\|001@Execute by others |
| C_ISUID@004\|000@Set \fIuid\fP |
| C_ISGID@002\|000@Set \fIgid\fP |
| C_ISVTX@001\|000@Reserved |
| _ |
| .T& |
| cB | cB | cB |
| l | n | l. |
| File Type Name@Value@Indicates |
| _ |
| C_ISDIR@040\|000@Directory |
| C_ISFIFO@010\|000@FIFO |
| C_ISREG@0100\|000@Regular file |
| C_ISLNK@0120\|000@Symbolic link |
| .RS 10 |
| .P |
| C_ISBLK@060\|000@Block special file |
| C_ISCHR@020\|000@Character special file |
| C_ISSOCK@0140\|000@Socket |
| .P |
| C_ISCTG@0110\|000@Reserved |
| .TE |
| .P |
| Directories, FIFOs, symbolic links, and regular files shall be |
| supported on a system conforming to this volume of POSIX.1\(hy2017; additional values defined |
| previously are reserved for compatibility with existing systems. |
| Additional file types may be supported; however, such files should not |
| be written to archives intended to be transported to other systems. |
| .RE |
| .IP "\fIc_uid\fR" 10 |
| Contains the user ID of the owner. |
| .IP "\fIc_gid\fR" 10 |
| Contains the group ID of the group. |
| .IP "\fIc_nlink\fR" 10 |
| Contains a number greater than or equal to the number of links in the |
| archive referencing the file. If the |
| .BR \-a |
| option is used to append to a |
| .IR cpio |
| archive, then the |
| .IR pax |
| utility need not account for the files in the existing part of the |
| archive when calculating the |
| .IR c_nlink |
| values for the appended part of the archive, and need not alter the |
| .IR c_nlink |
| values in the existing part of the archive if additional files with the |
| same |
| .IR c_dev |
| and |
| .IR c_ino |
| values are appended to the archive. |
| .IP "\fIc_rdev\fR" 10 |
| Contains implementation-defined information for character or block |
| special files. |
| .IP "\fIc_mtime\fR" 10 |
| Contains the latest time of modification of the file at the time the |
| archive was created. |
| .IP "\fIc_namesize\fR" 10 |
| Contains the length of the pathname, including the terminating NUL |
| character. |
| .IP "\fIc_filesize\fR" 10 |
| Contains the length in octets of the data section following the |
| header structure. |
| .SS "cpio Filename" |
| .P |
| The |
| .IR c_name |
| field shall contain the pathname of the file. The length of this field |
| in octets is the value of |
| .IR c_namesize . |
| .P |
| If a filename is found on the medium that would create an invalid |
| pathname, it is implementation-defined whether the data from the file |
| is stored on the file hierarchy and under what name it is stored. |
| .P |
| All characters shall be represented in the ISO/IEC\ 646:\|1991 standard IRV. For maximum |
| portability between implementations, names should be selected from |
| characters represented by the portable filename character set as |
| octets with the most significant bit zero. If an implementation |
| supports the use of characters outside the portable filename character |
| set in names for files, users, and groups, one or more |
| implementation-defined encodings of these characters shall be provided |
| for interchange purposes. However, the |
| .IR pax |
| utility shall never create filenames on the local system that cannot |
| be accessed via the procedures described previously in this volume of POSIX.1\(hy2017. If a |
| filename is found on the medium that would create an invalid filename, |
| it is implementation-defined whether the data from the file is stored on |
| the local file system and under what name it is stored. The |
| .IR pax |
| utility may choose to ignore these files as long as it produces an |
| error indicating that the file is being ignored. |
| .SS "cpio File Data" |
| .P |
| Following |
| .IR c_name , |
| there shall be |
| .IR c_filesize |
| octets of data. Interpretation of such data occurs in a manner |
| dependent on the file. For regular files, the data shall consist |
| of the contents of the file. For symbolic links, the data shall |
| consist of the contents of the symbolic link. If |
| .IR c_filesize |
| is zero, no data shall be contained in |
| .IR c_filedata . |
| .P |
| When restoring from an archive: |
| .IP " *" 4 |
| If the user does not have appropriate privileges to create a file of |
| the specified type, |
| .IR pax |
| shall ignore the entry and write an error message to standard error. |
| .IP " *" 4 |
| Only regular files and symbolic links have data to be restored. Presuming |
| a regular file meets any selection criteria that might be imposed on |
| the format-reading utility by the user, such data shall be restored. |
| .IP " *" 4 |
| If a user does not have appropriate privileges to set a particular mode |
| flag, the flag shall be ignored. Some of the mode flags in the archive |
| format are not mentioned elsewhere in this volume of POSIX.1\(hy2017. If the implementation does |
| not support those flags, they may be ignored. |
| .SS "cpio Special Entries" |
| .P |
| FIFO special files, directories, and the trailer shall be recorded with |
| .IR c_filesize |
| equal to zero. Symbolic links shall be recorded with |
| .IR c_filesize |
| equal to the length of the contents of the symbolic link. |
| For other special files, |
| .IR c_filesize |
| is unspecified by this volume of POSIX.1\(hy2017. The header for the next file entry in the |
| archive shall be written directly after the last octet of the file |
| entry preceding it. A header denoting the filename |
| .BR TRAILER!!! |
| shall indicate the end of the archive; the contents of octets in the |
| last block of the archive following such a header are undefined. |
| .SH "EXIT STATUS" |
| The following exit values shall be returned: |
| .IP "\00" 6 |
| All files were processed successfully. |
| .IP >0 6 |
| An error occurred. |
| .SH "CONSEQUENCES OF ERRORS" |
| If |
| .IR pax |
| cannot create a file or a link when reading an archive or cannot find a |
| file when writing an archive, or cannot preserve the user ID, group ID, |
| or file mode when the |
| .BR \-p |
| option is specified, a diagnostic message shall be written to standard |
| error and a non-zero exit status shall be returned, but processing |
| shall continue. In the case where |
| .IR pax |
| cannot create a link to a file, |
| .IR pax |
| shall not, by default, create a second copy of the file. |
| .P |
| If the extraction of a file from an archive is prematurely terminated |
| by a signal or error, |
| .IR pax |
| may have only partially extracted the file or (if the |
| .BR \-n |
| option was not specified) may have extracted a file of the same name as |
| that specified by the user, but which is not the file the user wanted. |
| Additionally, the file modes of extracted directories may have |
| additional bits from the S_IRWXU mask set as well as incorrect |
| modification and access times. |
| .LP |
| .IR "The following sections are informative." |
| .SH "APPLICATION USAGE" |
| Caution is advised when using the |
| .BR \-a |
| option to append to a |
| .IR cpio |
| format archive. If any of the files being appended happen to be given |
| the same |
| .IR c_dev |
| and |
| .IR c_ino |
| values as a file in the existing part of the archive, then they may be |
| treated as links to that file on extraction. Thus, it is risky to use |
| .BR \-a |
| with |
| .IR cpio |
| format except when it is done on the same system that the original |
| archive was created on, and with the same |
| .IR pax |
| utility, and in the knowledge that there has been little or no file |
| system activity since the original archive was created that could lead |
| to any of the files appended being given the same |
| .IR c_dev |
| and |
| .IR c_ino |
| values as an unrelated file in the existing part of the archive. Also, |
| when (intentionally) appending additional links to a file in the |
| existing part of the archive, the |
| .IR c_nlink |
| values in the modified archive can be smaller than the number of links |
| to the file in the archive, which may mean that the links are not |
| preserved on extraction. |
| .P |
| The |
| .BR \-p |
| (privileges) option was invented to reconcile differences between |
| historical |
| .IR tar |
| and |
| .IR cpio |
| implementations. In particular, the two utilities use |
| .BR \-m |
| in diametrically opposed ways. The |
| .BR \-p |
| option also provides a consistent means of extending the ways in which |
| future file attributes can be addressed, such as for enhanced security |
| systems or high-performance files. Although it may seem complex, there |
| are really two modes that are most commonly used: |
| .IP "\fB\-p\ e\fR" 8 |
| ``Preserve everything''. This would be used by the historical |
| superuser, someone with all appropriate privileges, to preserve all |
| aspects of the files as they are recorded in the archive. The |
| .BR e |
| flag is the sum of |
| .BR o |
| and |
| .BR p , |
| and other implementation-defined attributes. |
| .IP "\fB\-p\ p\fR" 8 |
| ``Preserve'' the file mode bits. This would be used by the user with |
| regular privileges who wished to preserve aspects of the file other |
| than the ownership. The file times are preserved by default, but two |
| other flags are offered to disable these and use the time of |
| extraction. |
| .P |
| The one pathname per line format of standard input precludes |
| pathnames containing |
| <newline> |
| characters. Although such pathnames violate the portable filename |
| guidelines, they may exist and their presence may inhibit usage of |
| .IR pax |
| within shell scripts. This problem is inherited from historical archive |
| programs. The problem can be avoided by listing filename arguments on |
| the command line instead of on standard input. |
| .P |
| It is almost certain that appropriate privileges are required for |
| .IR pax |
| to accomplish parts of this volume of POSIX.1\(hy2017. Specifically, creating files of type |
| block special or character special, restoring file access times unless |
| the files are owned by the user (the |
| .BR \-t |
| option), or preserving file owner, group, and mode (the |
| .BR \-p |
| option) all probably require appropriate privileges. |
| .P |
| In |
| .BR read |
| mode, implementations are permitted to overwrite files when the archive |
| has multiple members with the same name. This may fail if permissions |
| on the first version of the file do not permit it to be overwritten. |
| .P |
| The |
| .BR cpio |
| and |
| .BR ustar |
| formats can only support files up to 8\|589\|934\|592 bytes |
| (8 \(** 2^30) in size. |
| .P |
| When archives containing binary header information are listed , the |
| filenames printed may cause strange behavior on some terminals. |
| .P |
| When all of the following are true: |
| .IP " 1." 4 |
| A file of type directory is being placed into an archive. |
| .IP " 2." 4 |
| The |
| .BR ustar |
| archive format is being used. |
| .IP " 3." 4 |
| The pathname of the directory is less than or equal to 155 bytes long |
| (it will fit in the |
| .IR prefix |
| field in the |
| .BR ustar |
| header block). |
| .IP " 4." 4 |
| The last component of the pathname of the directory is longer than 100 |
| bytes long (it will not fit in the |
| .IR name |
| field in the |
| .BR ustar |
| header block). |
| .P |
| some implementations of the |
| .IR pax |
| utility will place the entire directory pathname in the |
| .IR prefix |
| field, set the |
| .IR name |
| field to an empty string, and place the directory in the archive. |
| Other implementations of the |
| .IR pax |
| utility will give an error under these conditions because the |
| .IR name |
| field is not large enough to hold the last component of the directory name. |
| This standard allows either behavior. However, when extracting a directory |
| from a |
| .BR ustar |
| format archive, this standard requires that all implementations be able |
| to extract a directory even if the |
| .IR name |
| field contains an empty string as long as the |
| .IR prefix |
| field does not also contain an empty string. |
| .SH EXAMPLES |
| The following command: |
| .sp |
| .RS 4 |
| .nf |
| |
| pax -w -f /dev/rmt/1m . |
| .fi |
| .P |
| .RE |
| .P |
| copies the contents of the current directory to tape drive 1, medium |
| density (assuming historical System V device naming procedures\(emthe |
| historical BSD device name would be |
| .BR /dev/rmt9 ). |
| .P |
| The following commands: |
| .sp |
| .RS 4 |
| .nf |
| |
| mkdir \fInewdir\fR |
| pax -rw \fIolddir newdir\fR |
| .fi |
| .P |
| .RE |
| .P |
| copy the |
| .IR olddir |
| directory hierarchy to |
| .IR newdir . |
| .sp |
| .RS 4 |
| .nf |
| |
| pax -r -s \(aq,\(ha//*usr//*,,\(aq -f a.pax |
| .fi |
| .P |
| .RE |
| .P |
| reads the archive |
| .BR a.pax , |
| with all files rooted in |
| .BR /usr |
| in the archive extracted relative to the current directory. |
| .P |
| Using the option: |
| .sp |
| .RS 4 |
| .nf |
| |
| -o listopt="%M %(atime)T %(size)D %(name)s" |
| .fi |
| .P |
| .RE |
| .P |
| overrides the default output description in Standard Output and instead |
| writes: |
| .sp |
| .RS 4 |
| .nf |
| |
| -rw-rw--- Jan 12 15:53 2003 1492 /usr/foo/bar |
| .fi |
| .P |
| .RE |
| .P |
| Using the options: |
| .sp |
| .RS 4 |
| .nf |
| |
| -o listopt=\(aq%L\et%(size)D\en%.7\(aq \e |
| -o listopt=\(aq(name)s\en%(atime)T\en%T\(aq |
| .fi |
| .P |
| .RE |
| .P |
| overrides the default output description in Standard Output and instead |
| writes: |
| .sp |
| .RS 4 |
| .nf |
| |
| /usr/foo/bar -> /tmp 1492 |
| /usr/fo |
| Jan 12 15:53 1991 |
| Jan 31 15:53 2003 |
| .fi |
| .P |
| .RE |
| .SH RATIONALE |
| The |
| .IR pax |
| utility was new for the ISO\ POSIX\(hy2:\|1993 standard. It represents a peaceful |
| compromise between advocates of the historical |
| .IR tar |
| and |
| .IR cpio |
| utilities. |
| .P |
| A fundamental difference between |
| .IR cpio |
| and |
| .IR tar |
| was in the way directories were treated. The |
| .IR cpio |
| utility did not treat directories differently from other files, and to |
| select a directory and its contents required that each file in the |
| hierarchy be explicitly specified. For |
| .IR tar , |
| a directory matched every file in the file hierarchy it rooted. |
| .P |
| The |
| .IR pax |
| utility offers both interfaces; by default, directories map into the |
| file hierarchy they root. The |
| .BR \-d |
| option causes |
| .IR pax |
| to skip any file not explicitly referenced, as |
| .IR cpio |
| historically did. The |
| .IR tar |
| .BR \- \c |
| .IR style |
| behavior was chosen as the default because it was believed that this |
| was the more common usage and because |
| .IR tar |
| is the more commonly available interface, as it was historically |
| provided on both System V and BSD implementations. |
| .P |
| The data interchange format specification in this volume of POSIX.1\(hy2017 requires that |
| processes with ``appropriate privileges'' shall always restore the |
| ownership and permissions of extracted files exactly as archived. If |
| viewed from the historic equivalence between superuser and |
| ``appropriate privileges'', there are two problems |
| with this requirement. First, users running as superusers may |
| unknowingly set dangerous permissions on extracted files. Second, it is |
| needlessly limiting, in that superusers cannot extract files and own |
| them as superuser unless the archive was created by the superuser. (It |
| should be noted that restoration of ownerships and permissions for the |
| superuser, by default, is historical practice in |
| .IR cpio , |
| but not in |
| .IR tar .) |
| In order to avoid these two problems, the |
| .IR pax |
| specification has an additional ``privilege'' mechanism, the |
| .BR \-p |
| option. Only a |
| .IR pax |
| invocation with the privileges needed, and which has the |
| .BR \-p |
| option set using the |
| .BR e |
| specification character, has appropriate privileges to restore |
| full ownership and permission information. |
| .P |
| Note also that this volume of POSIX.1\(hy2017 requires that the file ownership and access |
| permissions shall be set, on extraction, in the same fashion as the |
| \fIcreat\fR() |
| function when provided with the mode stored in the archive. This means |
| that the file creation mask of the user is applied to the file |
| permissions. |
| .P |
| Users should note that directories may be created by |
| .IR pax |
| while extracting files with permissions that are different from those |
| that existed at the time the archive was created. When extracting |
| sensitive information into a directory hierarchy that no longer exists, |
| users are encouraged to set their file creation mask appropriately to |
| protect these files during extraction. |
| .P |
| The table of contents output is written to standard output to |
| facilitate pipeline processing. |
| .P |
| An early proposal had hard links displaying for all pathnames. This |
| was removed because it complicates the output of the case where |
| .BR \-v |
| is not specified and does not match historical |
| .IR cpio |
| usage. The hard-link information is available in the |
| .BR \-v |
| display. |
| .P |
| The description of the |
| .BR \-l |
| option allows implementations to make hard links to symbolic links. |
| Earlier versions of this standard did not specify any way to create a |
| hard link to a symbolic link, but many implementations provided this |
| capability as an extension. If there are hard links to symbolic links |
| when an archive is created, the implementation is required to archive |
| the hard link in the archive (unless |
| .BR \-H |
| or |
| .BR \-L |
| is specified). When in |
| .BR read |
| mode and in |
| .BR copy |
| mode, implementations supporting hard links to symbolic links should |
| use them when appropriate. |
| .P |
| The archive formats inherited from the POSIX.1\(hy1990 standard have certain restrictions |
| that have been brought along from historical usage. For example, there |
| are restrictions on the length of pathnames stored in the archive. |
| When |
| .IR pax |
| is used in |
| .BR copy (\c |
| .BR \-rw ) |
| mode (copying directory hierarchies), the ability to use extensions |
| from the |
| .BR \-x \c |
| .BR pax |
| format overcomes these restrictions. |
| .P |
| The default |
| .IR blocksize |
| value of 5\|120 bytes for |
| .IR cpio |
| was selected because it is one of the standard block-size values for |
| .IR cpio , |
| set when the |
| .BR \-B |
| option is specified. (The other default block-size value for |
| .IR cpio |
| is 512 bytes, and this was considered to be too small.) The default |
| block value of 10\|240 bytes for |
| .IR tar |
| was selected because that is the standard block-size value for BSD |
| .IR tar . |
| The maximum block size of 32\|256 bytes (2**15\-512 bytes) |
| is the largest multiple of 512 bytes that fits into a signed 16-bit |
| tape controller transfer register. There are known limitations in some |
| historical systems that would prevent larger blocks from being |
| accepted. Historical values were chosen to improve compatibility with |
| historical scripts using |
| .IR dd |
| or similar utilities to manipulate archives. Also, default block sizes |
| for any file type other than character special file has been deleted |
| from this volume of POSIX.1\(hy2017 as unimportant and not likely to affect the structure of the |
| resulting archive. |
| .P |
| Implementations are permitted to modify the block-size value based on |
| the archive format or the device to which the archive is being |
| written. This is to provide implementations with the opportunity to |
| take advantage of special types of devices, and it should not be used |
| without a great deal of consideration as it almost certainly decreases |
| archive portability. |
| .P |
| The intended use of the |
| .BR \-n |
| option was to permit extraction of one or more files from the archive |
| without processing the entire archive. This was viewed by the standard |
| developers as offering significant performance advantages over |
| historical implementations. The |
| .BR \-n |
| option in early proposals had three effects; the first was to cause |
| special characters in patterns to not be treated specially. The second |
| was to cause only the first file that matched a pattern to be |
| extracted. The third was to cause |
| .IR pax |
| to write a diagnostic message to standard error when no file was found |
| matching a specified pattern. Only the second behavior is retained by |
| this volume of POSIX.1\(hy2017, for many reasons. First, it is in general not acceptable for a |
| single option to have multiple effects. Second, the ability to make |
| pattern matching characters act as normal characters |
| is useful for parts of |
| .IR pax |
| other than file extraction. Third, a finer degree of control over the |
| special characters is useful because users may wish to normalize only a |
| single special character in a single filename. Fourth, given a more |
| general escape mechanism, the previous behavior of the |
| .BR \-n |
| option can be easily obtained using the |
| .BR \-s |
| option or a |
| .IR sed |
| script. Finally, writing a diagnostic message when a pattern specified |
| by the user is unmatched by any file is useful behavior in all cases. |
| .P |
| In this version, the |
| .BR \-n |
| was removed from the |
| .BR copy |
| mode synopsis of |
| .IR pax ; |
| it is inapplicable because there are no pattern operands specified in |
| this mode. |
| .P |
| There is another method than |
| .IR pax |
| for copying subtrees in POSIX.1\(hy2008 described as part of the |
| .IR cp |
| utility. Both methods are historical practice: |
| .IR cp |
| provides a simpler, more intuitive interface, while |
| .IR pax |
| offers a finer granularity of control. Each provides additional |
| functionality to the other; in particular, |
| .IR pax |
| maintains the hard-link structure of the hierarchy while |
| .IR cp |
| does not. It is the intention of the standard developers that the |
| results be similar (using appropriate option combinations in both |
| utilities). The results are not required to be identical; there seemed |
| insufficient gain to applications to balance the difficulty of |
| implementations having to guarantee that the results would be exactly |
| identical. |
| .P |
| A single archive may span more than one file. It is suggested that |
| implementations provide informative messages to the user on standard |
| error whenever the archive file is changed. |
| .P |
| The |
| .BR \-d |
| option (do not create intermediate directories not listed in the |
| archive) found in early proposals was originally provided as a |
| complement to the historic |
| .BR \-d |
| option of |
| .IR cpio . |
| It has been deleted. |
| .P |
| The |
| .BR \-s |
| option in early proposals specified a subset of the substitution |
| command from the |
| .IR ed |
| utility. As there was no reason for only a subset to be supported, the |
| .BR \-s |
| option is now compatible with the current |
| .IR ed |
| specification. Since the delimiter can be any non-null character, the |
| following usage with single |
| <space> |
| characters is valid: |
| .sp |
| .RS 4 |
| .nf |
| |
| pax -s " foo bar " ... |
| .fi |
| .P |
| .RE |
| .P |
| The |
| .BR \-t |
| description is worded so as to note that this may cause the access time |
| update caused by some other activity (which occurs while the file is |
| being read) to be overwritten. |
| .P |
| The default behavior of |
| .IR pax |
| with regard to file modification times is the same as historical |
| implementations of |
| .IR tar . |
| It is not the historical behavior of |
| .IR cpio . |
| .P |
| Because the |
| .BR \-i |
| option uses |
| .BR /dev/tty , |
| utilities without a controlling terminal are not able to use this |
| option. |
| .P |
| The |
| .BR \-y |
| option, found in early proposals, has been deleted because a line |
| containing a single |
| <period> |
| for the |
| .BR \-i |
| option has equivalent functionality. The special lines for the |
| .BR \-i |
| option (a single |
| <period> |
| and the empty line) are historical practice in |
| .IR cpio . |
| .P |
| In early drafts, a |
| .BR \-e \c |
| .IR charmap |
| option was included to increase portability of files between systems |
| using different coded character sets. This option was omitted because |
| it was apparent that consensus could not be formed for it. In this |
| version, the use of UTF\(hy8 should be an adequate substitute. |
| .P |
| The ISO\ POSIX\(hy2:\|1993 standard and ISO\ POSIX\(hy1 standard requirements for |
| .IR pax , |
| however, made it very difficult to create a single archive containing |
| files created using extended characters provided by different locales. |
| This version adds the |
| .BR hdrcharset |
| keyword to make it possible to archive files in these cases without |
| dropping files due to translation errors. |
| .P |
| Translating filenames and other attributes from a locale's encoding to |
| UTF\(hy8 and then back again can lose information, as the resulting |
| filename might not be byte-for-byte equivalent to the original. To |
| avoid this problem, users can specify the |
| .BR \-o |
| .BR hdrcharset=binary |
| option, which will cause the resulting archive to use binary |
| format for all names and attributes. Such archives are not portable |
| among hosts that use different native encodings (e.g., EBCDIC |
| \fIversus\fR ASCII-based encodings), but they will allow interchange |
| among the vast majority of POSIX file systems in practical use. Also, |
| the |
| .BR \-o |
| .BR hdrcharset=binary |
| option will cause |
| .IR pax |
| in |
| .BR copy |
| mode to behave more like other standard utilities such as |
| .IR cp . |
| .P |
| If the values specified by the |
| .BR \-o |
| .BR exthdr.name=value , |
| .BR \-o |
| .BR globexthdr.name=value , |
| or by |
| .BR $TMPDIR |
| (if |
| .BR \-o |
| .BR globexthdr.name |
| is not specified) require a character encoding other than that |
| described in the ISO/IEC\ 646:\|1991 standard, a |
| .BR path |
| extended header record will have to be created for the file. If a |
| .BR hdrcharset |
| extended header record is active for such headers, it will determine |
| the codeset used for the value field in these extended |
| .BR path |
| header records. These |
| .BR path |
| extended header records always need to be created when writing an |
| archive even if |
| .BR hdrcharset=binary |
| has been specified and would contain the same (binary) data that |
| appears in the |
| .BR ustar |
| header record prefix and |
| .IR name |
| fields. (In other words, an extended header |
| .BR path |
| record is always required to be generated if the |
| .IR prefix |
| or |
| .IR name |
| fields contain non-ASCII characters even when |
| .BR hdrcharset=binary |
| is also in effect for that file.) |
| .P |
| The |
| .BR \-k |
| option was added to address international concerns about the dangers |
| involved in the character set transformations of |
| .BR \-e |
| (if the target character set were different from the source, the |
| filenames might be transformed into names matching existing files) and |
| also was made more general to protect files transferred between file |
| systems with different |
| {NAME_MAX} |
| values (truncating a filename on a smaller system might also |
| inadvertently overwrite existing files). As stated, it prevents any |
| overwriting, even if the target file is older than the source. This |
| version adds more granularity of options to solve this problem by |
| introducing the |
| .BR \-o \c |
| .BR invalid=option \c |
| \(emspecifically the |
| .BR UTF\(hy8 |
| and |
| .BR binary |
| actions. (Note that an existing file is still subject to overwriting in |
| this case. The |
| .BR \-k |
| option closes that loophole.) |
| .P |
| Some of the file characteristics referenced in this volume of POSIX.1\(hy2017 might not be |
| supported by some archive formats. For example, neither the |
| .BR tar |
| nor |
| .BR cpio |
| formats contain the file access time. For this reason, the |
| .BR e |
| specification character has been provided, intended to cause all file |
| characteristics specified in the archive to be retained. |
| .P |
| It is required that extracted directories, by default, have their |
| access and modification times and permissions set to the values |
| specified in the archive. This has obvious problems in that the |
| directories are almost certainly modified after being extracted and |
| that directory permissions may not permit file creation. One possible |
| solution is to create directories with the mode specified in the |
| archive, as modified by the |
| .IR umask |
| of the user, with sufficient permissions to allow file creation. After |
| all files have been extracted, |
| .IR pax |
| would then reset the access and modification times and permissions as |
| necessary. |
| .P |
| The list-mode formatting description borrows heavily from the one |
| defined by the |
| .IR printf |
| utility. However, since there is no separate operand list to get |
| conversion arguments, the format was extended to allow specifying the |
| name of the conversion argument as part of the conversion |
| specification. |
| .P |
| The |
| .BR T |
| conversion specifier allows time fields to be displayed in any of |
| the date formats. Unlike the |
| .IR ls |
| utility, |
| .IR pax |
| does not adjust the format when the date is less than six months in the |
| past. This makes parsing the output more predictable. |
| .P |
| The |
| .BR D |
| conversion specifier handles the ability to display the major/minor |
| or file size, as with |
| .IR ls , |
| by using \fR%\-8(\fIsize\fR)D\fR. |
| .P |
| The |
| .BR L |
| conversion specifier handles the |
| .IR ls |
| display for symbolic links. |
| .P |
| Conversion specifiers were added to generate existing known types used |
| for |
| .IR ls . |
| .SS "pax Interchange Format" |
| .P |
| The new POSIX data interchange format was developed primarily to |
| satisfy international concerns that the |
| .BR ustar |
| and |
| .BR cpio |
| formats did not provide for file, user, and group names encoded in |
| characters outside a subset of the ISO/IEC\ 646:\|1991 standard. The standard developers |
| realized that this new POSIX data interchange format should be very |
| extensible because there were other requirements they foresaw in the |
| near future: |
| .IP " *" 4 |
| Support international character encodings and locale information |
| .IP " *" 4 |
| Support security information (ACLs, and so on) |
| .IP " *" 4 |
| Support future file types, such as realtime or contiguous files |
| .IP " *" 4 |
| Include data areas for implementation use |
| .IP " *" 4 |
| Support systems with words larger than 32 bits and timers with |
| subsecond granularity |
| .P |
| The following were not goals for this format because these are better |
| handled by separate utilities or are inappropriate for a portable |
| format: |
| .IP " *" 4 |
| Encryption |
| .IP " *" 4 |
| Compression |
| .IP " *" 4 |
| Data translation between locales and codesets |
| .IP " *" 4 |
| .IR inode |
| storage |
| .P |
| The format chosen to support the goals is an extension of the |
| .BR ustar |
| format. Of the two formats previously available, only the |
| .BR ustar |
| format was selected for extensions because: |
| .IP " *" 4 |
| It was easier to extend in an upwards-compatible way. It offered version |
| flags and header block type fields with room for future |
| standardization. The |
| .BR cpio |
| format, while possessing a more flexible file naming methodology, could |
| not be extended without breaking some theoretical implementation |
| or using a dummy filename that could be a legitimate filename. |
| .IP " *" 4 |
| Industry experience since the original ``\c |
| .IR tar |
| wars'' fought in developing the ISO\ POSIX\(hy1 standard has clearly been in favor of the |
| .BR ustar |
| format, which is generally the default output format selected for |
| .IR pax |
| implementations on new systems. |
| .P |
| The new format was designed with one additional goal in mind: |
| reasonable behavior when an older |
| .IR tar |
| or |
| .IR pax |
| utility happened to read an archive. Since the POSIX.1\(hy1990 standard mandated that a |
| ``format-reading utility'' had to treat unrecognized |
| .IR typeflag |
| values as regular files, this allowed the format to include all the |
| extended information in a pseudo-regular file that preceded each real |
| file. An option is given that allows the archive creator to set up |
| reasonable names for these files on the older systems. Also, the |
| normative text suggests that reasonable file access values be used for |
| this |
| .BR ustar |
| header block. Making these header files inaccessible for convenient |
| reading and deleting would not be reasonable. File permissions of 600 |
| or 700 are suggested. |
| .P |
| The |
| .BR ustar |
| .IR typeflag |
| field was used to accommodate the additional functionality of the new |
| format rather than magic or version because the POSIX.1\(hy1990 standard (and, by |
| reference, the previous version of |
| .IR pax ), |
| mandated the behavior of the format-reading utility when it encountered |
| an unknown |
| .IR typeflag , |
| but was silent about the other two fields. |
| .P |
| Early proposals for the first version of this standard contained a proposed |
| archive format that was based on compatibility with the standard for |
| tape files (ISO\ 1001, similar to the format used historically on many |
| mainframes and minicomputers). This format was overly complex and required |
| considerable overhead in volume and header records. Furthermore, the |
| standard developers felt that it would not be acceptable to the community |
| of POSIX developers, so it was later changed to be a format more closely |
| related to historical practice on POSIX systems. |
| .P |
| The prefix and name split of pathnames in |
| .BR ustar |
| was replaced by the single path extended header record for simplicity. |
| .P |
| The concept of a global extended header (\c |
| .IR typeflag \c |
| .BR g ) |
| was controversial. If this were applied to an archive being recorded on |
| magnetic tape, a few unreadable blocks at the beginning of the tape |
| could be a serious problem; a utility attempting to extract as many |
| files as possible from a damaged archive could lose a large percentage |
| of file header information in this case. However, if the archive were |
| on a reliable medium, such as a CD\(hyROM, the global extended header |
| offers considerable potential size reductions by eliminating redundant |
| information. Thus, the text warns against using the global method for |
| unreliable media and provides a method for implanting global |
| information in the extended header for each file, rather than in the |
| .IR typeflag |
| .BR g |
| records. |
| .P |
| No facility for data translation or filtering on a per-file basis is |
| included because the standard developers could not invent an interface |
| that would allow this in an efficient manner. If a filter, such as |
| encryption or compression, is to be applied to all the files, it is |
| more efficient to apply the filter to the entire archive as a single |
| file. The standard developers considered interfaces that would invoke a |
| shell script for each file going into or out of the archive, but the |
| system overhead in this approach was considered to be too high. |
| .P |
| One such approach would be to have |
| .BR filter= |
| records that give a pathname for an executable. When the program is |
| invoked, the file and archive would be open for standard input/output |
| and all the header fields would be available as environment variables |
| or command-line arguments. The standard developers did discuss such |
| schemes, but they were omitted from POSIX.1\(hy2008 due to concerns about |
| excessive overhead. Also, the program itself would need to be in the |
| archive if it were to be used portably. |
| .P |
| There is currently no portable means of identifying the character |
| set(s) used for a file in the file system. Therefore, |
| .IR pax |
| has not been given a mechanism to generate charset records |
| automatically. The only portable means of doing this is for the user to |
| write the archive using the |
| .BR \-o \c |
| .BR charset=string |
| command line option. This assumes that all of the files in the archive |
| use the same encoding. The ``implementation-defined'' text is |
| included to allow for a system that can identify the encodings used for |
| each of its files. |
| .P |
| The table of standards that accompanies the charset record description |
| is acknowledged to be very limited. Only a limited number of character |
| set standards is reasonable for maximal interchange. Any character set |
| is, of course, possible by prior agreement. It was suggested that |
| EBCDIC be listed, but it was omitted because it is not defined by a |
| formal standard. Formal standards, and then only those with reasonably |
| large followings, can be included here, simply as a matter of |
| practicality. The <\fIvalue\fP>s represent names of officially |
| registered character sets in the format required by the ISO\ 2375:\|1985 standard. |
| .P |
| The normal |
| <comma> |
| or |
| <blank>-separated |
| list rules are not followed in the case of keyword options to allow |
| ease of argument parsing for |
| .IR getopts . |
| .P |
| Further information on character encodings is in |
| .IR "pax Archive Character Set Encoding/Decoding". |
| .P |
| The standard developers have reserved keyword name space for vendor |
| extensions. It is suggested that the format to be used is: |
| .sp |
| .RS 4 |
| .nf |
| |
| \fIVENDOR.keyword\fR |
| .fi |
| .P |
| .RE |
| .P |
| where |
| .IR VENDOR |
| is the name of the vendor or organization in all uppercase letters. It |
| is further suggested that the keyword following the |
| <period> |
| be named differently than any of the standard keywords so that it could |
| be used for future standardization, if appropriate, by omitting the |
| .IR VENDOR |
| prefix. |
| .P |
| The <\fIlength\fP> field in the extended header record was included to |
| make it simpler to step through the records, even if a record contains |
| an unknown format (to a particular |
| .IR pax ) |
| with complex interactions of special characters. It also provides a |
| minor integrity checkpoint within the records to aid a program |
| attempting to recover files from a damaged archive. |
| .P |
| There are no extended header versions of the |
| .IR devmajor |
| and |
| .IR devminor |
| fields because the unspecified format |
| .BR ustar |
| header field should be sufficient. If they are not, vendor-specific |
| extended keywords (such as |
| .IR VENDOR.devmajor ) |
| should be used. |
| .P |
| Device and |
| .IR i -number |
| labeling of files was not adopted from |
| .IR cpio ; |
| files are interchanged strictly on a symbolic name basis, as in |
| .BR ustar . |
| .P |
| Just as with the |
| .BR ustar |
| format descriptions, the new format makes no special arrangements for |
| multi-volume archives. Each of the |
| .IR pax |
| archive types is assumed to be inside a single POSIX file and splitting |
| that file over multiple volumes (diskettes, tape cartridges, and so |
| on), processing their labels, and mounting each in the proper sequence |
| are considered to be implementation details that cannot be described |
| portably. |
| .P |
| The |
| .BR pax |
| format is intended for interchange, not only for backup on a single |
| (family of) systems. It is not as densely packed as might be possible |
| for backup: |
| .IP " *" 4 |
| It contains information as coded characters that could be coded in |
| binary. |
| .IP " *" 4 |
| It identifies extended records with name fields that could be omitted |
| in favor of a fixed-field layout. |
| .IP " *" 4 |
| It translates names into a portable character set and identifies |
| locale-related information, both of which are probably unnecessary for |
| backup. |
| .P |
| The requirements on restoring from an archive are slightly different |
| from the historical wording, allowing for non-monolithic privilege to |
| bring forward as much as possible. In particular, attributes such as |
| ``high performance file'' might be broadly but not universally granted |
| while set-user-ID or |
| \fIchown\fR() |
| might be much more restricted. There is no implication in POSIX.1\(hy2008 that |
| the security information be honored after it is restored to the file |
| hierarchy, in spite of what might be improperly inferred by the silence |
| on that topic. That is a topic for another standard. |
| .P |
| Links are recorded in the fashion described here because a link can be |
| to any file type. It is desirable in general to be able to restore part |
| of an archive selectively and restore all of those files completely. If |
| the data is not associated with each link, it is not possible to do |
| this. However, the data associated with a file can be large, and when |
| selective restoration is not needed, this can be a significant burden. |
| The archive is structured so that files that have no associated data |
| can always be restored by the name of any link name of any link, and |
| the user may choose whether data is recorded with each instance of a |
| file that contains data. The format permits mixing of both types of |
| links in a single archive; this can be done for special needs, and |
| .IR pax |
| is expected to interpret such archives on input properly, despite the |
| fact that there is no |
| .IR pax |
| option that would force this mixed case on output. (When |
| .BR \-o |
| .BR linkdata |
| is used, the output must contain the duplicate data, but the |
| implementation is free to include it or omit it when |
| .BR \-o |
| .BR linkdata |
| is not used.) |
| .P |
| The time values are included as extended header records for those |
| implementations needing more than the eleven octal digits allowed by |
| the |
| .BR ustar |
| format. Portable file timestamps cannot be negative. If |
| .IR pax |
| encounters a file with a negative timestamp in |
| .BR copy |
| or |
| .BR write |
| mode, it can reject the file, substitute a non-negative timestamp, or |
| generate a non-portable timestamp with a leading |
| .BR '\-' . |
| Even though some implementations can support finer file-time |
| granularities than seconds, the normative text requires support only |
| for seconds since the Epoch because the ISO\ POSIX\(hy1 standard states them that way. The |
| .BR ustar |
| format includes only |
| .IR mtime ; |
| the new format adds |
| .IR atime |
| and |
| .IR ctime |
| for symmetry. The |
| .IR atime |
| access time restored to the file system will be affected by the |
| .BR \-p |
| .BR a |
| and |
| .BR \-p |
| .BR e |
| options. The |
| .IR ctime |
| creation time (actually |
| .IR inode |
| modification time) is described with appropriate privileges so that |
| it can be ignored when writing to the file system. POSIX does not |
| provide a portable means to change file creation time. Nothing is |
| intended to prevent a non-portable implementation of |
| .IR pax |
| from restoring the value. |
| .P |
| The |
| .IR gid , |
| .IR size , |
| and |
| .IR uid |
| extended header records were included to allow expansion beyond the |
| sizes specified in the regular |
| .IR tar |
| header. New file system architectures are emerging that will exhaust |
| the 12-digit size field. There are probably not many systems requiring |
| more than 8 digits for user and group IDs, but the extended header |
| values were included for completeness, allowing overrides for all of |
| the decimal values in the |
| .IR tar |
| header. |
| .P |
| The standard developers intended to describe the effective results of |
| .IR pax |
| with regard to file ownerships and permissions; implementations are not |
| restricted in timing or sequencing the restoration of such, provided |
| the results are as specified. |
| .P |
| Much of the text describing the extended headers refers to use in ``\c |
| .BR write |
| or |
| .BR copy |
| modes''. The |
| .BR copy |
| mode references are due to the normative text: ``The effect of the |
| copy shall be as if the copied files were written to an archive file |
| and then subsequently extracted .\|.\|.''. There is certainly no way to |
| test whether |
| .IR pax |
| is actually generating the extended headers in |
| .BR copy |
| mode, but the effects must be as if it had. |
| .SS "pax Archive Character Set Encoding/Decoding" |
| .P |
| There is a need to exchange archives of files between systems of |
| different native codesets. Filenames, group names, and user names must |
| be preserved to the fullest extent possible when an archive is read on |
| the receiving platform. Translation of the contents of files is not |
| within the scope of the |
| .IR pax |
| utility. |
| .P |
| There will also be the need to represent characters that are not |
| available on the receiving platform. These unsupported characters |
| cannot be automatically folded to the local set of characters due to |
| the chance of collisions. This could result in overwriting previous |
| extracted files from the archive or pre-existing files on the system. |
| .P |
| For these reasons, the codeset used to represent characters within the |
| extended header records of the |
| .IR pax |
| archive must be sufficiently rich to handle all commonly used character |
| sets. The fields requiring translation include, at a minimum, |
| filenames, user names, group names, and link pathnames. Implementations |
| may wish to have localized extended keywords that use non-portable |
| characters. |
| .P |
| The standard developers considered the following options: |
| .IP " *" 4 |
| The archive creator specifies the well-defined name of the source |
| codeset. The receiver must then recognize the codeset name and perform |
| the appropriate translations to the destination codeset. |
| .IP " *" 4 |
| The archive creator includes within the archive the character mapping |
| table for the source codeset used to encode extended header records. |
| The receiver must then read the character mapping table and perform the |
| appropriate translations to the destination codeset. |
| .IP " *" 4 |
| The archive creator translates the extended header records in the |
| source codeset into a canonical form. The receiver must then perform |
| the appropriate translations to the destination codeset. |
| .P |
| The approach that incorporates the name of the source codeset poses the |
| problem of codeset name registration, and makes the archive useless to |
| .IR pax |
| archive decoders that do not recognize that codeset. |
| .P |
| Because parts of an archive may be corrupted, the standard developers |
| felt that including the character map of the source codeset was too |
| fragile. The loss of this one key component could result in making the |
| entire archive useless. (The difference between this and the global |
| extended header decision was that the latter has a |
| workaround\(emduplicating extended header records on unreliable |
| media\(embut this would be too burdensome for large character set |
| maps.) |
| .P |
| Both of the above approaches also put an undue burden on the |
| .IR pax |
| archive receiver to handle the cross-product of all source and |
| destination codesets. |
| .P |
| To simplify the translation from the source codeset to the canonical |
| form and from the canonical form to the destination codeset, the |
| standard developers decided that the internal representation should be |
| a stateless encoding. A stateless encoding is one where each codepoint |
| has the same meaning, without regard to the decoder being in a specific |
| state. An example of a stateful encoding would be the Japanese |
| Shift-JIS; an example of a stateless encoding would be the ISO/IEC\ 646:\|1991 standard |
| (equivalent to 7-bit ASCII). |
| .P |
| For these reasons, the standard developers decided to adopt a canonical |
| format for the representation of file information strings. The obvious, |
| well-endorsed candidate is the ISO/IEC\ 10646\(hy1:\|2000 standard (based in part on Unicode), which |
| can be used to represent the characters of virtually all standardized |
| character sets. The standard developers initially agreed upon using |
| UCS2 (16-bit Unicode) as the internal representation. This repertoire |
| of characters provides a sufficiently rich set to represent all |
| commonly-used codesets. |
| .P |
| However, the standard developers found that the 16-bit Unicode |
| representation had some problems. It forced the issue of standardizing |
| byte ordering. The 2-byte length of each character made the extended |
| header records twice as long for the case of strings coded entirely |
| from historical 7-bit ASCII. For these reasons, the standard developers |
| chose the UTF\(hy8 defined in the ISO/IEC\ 10646\(hy1:\|2000 standard. This multi-byte representation |
| encodes UCS2 or UCS4 characters reliably and deterministically, |
| eliminating the need for a canonical byte ordering. In addition, NUL |
| octets and other characters possibly confusing to POSIX file systems do |
| not appear, except to represent themselves. It was realized that |
| certain national codesets take up more space after the encoding, due to |
| their placement within the UCS range; it was felt that the usefulness |
| of the encoding of the names outweighs the disadvantage of size |
| increase for file, user, and group names. |
| .P |
| The encoding of UTF\(hy8 is as follows: |
| .sp |
| .RS 4 |
| .nf |
| |
| UCS4 Hex Encoding UTF-8 Binary Encoding |
| .P |
| 00000000-0000007F 0xxxxxxx |
| 00000080-000007FF 110xxxxx 10xxxxxx |
| 00000800-0000FFFF 1110xxxx 10xxxxxx 10xxxxxx |
| 00010000-001FFFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx |
| 00200000-03FFFFFF 111110xx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
| 04000000-7FFFFFFF 1111110x 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx 10xxxxxx |
| .fi |
| .P |
| .RE |
| .P |
| where each |
| .BR 'x' |
| represents a bit value from the character being translated. |
| .SS "ustar Interchange Format" |
| .P |
| The description of the |
| .BR ustar |
| format reflects numerous enhancements over pre-1988 versions of the |
| historical |
| .IR tar |
| utility. The goal of these changes was not only to provide the |
| functional enhancements desired, but also to retain compatibility |
| between new and old versions. This compatibility has been retained. |
| Archives written using the old archive format are compatible with the |
| new format. |
| .P |
| Implementors should be aware that the previous file format did not |
| include a mechanism to archive directory type files. For this reason, |
| the convention of using a filename ending with |
| <slash> |
| was adopted to specify a directory on the archive. |
| .P |
| The total size of the |
| .IR name |
| and |
| .IR prefix |
| fields have been set to meet the minimum requirements for |
| {PATH_MAX}. |
| If a pathname will fit within the |
| .IR name |
| field, it is recommended that the pathname be stored there without the |
| use of the |
| .IR prefix |
| field. Although the name field is known to be too small to contain |
| {PATH_MAX} |
| characters, the value was not changed in this version of the archive |
| file format to retain backwards-compatibility, and instead the prefix |
| was introduced. Also, because of the earlier version of the format, |
| there is no way to remove the restriction on the |
| .IR linkname |
| field being limited in size to just that of the |
| .IR name |
| field. |
| .P |
| The |
| .IR size |
| field is required to be meaningful in all implementation extensions, |
| although it could be zero. This is required so that the data blocks can |
| always be properly counted. |
| .P |
| It is suggested that if device special files need to be represented |
| that cannot be represented in the standard format, that one of the |
| extension types (\c |
| .BR A \(hy\c |
| .BR Z ) |
| be used, and that the additional information for the special file be |
| represented as data and be reflected in the |
| .IR size |
| field. |
| .P |
| Attempting to restore a special file type, where it is converted to |
| ordinary data and conflicts with an existing filename, need not be |
| specially detected by the utility. If run as an ordinary user, |
| .IR pax |
| should not be able to overwrite the entries in, for example, |
| .BR /dev |
| in any case (whether the file is converted to another type or not). If |
| run as a privileged user, it should be able to do so, and it would be |
| considered a bug if it did not. The same is true of ordinary data files |
| and similarly named special files; it is impossible to anticipate the |
| needs of the user (who could really intend to overwrite the file), so |
| the behavior should be predictable (and thus regular) and rely on the |
| protection system as required. |
| .P |
| The value 7 in the |
| .IR typeflag |
| field is intended to define how contiguous files can be stored in a |
| .BR ustar |
| archive. POSIX.1\(hy2008 does not require the contiguous file extension, but does |
| define a standard way of archiving such files so that all conforming |
| systems can interpret these file types in a meaningful and consistent |
| manner. On a system that does not support extended file types, the |
| .IR pax |
| utility should do the best it can with the file and go on to the next. |
| .P |
| The file protection modes are those conventionally used by the |
| .IR ls |
| utility. This is extended beyond the usage in the ISO\ POSIX\(hy2 standard to support the |
| ``shared text'' or ``sticky'' bit. It is intended that the conformance |
| document should not document anything beyond the existence of and |
| support of such a mode. Further extensions are expected to these bits, |
| particularly with overloading the set-user-ID and set-group-ID flags. |
| .SS "cpio Interchange Format" |
| .P |
| The reference to appropriate privileges in the |
| .BR cpio |
| format refers to an error on standard output; the |
| .BR ustar |
| format does not make comparable statements. |
| .P |
| The model for this format was the historical System V |
| .IR cpio \c |
| .BR \-c |
| data interchange format. This model documents the portable version of |
| the |
| .BR cpio |
| format and not the binary version. It has the flexibility to transfer |
| data of any type described within POSIX.1\(hy2008, yet is extensible to transfer |
| data types specific to extensions beyond POSIX.1\(hy2008 (for example, contiguous |
| files). Because it describes existing practice, there is no question of |
| maintaining upwards-compatibility. |
| .SS "cpio Header" |
| .P |
| There has been some concern that the size of the |
| .IR c_ino |
| field of the header is too small to handle those systems that have very |
| large |
| .IR inode |
| numbers. However, the |
| .IR c_ino |
| field in the header is used strictly as a hard-link resolution |
| mechanism for archives. It is not necessarily the same value as the |
| .IR inode |
| number of the file in the location from which that file is extracted. |
| .P |
| The name |
| .IR c_magic |
| is based on historical usage. |
| .SS "cpio Filename" |
| .P |
| For most historical implementations of the |
| .IR cpio |
| utility, |
| {PATH_MAX} |
| octets can be used to describe the pathname without the addition of |
| any other header fields (the NUL character would be included in this |
| count). |
| {PATH_MAX} |
| is the minimum value for pathname size, documented as 256 bytes. |
| However, an implementation may use |
| .IR c_namesize |
| to determine the exact length of the pathname. With the current |
| description of the |
| .IR <cpio.h> |
| header, this pathname size can be as large as a number that is |
| described in six octal digits. |
| .P |
| Two values are documented under the |
| .IR c_mode |
| field values to provide for extensibility for known file types: |
| .IP "\fB0110\ 000\fP" 10 |
| Reserved for contiguous files. The implementation may treat the rest of |
| the information for this archive like a regular file. If this file type |
| is undefined, the implementation may create the file as a regular |
| file. |
| .P |
| This provides for extensibility of the |
| .BR cpio |
| format while allowing for the ability to read old archives. Files of an |
| unknown type may be read as ``regular files'' on some implementations. |
| On a system that does not support extended file types, the |
| .IR pax |
| utility should do the best it can with the file and go on to the next. |
| .SH "FUTURE DIRECTIONS" |
| None. |
| .SH "SEE ALSO" |
| .IR "Chapter 2" ", " "Shell Command Language", |
| .IR "\fIcp\fR\^", |
| .IR "\fIed\fR\^", |
| .IR "\fIgetopts\fR\^", |
| .IR "\fIls\fR\^", |
| .IR "\fIprintf\fR\^" |
| .P |
| The Base Definitions volume of POSIX.1\(hy2017, |
| .IR "Section 3.169" ", " "File Mode Bits", |
| .IR "Chapter 5" ", " "File Format Notation", |
| .IR "Chapter 8" ", " "Environment Variables", |
| .IR "Section 12.2" ", " "Utility Syntax Guidelines", |
| .IR "\fB<cpio.h>\fP", |
| .IR "\fB<tar.h>\fP" |
| .P |
| The System Interfaces volume of POSIX.1\(hy2017, |
| .IR "\fIchown\fR\^(\|)", |
| .IR "\fIcreat\fR\^(\|)", |
| .IR "\fIfstatat\fR\^(\|)", |
| .IR "\fImkdir\fR\^(\|)", |
| .IR "\fImkfifo\fR\^(\|)", |
| .IR "\fIutime\fR\^(\|)", |
| .IR "\fIwrite\fR\^(\|)" |
| .\" |
| .SH COPYRIGHT |
| Portions of this text are reprinted and reproduced in electronic form |
| from IEEE Std 1003.1-2017, Standard for Information Technology |
| -- Portable Operating System Interface (POSIX), The Open Group Base |
| Specifications Issue 7, 2018 Edition, |
| Copyright (C) 2018 by the Institute of |
| Electrical and Electronics Engineers, Inc and The Open Group. |
| In the event of any discrepancy between this version and the original IEEE and |
| The Open Group Standard, the original IEEE and The Open Group Standard |
| is the referee document. The original Standard can be obtained online at |
| http://www.opengroup.org/unix/online.html . |
| .PP |
| Any typographical or formatting errors that appear |
| in this page are most likely |
| to have been introduced during the conversion of the source files to |
| man page format. To report such errors, see |
| https://www.kernel.org/doc/man-pages/reporting_bugs.html . |