| Conversion of Open Group's troff sources to POSIX man pages |
| =========================================================== |
| |
| 1. Necessary data: |
| ================== |
| |
| - obtainable from The Open Group |
| - directory with the troff sources |
| - file ,xref.5 containing information to crossreferences |
| - file _strings.def containing information to references to other |
| standards |
| - obtainable online |
| - the HTML version of the standard |
| |
| |
| The directory of troff sources contains four directories: "Builtins", |
| "Commands", "Functions", "Headers". (Some of these contain |
| subdirectories with "LEGACY" interfaces.) The directories contain .mm |
| and .h files containing groff_mm files with extensions by The Open |
| Group. Upon request one can also obtain a file defining their custom |
| macros but this file is not necessary for the scripts. |
| |
| A relevant line in ,xref.5 could look like |
| |
| gropdf-info:href workdir page 104 Section 3.441 |
| |
| It contains a label ("workdir"), the page number and the |
| section number. |
| |
| A line in _strings.def might look like |
| |
| .ds Z5 ISO\ POSIX\(hy1 standard |
| |
| This tells us how to translate the escape sequence \*(Z5 . |
| |
| The HTML version of the standard can be obtained at |
| |
| http://pubs.opengroup.org/onlinepubs/9699919799/download/index.html |
| |
| The relevant files for the scripts are basedefs/V1_chap*.html, |
| functions/V2_chap*.html, utilities/V3_chap*.html and |
| xrat/V4_*_chap*.html. These are parts of the standard we do not |
| have the sources for. |
| |
| 2. Procedure to generate the man pages |
| ====================================== |
| |
| Change your directory to the directory containing the conversion |
| scripts. Type |
| |
| ./,xref.1.awk < ,xref.5 > ,xref.1 |
| ./,xref.py /path/to/HTML_version_of_standard > ,xref |
| |
| to generate ,xref and |
| |
| sed -f _strings.sed _strings.def > _strings. |
| |
| to generate _strings. With this done you can start generating |
| individual man pages. To generate all pages use: |
| |
| ./posix.py 0p /path/to/troff_sources/Headers/*.h |
| ./posix.py 1p /path/to/troff_sources/Built-Ins/*.mm |
| ./posix.py 1p /path/to/troff_sources/Commands/*.mm |
| ./posix.py 3p /path/to/troff_sources/Functions/*.mm |
| |
| You can now find the converted pages in your current working |
| directory. |
| |
| 3. Description of the included scripts |
| ====================================== |
| |
| ,xref.1.awk takes ,xref.5 from its standard input, strips |
| irrelevant lines and transforms lines of the form |
| |
| gropdf-info:href whitespace page 103 Section 3.436 |
| |
| to |
| |
| whitespace Section 3.436 |
| |
| ,xref.1.py expects ,xref.1 generated from ,xref.1.awk in the |
| current working directory and the path to the HTML version of |
| the standard as its first argument. It extracts section, table |
| and figure names for parts of the standard we do not have sources |
| for, adds them to the xrefs and writes them to standard output. |
| For the example, inside |
| |
| /path/to/HTML_version_of_standard/basedefs/V1_chap03.html |
| |
| it finds a line |
| |
| class; see also <a href="#tag_03_436">White Space</a>.</p> |
| |
| and therefore outputs |
| |
| whitespace Section 3.436, White Space |
| |
| to ,xref. |
| |
| The sed script _strings.sed does a simple conversion of lines of |
| the form |
| |
| .ds Z5 ISO\ POSIX\(hy1 standard |
| |
| to |
| |
| \*(Z5 ISO\ POSIX\(hy1 standard |
| |
| The main script is posix.py. It takes the name of the man section |
| as its first argument and the names of the pages to be converted |
| as its other arguments. Furthermore, it expects the data files |
| ,xref and _strings in its current working directory. It outputs |
| converted man pages to its current working directory. |
| |
| Notes: |
| |
| A final processing of the xrefs happens in posix.py: On the one |
| hand the section names for cross-references internal to the |
| current page are added. On the other hand the references to |
| other man pages are correctly formatted. The order of the entries |
| in ,xref is used to deduce the right section number. This could |
| also be achieved by careful examining the source directory. |
| |
| The code in posix.py to get the indentation right by inserting |
| ".RS ..." and ".RE" in the right places is very hacky and might |
| fail with pages with a slightly more complex structure then now. |