| % howto/howto.tex |
| |
| \QuickQuizChapter{chp:How To Use This Book}{How To Use This Book} |
| |
| The purpose of this book is to help you program |
| shared-memory parallel machines without risking your sanity.\footnote{ |
| Or, perhaps more accurately, without much greater risk to your |
| sanity than that incurred by non-parallel programming. |
| Which, come to think of it, might not be saying all that much.} |
| We hope that this book's design principles will help you avoid at least some |
| parallel-programming pitfalls. |
| That said, you should think of this book as a foundation on which to build, |
| rather than as a completed cathedral. |
| Your mission, if you choose to accept, is to help make further progress |
| in the exciting field of parallel programming---progress that will |
| in time render this book obsolete. |
| Parallel programming is not as hard as some say, and we hope |
| that this book makes your parallel-programming projects easier and |
| more fun. |
| |
| In short, where parallel programming once focused on science, research, |
| and grand-challenge projects, it is quickly becoming an engineering |
| discipline. |
| We therefore examine specific parallel-programming tasks |
| and describe how to approach them. |
| In some surprisingly common cases, they can even be automated. |
| |
| This book is written in the hope that presenting the engineering |
| discipline underlying successful |
| parallel-programming projects will free a new generation of parallel hackers |
| from the need to slowly and painstakingly reinvent old wheels, enabling |
| them to instead focus their energy and creativity on new frontiers. |
| We sincerely hope that parallel programming brings you at least as |
| much fun, excitement, and challenge that it has brought to us! |
| |
| \section{Roadmap} |
| \label{sec:howto:Roadmap} |
| |
| This book is a handbook of widely applicable and heavily |
| used design techniques, rather than |
| a collection of optimal algorithms with tiny areas of applicability. |
| You are currently reading Chapter~\ref{chp:How To Use This Book}, but |
| you knew that already. |
| Chapter~\ref{chp:Introduction} gives a high-level overview of parallel |
| programming. |
| |
| Chapter~\ref{chp:Hardware and its Habits} introduces shared-memory |
| parallel hardware. |
| After all, it is difficult to write good parallel code unless you |
| understand the underlying hardware. |
| Because hardware constantly evolves, this chapter will always be |
| out of date. |
| We will nevertheless do our best to keep up. |
| Chapter~\ref{chp:Tools of the Trade} then provides a very brief overview |
| of common shared-memory parallel-programming primitives. |
| |
| Chapter~\ref{chp:Counting} takes an in-depth look at parallelizing |
| one of the simplest problems imaginable, namely counting. |
| Because almost everyone has an excellent grasp of counting, this chapter |
| is able to delve into many important parallel-programming issues without |
| the distractions of more-typical computer-science problems. |
| My impression is that this chapter has seen the greatest use in |
| parallel-programming coursework. |
| |
| Chapter~\ref{cha:Partitioning and Synchronization Design} |
| introduces a number of design-level methods of addressing the issues |
| identified in Chapter~\ref{chp:Counting}. |
| It turns out that it is important to address parallelism at |
| the design level when feasible: |
| To paraphrase Dijkstra~\cite{Dijkstra:1968:LEG:362929.362947}, |
| ``retrofitted parallelism considered grossly |
| suboptimal''~\cite{PaulEMcKenney2012HOTPARsuboptimal}. |
| |
| The next three chapters examine three important approaches to |
| synchronization. |
| Chapter~\ref{chp:Locking} covers locking, which in 2014 is not only the |
| workhorse of production-quality parallel programming, but is also widely |
| considered to be parallel programming's worst villain. |
| Chapter~\ref{chp:Data Ownership} gives a brief overview of data ownership, |
| an often overlooked but remarkably pervasive and powerful approach. |
| Finally, Chapter~\ref{chp:Deferred Processing} introduces a number of |
| deferred-processing mechanisms, including reference counting, |
| hazard pointers, sequence locking, and RCU. |
| |
| Chapter~\ref{chp:Data Structures} applies the lessons of previous |
| chapters to hash tables, which are heavily used due |
| to their excellent partitionability, which (usually) leads to excellent |
| performance and scalability. |
| |
| As many have learned to their sorrow, parallel programming without |
| validation is a sure path to abject failure. |
| Chapter~\ref{chp:Validation} covers various forms of testing. |
| It is of course impossible to test reliability into your program |
| after the fact, so Chapter~\ref{chp:Formal Verification} |
| follows up with a brief overview of a couple of practical approaches to |
| formal verification. |
| |
| Chapter~\ref{chp:Putting It All Together} |
| contains a series of moderate-sized parallel programming problems. |
| The difficulty of these problems vary, but should be appropriate for |
| someone who has mastered the material in the previous chapters. |
| |
| Chapter~\ref{sec:advsync:Advanced Synchronization} |
| looks at advanced synchronization methods, including memory barriers |
| and non-blocking synchronization, while |
| Chapter~\ref{chp:Parallel Real-Time Computing} |
| looks at the nascent field of parallel real-time computing. |
| Chapter~\ref{chp:Ease of Use} follows up with some ease-of-use advice. |
| Finally, Chapter~\ref{chp:Conflicting Visions of the Future} |
| looks at a few possible future directions, including |
| shared-memory parallel system design, software and hardware transactional |
| memory, and functional programming for parallelism. |
| |
| This chapter is followed by a number of appendices. |
| The most popular of these appears to be |
| Appendix~\ref{chp:app:whymb:Why Memory Barriers?}, |
| which covers memory barriers. |
| Appendix~\ref{chp:Answers to Quick Quizzes} |
| contains the answers to the infamous Quick Quizzes, which are discussed in |
| the next section. |
| |
| \section{Quick Quizzes} |
| \label{sec:howto:Quick Quizzes} |
| |
| ``Quick quizzes'' appear throughout this book, and the answers may |
| be found in |
| Appendix~\ref{chp:Answers to Quick Quizzes} starting on |
| page~\pageref{chp:Answers to Quick Quizzes}. |
| Some of them are based on material in which that quick quiz |
| appears, but others require you to think beyond that section, and, |
| in some cases, beyond the realm of current knowledge. |
| As with most endeavors, what you get out of this book is largely |
| determined by what you are willing to put into it. |
| Therefore, readers who make a genuine effort to solve a quiz before |
| looking at the answer |
| find their effort repaid handsomely with increased understanding |
| of parallel programming. |
| |
| \QuickQuiz{} |
| Where are the answers to the Quick Quizzes found? |
| \QuickQuizAnswer{ |
| In Appendix~\ref{chp:Answers to Quick Quizzes} starting on |
| page~\pageref{chp:Answers to Quick Quizzes}. |
| |
| Hey, I thought I owed you an easy one! |
| } \QuickQuizEnd |
| |
| \QuickQuiz{} |
| Some of the Quick Quiz questions seem to be from the viewpoint |
| of the reader rather than the author. |
| Is that really the intent? |
| \QuickQuizAnswer{ |
| Indeed it is! |
| Many are questions that Paul E. McKenney would probably have |
| asked if he was a novice student in a class covering this material. |
| It is worth noting that Paul was taught most of this material by |
| parallel hardware and software, not by professors. |
| In Paul's experience, professors are much more likely to provide |
| answers to verbal questions than are parallel systems, |
| Watson notwithstanding. |
| Of course, we could have a lengthy debate over which of professors |
| or parallel systems provide the most useful answers to these sorts |
| of questions, |
| but for the time being let's just agree that usefulness of |
| answers varies widely across the population both of professors |
| and of parallel systems. |
| |
| Other quizzes are quite similar to actual questions that have been |
| asked during conference presentations and lectures covering the |
| material in this book. |
| A few others are from the viewpoint of the author. |
| } \QuickQuizEnd |
| |
| \QuickQuiz{} |
| These Quick Quizzes are just not my cup of tea. |
| What can I do about it? |
| \QuickQuizAnswer{ |
| Here are a few possible strategies: |
| |
| \begin{enumerate} |
| \item Just ignore the Quick Quizzes and read the rest of |
| the book. |
| You might miss out on the interesting material in |
| some of the Quick Quizzes, but the rest of the book |
| has lots of good material as well. |
| This is an eminently reasonable approach if your main |
| goal is to gain a general understanding of the material |
| or if you are skimming through to book to find a |
| solution to a specific problem. |
| \item If you find the Quick Quizzes distracting but impossible |
| to ignore, you can always clone the \LaTeX{} source for |
| this book from the git archive. |
| You can then modify \path{Makefile} and \path{qqz.sty} to eliminate |
| the Quick Quizzes from the PDF output. |
| Alternatively, you could modify these two files so as |
| to pull the answers inline, immediately following |
| the questions. |
| \item Look at the answer immediately rather than investing |
| a large amount of time in coming up with your own |
| answer. |
| This approach is reasonable when a given Quick Quiz's |
| answer holds the key to a specific problem you are |
| trying to solve. |
| This approach is also reasonable if you want a somewhat |
| deeper understanding of the material, but when you do not |
| expect to be called upon to generate parallel solutions given |
| only a blank sheet of paper. |
| \end{enumerate} |
| |
| Note that as of mid-2016 the quick quizzes are hyperlinked |
| to the answers and vice versa. |
| Click either the ``Quick Quiz'' heading or the small black square |
| to move to the beginning of the answer. |
| From the answer, click on the heading or the small black square to |
| move to the beginning of the quiz, or, alternatively, click on the |
| small white square at the end of the answer to move to the end of the |
| corresponding quiz. |
| } \QuickQuizEnd |
| |
| In short, if you need a deep |
| understanding of the material, then you should invest some time |
| into answering the Quick Quizzes. |
| Don't get me wrong, passively reading the material can be quite |
| valuable, but gaining full problem-solving capability really |
| does require that you practice solving problems. |
| |
| I learned this the hard way during coursework for my late-in-life |
| Ph.D. |
| I was studying a familiar topic, and was surprised at how few of |
| the chapter's exercises I could answer off the top of my head.\footnote{ |
| So I suppose that it was just as well that my professors refused |
| to let me waive that class!} |
| Forcing myself to answer the questions greatly increased my |
| retention of the material. |
| So with these Quick Quizzes I am not asking you to do anything |
| that I have not been doing myself! |
| |
| Finally, the most common learning disability is thinking that |
| you already know. |
| The quick quizzes can be an extremely effective cure. |
| |
| \section{Alternatives to This Book} |
| \label{sec:Alternatives to This Book} |
| |
| As Knuth learned, if you want your book to be finite, it must be focused. |
| This book focuses on shared-memory parallel programming, with an |
| emphasis on software that lives near the bottom of the software stack, |
| such as operating-system kernels, parallel data-management systems, |
| low-level libraries, and the like. |
| The programming language used by this book is C. |
| |
| If you are interested in other aspects of parallelism, you might well |
| be better served by some other book. |
| Fortunately, there are many alternatives available to you: |
| |
| \begin{enumerate} |
| \item If you prefer a more academic and rigorous treatment of |
| parallel programming, |
| you might like Herlihy's and Shavit's |
| textbook~\cite{HerlihyShavit2008Textbook}. |
| This book starts with an interesting combination |
| of low-level primitives at high levels of abstraction |
| from the hardware, and works its way through locking |
| and simple data structures including lists, queues, |
| hash tables, and counters, culminating with transactional |
| memory. |
| Michael Scott's textbook~\cite{MichaelScott2013Textbook} |
| approaches similar material with more of a |
| software-engineering focus, and, as far as I know, is |
| the first formally published academic textbook to include a |
| section devoted to RCU. |
| \item If you would like an academic treatment of parallel |
| programming from a programming\-/language\-/pragmatics viewpoint, |
| you might be interested in the concurrency chapter from Scott's |
| textbook~\cite{MichaelScott2006Textbook} |
| on programming-language pragmatics. |
| \item If you are interested in an object-oriented patternist |
| treatment of parallel programming focussing on C++, |
| you might try Volumes~2 and~4 of Schmidt's POSA |
| series~\cite{SchmidtStalRohnertBuschmann2000v2Textbook, |
| BuschmannHenneySchmidt2007v4Textbook}. |
| Volume~4 in particular has some interesting chapters |
| applying this work to a warehouse application. |
| The realism of this example is attested to by |
| the section entitled ``Partitioning the Big Ball of Mud'', |
| wherein the problems inherent in parallelism often |
| take a back seat to the problems inherent in getting |
| one's head around a real-world application. |
| \item If you want to work with Linux-kernel device drivers, |
| then Corbet's, Rubini's, and Kroah-Hartman's |
| ``Linux Device Drivers''~\cite{CorbetRubiniKroahHartman} |
| is indispensable, as is the Linux Weekly News web site |
| (\url{http://lwn.net/}). |
| There is a large number of books and resources on |
| the more general topic of Linux kernel internals. |
| \item If your primary focus is scientific and technical computing, |
| and you prefer a patternist approach, |
| you might try Mattson et al.'s |
| textbook~\cite{Mattson2005Textbook}. |
| It covers Java, C/C++, OpenMP, and MPI. |
| Its patterns are admirably focused first on design, |
| then on implementation. |
| \item If your primary focus is scientific and technical computing, |
| and you are interested in GPUs, CUDA, and MPI, you |
| might check out Norm Matloff's ``Programming on |
| Parallel Machines''~\cite{NormMatloff2013ParProcBook}. |
| Of course, the GPU vendors have quite a bit of additional |
| information~\cite{AMD2017OpenCL,NVidia2017GPGPU,NVidia2017GPGPU-university}. |
| \item If you are interested in POSIX Threads, you might take |
| a look at David R.~Butenhof's book~\cite{Butenhof1997pthreads}. |
| In addition, |
| W.~Richard Stevens's book~\cite{WRichardStevens1992} |
| covers UNIX and POSIX, and Stewart Weiss's lecture |
| notes~\cite{StewartWeiss2013UNIX} provide an |
| thorough and accessible introduction with a good set of |
| examples. |
| \item If you are interested in C++11, you might like |
| Anthony Williams's ``C++ Concurrency in Action: Practical |
| Multithreading''~\cite{AnthonyWilliams2012}. |
| \item If you are interested in C++, but in a Windows environment, |
| you might try Herb Sutter's ``Effective Concurrency'' |
| series in |
| Dr. Dobbs Journal~\cite{HerbSutter2008EffectiveConcurrency}. |
| This series does a reasonable job of presenting a |
| commonsense approach to parallelism. |
| \item If you want to try out Intel Threading Building Blocks, |
| then perhaps James Reinders's book~\cite{Reinders2007Textbook} |
| is what you are looking for. |
| \item Those interested in learning how various types of multi-processor |
| hardware |
| cache organizations affect the implementation of kernel |
| internals should take a look at Curt Schimmel's classic |
| treatment of this subject~\cite{Schimmel:1994:USM:175689}. |
| \item Finally, those using Java might be well-served by Doug Lea's |
| textbooks~\cite{DougLea1997Textbook,Goetz2007Textbook}. |
| \end{enumerate} |
| |
| However, if you are interested in principles of parallel design |
| for low-level software, especially software written in C, read on! |
| |
| \section{Sample Source Code} |
| \label{sec:howto:Sample Source Code} |
| |
| This book discusses its fair share of source code, and in many cases |
| this source code may be found in the \path{CodeSamples} directory |
| of this book's git tree. |
| For example, on UNIX systems, you should be able to type the following: |
| |
| \begin{quote} |
| {\scriptsize |
| \begin{verbatim} |
| find CodeSamples -name rcu_rcpls.c -print |
| \end{verbatim} |
| } |
| \end{quote} |
| |
| This command will locate the file \path{rcu_rcpls.c}, which is called out in |
| Appendix~\ref{chp:app:``Toy'' RCU Implementations}. |
| Other types of systems have well-known ways of locating files by filename. |
| |
| \section{Whose Book Is This?} |
| \label{sec:howto:Whose Book Is This?} |
| |
| \begin{figure*}[tbp] |
| { |
| \scriptsize |
| \begin{verbbox} |
| 1 git clone git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git |
| 2 cd perfbook |
| 3 # You may need to install a font here. See item 1 in FAQ.txt. |
| 4 make |
| 5 evince perfbook.pdf & # Two-column version |
| 6 make perfbook-1c.pdf |
| 7 evince perfbook-1c.pdf & # One-column version for e-readers |
| \end{verbbox} |
| } |
| \hspace*{1in}\OneColumnHSpace{-0.5in}\theverbbox |
| \caption{Creating an Up-To-Date PDF} |
| \label{fig:howto:Creating a Up-To-Date PDF} |
| \end{figure*} |
| |
| \begin{figure*}[tbp] |
| { |
| \scriptsize |
| \begin{verbbox} |
| 1 git remote update |
| 2 git checkout origin/master |
| 3 make |
| 4 evince perfbook.pdf & # Two-column version |
| 5 make perfbook-1c.pdf |
| 6 evince perfbook-1c.pdf & # One-column version for e-readers |
| \end{verbbox} |
| } |
| \hspace*{1in}\OneColumnHSpace{-0.5in}\theverbbox |
| \caption{Generating an Updated PDF} |
| \label{fig:howto:Generating an Updated PDF} |
| \end{figure*} |
| |
| As the cover says, the editor is one Paul E.~McKenney. |
| However, the editor does accept contributions via the |
| \href{mailto:perfbook@vger.kernel.org} |
| {\nolinkurl{perfbook@vger.kernel.org}} email list. |
| These contributions can be in pretty much any form, with popular |
| approaches including text emails, |
| patches against the book's \LaTeX{} source, and even \co{git pull} requests. |
| Use whatever form works best for you. |
| |
| To create patches or \co{git pull} requests, you will need the |
| \LaTeX{} source to the book, which is at |
| \url{git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/perfbook.git}. |
| You will of course also need \co{git} and \LaTeX{}, which are |
| available as part of most mainstream Linux distributions. |
| Other packages may be required, depending on the distribution you use. |
| The required list of packages for a few popular distributions is listed |
| in the file \path{FAQ-BUILD.txt} in the \LaTeX{} source to the book. |
| |
| To create and display a current \LaTeX{} source tree of this book, |
| use the list of Linux commands shown in |
| Figure~\ref{fig:howto:Creating a Up-To-Date PDF}. |
| In some environments, the \co{evince} command that displays \path{perfbook.pdf} |
| may need to be replaced, for example, with \co{acroread}. |
| The \co{git clone} command need only be used the first time you |
| create a PDF, subsequently, you can run the commands shown in |
| Figure~\ref{fig:howto:Generating an Updated PDF} to pull in any updates |
| and generate an updated PDF. |
| The commands in |
| Figure~\ref{fig:howto:Generating an Updated PDF} |
| must be run within the \path{perfbook} directory created by the commands |
| shown in |
| Figure~\ref{fig:howto:Creating a Up-To-Date PDF}. |
| |
| PDFs of this book are sporadically posted at |
| \url{http://kernel.org/pub/linux/kernel/people/paulmck/perfbook/perfbook.html} |
| and at |
| \url{http://www.rdrop.com/users/paulmck/perfbook/}. |
| |
| The actual process of contributing patches and sending \co{git pull} |
| requests is similar to that of the Linux kernel, which is documented |
| in the \path{Documentation/SubmittingPatches} file in the Linux source tree. |
| One important requirement is that each patch (or commit, in the case |
| of a \co{git pull} request) must contain a valid \co{Signed-off-by:} line, |
| which has the following format: |
| |
| \begin{quote} |
| { \scriptsize |
| \co{Signed-off-by: My Name <myname@example.org>} |
| } |
| \end{quote} |
| |
| Please see \url{http://lkml.org/lkml/2007/1/15/219} for an example |
| patch containing a \co{Signed-off-by:} line. |
| |
| It is important to note that the \co{Signed-off-by:} line has |
| a very specific meaning, namely that you are certifying that: |
| |
| \begin{enumerate}[label={(\alph*)}] |
| \item The contribution was created in whole or in part |
| by me and I have the right to submit it under |
| the open source license indicated in the file; or |
| \item The contribution is based upon previous work |
| that, to the best of my knowledge, is covered |
| under an appropriate open source License and I |
| have the right under that license to submit that |
| work with modifications, whether created in whole |
| or in part by me, under the same open source |
| license (unless I am permitted to submit under |
| a different license), as indicated in the file; or |
| \item The contribution was provided directly to me by |
| some other person who certified (a), (b) or (c) |
| and I have not modified it. |
| \item I understand and agree that this project and the |
| contribution are public and that a record of the |
| contribution (including all personal information |
| I submit with it, including my sign-off) is |
| maintained indefinitely and may be redistributed |
| consistent with this project or the open source |
| license(s) involved. |
| \end{enumerate} |
| |
| This is quite similar to the Developer's Certificate of Origin (DCO) |
| 1.1 used by the Linux kernel. |
| You must use your real name: I unfortunately cannot accept pseudonymous or |
| anonymous contributions. |
| |
| The language of this book is American English, however, the open-source |
| nature of this book permits translations, and I personally encourage them. |
| The open-source licenses covering this book additionally allow you |
| to sell your translation, if you wish. |
| I do request that you send me a copy of the translation (hardcopy if |
| available), but this is a request made as a professional courtesy, |
| and is not in any way a prerequisite to the permission that you already |
| have under the Creative Commons and GPL licenses. |
| Please see the \co{FAQ.txt} file in the source tree for a list of |
| translations currently in progress. |
| I consider a translation effort to be ``in progress'' once at least one |
| chapter has been fully translated. |
| |
| As noted at the beginning of this section, I am this book's editor. |
| However, if you choose to contribute, it will be your book as well. |
| With that, I offer you Chapter~\ref{chp:Introduction}, our introduction. |