| % cpu/cpu.tex |
| % mainfile: ../perfbook.tex |
| % SPDX-License-Identifier: CC-BY-SA-3.0 |
| |
| \QuickQuizChapter{chp:Hardware and its Habits}{Hardware and its Habits}{qqzcpu} |
| % |
| \Epigraph{Premature abstraction is the root of all evil.} |
| {A cast of thousands} |
| |
| Most people intuitively understand that passing messages between systems |
| is more expensive than performing simple calculations within the confines |
| of a single system. |
| But it is also the case that communicating among threads within the |
| confines of a single shared-memory system can be quite expensive. |
| This chapter therefore looks at the cost of synchronization and communication |
| within a shared-memory system. |
| These few pages can do no more than scratch the surface of shared-memory |
| parallel hardware design; readers desiring more detail would do well |
| to start with a recent edition of \pplsur{John L.}{Hennessy}'s and |
| \pplsur{David A.}{Patterson}'s classic |
| text~\cite{Hennessy2017}. |
| |
| \QuickQuiz{ |
| Why should parallel programmers bother learning low-level |
| properties of the hardware? |
| Wouldn't it be easier, better, more elegant, and more productive |
| to remain at a higher level of abstraction? |
| }\QuickQuizAnswer{ |
| It is often easier to ignore the detailed properties of the |
| hardware, and higher levels of abstraction can greatly improve |
| productivity. |
| But those of us working with low-level concurrent software would |
| be quite foolish to ignore those properties. |
| |
| After all, if you accept that the only purpose of parallelism |
| is to increase performance, and if you further accept that |
| performance depends on detailed properties of the hardware, |
| then it logically follows that parallel programmers are going |
| to need to know at least a few hardware properties. |
| |
| This is also the case in most engineering disciplines. |
| Would \emph{you} want to use a bridge designed by an |
| engineer who did not understand the properties of |
| the concrete and steel making up that bridge? |
| If not, why would you expect a parallel programmer to be |
| able to develop competent parallel software without at least |
| \emph{some} understanding of the underlying hardware? |
| |
| In short, you might not care about the laws of physics, but the |
| laws of physics cares deeply about your code! |
| }\QuickQuizEnd |
| |
| \input{cpu/overview} |
| \input{cpu/overheads} |
| \input{cpu/hwfreelunch} |
| \input{cpu/swdesign} |
| |
| So, to sum up: |
| |
| \begin{enumerate} |
| \item The good news is that multicore systems are inexpensive and |
| readily available. |
| \item More good news: |
| The overhead of many synchronization operations is much lower |
| than it was on parallel systems from the early 2000s. |
| \item The bad news is that the overhead of cache misses is still high, |
| especially on large systems. |
| \end{enumerate} |
| |
| The remainder of this book describes ways of handling this bad news. |
| |
| In particular, |
| \cref{chp:Tools of the Trade} will cover some of the low-level |
| tools used for parallel programming, |
| \cref{chp:Counting} will investigate problems and solutions to |
| parallel counting, and |
| \cref{chp:Partitioning and Synchronization Design} |
| will discuss design disciplines that promote performance and scalability. |
| |
| \QuickQuizAnswersChp{qqzcpu} |