blob: c3dd9fa73e84e46abf11cfd193add58302d3c9b1 [file] [log] [blame]
% cpu/cpu.tex
% SPDX-License-Identifier: CC-BY-SA-3.0
\QuickQuizChapter{chp:Hardware and its Habits}{Hardware and its Habits}
%
\Epigraph{Premature abstraction is the root of all evil.}
{\emph{A cast of thousands}}
Most people have an intuitive understanding that passing messages between
systems is considerably more expensive than performing simple calculations
within the confines of a single system.
However, it is not always so clear that communicating among threads within
the confines of a single shared-memory system can also be quite expensive.
This chapter therefore looks at the cost of synchronization and communication
within a shared-memory system.
These few pages can do no more than scratch the surface of shared-memory
parallel hardware design; readers desiring more detail would do well
to start with a recent edition of Hennessy and Patterson's classic
text~\cite{Hennessy2011,Hennessy95a}.
\QuickQuiz{}
Why should parallel programmers bother learning low-level
properties of the hardware?
Wouldn't it be easier, better, and more general to remain at
a higher level of abstraction?
\QuickQuizAnswer{
It might well be easier to ignore the detailed properties of
the hardware, but in most cases it would be quite foolish
to do so.
If you accept that the only purpose of parallelism is to
increase performance, and if you further accept that
performance depends on detailed properties of the hardware,
then it logically follows that parallel programmers are going
to need to know at least a few hardware properties.
This is the case in most engineering disciplines.
Would \emph{you} want to use a bridge designed by an
engineer who did not understand the properties of
the concrete and steel making up that bridge?
If not, why would you expect a parallel programmer to be
able to develop competent parallel software without at least
\emph{some} understanding of the underlying hardware?
} \QuickQuizEnd
\input{cpu/overview}
\input{cpu/overheads}
\input{cpu/hwfreelunch}
\input{cpu/swdesign}
So, to sum up:
\begin{enumerate}
\item The good news is that multicore systems are inexpensive and
readily available.
\item More good news: The overhead of many synchronization operations
is much lower than it was on parallel systems from the early 2000s.
\item The bad news is that the overhead of cache misses is still high,
especially on large systems.
\end{enumerate}
The remainder of this book describes ways of handling this bad news.
In particular,
Chapter~\ref{chp:Tools of the Trade} will cover some of the low-level
tools used for parallel programming,
Chapter~\ref{chp:Counting} will investigate problems and solutions to
parallel counting, and
Chapter~\ref{cha:Partitioning and Synchronization Design}
will discuss design disciplines that promote performance and scalability.