locking/locking-existence.tex - pub/scm/linux/kernel/git/paulmck/perfbook - Git at Google

 % locking/locking-existence.tex
 % mainfile: ../perfbook.tex
 % SPDX-License-Identifier: CC-BY-SA-3.0

 \section{Lock-Based Existence Guarantees}
 \label{sec:locking:Lock-Based Existence Guarantees}
 %
 \epigraph{Existence precedes and rules essence.}{Jean-Paul Sartre}

 A key challenge in parallel programming is to provide
 \emph{\IXpl{existence guarantee}}~\cite{Gamsa99},
 so that attempts to access a given object can rely on that object
 being in existence throughout a given access attempt.

 \begin{listing}
 \begin{fcvlabel}[ln:locking:Per-Element Locking Without Existence Guarantees]
 \begin{VerbatimL}[commandchars=\\\@\$]
 int delete(int key)
 {
 	int b;
 	struct element *p;

 	b = hashfunction(key);
 	p = hashtable[b];
 	if (p == NULL || p->key != key)		\lnlbl@chk_first$
 		return 0;
 	spin_lock(&p->lock);			\lnlbl@acq$
 	hashtable[b] = NULL;			\lnlbl@NULL$
 	spin_unlock(&p->lock);
 	kfree(p);
 	return 1;				\lnlbl@return1$
 }
 \end{VerbatimL}
 \end{fcvlabel}
 \caption{Per-Element Locking Without Existence Guarantees (Buggy!)}
 \label{lst:locking:Per-Element Locking Without Existence Guarantees (Buggy!)}
 \end{listing}

 In some cases, existence guarantees are implicit:

 \begin{enumerate}
 \item	Global variables and static local variables in the base module
 	will exist as long as the application is running.
 \item	Global variables and static local variables in a loaded module
 	will exist as long as that module remains loaded.
 \item	A module will remain loaded as long as at least one of its functions
 	has an active instance.
 \item	A given function instance's on-stack variables will exist until
 	that instance returns.
 \item	If you are executing within a given function or have been
 	called (directly or indirectly) from that function,
 	then the given function has an active instance.
 \end{enumerate}

 These implicit existence guarantees are straightforward, though
 bugs involving implicit existence guarantees really can happen.

 \QuickQuiz{
 	How can relying on implicit existence guarantees result in
 	a bug?
 }\QuickQuizAnswer{
 	Here are some bugs resulting from improper use of implicit
 	existence guarantees:
 	\begin{enumerate}
 	\item	A program writes the address of a global variable to
 		a file, then a later instance of that same program
 		reads that address and attempts to dereference it.
 		This can fail due to address-space randomization,
 		to say nothing of recompilation of the program.
 	\item	A module can record the address of one of its variables
 		in a pointer located in some other module, then attempt
 		to dereference that pointer after the module has
 		been unloaded.
 	\item	A function can record the address of one of its on-stack
 		variables into a global pointer, which some other
 		function might attempt to dereference after that function
 		has returned.
 	\end{enumerate}
 	I am sure that you can come up with additional possibilities.
 }\QuickQuizEnd

 But the more interesting---and troublesome---guarantee involves
 heap memory:
 A dynamically allocated data structure will exist until it is freed.
 The problem to be solved is to synchronize the freeing of the structure
 with concurrent accesses to that same structure.
 One way to do this is with \emph{explicit guarantees}, such as locking.
 If a given structure may only be freed while holding a given lock, then holding
 that lock guarantees that structure's existence.

 But this guarantee depends on the existence of the lock itself.
 One straightforward way to guarantee the lock's existence is to
 place the lock in a global variable, but global locking has the disadvantage
 of limiting scalability.
 One way of providing scalability that improves as the size of the
 data structure increases is to place a lock in each element of the
 structure.
 Unfortunately, putting the lock that is to protect a data element
 in the data element itself is subject to subtle \IXpl{race condition},
 as shown in
 \cref{lst:locking:Per-Element Locking Without Existence Guarantees (Buggy!)}.

 \QuickQuiz{
 	\begin{fcvref}[ln:locking:Per-Element Locking Without Existence Guarantees]
 	What if the element we need to delete is not the first element
 	of the list on \clnref{chk_first} of
 	\cref{lst:locking:Per-Element Locking Without Existence Guarantees (Buggy!)}?
 	\end{fcvref}
 }\QuickQuizAnswer{
 	This is a very simple hash table with no chaining, so the only
 	element in a given bucket is the first element.
 	The reader is invited to adapt this example to a hash table with
 	full chaining.
 }\QuickQuizEnd


 \begin{fcvref}[ln:locking:Per-Element Locking Without Existence Guarantees]
 To see one of these race conditions, consider the following sequence
 of events:
 \begin{enumerate}
 \item	Thread~0 invokes \co{delete(0)}, and reaches \clnref{acq} of
 	the listing, acquiring the lock.
 \item	Thread~1 concurrently invokes \co{delete(0)}, reaching
 	\clnref{acq}, but spins on the lock because Thread~0 holds it.
 \item	Thread~0 executes \clnrefrange{NULL}{return1}, removing the element from
 	the hashtable, releasing the lock, and then freeing the
 	element.
 \item	Thread~0 continues execution, and allocates memory, getting
 	the exact block of memory that it just freed.
 \item	Thread~0 then initializes this block of memory as some
 	other type of structure.
 \item	Thread~1's \co{spin_lock()} operation fails due to the
 	fact that what it believes to be \co{p->lock} is no longer
 	a spinlock.
 \end{enumerate}
 Because there is no existence guarantee, the identity of the
 data element can change while a thread is attempting to acquire
 that element's lock on \clnref{acq}!
 \end{fcvref}

 \begin{listing}
 \begin{fcvlabel}[ln:locking:Per-Element Locking With Lock-Based Existence Guarantees]
 \begin{VerbatimL}[commandchars=\\\@\$]
 int delete(int key)
 {
 	int b;
 	struct element *p;
 	spinlock_t *sp;

 	b = hashfunction(key);
 	sp = &locktable[b];
 	spin_lock(sp);				\lnlbl@acq$
 	p = hashtable[b];			\lnlbl@getp$
 	if (p == NULL || p->key != key) {
 		spin_unlock(sp);
 		return 0;
 	}
 	hashtable[b] = NULL;
 	spin_unlock(sp);
 	kfree(p);
 	return 1;
 }
 \end{VerbatimL}
 \end{fcvlabel}
 \caption{Per-Element Locking With Lock-Based Existence Guarantees}
 \label{lst:locking:Per-Element Locking With Lock-Based Existence Guarantees}
 \end{listing}

 \begin{fcvref}[ln:locking:Per-Element Locking With Lock-Based Existence Guarantees]
 One way to fix this example is to use a hashed set of global locks, so
 that each hash bucket has its own lock, as shown in
 \cref{lst:locking:Per-Element Locking With Lock-Based Existence Guarantees}.
 This approach allows acquiring the proper lock (on \clnref{acq}) before
 gaining a pointer to the data element (on \clnref{getp}).
 Although this approach works quite well for elements contained in a
 single partitionable data structure such as the hash table shown in the
 listing, it can be problematic if a given data element can be a member
 of multiple hash tables or given more-complex data structures such
 as trees or graphs.
 Not only can these problems be solved, but the solutions also form
 the basis of lock-based software transactional memory
 implementations~\cite{Shavit95,DaveDice2006DISC}.
 However,
 \cref{chp:Deferred Processing}
 describes simpler---and faster---ways of providing existence guarantees.
 \end{fcvref}

 % @@@ Optimized sharded locking from P1726R5, and pointer-zap implications.
	% locking/locking-existence.tex
	% mainfile: ../perfbook.tex
	% SPDX-License-Identifier: CC-BY-SA-3.0

	\section{Lock-Based Existence Guarantees}
	\label{sec:locking:Lock-Based Existence Guarantees}
	%
	\epigraph{Existence precedes and rules essence.}{Jean-Paul Sartre}

	A key challenge in parallel programming is to provide
	\emph{\IXpl{existence guarantee}}~\cite{Gamsa99},
	so that attempts to access a given object can rely on that object
	being in existence throughout a given access attempt.

	\begin{listing}
	\begin{fcvlabel}[ln:locking:Per-Element Locking Without Existence Guarantees]
	\begin{VerbatimL}[commandchars=\\\@\$]
	int delete(int key)
	{
	int b;
	struct element *p;

	b = hashfunction(key);
	p = hashtable[b];
	if (p == NULL \|\| p->key != key) \lnlbl@chk_first$
	return 0;
	spin_lock(&p->lock); \lnlbl@acq$
	hashtable[b] = NULL; \lnlbl@NULL$
	spin_unlock(&p->lock);
	kfree(p);
	return 1; \lnlbl@return1$
	}
	\end{VerbatimL}
	\end{fcvlabel}
	\caption{Per-Element Locking Without Existence Guarantees (Buggy!)}
	\label{lst:locking:Per-Element Locking Without Existence Guarantees (Buggy!)}
	\end{listing}

	In some cases, existence guarantees are implicit:

	\begin{enumerate}
	\item Global variables and static local variables in the base module
	will exist as long as the application is running.
	\item Global variables and static local variables in a loaded module
	will exist as long as that module remains loaded.
	\item A module will remain loaded as long as at least one of its functions
	has an active instance.
	\item A given function instance's on-stack variables will exist until
	that instance returns.
	\item If you are executing within a given function or have been
	called (directly or indirectly) from that function,
	then the given function has an active instance.
	\end{enumerate}

	These implicit existence guarantees are straightforward, though
	bugs involving implicit existence guarantees really can happen.

	\QuickQuiz{
	How can relying on implicit existence guarantees result in
	a bug?
	}\QuickQuizAnswer{
	Here are some bugs resulting from improper use of implicit
	existence guarantees:
	\begin{enumerate}
	\item A program writes the address of a global variable to
	a file, then a later instance of that same program
	reads that address and attempts to dereference it.
	This can fail due to address-space randomization,
	to say nothing of recompilation of the program.
	\item A module can record the address of one of its variables
	in a pointer located in some other module, then attempt
	to dereference that pointer after the module has
	been unloaded.
	\item A function can record the address of one of its on-stack
	variables into a global pointer, which some other
	function might attempt to dereference after that function
	has returned.
	\end{enumerate}
	I am sure that you can come up with additional possibilities.
	}\QuickQuizEnd

	But the more interesting---and troublesome---guarantee involves
	heap memory:
	A dynamically allocated data structure will exist until it is freed.
	The problem to be solved is to synchronize the freeing of the structure
	with concurrent accesses to that same structure.
	One way to do this is with \emph{explicit guarantees}, such as locking.
	If a given structure may only be freed while holding a given lock, then holding
	that lock guarantees that structure's existence.

	But this guarantee depends on the existence of the lock itself.
	One straightforward way to guarantee the lock's existence is to
	place the lock in a global variable, but global locking has the disadvantage
	of limiting scalability.
	One way of providing scalability that improves as the size of the
	data structure increases is to place a lock in each element of the
	structure.
	Unfortunately, putting the lock that is to protect a data element
	in the data element itself is subject to subtle \IXpl{race condition},
	as shown in
	\cref{lst:locking:Per-Element Locking Without Existence Guarantees (Buggy!)}.

	\QuickQuiz{
	\begin{fcvref}[ln:locking:Per-Element Locking Without Existence Guarantees]
	What if the element we need to delete is not the first element
	of the list on \clnref{chk_first} of
	\cref{lst:locking:Per-Element Locking Without Existence Guarantees (Buggy!)}?
	\end{fcvref}
	}\QuickQuizAnswer{
	This is a very simple hash table with no chaining, so the only
	element in a given bucket is the first element.
	The reader is invited to adapt this example to a hash table with
	full chaining.
	}\QuickQuizEnd


	\begin{fcvref}[ln:locking:Per-Element Locking Without Existence Guarantees]
	To see one of these race conditions, consider the following sequence
	of events:
	\begin{enumerate}
	\item Thread~0 invokes \co{delete(0)}, and reaches \clnref{acq} of
	the listing, acquiring the lock.
	\item Thread~1 concurrently invokes \co{delete(0)}, reaching
	\clnref{acq}, but spins on the lock because Thread~0 holds it.
	\item Thread~0 executes \clnrefrange{NULL}{return1}, removing the element from
	the hashtable, releasing the lock, and then freeing the
	element.
	\item Thread~0 continues execution, and allocates memory, getting
	the exact block of memory that it just freed.
	\item Thread~0 then initializes this block of memory as some
	other type of structure.
	\item Thread~1's \co{spin_lock()} operation fails due to the
	fact that what it believes to be \co{p->lock} is no longer
	a spinlock.
	\end{enumerate}
	Because there is no existence guarantee, the identity of the
	data element can change while a thread is attempting to acquire
	that element's lock on \clnref{acq}!
	\end{fcvref}

	\begin{listing}
	\begin{fcvlabel}[ln:locking:Per-Element Locking With Lock-Based Existence Guarantees]
	\begin{VerbatimL}[commandchars=\\\@\$]
	int delete(int key)
	{
	int b;
	struct element *p;
	spinlock_t *sp;

	b = hashfunction(key);
	sp = &locktable[b];
	spin_lock(sp); \lnlbl@acq$
	p = hashtable[b]; \lnlbl@getp$
	if (p == NULL \|\| p->key != key) {
	spin_unlock(sp);
	return 0;
	}
	hashtable[b] = NULL;
	spin_unlock(sp);
	kfree(p);
	return 1;
	}
	\end{VerbatimL}
	\end{fcvlabel}
	\caption{Per-Element Locking With Lock-Based Existence Guarantees}
	\label{lst:locking:Per-Element Locking With Lock-Based Existence Guarantees}
	\end{listing}

	\begin{fcvref}[ln:locking:Per-Element Locking With Lock-Based Existence Guarantees]
	One way to fix this example is to use a hashed set of global locks, so
	that each hash bucket has its own lock, as shown in
	\cref{lst:locking:Per-Element Locking With Lock-Based Existence Guarantees}.
	This approach allows acquiring the proper lock (on \clnref{acq}) before
	gaining a pointer to the data element (on \clnref{getp}).
	Although this approach works quite well for elements contained in a
	single partitionable data structure such as the hash table shown in the
	listing, it can be problematic if a given data element can be a member
	of multiple hash tables or given more-complex data structures such
	as trees or graphs.
	Not only can these problems be solved, but the solutions also form
	the basis of lock-based software transactional memory
	implementations~\cite{Shavit95,DaveDice2006DISC}.
	However,
	\cref{chp:Deferred Processing}
	describes simpler---and faster---ways of providing existence guarantees.
	\end{fcvref}

	% @@@ Optimized sharded locking from P1726R5, and pointer-zap implications.