Merge tag 'xarray-4.20-rc4' of git://git.infradead.org/users/willy/linux-dax

Pull XArray updates from Matthew Wilcox:
 "We found some bugs in the DAX conversion to XArray (and one bug which
  predated the XArray conversion). There were a couple of bugs in some
  of the higher-level functions, which aren't actually being called in
  today's kernel, but surfaced as a result of converting existing radix
  tree & IDR users over to the XArray.

  Some of the other changes to how the higher-level APIs work were also
  motivated by converting various users; again, they're not in use in
  today's kernel, so changing them has a low probability of introducing
  a bug.

  Dan can still trigger a bug in the DAX code with hot-offline/online,
  and we're working on tracking that down"

* tag 'xarray-4.20-rc4' of git://git.infradead.org/users/willy/linux-dax:
  XArray tests: Add missing locking
  dax: Avoid losing wakeup in dax_lock_mapping_entry
  dax: Fix huge page faults
  dax: Fix dax_unlock_mapping_entry for PMD pages
  dax: Reinstate RCU protection of inode
  dax: Make sure the unlocking entry isn't locked
  dax: Remove optimisation from dax_lock_mapping_entry
  XArray tests: Correct some 64-bit assumptions
  XArray: Correct xa_store_range
  XArray: Fix Documentation
  XArray: Handle NULL pointers differently for allocation
  XArray: Unify xa_store and __xa_store
  XArray: Add xa_store_bh() and xa_store_irq()
  XArray: Turn xa_erase into an exported function
  XArray: Unify xa_cmpxchg and __xa_cmpxchg
  XArray: Regularise xa_reserve
  nilfs2: Use xa_erase_irq
  XArray: Export __xa_foo to non-GPL modules
  XArray: Fix xa_for_each with a single element at 0
diff --git a/Documentation/core-api/xarray.rst b/Documentation/core-api/xarray.rst
index a4e7051..dbe96cb 100644
--- a/Documentation/core-api/xarray.rst
+++ b/Documentation/core-api/xarray.rst
@@ -74,7 +74,8 @@
 new entry and return the previous entry stored at that index.  You can
 use :c:func:`xa_erase` instead of calling :c:func:`xa_store` with a
 ``NULL`` entry.  There is no difference between an entry that has never
-been stored to and one that has most recently had ``NULL`` stored to it.
+been stored to, one that has been erased and one that has most recently
+had ``NULL`` stored to it.
 
 You can conditionally replace an entry at an index by using
 :c:func:`xa_cmpxchg`.  Like :c:func:`cmpxchg`, it will only succeed if
@@ -105,23 +106,44 @@
 indices.  Storing into one index may result in the entry retrieved by
 some, but not all of the other indices changing.
 
+Sometimes you need to ensure that a subsequent call to :c:func:`xa_store`
+will not need to allocate memory.  The :c:func:`xa_reserve` function
+will store a reserved entry at the indicated index.  Users of the normal
+API will see this entry as containing ``NULL``.  If you do not need to
+use the reserved entry, you can call :c:func:`xa_release` to remove the
+unused entry.  If another user has stored to the entry in the meantime,
+:c:func:`xa_release` will do nothing; if instead you want the entry to
+become ``NULL``, you should use :c:func:`xa_erase`.
+
+If all entries in the array are ``NULL``, the :c:func:`xa_empty` function
+will return ``true``.
+
 Finally, you can remove all entries from an XArray by calling
 :c:func:`xa_destroy`.  If the XArray entries are pointers, you may wish
 to free the entries first.  You can do this by iterating over all present
 entries in the XArray using the :c:func:`xa_for_each` iterator.
 
-ID assignment
--------------
+Allocating XArrays
+------------------
+
+If you use :c:func:`DEFINE_XARRAY_ALLOC` to define the XArray, or
+initialise it by passing ``XA_FLAGS_ALLOC`` to :c:func:`xa_init_flags`,
+the XArray changes to track whether entries are in use or not.
 
 You can call :c:func:`xa_alloc` to store the entry at any unused index
 in the XArray.  If you need to modify the array from interrupt context,
 you can use :c:func:`xa_alloc_bh` or :c:func:`xa_alloc_irq` to disable
-interrupts while allocating the ID.  Unlike :c:func:`xa_store`, allocating
-a ``NULL`` pointer does not delete an entry.  Instead it reserves an
-entry like :c:func:`xa_reserve` and you can release it using either
-:c:func:`xa_erase` or :c:func:`xa_release`.  To use ID assignment, the
-XArray must be defined with :c:func:`DEFINE_XARRAY_ALLOC`, or initialised
-by passing ``XA_FLAGS_ALLOC`` to :c:func:`xa_init_flags`,
+interrupts while allocating the ID.
+
+Using :c:func:`xa_store`, :c:func:`xa_cmpxchg` or :c:func:`xa_insert`
+will mark the entry as being allocated.  Unlike a normal XArray, storing
+``NULL`` will mark the entry as being in use, like :c:func:`xa_reserve`.
+To free an entry, use :c:func:`xa_erase` (or :c:func:`xa_release` if
+you only want to free the entry if it's ``NULL``).
+
+You cannot use ``XA_MARK_0`` with an allocating XArray as this mark
+is used to track whether an entry is free or not.  The other marks are
+available for your use.
 
 Memory allocation
 -----------------
@@ -158,6 +180,8 @@
 
 Takes xa_lock internally:
  * :c:func:`xa_store`
+ * :c:func:`xa_store_bh`
+ * :c:func:`xa_store_irq`
  * :c:func:`xa_insert`
  * :c:func:`xa_erase`
  * :c:func:`xa_erase_bh`
@@ -167,6 +191,9 @@
  * :c:func:`xa_alloc`
  * :c:func:`xa_alloc_bh`
  * :c:func:`xa_alloc_irq`
+ * :c:func:`xa_reserve`
+ * :c:func:`xa_reserve_bh`
+ * :c:func:`xa_reserve_irq`
  * :c:func:`xa_destroy`
  * :c:func:`xa_set_mark`
  * :c:func:`xa_clear_mark`
@@ -177,6 +204,7 @@
  * :c:func:`__xa_erase`
  * :c:func:`__xa_cmpxchg`
  * :c:func:`__xa_alloc`
+ * :c:func:`__xa_reserve`
  * :c:func:`__xa_set_mark`
  * :c:func:`__xa_clear_mark`
 
@@ -234,7 +262,8 @@
 using :c:func:`xa_lock_irqsave` in both the interrupt handler and process
 context, or :c:func:`xa_lock_irq` in process context and :c:func:`xa_lock`
 in the interrupt handler.  Some of the more common patterns have helper
-functions such as :c:func:`xa_erase_bh` and :c:func:`xa_erase_irq`.
+functions such as :c:func:`xa_store_bh`, :c:func:`xa_store_irq`,
+:c:func:`xa_erase_bh` and :c:func:`xa_erase_irq`.
 
 Sometimes you need to protect access to the XArray with a mutex because
 that lock sits above another mutex in the locking hierarchy.  That does
@@ -322,7 +351,8 @@
      - :c:func:`xa_is_zero`
      - Zero entries appear as ``NULL`` through the Normal API, but occupy
        an entry in the XArray which can be used to reserve the index for
-       future use.
+       future use.  This is used by allocating XArrays for allocated entries
+       which are ``NULL``.
 
 Other internal entries may be added in the future.  As far as possible, they
 will be handled by :c:func:`xas_retry`.
diff --git a/fs/dax.c b/fs/dax.c
index 616e36e..9bcce89 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -98,12 +98,6 @@
 	return xa_mk_value(flags | (pfn_t_to_pfn(pfn) << DAX_SHIFT));
 }
 
-static void *dax_make_page_entry(struct page *page)
-{
-	pfn_t pfn = page_to_pfn_t(page);
-	return dax_make_entry(pfn, PageHead(page) ? DAX_PMD : 0);
-}
-
 static bool dax_is_locked(void *entry)
 {
 	return xa_to_value(entry) & DAX_LOCKED;
@@ -116,12 +110,12 @@
 	return 0;
 }
 
-static int dax_is_pmd_entry(void *entry)
+static unsigned long dax_is_pmd_entry(void *entry)
 {
 	return xa_to_value(entry) & DAX_PMD;
 }
 
-static int dax_is_pte_entry(void *entry)
+static bool dax_is_pte_entry(void *entry)
 {
 	return !(xa_to_value(entry) & DAX_PMD);
 }
@@ -222,9 +216,8 @@
 	ewait.wait.func = wake_exceptional_entry_func;
 
 	for (;;) {
-		entry = xas_load(xas);
-		if (!entry || xa_is_internal(entry) ||
-				WARN_ON_ONCE(!xa_is_value(entry)) ||
+		entry = xas_find_conflict(xas);
+		if (!entry || WARN_ON_ONCE(!xa_is_value(entry)) ||
 				!dax_is_locked(entry))
 			return entry;
 
@@ -255,6 +248,7 @@
 {
 	void *old;
 
+	BUG_ON(dax_is_locked(entry));
 	xas_reset(xas);
 	xas_lock_irq(xas);
 	old = xas_store(xas, entry);
@@ -352,16 +346,27 @@
 	return NULL;
 }
 
+/*
+ * dax_lock_mapping_entry - Lock the DAX entry corresponding to a page
+ * @page: The page whose entry we want to lock
+ *
+ * Context: Process context.
+ * Return: %true if the entry was locked or does not need to be locked.
+ */
 bool dax_lock_mapping_entry(struct page *page)
 {
 	XA_STATE(xas, NULL, 0);
 	void *entry;
+	bool locked;
 
+	/* Ensure page->mapping isn't freed while we look at it */
+	rcu_read_lock();
 	for (;;) {
 		struct address_space *mapping = READ_ONCE(page->mapping);
 
+		locked = false;
 		if (!dax_mapping(mapping))
-			return false;
+			break;
 
 		/*
 		 * In the device-dax case there's no need to lock, a
@@ -370,8 +375,9 @@
 		 * otherwise we would not have a valid pfn_to_page()
 		 * translation.
 		 */
+		locked = true;
 		if (S_ISCHR(mapping->host->i_mode))
-			return true;
+			break;
 
 		xas.xa = &mapping->i_pages;
 		xas_lock_irq(&xas);
@@ -382,28 +388,35 @@
 		xas_set(&xas, page->index);
 		entry = xas_load(&xas);
 		if (dax_is_locked(entry)) {
+			rcu_read_unlock();
 			entry = get_unlocked_entry(&xas);
-			/* Did the page move while we slept? */
-			if (dax_to_pfn(entry) != page_to_pfn(page)) {
-				xas_unlock_irq(&xas);
-				continue;
-			}
+			xas_unlock_irq(&xas);
+			put_unlocked_entry(&xas, entry);
+			rcu_read_lock();
+			continue;
 		}
 		dax_lock_entry(&xas, entry);
 		xas_unlock_irq(&xas);
-		return true;
+		break;
 	}
+	rcu_read_unlock();
+	return locked;
 }
 
 void dax_unlock_mapping_entry(struct page *page)
 {
 	struct address_space *mapping = page->mapping;
 	XA_STATE(xas, &mapping->i_pages, page->index);
+	void *entry;
 
 	if (S_ISCHR(mapping->host->i_mode))
 		return;
 
-	dax_unlock_entry(&xas, dax_make_page_entry(page));
+	rcu_read_lock();
+	entry = xas_load(&xas);
+	rcu_read_unlock();
+	entry = dax_make_entry(page_to_pfn_t(page), dax_is_pmd_entry(entry));
+	dax_unlock_entry(&xas, entry);
 }
 
 /*
@@ -445,11 +458,9 @@
 retry:
 	xas_lock_irq(xas);
 	entry = get_unlocked_entry(xas);
-	if (xa_is_internal(entry))
-		goto fallback;
 
 	if (entry) {
-		if (WARN_ON_ONCE(!xa_is_value(entry))) {
+		if (!xa_is_value(entry)) {
 			xas_set_err(xas, EIO);
 			goto out_unlock;
 		}
@@ -1628,8 +1639,7 @@
 	/* Did we race with someone splitting entry or so? */
 	if (!entry ||
 	    (order == 0 && !dax_is_pte_entry(entry)) ||
-	    (order == PMD_ORDER && (xa_is_internal(entry) ||
-				    !dax_is_pmd_entry(entry)))) {
+	    (order == PMD_ORDER && !dax_is_pmd_entry(entry))) {
 		put_unlocked_entry(&xas, entry);
 		xas_unlock_irq(&xas);
 		trace_dax_insert_pfn_mkwrite_no_entry(mapping->host, vmf,
diff --git a/fs/nilfs2/btnode.c b/fs/nilfs2/btnode.c
index de99db5..f2129a5 100644
--- a/fs/nilfs2/btnode.c
+++ b/fs/nilfs2/btnode.c
@@ -266,9 +266,7 @@
 		return;
 
 	if (nbh == NULL) {	/* blocksize == pagesize */
-		xa_lock_irq(&btnc->i_pages);
-		__xa_erase(&btnc->i_pages, newkey);
-		xa_unlock_irq(&btnc->i_pages);
+		xa_erase_irq(&btnc->i_pages, newkey);
 		unlock_page(ctxt->bh->b_page);
 	} else
 		brelse(nbh);
diff --git a/include/linux/xarray.h b/include/linux/xarray.h
index d951492..564892e 100644
--- a/include/linux/xarray.h
+++ b/include/linux/xarray.h
@@ -289,9 +289,7 @@
 void xa_init_flags(struct xarray *, gfp_t flags);
 void *xa_load(struct xarray *, unsigned long index);
 void *xa_store(struct xarray *, unsigned long index, void *entry, gfp_t);
-void *xa_cmpxchg(struct xarray *, unsigned long index,
-			void *old, void *entry, gfp_t);
-int xa_reserve(struct xarray *, unsigned long index, gfp_t);
+void *xa_erase(struct xarray *, unsigned long index);
 void *xa_store_range(struct xarray *, unsigned long first, unsigned long last,
 			void *entry, gfp_t);
 bool xa_get_mark(struct xarray *, unsigned long index, xa_mark_t);
@@ -344,65 +342,6 @@
 }
 
 /**
- * xa_erase() - Erase this entry from the XArray.
- * @xa: XArray.
- * @index: Index of entry.
- *
- * This function is the equivalent of calling xa_store() with %NULL as
- * the third argument.  The XArray does not need to allocate memory, so
- * the user does not need to provide GFP flags.
- *
- * Context: Process context.  Takes and releases the xa_lock.
- * Return: The entry which used to be at this index.
- */
-static inline void *xa_erase(struct xarray *xa, unsigned long index)
-{
-	return xa_store(xa, index, NULL, 0);
-}
-
-/**
- * xa_insert() - Store this entry in the XArray unless another entry is
- *			already present.
- * @xa: XArray.
- * @index: Index into array.
- * @entry: New entry.
- * @gfp: Memory allocation flags.
- *
- * If you would rather see the existing entry in the array, use xa_cmpxchg().
- * This function is for users who don't care what the entry is, only that
- * one is present.
- *
- * Context: Process context.  Takes and releases the xa_lock.
- *	    May sleep if the @gfp flags permit.
- * Return: 0 if the store succeeded.  -EEXIST if another entry was present.
- * -ENOMEM if memory could not be allocated.
- */
-static inline int xa_insert(struct xarray *xa, unsigned long index,
-		void *entry, gfp_t gfp)
-{
-	void *curr = xa_cmpxchg(xa, index, NULL, entry, gfp);
-	if (!curr)
-		return 0;
-	if (xa_is_err(curr))
-		return xa_err(curr);
-	return -EEXIST;
-}
-
-/**
- * xa_release() - Release a reserved entry.
- * @xa: XArray.
- * @index: Index of entry.
- *
- * After calling xa_reserve(), you can call this function to release the
- * reservation.  If the entry at @index has been stored to, this function
- * will do nothing.
- */
-static inline void xa_release(struct xarray *xa, unsigned long index)
-{
-	xa_cmpxchg(xa, index, NULL, NULL, 0);
-}
-
-/**
  * xa_for_each() - Iterate over a portion of an XArray.
  * @xa: XArray.
  * @entry: Entry retrieved from array.
@@ -455,6 +394,7 @@
 void *__xa_cmpxchg(struct xarray *, unsigned long index, void *old,
 		void *entry, gfp_t);
 int __xa_alloc(struct xarray *, u32 *id, u32 max, void *entry, gfp_t);
+int __xa_reserve(struct xarray *, unsigned long index, gfp_t);
 void __xa_set_mark(struct xarray *, unsigned long index, xa_mark_t);
 void __xa_clear_mark(struct xarray *, unsigned long index, xa_mark_t);
 
@@ -487,6 +427,58 @@
 }
 
 /**
+ * xa_store_bh() - Store this entry in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @entry: New entry.
+ * @gfp: Memory allocation flags.
+ *
+ * This function is like calling xa_store() except it disables softirqs
+ * while holding the array lock.
+ *
+ * Context: Any context.  Takes and releases the xa_lock while
+ * disabling softirqs.
+ * Return: The entry which used to be at this index.
+ */
+static inline void *xa_store_bh(struct xarray *xa, unsigned long index,
+		void *entry, gfp_t gfp)
+{
+	void *curr;
+
+	xa_lock_bh(xa);
+	curr = __xa_store(xa, index, entry, gfp);
+	xa_unlock_bh(xa);
+
+	return curr;
+}
+
+/**
+ * xa_store_irq() - Erase this entry from the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @entry: New entry.
+ * @gfp: Memory allocation flags.
+ *
+ * This function is like calling xa_store() except it disables interrupts
+ * while holding the array lock.
+ *
+ * Context: Process context.  Takes and releases the xa_lock while
+ * disabling interrupts.
+ * Return: The entry which used to be at this index.
+ */
+static inline void *xa_store_irq(struct xarray *xa, unsigned long index,
+		void *entry, gfp_t gfp)
+{
+	void *curr;
+
+	xa_lock_irq(xa);
+	curr = __xa_store(xa, index, entry, gfp);
+	xa_unlock_irq(xa);
+
+	return curr;
+}
+
+/**
  * xa_erase_bh() - Erase this entry from the XArray.
  * @xa: XArray.
  * @index: Index of entry.
@@ -495,7 +487,7 @@
  * the third argument.  The XArray does not need to allocate memory, so
  * the user does not need to provide GFP flags.
  *
- * Context: Process context.  Takes and releases the xa_lock while
+ * Context: Any context.  Takes and releases the xa_lock while
  * disabling softirqs.
  * Return: The entry which used to be at this index.
  */
@@ -535,6 +527,61 @@
 }
 
 /**
+ * xa_cmpxchg() - Conditionally replace an entry in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @old: Old value to test against.
+ * @entry: New value to place in array.
+ * @gfp: Memory allocation flags.
+ *
+ * If the entry at @index is the same as @old, replace it with @entry.
+ * If the return value is equal to @old, then the exchange was successful.
+ *
+ * Context: Any context.  Takes and releases the xa_lock.  May sleep
+ * if the @gfp flags permit.
+ * Return: The old value at this index or xa_err() if an error happened.
+ */
+static inline void *xa_cmpxchg(struct xarray *xa, unsigned long index,
+			void *old, void *entry, gfp_t gfp)
+{
+	void *curr;
+
+	xa_lock(xa);
+	curr = __xa_cmpxchg(xa, index, old, entry, gfp);
+	xa_unlock(xa);
+
+	return curr;
+}
+
+/**
+ * xa_insert() - Store this entry in the XArray unless another entry is
+ *			already present.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @entry: New entry.
+ * @gfp: Memory allocation flags.
+ *
+ * If you would rather see the existing entry in the array, use xa_cmpxchg().
+ * This function is for users who don't care what the entry is, only that
+ * one is present.
+ *
+ * Context: Process context.  Takes and releases the xa_lock.
+ *	    May sleep if the @gfp flags permit.
+ * Return: 0 if the store succeeded.  -EEXIST if another entry was present.
+ * -ENOMEM if memory could not be allocated.
+ */
+static inline int xa_insert(struct xarray *xa, unsigned long index,
+		void *entry, gfp_t gfp)
+{
+	void *curr = xa_cmpxchg(xa, index, NULL, entry, gfp);
+	if (!curr)
+		return 0;
+	if (xa_is_err(curr))
+		return xa_err(curr);
+	return -EEXIST;
+}
+
+/**
  * xa_alloc() - Find somewhere to store this entry in the XArray.
  * @xa: XArray.
  * @id: Pointer to ID.
@@ -575,7 +622,7 @@
  * Updates the @id pointer with the index, then stores the entry at that
  * index.  A concurrent lookup will not see an uninitialised @id.
  *
- * Context: Process context.  Takes and releases the xa_lock while
+ * Context: Any context.  Takes and releases the xa_lock while
  * disabling softirqs.  May sleep if the @gfp flags permit.
  * Return: 0 on success, -ENOMEM if memory allocation fails or -ENOSPC if
  * there is no more space in the XArray.
@@ -621,6 +668,98 @@
 	return err;
 }
 
+/**
+ * xa_reserve() - Reserve this index in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @gfp: Memory allocation flags.
+ *
+ * Ensures there is somewhere to store an entry at @index in the array.
+ * If there is already something stored at @index, this function does
+ * nothing.  If there was nothing there, the entry is marked as reserved.
+ * Loading from a reserved entry returns a %NULL pointer.
+ *
+ * If you do not use the entry that you have reserved, call xa_release()
+ * or xa_erase() to free any unnecessary memory.
+ *
+ * Context: Any context.  Takes and releases the xa_lock.
+ * May sleep if the @gfp flags permit.
+ * Return: 0 if the reservation succeeded or -ENOMEM if it failed.
+ */
+static inline
+int xa_reserve(struct xarray *xa, unsigned long index, gfp_t gfp)
+{
+	int ret;
+
+	xa_lock(xa);
+	ret = __xa_reserve(xa, index, gfp);
+	xa_unlock(xa);
+
+	return ret;
+}
+
+/**
+ * xa_reserve_bh() - Reserve this index in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @gfp: Memory allocation flags.
+ *
+ * A softirq-disabling version of xa_reserve().
+ *
+ * Context: Any context.  Takes and releases the xa_lock while
+ * disabling softirqs.
+ * Return: 0 if the reservation succeeded or -ENOMEM if it failed.
+ */
+static inline
+int xa_reserve_bh(struct xarray *xa, unsigned long index, gfp_t gfp)
+{
+	int ret;
+
+	xa_lock_bh(xa);
+	ret = __xa_reserve(xa, index, gfp);
+	xa_unlock_bh(xa);
+
+	return ret;
+}
+
+/**
+ * xa_reserve_irq() - Reserve this index in the XArray.
+ * @xa: XArray.
+ * @index: Index into array.
+ * @gfp: Memory allocation flags.
+ *
+ * An interrupt-disabling version of xa_reserve().
+ *
+ * Context: Process context.  Takes and releases the xa_lock while
+ * disabling interrupts.
+ * Return: 0 if the reservation succeeded or -ENOMEM if it failed.
+ */
+static inline
+int xa_reserve_irq(struct xarray *xa, unsigned long index, gfp_t gfp)
+{
+	int ret;
+
+	xa_lock_irq(xa);
+	ret = __xa_reserve(xa, index, gfp);
+	xa_unlock_irq(xa);
+
+	return ret;
+}
+
+/**
+ * xa_release() - Release a reserved entry.
+ * @xa: XArray.
+ * @index: Index of entry.
+ *
+ * After calling xa_reserve(), you can call this function to release the
+ * reservation.  If the entry at @index has been stored to, this function
+ * will do nothing.
+ */
+static inline void xa_release(struct xarray *xa, unsigned long index)
+{
+	xa_cmpxchg(xa, index, NULL, NULL, 0);
+}
+
 /* Everything below here is the Advanced API.  Proceed with caution. */
 
 /*
diff --git a/lib/test_xarray.c b/lib/test_xarray.c
index aa47754..0598e86 100644
--- a/lib/test_xarray.c
+++ b/lib/test_xarray.c
@@ -208,15 +208,19 @@
 			XA_BUG_ON(xa, xa_get_mark(xa, i, XA_MARK_2));
 
 			/* We should see two elements in the array */
+			rcu_read_lock();
 			xas_for_each(&xas, entry, ULONG_MAX)
 				seen++;
+			rcu_read_unlock();
 			XA_BUG_ON(xa, seen != 2);
 
 			/* One of which is marked */
 			xas_set(&xas, 0);
 			seen = 0;
+			rcu_read_lock();
 			xas_for_each_marked(&xas, entry, ULONG_MAX, XA_MARK_0)
 				seen++;
+			rcu_read_unlock();
 			XA_BUG_ON(xa, seen != 1);
 		}
 		XA_BUG_ON(xa, xa_get_mark(xa, next, XA_MARK_0));
@@ -373,6 +377,12 @@
 	xa_erase_index(xa, 12345678);
 	XA_BUG_ON(xa, !xa_empty(xa));
 
+	/* And so does xa_insert */
+	xa_reserve(xa, 12345678, GFP_KERNEL);
+	XA_BUG_ON(xa, xa_insert(xa, 12345678, xa_mk_value(12345678), 0) != 0);
+	xa_erase_index(xa, 12345678);
+	XA_BUG_ON(xa, !xa_empty(xa));
+
 	/* Can iterate through a reserved entry */
 	xa_store_index(xa, 5, GFP_KERNEL);
 	xa_reserve(xa, 6, GFP_KERNEL);
@@ -436,7 +446,9 @@
 	XA_BUG_ON(xa, xa_load(xa, max) != NULL);
 	XA_BUG_ON(xa, xa_load(xa, min - 1) != NULL);
 
+	xas_lock(&xas);
 	XA_BUG_ON(xa, xas_store(&xas, xa_mk_value(min)) != xa_mk_value(index));
+	xas_unlock(&xas);
 	XA_BUG_ON(xa, xa_load(xa, min) != xa_mk_value(min));
 	XA_BUG_ON(xa, xa_load(xa, max - 1) != xa_mk_value(min));
 	XA_BUG_ON(xa, xa_load(xa, max) != NULL);
@@ -452,9 +464,11 @@
 	XA_STATE(xas, xa, index);
 	xa_store_order(xa, index, order, xa_mk_value(0), GFP_KERNEL);
 
+	xas_lock(&xas);
 	XA_BUG_ON(xa, xas_store(&xas, xa_mk_value(1)) != xa_mk_value(0));
 	XA_BUG_ON(xa, xas.xa_index != index);
 	XA_BUG_ON(xa, xas_store(&xas, NULL) != xa_mk_value(1));
+	xas_unlock(&xas);
 	XA_BUG_ON(xa, !xa_empty(xa));
 }
 #endif
@@ -498,7 +512,7 @@
 	rcu_read_unlock();
 
 	/* We can erase multiple values with a single store */
-	xa_store_order(xa, 0, 63, NULL, GFP_KERNEL);
+	xa_store_order(xa, 0, BITS_PER_LONG - 1, NULL, GFP_KERNEL);
 	XA_BUG_ON(xa, !xa_empty(xa));
 
 	/* Even when the first slot is empty but the others aren't */
@@ -702,7 +716,7 @@
 	}
 }
 
-static noinline void check_find(struct xarray *xa)
+static noinline void check_find_1(struct xarray *xa)
 {
 	unsigned long i, j, k;
 
@@ -748,6 +762,34 @@
 		XA_BUG_ON(xa, xa_get_mark(xa, i, XA_MARK_0));
 	}
 	XA_BUG_ON(xa, !xa_empty(xa));
+}
+
+static noinline void check_find_2(struct xarray *xa)
+{
+	void *entry;
+	unsigned long i, j, index = 0;
+
+	xa_for_each(xa, entry, index, ULONG_MAX, XA_PRESENT) {
+		XA_BUG_ON(xa, true);
+	}
+
+	for (i = 0; i < 1024; i++) {
+		xa_store_index(xa, index, GFP_KERNEL);
+		j = 0;
+		index = 0;
+		xa_for_each(xa, entry, index, ULONG_MAX, XA_PRESENT) {
+			XA_BUG_ON(xa, xa_mk_value(index) != entry);
+			XA_BUG_ON(xa, index != j++);
+		}
+	}
+
+	xa_destroy(xa);
+}
+
+static noinline void check_find(struct xarray *xa)
+{
+	check_find_1(xa);
+	check_find_2(xa);
 	check_multi_find(xa);
 	check_multi_find_2(xa);
 }
@@ -1067,7 +1109,7 @@
 			__check_store_range(xa, 4095 + i, 4095 + j);
 			__check_store_range(xa, 4096 + i, 4096 + j);
 			__check_store_range(xa, 123456 + i, 123456 + j);
-			__check_store_range(xa, UINT_MAX + i, UINT_MAX + j);
+			__check_store_range(xa, (1 << 24) + i, (1 << 24) + j);
 		}
 	}
 }
@@ -1146,10 +1188,12 @@
 		XA_STATE(xas, xa, 1 << order);
 
 		xa_store_order(xa, 0, order, xa, GFP_KERNEL);
+		rcu_read_lock();
 		xas_load(&xas);
 		XA_BUG_ON(xa, xas.xa_node->count == 0);
 		XA_BUG_ON(xa, xas.xa_node->count > (1 << order));
 		XA_BUG_ON(xa, xas.xa_node->nr_values != 0);
+		rcu_read_unlock();
 
 		xa_store_order(xa, 1 << order, order, xa_mk_value(1 << order),
 				GFP_KERNEL);
diff --git a/lib/xarray.c b/lib/xarray.c
index 8b176f0..bbacca5 100644
--- a/lib/xarray.c
+++ b/lib/xarray.c
@@ -610,8 +610,8 @@
  * (see the xa_cmpxchg() implementation for an example).
  *
  * Return: If the slot already existed, returns the contents of this slot.
- * If the slot was newly created, returns NULL.  If it failed to create the
- * slot, returns NULL and indicates the error in @xas.
+ * If the slot was newly created, returns %NULL.  If it failed to create the
+ * slot, returns %NULL and indicates the error in @xas.
  */
 static void *xas_create(struct xa_state *xas)
 {
@@ -1334,44 +1334,31 @@
 	XA_STATE(xas, xa, index);
 	return xas_result(&xas, xas_store(&xas, NULL));
 }
-EXPORT_SYMBOL_GPL(__xa_erase);
+EXPORT_SYMBOL(__xa_erase);
 
 /**
- * xa_store() - Store this entry in the XArray.
+ * xa_erase() - Erase this entry from the XArray.
  * @xa: XArray.
- * @index: Index into array.
- * @entry: New entry.
- * @gfp: Memory allocation flags.
+ * @index: Index of entry.
  *
- * After this function returns, loads from this index will return @entry.
- * Storing into an existing multislot entry updates the entry of every index.
- * The marks associated with @index are unaffected unless @entry is %NULL.
+ * This function is the equivalent of calling xa_store() with %NULL as
+ * the third argument.  The XArray does not need to allocate memory, so
+ * the user does not need to provide GFP flags.
  *
- * Context: Process context.  Takes and releases the xa_lock.  May sleep
- * if the @gfp flags permit.
- * Return: The old entry at this index on success, xa_err(-EINVAL) if @entry
- * cannot be stored in an XArray, or xa_err(-ENOMEM) if memory allocation
- * failed.
+ * Context: Any context.  Takes and releases the xa_lock.
+ * Return: The entry which used to be at this index.
  */
-void *xa_store(struct xarray *xa, unsigned long index, void *entry, gfp_t gfp)
+void *xa_erase(struct xarray *xa, unsigned long index)
 {
-	XA_STATE(xas, xa, index);
-	void *curr;
+	void *entry;
 
-	if (WARN_ON_ONCE(xa_is_internal(entry)))
-		return XA_ERROR(-EINVAL);
+	xa_lock(xa);
+	entry = __xa_erase(xa, index);
+	xa_unlock(xa);
 
-	do {
-		xas_lock(&xas);
-		curr = xas_store(&xas, entry);
-		if (xa_track_free(xa) && entry)
-			xas_clear_mark(&xas, XA_FREE_MARK);
-		xas_unlock(&xas);
-	} while (xas_nomem(&xas, gfp));
-
-	return xas_result(&xas, curr);
+	return entry;
 }
-EXPORT_SYMBOL(xa_store);
+EXPORT_SYMBOL(xa_erase);
 
 /**
  * __xa_store() - Store this entry in the XArray.
@@ -1395,10 +1382,12 @@
 
 	if (WARN_ON_ONCE(xa_is_internal(entry)))
 		return XA_ERROR(-EINVAL);
+	if (xa_track_free(xa) && !entry)
+		entry = XA_ZERO_ENTRY;
 
 	do {
 		curr = xas_store(&xas, entry);
-		if (xa_track_free(xa) && entry)
+		if (xa_track_free(xa))
 			xas_clear_mark(&xas, XA_FREE_MARK);
 	} while (__xas_nomem(&xas, gfp));
 
@@ -1407,45 +1396,33 @@
 EXPORT_SYMBOL(__xa_store);
 
 /**
- * xa_cmpxchg() - Conditionally replace an entry in the XArray.
+ * xa_store() - Store this entry in the XArray.
  * @xa: XArray.
  * @index: Index into array.
- * @old: Old value to test against.
- * @entry: New value to place in array.
+ * @entry: New entry.
  * @gfp: Memory allocation flags.
  *
- * If the entry at @index is the same as @old, replace it with @entry.
- * If the return value is equal to @old, then the exchange was successful.
+ * After this function returns, loads from this index will return @entry.
+ * Storing into an existing multislot entry updates the entry of every index.
+ * The marks associated with @index are unaffected unless @entry is %NULL.
  *
- * Context: Process context.  Takes and releases the xa_lock.  May sleep
- * if the @gfp flags permit.
- * Return: The old value at this index or xa_err() if an error happened.
+ * Context: Any context.  Takes and releases the xa_lock.
+ * May sleep if the @gfp flags permit.
+ * Return: The old entry at this index on success, xa_err(-EINVAL) if @entry
+ * cannot be stored in an XArray, or xa_err(-ENOMEM) if memory allocation
+ * failed.
  */
-void *xa_cmpxchg(struct xarray *xa, unsigned long index,
-			void *old, void *entry, gfp_t gfp)
+void *xa_store(struct xarray *xa, unsigned long index, void *entry, gfp_t gfp)
 {
-	XA_STATE(xas, xa, index);
 	void *curr;
 
-	if (WARN_ON_ONCE(xa_is_internal(entry)))
-		return XA_ERROR(-EINVAL);
+	xa_lock(xa);
+	curr = __xa_store(xa, index, entry, gfp);
+	xa_unlock(xa);
 
-	do {
-		xas_lock(&xas);
-		curr = xas_load(&xas);
-		if (curr == XA_ZERO_ENTRY)
-			curr = NULL;
-		if (curr == old) {
-			xas_store(&xas, entry);
-			if (xa_track_free(xa) && entry)
-				xas_clear_mark(&xas, XA_FREE_MARK);
-		}
-		xas_unlock(&xas);
-	} while (xas_nomem(&xas, gfp));
-
-	return xas_result(&xas, curr);
+	return curr;
 }
-EXPORT_SYMBOL(xa_cmpxchg);
+EXPORT_SYMBOL(xa_store);
 
 /**
  * __xa_cmpxchg() - Store this entry in the XArray.
@@ -1471,6 +1448,8 @@
 
 	if (WARN_ON_ONCE(xa_is_internal(entry)))
 		return XA_ERROR(-EINVAL);
+	if (xa_track_free(xa) && !entry)
+		entry = XA_ZERO_ENTRY;
 
 	do {
 		curr = xas_load(&xas);
@@ -1478,7 +1457,7 @@
 			curr = NULL;
 		if (curr == old) {
 			xas_store(&xas, entry);
-			if (xa_track_free(xa) && entry)
+			if (xa_track_free(xa))
 				xas_clear_mark(&xas, XA_FREE_MARK);
 		}
 	} while (__xas_nomem(&xas, gfp));
@@ -1488,7 +1467,7 @@
 EXPORT_SYMBOL(__xa_cmpxchg);
 
 /**
- * xa_reserve() - Reserve this index in the XArray.
+ * __xa_reserve() - Reserve this index in the XArray.
  * @xa: XArray.
  * @index: Index into array.
  * @gfp: Memory allocation flags.
@@ -1496,33 +1475,32 @@
  * Ensures there is somewhere to store an entry at @index in the array.
  * If there is already something stored at @index, this function does
  * nothing.  If there was nothing there, the entry is marked as reserved.
- * Loads from @index will continue to see a %NULL pointer until a
- * subsequent store to @index.
+ * Loading from a reserved entry returns a %NULL pointer.
  *
  * If you do not use the entry that you have reserved, call xa_release()
  * or xa_erase() to free any unnecessary memory.
  *
- * Context: Process context.  Takes and releases the xa_lock, IRQ or BH safe
- * if specified in XArray flags.  May sleep if the @gfp flags permit.
+ * Context: Any context.  Expects the xa_lock to be held on entry.  May
+ * release the lock, sleep and reacquire the lock if the @gfp flags permit.
  * Return: 0 if the reservation succeeded or -ENOMEM if it failed.
  */
-int xa_reserve(struct xarray *xa, unsigned long index, gfp_t gfp)
+int __xa_reserve(struct xarray *xa, unsigned long index, gfp_t gfp)
 {
 	XA_STATE(xas, xa, index);
-	unsigned int lock_type = xa_lock_type(xa);
 	void *curr;
 
 	do {
-		xas_lock_type(&xas, lock_type);
 		curr = xas_load(&xas);
-		if (!curr)
+		if (!curr) {
 			xas_store(&xas, XA_ZERO_ENTRY);
-		xas_unlock_type(&xas, lock_type);
-	} while (xas_nomem(&xas, gfp));
+			if (xa_track_free(xa))
+				xas_clear_mark(&xas, XA_FREE_MARK);
+		}
+	} while (__xas_nomem(&xas, gfp));
 
 	return xas_error(&xas);
 }
-EXPORT_SYMBOL(xa_reserve);
+EXPORT_SYMBOL(__xa_reserve);
 
 #ifdef CONFIG_XARRAY_MULTI
 static void xas_set_range(struct xa_state *xas, unsigned long first,
@@ -1587,8 +1565,9 @@
 	do {
 		xas_lock(&xas);
 		if (entry) {
-			unsigned int order = (last == ~0UL) ? 64 :
-						ilog2(last + 1);
+			unsigned int order = BITS_PER_LONG;
+			if (last + 1)
+				order = __ffs(last + 1);
 			xas_set_order(&xas, last, order);
 			xas_create(&xas);
 			if (xas_error(&xas))
@@ -1662,7 +1641,7 @@
  * @index: Index of entry.
  * @mark: Mark number.
  *
- * Attempting to set a mark on a NULL entry does not succeed.
+ * Attempting to set a mark on a %NULL entry does not succeed.
  *
  * Context: Any context.  Expects xa_lock to be held on entry.
  */
@@ -1674,7 +1653,7 @@
 	if (entry)
 		xas_set_mark(&xas, mark);
 }
-EXPORT_SYMBOL_GPL(__xa_set_mark);
+EXPORT_SYMBOL(__xa_set_mark);
 
 /**
  * __xa_clear_mark() - Clear this mark on this entry while locked.
@@ -1692,7 +1671,7 @@
 	if (entry)
 		xas_clear_mark(&xas, mark);
 }
-EXPORT_SYMBOL_GPL(__xa_clear_mark);
+EXPORT_SYMBOL(__xa_clear_mark);
 
 /**
  * xa_get_mark() - Inquire whether this mark is set on this entry.
@@ -1732,7 +1711,7 @@
  * @index: Index of entry.
  * @mark: Mark number.
  *
- * Attempting to set a mark on a NULL entry does not succeed.
+ * Attempting to set a mark on a %NULL entry does not succeed.
  *
  * Context: Process context.  Takes and releases the xa_lock.
  */
@@ -1829,6 +1808,8 @@
 			entry = xas_find_marked(&xas, max, filter);
 		else
 			entry = xas_find(&xas, max);
+		if (xas.xa_node == XAS_BOUNDS)
+			break;
 		if (xas.xa_shift) {
 			if (xas.xa_index & ((1UL << xas.xa_shift) - 1))
 				continue;
@@ -1899,7 +1880,7 @@
  *
  * The @filter may be an XArray mark value, in which case entries which are
  * marked with that mark will be copied.  It may also be %XA_PRESENT, in
- * which case all entries which are not NULL will be copied.
+ * which case all entries which are not %NULL will be copied.
  *
  * The entries returned may not represent a snapshot of the XArray at a
  * moment in time.  For example, if another thread stores to index 5, then