rcu: Use for_each_leaf_node_cpu() in online CPU iteration

Though mostly identical, ->qsmaskinit(A.K.A rcu_rnp_online_cpus()) is
sometimes more sparse than the corresponding part of cpu_possible_mask
for an RCU leaf node. So we use for_each_leaf_node_cpu() in
rcu_boost_kthread_setaffinity() instead to save some extra checks.

Signed-off-by: Boqun Feng <boqun.feng@gmail.com>
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 69c6eb2..93cebd9 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -1175,10 +1175,11 @@ static void rcu_boost_kthread_setaffinity(struct rcu_node *rnp, int outgoingcpu)
 		return;
 	if (!zalloc_cpumask_var(&cm, GFP_KERNEL))
 		return;
-	for_each_leaf_node_possible_cpu(rnp, cpu)
-		if ((mask & leaf_node_cpu_bit(rnp, cpu)) &&
-		    cpu != outgoingcpu)
+
+	for_each_leaf_node_cpu(rnp, mask, cpu)
+		if (cpu != outgoingcpu)
 			cpumask_set_cpu(cpu, cm);
+
 	if (cpumask_weight(cm) == 0)
 		cpumask_setall(cm);
 	set_cpus_allowed_ptr(t, cm);