rcu: Merge rcu_seq_done_exact() logic into rcu_seq_done()

The rcu_seq_done() API has a large "false-negative" windows of size
ULONG_MAX/2, where after wrap around, it is possible that it will think
that a GP has not completed if a wrap around happens and the delta is
large.

One place this might cause a possible problem is SRCU:

poll_state_synchronize_srcu() uses rcu_seq_done() unlike
poll_state_synchronize_rcu() which uses rcu_seq_done_exact().

The  rcu_seq_done_exact() makes more sense for polling API, as
there is a higher chance that there is a significant delay between the
get_state..() and poll_state..() calls.

Another place where this seems scary is if the condition for the wakeup
was false causing missed wakeups, example in tree-nocb:

        swait_event_interruptible_exclusive(
            rnp->nocb_gp_wq[rcu_seq_ctr(wait_gp_seq) & 0x1],
            rcu_seq_done(&rnp->gp_seq, wait_gp_seq) ||
            !READ_ONCE(my_rdp->nocb_gp_sleep));

The shorter false-negative window of rcu_seq_done_exact() would improve
robustness as rcu_seq_done_exact() makes the window of false-negativity
by only ~2-3 GPs versus ULONG_MAX/2. It also results in a negative code
delta and could potentially avoid issues in the future where
rcu_seq_done() was reporting false-negatives for too long.

One downside of this change is the slightly higher computation, but it
is trivial computation and I think is worth it.

rcutorture runs of all scenarios for 15 minutes passed. Code inspection
was done thoroughly for all users to convince the change would work.
Further inspection reveals it is more robust so it is more than a
cleanup.

Signed-off-by: Joel Fernandes <joelagnelf@nvidia.com>
2 files changed