file_ref_t: don't accidently mark recycled files dead

When a file is recycled the following race exists:

CPU1                                                    CPU2
file_ref_get(file)
// the result is negative >= FILE_REF_RELEASED
atomic_long_add_negative(1, &ref->refcnt)
-> __file_ref_get()

   // Sees cnt >> FILE_REF_RELEASED
   cnt = atomic_long_read(&ref->refcnt);

                                                        kmem_cache_free()
                                                        file = kmem_cache_alloc()
                                                        file_ref_init(&ref->refcnt, 1);

   // Here we mark someone else's file dead...
   atomic_long_set(&ref->refcnt, FILE_REF_DEAD);

                                                        close(fd)
                                                        file = file_close_fd_locked()
                                                        filp_flush()
                                                        // splats the first time
                                                        CHECK_DATA_CORRUPTION(file_count(file) == 0)

                                                        // splats a second time becaues it sees FILE_REF_DEAD
                                                        file_ref_put(&ref->refcnt);

Quoting Linus:

I think we should just make file_ref_get() do a simple
>
>        return !atomic_long_add_negative(1, &ref->refcnt));
>
> and nothing else. Yes, multiple CPU's can race, and you can increment
> more than once, but the gap - even on 32-bit - between DEAD and
> becoming close to REF_RELEASED is so big that we simply don't care.
> That's the point of having a gap.

Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202410151043.5d224a27-oliver.sang@intel.com
Closes: https://lore.kernel.org/all/202410151611.f4cd71f2-oliver.sang@intel.com
Signed-off-by: Christian Brauner <brauner@kernel.org>
2 files changed