MCE: Add Action-Required support
Implement core MCA recovery. This is used for errors
that happen in the current execution context.
The kernel has to first pass the error information
to a function running on the current process stack.
This is done using a new work flag and then executing
the code after the exception through do_notify_resume.
Then hwpoison is allowed to sleep and can try to recover it.
To pass the information about the error around we need
to use a field in the current process. The old ways
to handle this (per cpu buffer) don't work because
a CPU could be switched before reaching the handler code.
For kernel recovery we only handle errors happening
during copy_*_user() exception tables and inject EFAULT.
When the tolerance level is sufficiently high also
a unsafe oops like do_exit() killing, which has some
FIXME: fix 386 handling of mce notify bit in entry_32.S after mce
Signed-off-by: Andi Kleen <firstname.lastname@example.org>
4 files changed