|  | .. SPDX-License-Identifier: GPL-2.0 | 
|  |  | 
|  | ========================== | 
|  | The Linux Microcode Loader | 
|  | ========================== | 
|  |  | 
|  | :Authors: - Fenghua Yu <fenghua.yu@intel.com> | 
|  | - Borislav Petkov <bp@suse.de> | 
|  | - Ashok Raj <ashok.raj@intel.com> | 
|  |  | 
|  | The kernel has a x86 microcode loading facility which is supposed to | 
|  | provide microcode loading methods in the OS. Potential use cases are | 
|  | updating the microcode on platforms beyond the OEM End-Of-Life support, | 
|  | and updating the microcode on long-running systems without rebooting. | 
|  |  | 
|  | The loader supports three loading methods: | 
|  |  | 
|  | Early load microcode | 
|  | ==================== | 
|  |  | 
|  | The kernel can update microcode very early during boot. Loading | 
|  | microcode early can fix CPU issues before they are observed during | 
|  | kernel boot time. | 
|  |  | 
|  | The microcode is stored in an initrd file. During boot, it is read from | 
|  | it and loaded into the CPU cores. | 
|  |  | 
|  | The format of the combined initrd image is microcode in (uncompressed) | 
|  | cpio format followed by the (possibly compressed) initrd image. The | 
|  | loader parses the combined initrd image during boot. | 
|  |  | 
|  | The microcode files in cpio name space are: | 
|  |  | 
|  | on Intel: | 
|  | kernel/x86/microcode/GenuineIntel.bin | 
|  | on AMD  : | 
|  | kernel/x86/microcode/AuthenticAMD.bin | 
|  |  | 
|  | During BSP (BootStrapping Processor) boot (pre-SMP), the kernel | 
|  | scans the microcode file in the initrd. If microcode matching the | 
|  | CPU is found, it will be applied in the BSP and later on in all APs | 
|  | (Application Processors). | 
|  |  | 
|  | The loader also saves the matching microcode for the CPU in memory. | 
|  | Thus, the cached microcode patch is applied when CPUs resume from a | 
|  | sleep state. | 
|  |  | 
|  | Here's a crude example how to prepare an initrd with microcode (this is | 
|  | normally done automatically by the distribution, when recreating the | 
|  | initrd, so you don't really have to do it yourself. It is documented | 
|  | here for future reference only). | 
|  | :: | 
|  |  | 
|  | #!/bin/bash | 
|  |  | 
|  | if [ -z "$1" ]; then | 
|  | echo "You need to supply an initrd file" | 
|  | exit 1 | 
|  | fi | 
|  |  | 
|  | INITRD="$1" | 
|  |  | 
|  | DSTDIR=kernel/x86/microcode | 
|  | TMPDIR=/tmp/initrd | 
|  |  | 
|  | rm -rf $TMPDIR | 
|  |  | 
|  | mkdir $TMPDIR | 
|  | cd $TMPDIR | 
|  | mkdir -p $DSTDIR | 
|  |  | 
|  | if [ -d /lib/firmware/amd-ucode ]; then | 
|  | cat /lib/firmware/amd-ucode/microcode_amd*.bin > $DSTDIR/AuthenticAMD.bin | 
|  | fi | 
|  |  | 
|  | if [ -d /lib/firmware/intel-ucode ]; then | 
|  | cat /lib/firmware/intel-ucode/* > $DSTDIR/GenuineIntel.bin | 
|  | fi | 
|  |  | 
|  | find . | cpio -o -H newc >../ucode.cpio | 
|  | cd .. | 
|  | mv $INITRD $INITRD.orig | 
|  | cat ucode.cpio $INITRD.orig > $INITRD | 
|  |  | 
|  | rm -rf $TMPDIR | 
|  |  | 
|  |  | 
|  | The system needs to have the microcode packages installed into | 
|  | /lib/firmware or you need to fixup the paths above if yours are | 
|  | somewhere else and/or you've downloaded them directly from the processor | 
|  | vendor's site. | 
|  |  | 
|  | Late loading | 
|  | ============ | 
|  |  | 
|  | You simply install the microcode packages your distro supplies and | 
|  | run:: | 
|  |  | 
|  | # echo 1 > /sys/devices/system/cpu/microcode/reload | 
|  |  | 
|  | as root. | 
|  |  | 
|  | The loading mechanism looks for microcode blobs in | 
|  | /lib/firmware/{intel-ucode,amd-ucode}. The default distro installation | 
|  | packages already put them there. | 
|  |  | 
|  | Since kernel 5.19, late loading is not enabled by default. | 
|  |  | 
|  | The /dev/cpu/microcode method has been removed in 5.19. | 
|  |  | 
|  | Why is late loading dangerous? | 
|  | ============================== | 
|  |  | 
|  | Synchronizing all CPUs | 
|  | ---------------------- | 
|  |  | 
|  | The microcode engine which receives the microcode update is shared | 
|  | between the two logical threads in a SMT system. Therefore, when | 
|  | the update is executed on one SMT thread of the core, the sibling | 
|  | "automatically" gets the update. | 
|  |  | 
|  | Since the microcode can "simulate" MSRs too, while the microcode update | 
|  | is in progress, those simulated MSRs transiently cease to exist. This | 
|  | can result in unpredictable results if the SMT sibling thread happens to | 
|  | be in the middle of an access to such an MSR. The usual observation is | 
|  | that such MSR accesses cause #GPs to be raised to signal that former are | 
|  | not present. | 
|  |  | 
|  | The disappearing MSRs are just one common issue which is being observed. | 
|  | Any other instruction that's being patched and gets concurrently | 
|  | executed by the other SMT sibling, can also result in similar, | 
|  | unpredictable behavior. | 
|  |  | 
|  | To eliminate this case, a stop_machine()-based CPU synchronization was | 
|  | introduced as a way to guarantee that all logical CPUs will not execute | 
|  | any code but just wait in a spin loop, polling an atomic variable. | 
|  |  | 
|  | While this took care of device or external interrupts, IPIs including | 
|  | LVT ones, such as CMCI etc, it cannot address other special interrupts | 
|  | that can't be shut off. Those are Machine Check (#MC), System Management | 
|  | (#SMI) and Non-Maskable interrupts (#NMI). | 
|  |  | 
|  | Machine Checks | 
|  | -------------- | 
|  |  | 
|  | Machine Checks (#MC) are non-maskable. There are two kinds of MCEs. | 
|  | Fatal un-recoverable MCEs and recoverable MCEs. While un-recoverable | 
|  | errors are fatal, recoverable errors can also happen in kernel context | 
|  | are also treated as fatal by the kernel. | 
|  |  | 
|  | On certain Intel machines, MCEs are also broadcast to all threads in a | 
|  | system. If one thread is in the middle of executing WRMSR, a MCE will be | 
|  | taken at the end of the flow. Either way, they will wait for the thread | 
|  | performing the wrmsr(0x79) to rendezvous in the MCE handler and shutdown | 
|  | eventually if any of the threads in the system fail to check in to the | 
|  | MCE rendezvous. | 
|  |  | 
|  | To be paranoid and get predictable behavior, the OS can choose to set | 
|  | MCG_STATUS.MCIP. Since MCEs can be at most one in a system, if an | 
|  | MCE was signaled, the above condition will promote to a system reset | 
|  | automatically. OS can turn off MCIP at the end of the update for that | 
|  | core. | 
|  |  | 
|  | System Management Interrupt | 
|  | --------------------------- | 
|  |  | 
|  | SMIs are also broadcast to all CPUs in the platform. Microcode update | 
|  | requests exclusive access to the core before writing to MSR 0x79. So if | 
|  | it does happen such that, one thread is in WRMSR flow, and the 2nd got | 
|  | an SMI, that thread will be stopped in the first instruction in the SMI | 
|  | handler. | 
|  |  | 
|  | Since the secondary thread is stopped in the first instruction in SMI, | 
|  | there is very little chance that it would be in the middle of executing | 
|  | an instruction being patched. Plus OS has no way to stop SMIs from | 
|  | happening. | 
|  |  | 
|  | Non-Maskable Interrupts | 
|  | ----------------------- | 
|  |  | 
|  | When thread0 of a core is doing the microcode update, if thread1 is | 
|  | pulled into NMI, that can cause unpredictable behavior due to the | 
|  | reasons above. | 
|  |  | 
|  | OS can choose a variety of methods to avoid running into this situation. | 
|  |  | 
|  |  | 
|  | Is the microcode suitable for late loading? | 
|  | ------------------------------------------- | 
|  |  | 
|  | Late loading is done when the system is fully operational and running | 
|  | real workloads. Late loading behavior depends on what the base patch on | 
|  | the CPU is before upgrading to the new patch. | 
|  |  | 
|  | This is true for Intel CPUs. | 
|  |  | 
|  | Consider, for example, a CPU has patch level 1 and the update is to | 
|  | patch level 3. | 
|  |  | 
|  | Between patch1 and patch3, patch2 might have deprecated a software-visible | 
|  | feature. | 
|  |  | 
|  | This is unacceptable if software is even potentially using that feature. | 
|  | For instance, say MSR_X is no longer available after an update, | 
|  | accessing that MSR will cause a #GP fault. | 
|  |  | 
|  | Basically there is no way to declare a new microcode update suitable | 
|  | for late-loading. This is another one of the problems that caused late | 
|  | loading to be not enabled by default. | 
|  |  | 
|  | Builtin microcode | 
|  | ================= | 
|  |  | 
|  | The loader supports also loading of a builtin microcode supplied through | 
|  | the regular builtin firmware method CONFIG_EXTRA_FIRMWARE. Only 64-bit is | 
|  | currently supported. | 
|  |  | 
|  | Here's an example:: | 
|  |  | 
|  | CONFIG_EXTRA_FIRMWARE="intel-ucode/06-3a-09 amd-ucode/microcode_amd_fam15h.bin" | 
|  | CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware" | 
|  |  | 
|  | This basically means, you have the following tree structure locally:: | 
|  |  | 
|  | /lib/firmware/ | 
|  | |-- amd-ucode | 
|  | ... | 
|  | |   |-- microcode_amd_fam15h.bin | 
|  | ... | 
|  | |-- intel-ucode | 
|  | ... | 
|  | |   |-- 06-3a-09 | 
|  | ... | 
|  |  | 
|  | so that the build system can find those files and integrate them into | 
|  | the final kernel image. The early loader finds them and applies them. | 
|  |  | 
|  | Needless to say, this method is not the most flexible one because it | 
|  | requires rebuilding the kernel each time updated microcode from the CPU | 
|  | vendor is available. |