| From foo@baz Tue Aug 14 16:14:56 CEST 2018 |
| From: Thomas Gleixner <tglx@linutronix.de> |
| Date: Tue, 5 Jun 2018 14:00:11 +0200 |
| Subject: x86/apic: Ignore secondary threads if nosmt=force |
| |
| From: Thomas Gleixner <tglx@linutronix.de> |
| |
| commit 2207def700f902f169fc237b717252c326f9e464 upstream |
| |
| nosmt on the kernel command line merely prevents the onlining of the |
| secondary SMT siblings. |
| |
| nosmt=force makes the APIC detection code ignore the secondary SMT siblings |
| completely, so they even do not show up as possible CPUs. That reduces the |
| amount of memory allocations for per cpu variables and saves other |
| resources from being allocated too large. |
| |
| This is not fully equivalent to disabling SMT in the BIOS because the low |
| level SMT enabling in the BIOS can result in partitioning of resources |
| between the siblings, which is not undone by just ignoring them. Some CPUs |
| can use the full resources when their sibling is not onlined, but this is |
| depending on the CPU family and model and it's not well documented whether |
| this applies to all partitioned resources. That means depending on the |
| workload disabling SMT in the BIOS might result in better performance. |
| |
| Linus analysis of the Intel manual: |
| |
| The intel optimization manual is not very clear on what the partitioning |
| rules are. |
| |
| I find: |
| |
| "In general, the buffers for staging instructions between major pipe |
| stages are partitioned. These buffers include µop queues after the |
| execution trace cache, the queues after the register rename stage, the |
| reorder buffer which stages instructions for retirement, and the load |
| and store buffers. |
| |
| In the case of load and store buffers, partitioning also provided an |
| easier implementation to maintain memory ordering for each logical |
| processor and detect memory ordering violations" |
| |
| but some of that partitioning may be relaxed if the HT thread is "not |
| active": |
| |
| "In Intel microarchitecture code name Sandy Bridge, the micro-op queue |
| is statically partitioned to provide 28 entries for each logical |
| processor, irrespective of software executing in single thread or |
| multiple threads. If one logical processor is not active in Intel |
| microarchitecture code name Ivy Bridge, then a single thread executing |
| on that processor core can use the 56 entries in the micro-op queue" |
| |
| but I do not know what "not active" means, and how dynamic it is. Some of |
| that partitioning may be entirely static and depend on the early BIOS |
| disabling of HT, and even if we park the cores, the resources will just be |
| wasted. |
| |
| Signed-off-by: Thomas Gleixner <tglx@linutronix.de> |
| Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> |
| Acked-by: Ingo Molnar <mingo@kernel.org> |
| Signed-off-by: David Woodhouse <dwmw@amazon.co.uk> |
| Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> |
| --- |
| arch/x86/include/asm/apic.h | 2 ++ |
| arch/x86/kernel/acpi/boot.c | 3 ++- |
| arch/x86/kernel/apic/apic.c | 19 +++++++++++++++++++ |
| 3 files changed, 23 insertions(+), 1 deletion(-) |
| |
| --- a/arch/x86/include/asm/apic.h |
| +++ b/arch/x86/include/asm/apic.h |
| @@ -636,8 +636,10 @@ extern int default_check_phys_apicid_pre |
| |
| #ifdef CONFIG_SMP |
| bool apic_id_is_primary_thread(unsigned int id); |
| +bool apic_id_disabled(unsigned int id); |
| #else |
| static inline bool apic_id_is_primary_thread(unsigned int id) { return false; } |
| +static inline bool apic_id_disabled(unsigned int id) { return false; } |
| #endif |
| |
| extern void irq_enter(void); |
| --- a/arch/x86/kernel/acpi/boot.c |
| +++ b/arch/x86/kernel/acpi/boot.c |
| @@ -177,7 +177,8 @@ static int acpi_register_lapic(int id, u |
| } |
| |
| if (!enabled) { |
| - ++disabled_cpus; |
| + if (!apic_id_disabled(id)) |
| + ++disabled_cpus; |
| return -EINVAL; |
| } |
| |
| --- a/arch/x86/kernel/apic/apic.c |
| +++ b/arch/x86/kernel/apic/apic.c |
| @@ -2056,6 +2056,16 @@ bool apic_id_is_primary_thread(unsigned |
| return !(apicid & mask); |
| } |
| |
| +/** |
| + * apic_id_disabled - Check whether APIC ID is disabled via SMT control |
| + * @id: APIC ID to check |
| + */ |
| +bool apic_id_disabled(unsigned int id) |
| +{ |
| + return (cpu_smt_control == CPU_SMT_FORCE_DISABLED && |
| + !apic_id_is_primary_thread(id)); |
| +} |
| + |
| /* |
| * Should use this API to allocate logical CPU IDs to keep nr_logical_cpuids |
| * and cpuid_to_apicid[] synchronized. |
| @@ -2151,6 +2161,15 @@ int generic_processor_info(int apicid, i |
| return -EINVAL; |
| } |
| |
| + /* |
| + * If SMT is force disabled and the APIC ID belongs to |
| + * a secondary thread, ignore it. |
| + */ |
| + if (apic_id_disabled(apicid)) { |
| + pr_info_once("Ignoring secondary SMT threads\n"); |
| + return -EINVAL; |
| + } |
| + |
| if (apicid == boot_cpu_physical_apicid) { |
| /* |
| * x86_bios_cpu_apicid is required to have processors listed |