Merge branch 'iter-ubuf.2' into for-next
* iter-ubuf.2:
iov_iter: Mark copy_compat_iovec_from_user() noinline
diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block
index 282de36..c57e5b7 100644
--- a/Documentation/ABI/stable/sysfs-block
+++ b/Documentation/ABI/stable/sysfs-block
@@ -336,18 +336,11 @@
Date: November 2016
Contact: linux-block@vger.kernel.org
Description:
- [RW] If polling is enabled, this controls what kind of polling
- will be performed. It defaults to -1, which is classic polling.
+ [RW] This was used to control what kind of polling will be
+ performed. It is now fixed to -1, which is classic polling.
In this mode, the CPU will repeatedly ask for completions
- without giving up any time. If set to 0, a hybrid polling mode
- is used, where the kernel will attempt to make an educated guess
- at when the IO will complete. Based on this guess, the kernel
- will put the process issuing IO to sleep for an amount of time,
- before entering a classic poll loop. This mode might be a little
- slower than pure classic polling, but it will be more efficient.
- If set to a value larger than 0, the kernel will put the process
- issuing IO to sleep for this amount of microseconds before
- entering classic polling.
+ without giving up any time.
+ <deprecated>
What: /sys/block/<disk>/queue/io_timeout
diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst
index 0ad7e7e..09a563b 100644
--- a/Documentation/admin-guide/index.rst
+++ b/Documentation/admin-guide/index.rst
@@ -36,7 +36,6 @@
reporting-issues
reporting-regressions
- security-bugs
bug-hunting
bug-bisect
tainted-kernels
diff --git a/Documentation/admin-guide/reporting-issues.rst b/Documentation/admin-guide/reporting-issues.rst
index ec62151..2fd5a03 100644
--- a/Documentation/admin-guide/reporting-issues.rst
+++ b/Documentation/admin-guide/reporting-issues.rst
@@ -395,7 +395,7 @@
list of tracked regressions, to ensure it won't fall through the cracks.
What qualifies as security issue is left to your judgment. Consider reading
-Documentation/admin-guide/security-bugs.rst before proceeding, as it
+Documentation/process/security-bugs.rst before proceeding, as it
provides additional details how to best handle security issues.
An issue is a 'really severe problem' when something totally unacceptably bad
@@ -1269,7 +1269,7 @@
the report's text to these addresses; but on top of it put a small note where
you mention that you filed it with a link to the ticket.
-See Documentation/admin-guide/security-bugs.rst for more information.
+See Documentation/process/security-bugs.rst for more information.
Duties after the report went out
diff --git a/Documentation/block/inline-encryption.rst b/Documentation/block/inline-encryption.rst
index f9bf18e..90b7334 100644
--- a/Documentation/block/inline-encryption.rst
+++ b/Documentation/block/inline-encryption.rst
@@ -270,8 +270,7 @@
encryption need to create their own blk_crypto_profile for their request_queue,
and expose whatever functionality they choose. When a layered device wants to
pass a clone of that request to another request_queue, blk-crypto will
-initialize and prepare the clone as necessary; see
-``blk_crypto_insert_cloned_request()``.
+initialize and prepare the clone as necessary.
Interaction between inline encryption and blk integrity
=======================================================
diff --git a/Documentation/devicetree/bindings/pinctrl/qcom,sm8550-lpass-lpi-pinctrl.yaml b/Documentation/devicetree/bindings/pinctrl/qcom,sm8550-lpass-lpi-pinctrl.yaml
index 5e90051..8f60a91 100644
--- a/Documentation/devicetree/bindings/pinctrl/qcom,sm8550-lpass-lpi-pinctrl.yaml
+++ b/Documentation/devicetree/bindings/pinctrl/qcom,sm8550-lpass-lpi-pinctrl.yaml
@@ -96,9 +96,11 @@
2: Lower Slew rate (slower edges)
3: Reserved (No adjustments)
+ bias-bus-hold: true
bias-pull-down: true
bias-pull-up: true
bias-disable: true
+ input-enable: true
output-high: true
output-low: true
diff --git a/Documentation/process/howto.rst b/Documentation/process/howto.rst
index cb6abcb..deb8235 100644
--- a/Documentation/process/howto.rst
+++ b/Documentation/process/howto.rst
@@ -138,7 +138,7 @@
philosophy and is very important for people moving to Linux from
development on other Operating Systems.
- :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`
+ :ref:`Documentation/process/security-bugs.rst <securitybugs>`
If you feel you have found a security problem in the Linux kernel,
please follow the steps in this document to help notify the kernel
developers, and help solve the issue.
diff --git a/Documentation/process/index.rst b/Documentation/process/index.rst
index d4b6217..565df59 100644
--- a/Documentation/process/index.rst
+++ b/Documentation/process/index.rst
@@ -35,6 +35,14 @@
kernel-enforcement-statement
kernel-driver-statement
+For security issues, see:
+
+.. toctree::
+ :maxdepth: 1
+
+ security-bugs
+ embargoed-hardware-issues
+
Other guides to the community that are of interest to most developers are:
.. toctree::
@@ -47,7 +55,6 @@
submit-checklist
kernel-docs
deprecated
- embargoed-hardware-issues
maintainers
researcher-guidelines
diff --git a/Documentation/process/researcher-guidelines.rst b/Documentation/process/researcher-guidelines.rst
index afc944e..9fcfed3 100644
--- a/Documentation/process/researcher-guidelines.rst
+++ b/Documentation/process/researcher-guidelines.rst
@@ -68,7 +68,7 @@
* Documentation/process/development-process.rst
* Documentation/process/submitting-patches.rst
* Documentation/admin-guide/reporting-issues.rst
-* Documentation/admin-guide/security-bugs.rst
+* Documentation/process/security-bugs.rst
Then send a patch (including a commit log with all the details listed
below) and follow up on any feedback from other developers.
diff --git a/Documentation/admin-guide/security-bugs.rst b/Documentation/process/security-bugs.rst
similarity index 100%
rename from Documentation/admin-guide/security-bugs.rst
rename to Documentation/process/security-bugs.rst
diff --git a/Documentation/process/stable-kernel-rules.rst b/Documentation/process/stable-kernel-rules.rst
index 2fd8aa5..51df119 100644
--- a/Documentation/process/stable-kernel-rules.rst
+++ b/Documentation/process/stable-kernel-rules.rst
@@ -39,7 +39,7 @@
Security patches should not be handled (solely) by the -stable review
process but should follow the procedures in
- :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`.
+ :ref:`Documentation/process/security-bugs.rst <securitybugs>`.
For all other submissions, choose one of the following procedures
-----------------------------------------------------------------
diff --git a/Documentation/process/submitting-patches.rst b/Documentation/process/submitting-patches.rst
index 69ce64e..828997b 100644
--- a/Documentation/process/submitting-patches.rst
+++ b/Documentation/process/submitting-patches.rst
@@ -254,7 +254,7 @@
to security@kernel.org. For severe bugs, a short embargo may be considered
to allow distributors to get the patch out to users; in such cases,
obviously, the patch should not be sent to any public lists. See also
-Documentation/admin-guide/security-bugs.rst.
+Documentation/process/security-bugs.rst.
Patches that fix a severe bug in a released kernel should be directed
toward the stable maintainers by putting a line like this::
diff --git a/Documentation/translations/it_IT/admin-guide/security-bugs.rst b/Documentation/translations/it_IT/admin-guide/security-bugs.rst
index 18a5822..20994f4 100644
--- a/Documentation/translations/it_IT/admin-guide/security-bugs.rst
+++ b/Documentation/translations/it_IT/admin-guide/security-bugs.rst
@@ -1,6 +1,6 @@
.. include:: ../disclaimer-ita.rst
-:Original: :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`
+:Original: :ref:`Documentation/process/security-bugs.rst <securitybugs>`
.. _it_securitybugs:
diff --git a/Documentation/translations/it_IT/process/submitting-patches.rst b/Documentation/translations/it_IT/process/submitting-patches.rst
index c2cfa09..167fce8 100644
--- a/Documentation/translations/it_IT/process/submitting-patches.rst
+++ b/Documentation/translations/it_IT/process/submitting-patches.rst
@@ -272,7 +272,7 @@
distribuzioni di prendere la patch e renderla disponibile ai loro utenti;
in questo caso, ovviamente, la patch non dovrebbe essere inviata su alcuna
lista di discussione pubblica. Leggete anche
-Documentation/admin-guide/security-bugs.rst.
+Documentation/process/security-bugs.rst.
Patch che correggono bachi importanti su un kernel già rilasciato, dovrebbero
essere inviate ai manutentori dei kernel stabili aggiungendo la seguente riga::
diff --git a/Documentation/translations/ja_JP/howto.rst b/Documentation/translations/ja_JP/howto.rst
index 9b0b343..8d856eb 100644
--- a/Documentation/translations/ja_JP/howto.rst
+++ b/Documentation/translations/ja_JP/howto.rst
@@ -167,7 +167,7 @@
このドキュメントは Linux 開発の思想を理解するのに非常に重要です。
そして、他のOSでの開発者が Linux に移る時にとても重要です。
- :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`
+ :ref:`Documentation/process/security-bugs.rst <securitybugs>`
もし Linux カーネルでセキュリティ問題を発見したように思ったら、こ
のドキュメントのステップに従ってカーネル開発者に連絡し、問題解決を
支援してください。
diff --git a/Documentation/translations/ko_KR/howto.rst b/Documentation/translations/ko_KR/howto.rst
index 969e91a..34f1489 100644
--- a/Documentation/translations/ko_KR/howto.rst
+++ b/Documentation/translations/ko_KR/howto.rst
@@ -157,7 +157,7 @@
리눅스로 전향하는 사람들에게는 매우 중요하다.
- :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`
+ :ref:`Documentation/process/security-bugs.rst <securitybugs>`
여러분들이 리눅스 커널의 보안 문제를 발견했다고 생각한다면 이 문서에
나온 단계에 따라서 커널 개발자들에게 알리고 그 문제를 해결할 수 있도록
도와 달라.
diff --git a/Documentation/translations/sp_SP/howto.rst b/Documentation/translations/sp_SP/howto.rst
index f9818d6..f162973 100644
--- a/Documentation/translations/sp_SP/howto.rst
+++ b/Documentation/translations/sp_SP/howto.rst
@@ -135,7 +135,7 @@
de Linux y es muy importante para las personas que se mudan a Linux
tras desarrollar otros sistemas operativos.
- :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`
+ :ref:`Documentation/process/security-bugs.rst <securitybugs>`
Si cree que ha encontrado un problema de seguridad en el kernel de
Linux, siga los pasos de este documento para ayudar a notificar a los
desarrolladores del kernel y ayudar a resolver el problema.
diff --git a/Documentation/translations/sp_SP/process/submitting-patches.rst b/Documentation/translations/sp_SP/process/submitting-patches.rst
index bf95ceb..c2757d9 100644
--- a/Documentation/translations/sp_SP/process/submitting-patches.rst
+++ b/Documentation/translations/sp_SP/process/submitting-patches.rst
@@ -276,7 +276,7 @@
poco de discreción y permitir que los distribuidores entreguen el parche a
los usuarios; en esos casos, obviamente, el parche no debe enviarse a
ninguna lista pública. Revise también
-Documentation/admin-guide/security-bugs.rst.
+Documentation/process/security-bugs.rst.
Los parches que corrigen un error grave en un kernel en uso deben dirigirse
hacia los maintainers estables poniendo una línea como esta::
diff --git a/Documentation/translations/zh_CN/admin-guide/security-bugs.rst b/Documentation/translations/zh_CN/admin-guide/security-bugs.rst
index b812039..d6b8f8a 100644
--- a/Documentation/translations/zh_CN/admin-guide/security-bugs.rst
+++ b/Documentation/translations/zh_CN/admin-guide/security-bugs.rst
@@ -1,6 +1,6 @@
.. include:: ../disclaimer-zh_CN.rst
-:Original: :doc:`../../../admin-guide/security-bugs`
+:Original: :doc:`../../../process/security-bugs`
:译者:
diff --git a/Documentation/translations/zh_CN/process/howto.rst b/Documentation/translations/zh_CN/process/howto.rst
index 1025475..cc47be3 100644
--- a/Documentation/translations/zh_CN/process/howto.rst
+++ b/Documentation/translations/zh_CN/process/howto.rst
@@ -125,7 +125,7 @@
这篇文档对于理解Linux的开发哲学至关重要。对于将开发平台从其他操作系
统转移到Linux的人来说也很重要。
- :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`
+ :ref:`Documentation/process/security-bugs.rst <securitybugs>`
如果你认为自己发现了Linux内核的安全性问题,请根据这篇文档中的步骤来
提醒其他内核开发者并帮助解决这个问题。
diff --git a/Documentation/translations/zh_TW/admin-guide/security-bugs.rst b/Documentation/translations/zh_TW/admin-guide/security-bugs.rst
index eed260e..15f8e90 100644
--- a/Documentation/translations/zh_TW/admin-guide/security-bugs.rst
+++ b/Documentation/translations/zh_TW/admin-guide/security-bugs.rst
@@ -2,7 +2,7 @@
.. include:: ../disclaimer-zh_TW.rst
-:Original: :doc:`../../../admin-guide/security-bugs`
+:Original: :doc:`../../../process/security-bugs`
:譯者:
diff --git a/Documentation/translations/zh_TW/process/howto.rst b/Documentation/translations/zh_TW/process/howto.rst
index 8fb8edca..ea2f468 100644
--- a/Documentation/translations/zh_TW/process/howto.rst
+++ b/Documentation/translations/zh_TW/process/howto.rst
@@ -128,7 +128,7 @@
這篇文檔對於理解Linux的開發哲學至關重要。對於將開發平台從其他操作系
統轉移到Linux的人來說也很重要。
- :ref:`Documentation/admin-guide/security-bugs.rst <securitybugs>`
+ :ref:`Documentation/process/security-bugs.rst <securitybugs>`
如果你認爲自己發現了Linux內核的安全性問題,請根據這篇文檔中的步驟來
提醒其他內核開發者並幫助解決這個問題。
diff --git a/MAINTAINERS b/MAINTAINERS
index 1dc8bd2..90abe83 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -73,7 +73,7 @@
and ideally, should come with a patch proposal. Please do not send
automated reports to this list either. Such bugs will be handled
better and faster in the usual public places. See
- Documentation/admin-guide/security-bugs.rst for details.
+ Documentation/process/security-bugs.rst for details.
8. Happy hacking.
@@ -8216,6 +8216,7 @@
FREESCALE QORIQ DPAA FMAN DRIVER
M: Madalin Bucur <madalin.bucur@nxp.com>
+R: Sean Anderson <sean.anderson@seco.com>
L: netdev@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/net/fsl-fman.txt
@@ -14656,10 +14657,8 @@
NFC SUBSYSTEM
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
-L: linux-nfc@lists.01.org (subscribers-only)
L: netdev@vger.kernel.org
S: Maintained
-B: mailto:linux-nfc@lists.01.org
F: Documentation/devicetree/bindings/net/nfc/
F: drivers/nfc/
F: include/linux/platform_data/nfcmrvl.h
@@ -14670,7 +14669,6 @@
NFC VIRTUAL NCI DEVICE DRIVER
M: Bongsu Jeon <bongsu.jeon@samsung.com>
L: netdev@vger.kernel.org
-L: linux-nfc@lists.01.org (subscribers-only)
S: Supported
F: drivers/nfc/virtual_ncidev.c
F: tools/testing/selftests/nci/
@@ -15042,7 +15040,6 @@
F: sound/soc/codecs/tfa989x.c
NXP-NCI NFC DRIVER
-L: linux-nfc@lists.01.org (subscribers-only)
S: Orphan
F: Documentation/devicetree/bindings/net/nfc/nxp,nci.yaml
F: drivers/nfc/nxp-nci
@@ -18291,8 +18288,9 @@
F: include/linux/dasd_mod.h
S390 IOMMU (PCI)
+M: Niklas Schnelle <schnelle@linux.ibm.com>
M: Matthew Rosato <mjrosato@linux.ibm.com>
-M: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
+R: Gerald Schaefer <gerald.schaefer@linux.ibm.com>
L: linux-s390@vger.kernel.org
S: Supported
F: drivers/iommu/s390-iommu.c
@@ -18487,7 +18485,6 @@
SAMSUNG S3FWRN5 NFC DRIVER
M: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
-L: linux-nfc@lists.01.org (subscribers-only)
S: Maintained
F: Documentation/devicetree/bindings/net/nfc/samsung,s3fwrn5.yaml
F: drivers/nfc/s3fwrn5
@@ -18802,7 +18799,7 @@
SECURITY CONTACT
M: Security Officers <security@kernel.org>
S: Supported
-F: Documentation/admin-guide/security-bugs.rst
+F: Documentation/process/security-bugs.rst
SECURITY SUBSYSTEM
M: Paul Moore <paul@paul-moore.com>
@@ -20645,7 +20642,6 @@
TENSILICA XTENSA PORT (xtensa)
M: Chris Zankel <chris@zankel.net>
M: Max Filippov <jcmvbkbc@gmail.com>
-L: linux-xtensa@linux-xtensa.org
S: Maintained
T: git https://github.com/jcmvbkbc/linux-xtensa.git
F: arch/xtensa/
@@ -20981,7 +20977,6 @@
TI TRF7970A NFC DRIVER
M: Mark Greer <mgreer@animalcreek.com>
L: linux-wireless@vger.kernel.org
-L: linux-nfc@lists.01.org (subscribers-only)
S: Supported
F: Documentation/devicetree/bindings/net/nfc/ti,trf7970a.yaml
F: drivers/nfc/trf7970a.c
@@ -23038,7 +23033,6 @@
XTENSA XTFPGA PLATFORM SUPPORT
M: Max Filippov <jcmvbkbc@gmail.com>
-L: linux-xtensa@linux-xtensa.org
S: Maintained
F: drivers/spi/spi-xtensa-xtfpga.c
F: sound/soc/xtensa/xtfpga-i2s.c
diff --git a/Makefile b/Makefile
index da2586d..ef4e96b 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@
VERSION = 6
PATCHLEVEL = 3
SUBLEVEL = 0
-EXTRAVERSION = -rc4
+EXTRAVERSION = -rc5
NAME = Hurr durr I'ma ninja sloth
# *DOCUMENTATION*
diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c
index 3378866..3779e78 100644
--- a/arch/mips/bmips/dma.c
+++ b/arch/mips/bmips/dma.c
@@ -5,6 +5,8 @@
#include <asm/bmips.h>
#include <asm/io.h>
+bool bmips_rac_flush_disable;
+
void arch_sync_dma_for_cpu_all(void)
{
void __iomem *cbr = BMIPS_GET_CBR();
@@ -15,6 +17,9 @@ void arch_sync_dma_for_cpu_all(void)
boot_cpu_type() != CPU_BMIPS4380)
return;
+ if (unlikely(bmips_rac_flush_disable))
+ return;
+
/* Flush stale data out of the readahead cache */
cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG);
__raw_writel(cfg | 0x100, cbr + BMIPS_RAC_CONFIG);
diff --git a/arch/mips/bmips/setup.c b/arch/mips/bmips/setup.c
index e95b3f7..549a639 100644
--- a/arch/mips/bmips/setup.c
+++ b/arch/mips/bmips/setup.c
@@ -35,6 +35,8 @@
#define REG_BCM6328_OTP ((void __iomem *)CKSEG1ADDR(0x1000062c))
#define BCM6328_TP1_DISABLED BIT(9)
+extern bool bmips_rac_flush_disable;
+
static const unsigned long kbase = VMLINUX_LOAD_ADDRESS & 0xfff00000;
struct bmips_quirk {
@@ -104,6 +106,12 @@ static void bcm6358_quirks(void)
* disable SMP for now
*/
bmips_smp_enabled = 0;
+
+ /*
+ * RAC flush causes kernel panics on BCM6358 when booting from TP1
+ * because the bootloader is not initializing it properly.
+ */
+ bmips_rac_flush_disable = !!(read_c0_brcm_cmt_local() & (1 << 31));
}
static void bcm6368_quirks(void)
diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush.h b/arch/powerpc/include/asm/book3s/64/tlbflush.h
index 2bbc0fc..5e26c7f 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h
@@ -148,6 +148,11 @@ static inline void flush_tlb_fix_spurious_fault(struct vm_area_struct *vma,
*/
}
+static inline bool __pte_protnone(unsigned long pte)
+{
+ return (pte & (pgprot_val(PAGE_NONE) | _PAGE_RWX)) == pgprot_val(PAGE_NONE);
+}
+
static inline bool __pte_flags_need_flush(unsigned long oldval,
unsigned long newval)
{
@@ -164,8 +169,8 @@ static inline bool __pte_flags_need_flush(unsigned long oldval,
/*
* We do not expect kernel mappings or non-PTEs or not-present PTEs.
*/
- VM_WARN_ON_ONCE(oldval & _PAGE_PRIVILEGED);
- VM_WARN_ON_ONCE(newval & _PAGE_PRIVILEGED);
+ VM_WARN_ON_ONCE(!__pte_protnone(oldval) && oldval & _PAGE_PRIVILEGED);
+ VM_WARN_ON_ONCE(!__pte_protnone(newval) && newval & _PAGE_PRIVILEGED);
VM_WARN_ON_ONCE(!(oldval & _PAGE_PTE));
VM_WARN_ON_ONCE(!(newval & _PAGE_PTE));
VM_WARN_ON_ONCE(!(oldval & _PAGE_PRESENT));
diff --git a/arch/powerpc/kernel/ptrace/ptrace-view.c b/arch/powerpc/kernel/ptrace/ptrace-view.c
index 2087a78..5fff0d0 100644
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c
+++ b/arch/powerpc/kernel/ptrace/ptrace-view.c
@@ -290,6 +290,9 @@ static int gpr_set(struct task_struct *target, const struct user_regset *regset,
static int ppr_get(struct task_struct *target, const struct user_regset *regset,
struct membuf to)
{
+ if (!target->thread.regs)
+ return -EINVAL;
+
return membuf_write(&to, &target->thread.regs->ppr, sizeof(u64));
}
@@ -297,6 +300,9 @@ static int ppr_set(struct task_struct *target, const struct user_regset *regset,
unsigned int pos, unsigned int count, const void *kbuf,
const void __user *ubuf)
{
+ if (!target->thread.regs)
+ return -EINVAL;
+
return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
&target->thread.regs->ppr, 0, sizeof(u64));
}
diff --git a/arch/powerpc/platforms/pseries/vas.c b/arch/powerpc/platforms/pseries/vas.c
index 5591123..5131804 100644
--- a/arch/powerpc/platforms/pseries/vas.c
+++ b/arch/powerpc/platforms/pseries/vas.c
@@ -856,6 +856,13 @@ int pseries_vas_dlpar_cpu(void)
{
int new_nr_creds, rc;
+ /*
+ * NX-GZIP is not enabled. Nothing to do for DLPAR event
+ */
+ if (!copypaste_feat)
+ return 0;
+
+
rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES,
vascaps[VAS_GZIP_DEF_FEAT_TYPE].feat,
(u64)virt_to_phys(&hv_cop_caps));
@@ -1012,6 +1019,7 @@ static int __init pseries_vas_init(void)
* Linux supports user space COPY/PASTE only with Radix
*/
if (!radix_enabled()) {
+ copypaste_feat = false;
pr_err("API is supported only with radix page tables\n");
return -ENOTSUPP;
}
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 5b182d1..eb7f29a 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -126,6 +126,7 @@
select OF_IRQ
select PCI_DOMAINS_GENERIC if PCI
select PCI_MSI if PCI
+ select RISCV_ALTERNATIVE if !XIP_KERNEL
select RISCV_INTC
select RISCV_TIMER if RISCV_SBI
select SIFIVE_PLIC
@@ -401,9 +402,8 @@
config RISCV_ISA_SVPBMT
bool "SVPBMT extension support"
depends on 64BIT && MMU
- depends on !XIP_KERNEL
+ depends on RISCV_ALTERNATIVE
default y
- select RISCV_ALTERNATIVE
help
Adds support to dynamically detect the presence of the SVPBMT
ISA-extension (Supervisor-mode: page-based memory types) and
@@ -428,8 +428,8 @@
config RISCV_ISA_ZBB
bool "Zbb extension support for bit manipulation instructions"
depends on TOOLCHAIN_HAS_ZBB
- depends on !XIP_KERNEL && MMU
- select RISCV_ALTERNATIVE
+ depends on MMU
+ depends on RISCV_ALTERNATIVE
default y
help
Adds support to dynamically detect the presence of the ZBB
@@ -443,9 +443,9 @@
config RISCV_ISA_ZICBOM
bool "Zicbom extension support for non-coherent DMA operation"
- depends on !XIP_KERNEL && MMU
+ depends on MMU
+ depends on RISCV_ALTERNATIVE
default y
- select RISCV_ALTERNATIVE
select RISCV_DMA_NONCOHERENT
help
Adds support to dynamically detect the presence of the ZICBOM
diff --git a/arch/riscv/Kconfig.erratas b/arch/riscv/Kconfig.erratas
index 69621ae..0c8f4652 100644
--- a/arch/riscv/Kconfig.erratas
+++ b/arch/riscv/Kconfig.erratas
@@ -2,8 +2,7 @@
config ERRATA_SIFIVE
bool "SiFive errata"
- depends on !XIP_KERNEL
- select RISCV_ALTERNATIVE
+ depends on RISCV_ALTERNATIVE
help
All SiFive errata Kconfig depend on this Kconfig. Disabling
this Kconfig will disable all SiFive errata. Please say "Y"
@@ -35,8 +34,7 @@
config ERRATA_THEAD
bool "T-HEAD errata"
- depends on !XIP_KERNEL
- select RISCV_ALTERNATIVE
+ depends on RISCV_ALTERNATIVE
help
All T-HEAD errata Kconfig depend on this Kconfig. Disabling
this Kconfig will disable all T-HEAD errata. Please say "Y"
diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h
index e3021b2..6263a0d 100644
--- a/arch/riscv/include/asm/hwcap.h
+++ b/arch/riscv/include/asm/hwcap.h
@@ -57,18 +57,31 @@ struct riscv_isa_ext_data {
unsigned int isa_ext_id;
};
+unsigned long riscv_isa_extension_base(const unsigned long *isa_bitmap);
+
+#define riscv_isa_extension_mask(ext) BIT_MASK(RISCV_ISA_EXT_##ext)
+
+bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit);
+#define riscv_isa_extension_available(isa_bitmap, ext) \
+ __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_##ext)
+
static __always_inline bool
riscv_has_extension_likely(const unsigned long ext)
{
compiletime_assert(ext < RISCV_ISA_EXT_MAX,
"ext must be < RISCV_ISA_EXT_MAX");
- asm_volatile_goto(
- ALTERNATIVE("j %l[l_no]", "nop", 0, %[ext], 1)
- :
- : [ext] "i" (ext)
- :
- : l_no);
+ if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
+ asm_volatile_goto(
+ ALTERNATIVE("j %l[l_no]", "nop", 0, %[ext], 1)
+ :
+ : [ext] "i" (ext)
+ :
+ : l_no);
+ } else {
+ if (!__riscv_isa_extension_available(NULL, ext))
+ goto l_no;
+ }
return true;
l_no:
@@ -81,26 +94,23 @@ riscv_has_extension_unlikely(const unsigned long ext)
compiletime_assert(ext < RISCV_ISA_EXT_MAX,
"ext must be < RISCV_ISA_EXT_MAX");
- asm_volatile_goto(
- ALTERNATIVE("nop", "j %l[l_yes]", 0, %[ext], 1)
- :
- : [ext] "i" (ext)
- :
- : l_yes);
+ if (IS_ENABLED(CONFIG_RISCV_ALTERNATIVE)) {
+ asm_volatile_goto(
+ ALTERNATIVE("nop", "j %l[l_yes]", 0, %[ext], 1)
+ :
+ : [ext] "i" (ext)
+ :
+ : l_yes);
+ } else {
+ if (__riscv_isa_extension_available(NULL, ext))
+ goto l_yes;
+ }
return false;
l_yes:
return true;
}
-unsigned long riscv_isa_extension_base(const unsigned long *isa_bitmap);
-
-#define riscv_isa_extension_mask(ext) BIT_MASK(RISCV_ISA_EXT_##ext)
-
-bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit);
-#define riscv_isa_extension_available(isa_bitmap, ext) \
- __riscv_isa_extension_available(isa_bitmap, RISCV_ISA_EXT_##ext)
-
#endif
#endif /* _ASM_RISCV_HWCAP_H */
diff --git a/arch/s390/Makefile b/arch/s390/Makefile
index b3235ab..ed646c5 100644
--- a/arch/s390/Makefile
+++ b/arch/s390/Makefile
@@ -162,7 +162,7 @@
ifdef CONFIG_EXPOLINE_EXTERN
modules_prepare: expoline_prepare
-expoline_prepare:
+expoline_prepare: scripts
$(Q)$(MAKE) $(build)=arch/s390/lib/expoline arch/s390/lib/expoline/expoline.o
endif
endif
diff --git a/arch/s390/include/uapi/asm/dasd.h b/arch/s390/include/uapi/asm/dasd.h
index 93d1ccd..9c49c3d 100644
--- a/arch/s390/include/uapi/asm/dasd.h
+++ b/arch/s390/include/uapi/asm/dasd.h
@@ -78,6 +78,7 @@ typedef struct dasd_information2_t {
* 0x040: give access to raw eckd data
* 0x080: enable discard support
* 0x100: enable autodisable for IFCC errors (default)
+ * 0x200: enable requeue of all requests on autoquiesce
*/
#define DASD_FEATURE_READONLY 0x001
#define DASD_FEATURE_USEDIAG 0x002
@@ -88,6 +89,7 @@ typedef struct dasd_information2_t {
#define DASD_FEATURE_USERAW 0x040
#define DASD_FEATURE_DISCARD 0x080
#define DASD_FEATURE_PATH_AUTODISABLE 0x100
+#define DASD_FEATURE_REQUEUEQUIESCE 0x200
#define DASD_FEATURE_DEFAULT DASD_FEATURE_PATH_AUTODISABLE
#define DASD_PARTN_BITS 2
diff --git a/arch/s390/kernel/ptrace.c b/arch/s390/kernel/ptrace.c
index cf9659e..ea244a7 100644
--- a/arch/s390/kernel/ptrace.c
+++ b/arch/s390/kernel/ptrace.c
@@ -474,9 +474,7 @@ long arch_ptrace(struct task_struct *child, long request,
}
return 0;
case PTRACE_GET_LAST_BREAK:
- put_user(child->thread.last_break,
- (unsigned long __user *) data);
- return 0;
+ return put_user(child->thread.last_break, (unsigned long __user *)data);
case PTRACE_ENABLE_TE:
if (!MACHINE_HAS_TE)
return -EIO;
@@ -824,9 +822,7 @@ long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
}
return 0;
case PTRACE_GET_LAST_BREAK:
- put_user(child->thread.last_break,
- (unsigned int __user *) data);
- return 0;
+ return put_user(child->thread.last_break, (unsigned int __user *)data);
}
return compat_ptrace_request(child, request, addr, data);
}
diff --git a/arch/s390/lib/uaccess.c b/arch/s390/lib/uaccess.c
index 720036fb..d442140 100644
--- a/arch/s390/lib/uaccess.c
+++ b/arch/s390/lib/uaccess.c
@@ -172,7 +172,7 @@ unsigned long __clear_user(void __user *to, unsigned long size)
"4: slgr %0,%0\n"
"5:\n"
EX_TABLE(0b,2b) EX_TABLE(6b,2b) EX_TABLE(3b,5b) EX_TABLE(7b,5b)
- : "+a" (size), "+a" (to), "+a" (tmp1), "=a" (tmp2)
+ : "+&a" (size), "+&a" (to), "+a" (tmp1), "=&a" (tmp2)
: "a" (empty_zero_page), [spec] "d" (spec.val)
: "cc", "memory", "0");
return size;
diff --git a/arch/xtensa/kernel/traps.c b/arch/xtensa/kernel/traps.c
index cd98366..f0a7d1c 100644
--- a/arch/xtensa/kernel/traps.c
+++ b/arch/xtensa/kernel/traps.c
@@ -539,7 +539,7 @@ static size_t kstack_depth_to_print = CONFIG_PRINT_STACK_DEPTH;
void show_stack(struct task_struct *task, unsigned long *sp, const char *loglvl)
{
- size_t len;
+ size_t len, off = 0;
if (!sp)
sp = stack_pointer(task);
@@ -548,9 +548,17 @@ void show_stack(struct task_struct *task, unsigned long *sp, const char *loglvl)
kstack_depth_to_print * STACK_DUMP_ENTRY_SIZE);
printk("%sStack:\n", loglvl);
- print_hex_dump(loglvl, " ", DUMP_PREFIX_NONE,
- STACK_DUMP_LINE_SIZE, STACK_DUMP_ENTRY_SIZE,
- sp, len, false);
+ while (off < len) {
+ u8 line[STACK_DUMP_LINE_SIZE];
+ size_t line_len = len - off > STACK_DUMP_LINE_SIZE ?
+ STACK_DUMP_LINE_SIZE : len - off;
+
+ __memcpy(line, (u8 *)sp + off, line_len);
+ print_hex_dump(loglvl, " ", DUMP_PREFIX_NONE,
+ STACK_DUMP_LINE_SIZE, STACK_DUMP_ENTRY_SIZE,
+ line, line_len, false);
+ off += STACK_DUMP_LINE_SIZE;
+ }
show_trace(task, sp, loglvl);
}
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index 89ffb3a..74f7d05 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -497,17 +497,11 @@ static struct blkcg_policy_data *bfq_cpd_alloc(gfp_t gfp)
bgd = kzalloc(sizeof(*bgd), gfp);
if (!bgd)
return NULL;
+
+ bgd->weight = CGROUP_WEIGHT_DFL;
return &bgd->pd;
}
-static void bfq_cpd_init(struct blkcg_policy_data *cpd)
-{
- struct bfq_group_data *d = cpd_to_bfqgd(cpd);
-
- d->weight = cgroup_subsys_on_dfl(io_cgrp_subsys) ?
- CGROUP_WEIGHT_DFL : BFQ_WEIGHT_LEGACY_DFL;
-}
-
static void bfq_cpd_free(struct blkcg_policy_data *cpd)
{
kfree(cpd_to_bfqgd(cpd));
@@ -1301,8 +1295,6 @@ struct blkcg_policy blkcg_policy_bfq = {
.legacy_cftypes = bfq_blkcg_legacy_files,
.cpd_alloc_fn = bfq_cpd_alloc,
- .cpd_init_fn = bfq_cpd_init,
- .cpd_bind_fn = bfq_cpd_init,
.cpd_free_fn = bfq_cpd_free,
.pd_alloc_fn = bfq_pd_alloc,
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 69aaee5..467e8cf 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -20,7 +20,6 @@
#define BFQ_DEFAULT_QUEUE_IOPRIO 4
-#define BFQ_WEIGHT_LEGACY_DFL 100
#define BFQ_DEFAULT_GRP_IOPRIO 0
#define BFQ_DEFAULT_GRP_CLASS IOPRIO_CLASS_BE
diff --git a/block/bio.c b/block/bio.c
index fd11614..fc98c1c72 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -1168,7 +1168,7 @@ void __bio_release_pages(struct bio *bio, bool mark_dirty)
bio_for_each_segment_all(bvec, bio, iter_all) {
if (mark_dirty && !PageCompound(bvec->bv_page))
set_page_dirty_lock(bvec->bv_page);
- put_page(bvec->bv_page);
+ bio_release_page(bio, bvec->bv_page);
}
}
EXPORT_SYMBOL_GPL(__bio_release_pages);
@@ -1190,7 +1190,6 @@ void bio_iov_bvec_set(struct bio *bio, struct iov_iter *iter)
bio->bi_io_vec = (struct bio_vec *)iter->bvec;
bio->bi_iter.bi_bvec_done = iter->iov_offset;
bio->bi_iter.bi_size = size;
- bio_set_flag(bio, BIO_NO_PAGE_REF);
bio_set_flag(bio, BIO_CLONED);
}
@@ -1205,7 +1204,7 @@ static int bio_iov_add_page(struct bio *bio, struct page *page,
}
if (same_page)
- put_page(page);
+ bio_release_page(bio, page);
return 0;
}
@@ -1219,7 +1218,7 @@ static int bio_iov_add_zone_append_page(struct bio *bio, struct page *page,
queue_max_zone_append_sectors(q), &same_page) != len)
return -EINVAL;
if (same_page)
- put_page(page);
+ bio_release_page(bio, page);
return 0;
}
@@ -1230,10 +1229,10 @@ static int bio_iov_add_zone_append_page(struct bio *bio, struct page *page,
* @bio: bio to add pages to
* @iter: iov iterator describing the region to be mapped
*
- * Pins pages from *iter and appends them to @bio's bvec array. The
- * pages will have to be released using put_page() when done.
- * For multi-segment *iter, this function only adds pages from the
- * next non-empty segment of the iov iterator.
+ * Extracts pages from *iter and appends them to @bio's bvec array. The pages
+ * will have to be cleaned up in the way indicated by the BIO_PAGE_PINNED flag.
+ * For a multi-segment *iter, this function only adds pages from the next
+ * non-empty segment of the iov iterator.
*/
static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
{
@@ -1265,9 +1264,9 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
* result to ensure the bio's total size is correct. The remainder of
* the iov data will be picked up in the next bio iteration.
*/
- size = iov_iter_get_pages(iter, pages,
- UINT_MAX - bio->bi_iter.bi_size,
- nr_pages, &offset, extraction_flags);
+ size = iov_iter_extract_pages(iter, &pages,
+ UINT_MAX - bio->bi_iter.bi_size,
+ nr_pages, extraction_flags, &offset);
if (unlikely(size <= 0))
return size ? size : -EFAULT;
@@ -1300,7 +1299,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
iov_iter_revert(iter, left);
out:
while (i < nr_pages)
- put_page(pages[i++]);
+ bio_release_page(bio, pages[i++]);
return ret;
}
@@ -1335,6 +1334,8 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter)
return 0;
}
+ if (iov_iter_extract_will_pin(iter))
+ bio_set_flag(bio, BIO_PAGE_PINNED);
do {
ret = __bio_iov_iter_get_pages(bio, iter);
} while (!ret && iov_iter_count(iter) && !bio_full(bio, 0));
@@ -1488,8 +1489,8 @@ void bio_set_pages_dirty(struct bio *bio)
* the BIO and re-dirty the pages in process context.
*
* It is expected that bio_check_pages_dirty() will wholly own the BIO from
- * here on. It will run one put_page() against each page and will run one
- * bio_put() against the BIO.
+ * here on. It will unpin each page and will run one bio_put() against the
+ * BIO.
*/
static void bio_dirty_fn(struct work_struct *work);
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c
index bd50b55..18331cb 100644
--- a/block/blk-cgroup.c
+++ b/block/blk-cgroup.c
@@ -1249,8 +1249,6 @@ blkcg_css_alloc(struct cgroup_subsys_state *parent_css)
blkcg->cpd[i] = cpd;
cpd->blkcg = blkcg;
cpd->plid = i;
- if (pol->cpd_init_fn)
- pol->cpd_init_fn(cpd);
}
spin_lock_init(&blkcg->lock);
@@ -1355,26 +1353,6 @@ void blkcg_exit_disk(struct gendisk *disk)
blk_throtl_exit(disk);
}
-static void blkcg_bind(struct cgroup_subsys_state *root_css)
-{
- int i;
-
- mutex_lock(&blkcg_pol_mutex);
-
- for (i = 0; i < BLKCG_MAX_POLS; i++) {
- struct blkcg_policy *pol = blkcg_policy[i];
- struct blkcg *blkcg;
-
- if (!pol || !pol->cpd_bind_fn)
- continue;
-
- list_for_each_entry(blkcg, &all_blkcgs, all_blkcgs_node)
- if (blkcg->cpd[pol->plid])
- pol->cpd_bind_fn(blkcg->cpd[pol->plid]);
- }
- mutex_unlock(&blkcg_pol_mutex);
-}
-
static void blkcg_exit(struct task_struct *tsk)
{
if (tsk->throttle_disk)
@@ -1388,7 +1366,6 @@ struct cgroup_subsys io_cgrp_subsys = {
.css_offline = blkcg_css_offline,
.css_free = blkcg_css_free,
.css_rstat_flush = blkcg_rstat_flush,
- .bind = blkcg_bind,
.dfl_cftypes = blkcg_files,
.legacy_cftypes = blkcg_legacy_files,
.legacy_name = "blkio",
@@ -1626,8 +1603,6 @@ int blkcg_policy_register(struct blkcg_policy *pol)
blkcg->cpd[pol->plid] = cpd;
cpd->blkcg = blkcg;
cpd->plid = pol->plid;
- if (pol->cpd_init_fn)
- pol->cpd_init_fn(cpd);
}
}
diff --git a/block/blk-cgroup.h b/block/blk-cgroup.h
index 9c50787..2c67886 100644
--- a/block/blk-cgroup.h
+++ b/block/blk-cgroup.h
@@ -173,9 +173,7 @@ struct blkcg_policy {
/* operations */
blkcg_pol_alloc_cpd_fn *cpd_alloc_fn;
- blkcg_pol_init_cpd_fn *cpd_init_fn;
blkcg_pol_free_cpd_fn *cpd_free_fn;
- blkcg_pol_bind_cpd_fn *cpd_bind_fn;
blkcg_pol_alloc_pd_fn *pd_alloc_fn;
blkcg_pol_init_pd_fn *pd_init_fn;
diff --git a/block/blk-core.c b/block/blk-core.c
index 42926e6..80deb10 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -263,13 +263,7 @@ static void blk_free_queue_rcu(struct rcu_head *rcu_head)
static void blk_free_queue(struct request_queue *q)
{
- if (q->poll_stat)
- blk_stat_remove_callback(q, q->poll_cb);
- blk_stat_free_callback(q->poll_cb);
-
blk_free_queue_stats(q->stats);
- kfree(q->poll_stat);
-
if (queue_is_mq(q))
blk_mq_release(q);
diff --git a/block/blk-crypto-internal.h b/block/blk-crypto-internal.h
index a8cdaf2..93a1419 100644
--- a/block/blk-crypto-internal.h
+++ b/block/blk-crypto-internal.h
@@ -65,6 +65,11 @@ static inline bool blk_crypto_rq_is_encrypted(struct request *rq)
return rq->crypt_ctx;
}
+static inline bool blk_crypto_rq_has_keyslot(struct request *rq)
+{
+ return rq->crypt_keyslot;
+}
+
blk_status_t blk_crypto_get_keyslot(struct blk_crypto_profile *profile,
const struct blk_crypto_key *key,
struct blk_crypto_keyslot **slot_ptr);
@@ -119,6 +124,11 @@ static inline bool blk_crypto_rq_is_encrypted(struct request *rq)
return false;
}
+static inline bool blk_crypto_rq_has_keyslot(struct request *rq)
+{
+ return false;
+}
+
#endif /* CONFIG_BLK_INLINE_ENCRYPTION */
void __bio_crypt_advance(struct bio *bio, unsigned int bytes);
@@ -153,14 +163,21 @@ static inline bool blk_crypto_bio_prep(struct bio **bio_ptr)
return true;
}
-blk_status_t __blk_crypto_init_request(struct request *rq);
-static inline blk_status_t blk_crypto_init_request(struct request *rq)
+blk_status_t __blk_crypto_rq_get_keyslot(struct request *rq);
+static inline blk_status_t blk_crypto_rq_get_keyslot(struct request *rq)
{
if (blk_crypto_rq_is_encrypted(rq))
- return __blk_crypto_init_request(rq);
+ return __blk_crypto_rq_get_keyslot(rq);
return BLK_STS_OK;
}
+void __blk_crypto_rq_put_keyslot(struct request *rq);
+static inline void blk_crypto_rq_put_keyslot(struct request *rq)
+{
+ if (blk_crypto_rq_has_keyslot(rq))
+ __blk_crypto_rq_put_keyslot(rq);
+}
+
void __blk_crypto_free_request(struct request *rq);
static inline void blk_crypto_free_request(struct request *rq)
{
@@ -188,21 +205,6 @@ static inline int blk_crypto_rq_bio_prep(struct request *rq, struct bio *bio,
return 0;
}
-/**
- * blk_crypto_insert_cloned_request - Prepare a cloned request to be inserted
- * into a request queue.
- * @rq: the request being queued
- *
- * Return: BLK_STS_OK on success, nonzero on error.
- */
-static inline blk_status_t blk_crypto_insert_cloned_request(struct request *rq)
-{
-
- if (blk_crypto_rq_is_encrypted(rq))
- return blk_crypto_init_request(rq);
- return BLK_STS_OK;
-}
-
#ifdef CONFIG_BLK_INLINE_ENCRYPTION_FALLBACK
int blk_crypto_fallback_start_using_mode(enum blk_crypto_mode_num mode_num);
diff --git a/block/blk-crypto-profile.c b/block/blk-crypto-profile.c
index 0307fb0..2a67d3f 100644
--- a/block/blk-crypto-profile.c
+++ b/block/blk-crypto-profile.c
@@ -227,14 +227,13 @@ EXPORT_SYMBOL_GPL(blk_crypto_keyslot_index);
* @profile: the crypto profile of the device the key will be used on
* @key: the key that will be used
* @slot_ptr: If a keyslot is allocated, an opaque pointer to the keyslot struct
- * will be stored here; otherwise NULL will be stored here.
+ * will be stored here. blk_crypto_put_keyslot() must be called
+ * later to release it. Otherwise, NULL will be stored here.
*
* If the device has keyslots, this gets a keyslot that's been programmed with
* the specified key. If the key is already in a slot, this reuses it;
* otherwise this waits for a slot to become idle and programs the key into it.
*
- * This must be paired with a call to blk_crypto_put_keyslot().
- *
* Context: Process context. Takes and releases profile->lock.
* Return: BLK_STS_OK on success, meaning that either a keyslot was allocated or
* one wasn't needed; or a blk_status_t error on failure.
@@ -312,20 +311,15 @@ blk_status_t blk_crypto_get_keyslot(struct blk_crypto_profile *profile,
/**
* blk_crypto_put_keyslot() - Release a reference to a keyslot
- * @slot: The keyslot to release the reference of (may be NULL).
+ * @slot: The keyslot to release the reference of
*
* Context: Any context.
*/
void blk_crypto_put_keyslot(struct blk_crypto_keyslot *slot)
{
- struct blk_crypto_profile *profile;
+ struct blk_crypto_profile *profile = slot->profile;
unsigned long flags;
- if (!slot)
- return;
-
- profile = slot->profile;
-
if (atomic_dec_and_lock_irqsave(&slot->slot_refs,
&profile->idle_slots_lock, flags)) {
list_add_tail(&slot->idle_slot_node, &profile->idle_slots);
@@ -354,28 +348,16 @@ bool __blk_crypto_cfg_supported(struct blk_crypto_profile *profile,
return true;
}
-/**
- * __blk_crypto_evict_key() - Evict a key from a device.
- * @profile: the crypto profile of the device
- * @key: the key to evict. It must not still be used in any I/O.
- *
- * If the device has keyslots, this finds the keyslot (if any) that contains the
- * specified key and calls the driver's keyslot_evict function to evict it.
- *
- * Otherwise, this just calls the driver's keyslot_evict function if it is
- * implemented, passing just the key (without any particular keyslot). This
- * allows layered devices to evict the key from their underlying devices.
- *
- * Context: Process context. Takes and releases profile->lock.
- * Return: 0 on success or if there's no keyslot with the specified key, -EBUSY
- * if the keyslot is still in use, or another -errno value on other
- * error.
+/*
+ * This is an internal function that evicts a key from an inline encryption
+ * device that can be either a real device or the blk-crypto-fallback "device".
+ * It is used only by blk_crypto_evict_key(); see that function for details.
*/
int __blk_crypto_evict_key(struct blk_crypto_profile *profile,
const struct blk_crypto_key *key)
{
struct blk_crypto_keyslot *slot;
- int err = 0;
+ int err;
if (profile->num_slots == 0) {
if (profile->ll_ops.keyslot_evict) {
@@ -389,22 +371,30 @@ int __blk_crypto_evict_key(struct blk_crypto_profile *profile,
blk_crypto_hw_enter(profile);
slot = blk_crypto_find_keyslot(profile, key);
- if (!slot)
- goto out_unlock;
+ if (!slot) {
+ /*
+ * Not an error, since a key not in use by I/O is not guaranteed
+ * to be in a keyslot. There can be more keys than keyslots.
+ */
+ err = 0;
+ goto out;
+ }
if (WARN_ON_ONCE(atomic_read(&slot->slot_refs) != 0)) {
+ /* BUG: key is still in use by I/O */
err = -EBUSY;
- goto out_unlock;
+ goto out_remove;
}
err = profile->ll_ops.keyslot_evict(profile, key,
blk_crypto_keyslot_index(slot));
- if (err)
- goto out_unlock;
-
+out_remove:
+ /*
+ * Callers free the key even on error, so unlink the key from the hash
+ * table and clear slot->key even on error.
+ */
hlist_del(&slot->hash_node);
slot->key = NULL;
- err = 0;
-out_unlock:
+out:
blk_crypto_hw_exit(profile);
return err;
}
diff --git a/block/blk-crypto.c b/block/blk-crypto.c
index 4537858..4d760b0 100644
--- a/block/blk-crypto.c
+++ b/block/blk-crypto.c
@@ -13,6 +13,7 @@
#include <linux/blkdev.h>
#include <linux/blk-crypto-profile.h>
#include <linux/module.h>
+#include <linux/ratelimit.h>
#include <linux/slab.h>
#include "blk-crypto-internal.h"
@@ -224,27 +225,27 @@ static bool bio_crypt_check_alignment(struct bio *bio)
return true;
}
-blk_status_t __blk_crypto_init_request(struct request *rq)
+blk_status_t __blk_crypto_rq_get_keyslot(struct request *rq)
{
return blk_crypto_get_keyslot(rq->q->crypto_profile,
rq->crypt_ctx->bc_key,
&rq->crypt_keyslot);
}
-/**
- * __blk_crypto_free_request - Uninitialize the crypto fields of a request.
- *
- * @rq: The request whose crypto fields to uninitialize.
- *
- * Completely uninitializes the crypto fields of a request. If a keyslot has
- * been programmed into some inline encryption hardware, that keyslot is
- * released. The rq->crypt_ctx is also freed.
- */
-void __blk_crypto_free_request(struct request *rq)
+void __blk_crypto_rq_put_keyslot(struct request *rq)
{
blk_crypto_put_keyslot(rq->crypt_keyslot);
+ rq->crypt_keyslot = NULL;
+}
+
+void __blk_crypto_free_request(struct request *rq)
+{
+ /* The keyslot, if one was needed, should have been released earlier. */
+ if (WARN_ON_ONCE(rq->crypt_keyslot))
+ __blk_crypto_rq_put_keyslot(rq);
+
mempool_free(rq->crypt_ctx, bio_crypt_ctx_pool);
- blk_crypto_rq_set_defaults(rq);
+ rq->crypt_ctx = NULL;
}
/**
@@ -399,30 +400,39 @@ int blk_crypto_start_using_key(struct block_device *bdev,
}
/**
- * blk_crypto_evict_key() - Evict a key from any inline encryption hardware
- * it may have been programmed into
- * @bdev: The block_device who's associated inline encryption hardware this key
- * might have been programmed into
- * @key: The key to evict
+ * blk_crypto_evict_key() - Evict a blk_crypto_key from a block_device
+ * @bdev: a block_device on which I/O using the key may have been done
+ * @key: the key to evict
*
- * Upper layers (filesystems) must call this function to ensure that a key is
- * evicted from any hardware that it might have been programmed into. The key
- * must not be in use by any in-flight IO when this function is called.
+ * For a given block_device, this function removes the given blk_crypto_key from
+ * the keyslot management structures and evicts it from any underlying hardware
+ * keyslot(s) or blk-crypto-fallback keyslot it may have been programmed into.
*
- * Return: 0 on success or if the key wasn't in any keyslot; -errno on error.
+ * Upper layers must call this before freeing the blk_crypto_key. It must be
+ * called for every block_device the key may have been used on. The key must no
+ * longer be in use by any I/O when this function is called.
+ *
+ * Context: May sleep.
*/
-int blk_crypto_evict_key(struct block_device *bdev,
- const struct blk_crypto_key *key)
+void blk_crypto_evict_key(struct block_device *bdev,
+ const struct blk_crypto_key *key)
{
struct request_queue *q = bdev_get_queue(bdev);
+ int err;
if (blk_crypto_config_supported_natively(bdev, &key->crypto_cfg))
- return __blk_crypto_evict_key(q->crypto_profile, key);
-
+ err = __blk_crypto_evict_key(q->crypto_profile, key);
+ else
+ err = blk_crypto_fallback_evict_key(key);
/*
- * If the block_device didn't support the key, then blk-crypto-fallback
- * may have been used, so try to evict the key from blk-crypto-fallback.
+ * An error can only occur here if the key failed to be evicted from a
+ * keyslot (due to a hardware or driver issue) or is allegedly still in
+ * use by I/O (due to a kernel bug). Even in these cases, the key is
+ * still unlinked from the keyslot management structures, and the caller
+ * is allowed and expected to free it right away. There's nothing
+ * callers can do to handle errors, so just log them and return void.
*/
- return blk_crypto_fallback_evict_key(key);
+ if (err)
+ pr_warn_ratelimited("%pg: error %d evicting key\n", bdev, err);
}
EXPORT_SYMBOL_GPL(blk_crypto_evict_key);
diff --git a/block/blk-map.c b/block/blk-map.c
index 04c55f1..3551c3f 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -281,21 +281,21 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
if (blk_queue_pci_p2pdma(rq->q))
extraction_flags |= ITER_ALLOW_P2PDMA;
+ if (iov_iter_extract_will_pin(iter))
+ bio_set_flag(bio, BIO_PAGE_PINNED);
while (iov_iter_count(iter)) {
- struct page **pages, *stack_pages[UIO_FASTIOV];
+ struct page *stack_pages[UIO_FASTIOV];
+ struct page **pages = stack_pages;
ssize_t bytes;
size_t offs;
int npages;
- if (nr_vecs <= ARRAY_SIZE(stack_pages)) {
- pages = stack_pages;
- bytes = iov_iter_get_pages(iter, pages, LONG_MAX,
- nr_vecs, &offs, extraction_flags);
- } else {
- bytes = iov_iter_get_pages_alloc(iter, &pages,
- LONG_MAX, &offs, extraction_flags);
- }
+ if (nr_vecs > ARRAY_SIZE(stack_pages))
+ pages = NULL;
+
+ bytes = iov_iter_extract_pages(iter, &pages, LONG_MAX,
+ nr_vecs, extraction_flags, &offs);
if (unlikely(bytes <= 0)) {
ret = bytes ? bytes : -EFAULT;
goto out_unmap;
@@ -317,7 +317,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
if (!bio_add_hw_page(rq->q, bio, page, n, offs,
max_sectors, &same_page)) {
if (same_page)
- put_page(page);
+ bio_release_page(bio, page);
break;
}
@@ -329,7 +329,7 @@ static int bio_map_user_iov(struct request *rq, struct iov_iter *iter,
* release the pages we didn't map into the bio, if any
*/
while (j < npages)
- put_page(pages[j++]);
+ bio_release_page(bio, pages[j++]);
if (pages != stack_pages)
kvfree(pages);
/* couldn't stuff something into bio? */
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 6460abd..65e75ef 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -867,6 +867,8 @@ static struct request *attempt_merge(struct request_queue *q,
if (!blk_discard_mergable(req))
elv_merge_requests(q, req, next);
+ blk_crypto_rq_put_keyslot(next);
+
/*
* 'next' is going away, so update stats accordingly
*/
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index b01818f8..212a7f3 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -15,33 +15,8 @@
#include "blk-mq-tag.h"
#include "blk-rq-qos.h"
-static void print_stat(struct seq_file *m, struct blk_rq_stat *stat)
-{
- if (stat->nr_samples) {
- seq_printf(m, "samples=%d, mean=%llu, min=%llu, max=%llu",
- stat->nr_samples, stat->mean, stat->min, stat->max);
- } else {
- seq_puts(m, "samples=0");
- }
-}
-
static int queue_poll_stat_show(void *data, struct seq_file *m)
{
- struct request_queue *q = data;
- int bucket;
-
- if (!q->poll_stat)
- return 0;
-
- for (bucket = 0; bucket < (BLK_MQ_POLL_STATS_BKTS / 2); bucket++) {
- seq_printf(m, "read (%d Bytes): ", 1 << (9 + bucket));
- print_stat(m, &q->poll_stat[2 * bucket]);
- seq_puts(m, "\n");
-
- seq_printf(m, "write (%d Bytes): ", 1 << (9 + bucket));
- print_stat(m, &q->poll_stat[2 * bucket + 1]);
- seq_puts(m, "\n");
- }
return 0;
}
@@ -282,7 +257,6 @@ static const char *const rqf_name[] = {
RQF_NAME(STATS),
RQF_NAME(SPECIAL_PAYLOAD),
RQF_NAME(ZONE_WRITE_LOCKED),
- RQF_NAME(MQ_POLL_SLEPT),
RQF_NAME(TIMED_OUT),
RQF_NAME(ELV),
RQF_NAME(RESV),
diff --git a/block/blk-mq.c b/block/blk-mq.c
index cf1a39a..1b304f6 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -46,51 +46,15 @@
static DEFINE_PER_CPU(struct llist_head, blk_cpu_done);
-static void blk_mq_poll_stats_start(struct request_queue *q);
-static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb);
-
-static int blk_mq_poll_stats_bkt(const struct request *rq)
-{
- int ddir, sectors, bucket;
-
- ddir = rq_data_dir(rq);
- sectors = blk_rq_stats_sectors(rq);
-
- bucket = ddir + 2 * ilog2(sectors);
-
- if (bucket < 0)
- return -1;
- else if (bucket >= BLK_MQ_POLL_STATS_BKTS)
- return ddir + BLK_MQ_POLL_STATS_BKTS - 2;
-
- return bucket;
-}
-
-#define BLK_QC_T_SHIFT 16
-#define BLK_QC_T_INTERNAL (1U << 31)
-
static inline struct blk_mq_hw_ctx *blk_qc_to_hctx(struct request_queue *q,
blk_qc_t qc)
{
- return xa_load(&q->hctx_table,
- (qc & ~BLK_QC_T_INTERNAL) >> BLK_QC_T_SHIFT);
-}
-
-static inline struct request *blk_qc_to_rq(struct blk_mq_hw_ctx *hctx,
- blk_qc_t qc)
-{
- unsigned int tag = qc & ((1U << BLK_QC_T_SHIFT) - 1);
-
- if (qc & BLK_QC_T_INTERNAL)
- return blk_mq_tag_to_rq(hctx->sched_tags, tag);
- return blk_mq_tag_to_rq(hctx->tags, tag);
+ return xa_load(&q->hctx_table, qc);
}
static inline blk_qc_t blk_rq_to_qc(struct request *rq)
{
- return (rq->mq_hctx->queue_num << BLK_QC_T_SHIFT) |
- (rq->tag != -1 ?
- rq->tag : (rq->internal_tag | BLK_QC_T_INTERNAL));
+ return rq->mq_hctx->queue_num;
}
/*
@@ -840,6 +804,12 @@ static void blk_complete_request(struct request *req)
req->q->integrity.profile->complete_fn(req, total_bytes);
#endif
+ /*
+ * Upper layers may call blk_crypto_evict_key() anytime after the last
+ * bio_endio(). Therefore, the keyslot must be released before that.
+ */
+ blk_crypto_rq_put_keyslot(req);
+
blk_account_io_completion(req, total_bytes);
do {
@@ -905,6 +875,13 @@ bool blk_update_request(struct request *req, blk_status_t error,
req->q->integrity.profile->complete_fn(req, nr_bytes);
#endif
+ /*
+ * Upper layers may call blk_crypto_evict_key() anytime after the last
+ * bio_endio(). Therefore, the keyslot must be released before that.
+ */
+ if (blk_crypto_rq_has_keyslot(req) && nr_bytes >= blk_rq_bytes(req))
+ __blk_crypto_rq_put_keyslot(req);
+
if (unlikely(error && !blk_rq_is_passthrough(req) &&
!(req->rq_flags & RQF_QUIET)) &&
!test_bit(GD_DEAD, &req->q->disk->state)) {
@@ -976,17 +953,6 @@ bool blk_update_request(struct request *req, blk_status_t error,
}
EXPORT_SYMBOL_GPL(blk_update_request);
-static void __blk_account_io_done(struct request *req, u64 now)
-{
- const int sgrp = op_stat_group(req_op(req));
-
- part_stat_lock();
- update_io_ticks(req->part, jiffies, true);
- part_stat_inc(req->part, ios[sgrp]);
- part_stat_add(req->part, nsecs[sgrp], now - req->start_time_ns);
- part_stat_unlock();
-}
-
static inline void blk_account_io_done(struct request *req, u64 now)
{
/*
@@ -995,40 +961,41 @@ static inline void blk_account_io_done(struct request *req, u64 now)
* containing request is enough.
*/
if (blk_do_io_stat(req) && req->part &&
- !(req->rq_flags & RQF_FLUSH_SEQ))
- __blk_account_io_done(req, now);
-}
+ !(req->rq_flags & RQF_FLUSH_SEQ)) {
+ const int sgrp = op_stat_group(req_op(req));
-static void __blk_account_io_start(struct request *rq)
-{
- /*
- * All non-passthrough requests are created from a bio with one
- * exception: when a flush command that is part of a flush sequence
- * generated by the state machine in blk-flush.c is cloned onto the
- * lower device by dm-multipath we can get here without a bio.
- */
- if (rq->bio)
- rq->part = rq->bio->bi_bdev;
- else
- rq->part = rq->q->disk->part0;
-
- part_stat_lock();
- update_io_ticks(rq->part, jiffies, false);
- part_stat_unlock();
+ part_stat_lock();
+ update_io_ticks(req->part, jiffies, true);
+ part_stat_inc(req->part, ios[sgrp]);
+ part_stat_add(req->part, nsecs[sgrp], now - req->start_time_ns);
+ part_stat_unlock();
+ }
}
static inline void blk_account_io_start(struct request *req)
{
- if (blk_do_io_stat(req))
- __blk_account_io_start(req);
+ if (blk_do_io_stat(req)) {
+ /*
+ * All non-passthrough requests are created from a bio with one
+ * exception: when a flush command that is part of a flush sequence
+ * generated by the state machine in blk-flush.c is cloned onto the
+ * lower device by dm-multipath we can get here without a bio.
+ */
+ if (req->bio)
+ req->part = req->bio->bi_bdev;
+ else
+ req->part = req->q->disk->part0;
+
+ part_stat_lock();
+ update_io_ticks(req->part, jiffies, false);
+ part_stat_unlock();
+ }
}
static inline void __blk_mq_end_request_acct(struct request *rq, u64 now)
{
- if (rq->rq_flags & RQF_STATS) {
- blk_mq_poll_stats_start(rq->q);
+ if (rq->rq_flags & RQF_STATS)
blk_stat_add(rq, now);
- }
blk_mq_sched_completed_request(rq, now);
blk_account_io_done(rq, now);
@@ -2968,7 +2935,7 @@ void blk_mq_submit_bio(struct bio *bio)
blk_mq_bio_to_request(rq, bio, nr_segs);
- ret = blk_crypto_init_request(rq);
+ ret = blk_crypto_rq_get_keyslot(rq);
if (ret != BLK_STS_OK) {
bio->bi_status = ret;
bio_endio(bio);
@@ -3037,8 +3004,9 @@ blk_status_t blk_insert_cloned_request(struct request *rq)
if (q->disk && should_fail_request(q->disk->part0, blk_rq_bytes(rq)))
return BLK_STS_IOERR;
- if (blk_crypto_insert_cloned_request(rq))
- return BLK_STS_IOERR;
+ ret = blk_crypto_rq_get_keyslot(rq);
+ if (ret != BLK_STS_OK)
+ return ret;
blk_account_io_start(rq);
@@ -4209,14 +4177,8 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
/* mark the queue as mq asap */
q->mq_ops = set->ops;
- q->poll_cb = blk_stat_alloc_callback(blk_mq_poll_stats_fn,
- blk_mq_poll_stats_bkt,
- BLK_MQ_POLL_STATS_BKTS, q);
- if (!q->poll_cb)
- goto err_exit;
-
if (blk_mq_alloc_ctxs(q))
- goto err_poll;
+ goto err_exit;
/* init q->mq_kobj and sw queues' kobjects */
blk_mq_sysfs_init(q);
@@ -4244,11 +4206,6 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
q->nr_requests = set->queue_depth;
- /*
- * Default to classic polling
- */
- q->poll_nsec = BLK_MQ_POLL_CLASSIC;
-
blk_mq_init_cpu_queues(q, set->nr_hw_queues);
blk_mq_add_queue_tag_set(set, q);
blk_mq_map_swqueue(q);
@@ -4256,9 +4213,6 @@ int blk_mq_init_allocated_queue(struct blk_mq_tag_set *set,
err_hctxs:
blk_mq_release(q);
-err_poll:
- blk_stat_free_callback(q->poll_cb);
- q->poll_cb = NULL;
err_exit:
q->mq_ops = NULL;
return -ENOMEM;
@@ -4755,138 +4709,8 @@ void blk_mq_update_nr_hw_queues(struct blk_mq_tag_set *set, int nr_hw_queues)
}
EXPORT_SYMBOL_GPL(blk_mq_update_nr_hw_queues);
-/* Enable polling stats and return whether they were already enabled. */
-static bool blk_poll_stats_enable(struct request_queue *q)
-{
- if (q->poll_stat)
- return true;
-
- return blk_stats_alloc_enable(q);
-}
-
-static void blk_mq_poll_stats_start(struct request_queue *q)
-{
- /*
- * We don't arm the callback if polling stats are not enabled or the
- * callback is already active.
- */
- if (!q->poll_stat || blk_stat_is_active(q->poll_cb))
- return;
-
- blk_stat_activate_msecs(q->poll_cb, 100);
-}
-
-static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb)
-{
- struct request_queue *q = cb->data;
- int bucket;
-
- for (bucket = 0; bucket < BLK_MQ_POLL_STATS_BKTS; bucket++) {
- if (cb->stat[bucket].nr_samples)
- q->poll_stat[bucket] = cb->stat[bucket];
- }
-}
-
-static unsigned long blk_mq_poll_nsecs(struct request_queue *q,
- struct request *rq)
-{
- unsigned long ret = 0;
- int bucket;
-
- /*
- * If stats collection isn't on, don't sleep but turn it on for
- * future users
- */
- if (!blk_poll_stats_enable(q))
- return 0;
-
- /*
- * As an optimistic guess, use half of the mean service time
- * for this type of request. We can (and should) make this smarter.
- * For instance, if the completion latencies are tight, we can
- * get closer than just half the mean. This is especially
- * important on devices where the completion latencies are longer
- * than ~10 usec. We do use the stats for the relevant IO size
- * if available which does lead to better estimates.
- */
- bucket = blk_mq_poll_stats_bkt(rq);
- if (bucket < 0)
- return ret;
-
- if (q->poll_stat[bucket].nr_samples)
- ret = (q->poll_stat[bucket].mean + 1) / 2;
-
- return ret;
-}
-
-static bool blk_mq_poll_hybrid(struct request_queue *q, blk_qc_t qc)
-{
- struct blk_mq_hw_ctx *hctx = blk_qc_to_hctx(q, qc);
- struct request *rq = blk_qc_to_rq(hctx, qc);
- struct hrtimer_sleeper hs;
- enum hrtimer_mode mode;
- unsigned int nsecs;
- ktime_t kt;
-
- /*
- * If a request has completed on queue that uses an I/O scheduler, we
- * won't get back a request from blk_qc_to_rq.
- */
- if (!rq || (rq->rq_flags & RQF_MQ_POLL_SLEPT))
- return false;
-
- /*
- * If we get here, hybrid polling is enabled. Hence poll_nsec can be:
- *
- * 0: use half of prev avg
- * >0: use this specific value
- */
- if (q->poll_nsec > 0)
- nsecs = q->poll_nsec;
- else
- nsecs = blk_mq_poll_nsecs(q, rq);
-
- if (!nsecs)
- return false;
-
- rq->rq_flags |= RQF_MQ_POLL_SLEPT;
-
- /*
- * This will be replaced with the stats tracking code, using
- * 'avg_completion_time / 2' as the pre-sleep target.
- */
- kt = nsecs;
-
- mode = HRTIMER_MODE_REL;
- hrtimer_init_sleeper_on_stack(&hs, CLOCK_MONOTONIC, mode);
- hrtimer_set_expires(&hs.timer, kt);
-
- do {
- if (blk_mq_rq_state(rq) == MQ_RQ_COMPLETE)
- break;
- set_current_state(TASK_UNINTERRUPTIBLE);
- hrtimer_sleeper_start_expires(&hs, mode);
- if (hs.task)
- io_schedule();
- hrtimer_cancel(&hs.timer);
- mode = HRTIMER_MODE_ABS;
- } while (hs.task && !signal_pending(current));
-
- __set_current_state(TASK_RUNNING);
- destroy_hrtimer_on_stack(&hs.timer);
-
- /*
- * If we sleep, have the caller restart the poll loop to reset the
- * state. Like for the other success return cases, the caller is
- * responsible for checking if the IO completed. If the IO isn't
- * complete, we'll get called again and will go straight to the busy
- * poll loop.
- */
- return true;
-}
-
-static int blk_mq_poll_classic(struct request_queue *q, blk_qc_t cookie,
- struct io_comp_batch *iob, unsigned int flags)
+int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, struct io_comp_batch *iob,
+ unsigned int flags)
{
struct blk_mq_hw_ctx *hctx = blk_qc_to_hctx(q, cookie);
long state = get_current_state();
@@ -4913,17 +4737,6 @@ static int blk_mq_poll_classic(struct request_queue *q, blk_qc_t cookie,
return 0;
}
-int blk_mq_poll(struct request_queue *q, blk_qc_t cookie, struct io_comp_batch *iob,
- unsigned int flags)
-{
- if (!(flags & BLK_POLL_NOSLEEP) &&
- q->poll_nsec != BLK_MQ_POLL_CLASSIC) {
- if (blk_mq_poll_hybrid(q, cookie))
- return 1;
- }
- return blk_mq_poll_classic(q, cookie, iob, flags);
-}
-
unsigned int blk_mq_rq_cpu(struct request *rq)
{
return rq->mq_ctx->cpu;
diff --git a/block/blk-stat.c b/block/blk-stat.c
index c6ca16ab..74a1a8c 100644
--- a/block/blk-stat.c
+++ b/block/blk-stat.c
@@ -231,21 +231,3 @@ void blk_free_queue_stats(struct blk_queue_stats *stats)
kfree(stats);
}
-
-bool blk_stats_alloc_enable(struct request_queue *q)
-{
- struct blk_rq_stat *poll_stat;
-
- poll_stat = kcalloc(BLK_MQ_POLL_STATS_BKTS, sizeof(*poll_stat),
- GFP_ATOMIC);
- if (!poll_stat)
- return false;
-
- if (cmpxchg(&q->poll_stat, NULL, poll_stat) != NULL) {
- kfree(poll_stat);
- return true;
- }
-
- blk_stat_add_callback(q, q->poll_cb);
- return false;
-}
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index f1fce1c7..1a743b4 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -408,35 +408,12 @@ queue_rq_affinity_store(struct request_queue *q, const char *page, size_t count)
static ssize_t queue_poll_delay_show(struct request_queue *q, char *page)
{
- int val;
-
- if (q->poll_nsec == BLK_MQ_POLL_CLASSIC)
- val = BLK_MQ_POLL_CLASSIC;
- else
- val = q->poll_nsec / 1000;
-
- return sprintf(page, "%d\n", val);
+ return sprintf(page, "%d\n", -1);
}
static ssize_t queue_poll_delay_store(struct request_queue *q, const char *page,
size_t count)
{
- int err, val;
-
- if (!q->mq_ops || !q->mq_ops->poll)
- return -EINVAL;
-
- err = kstrtoint(page, 10, &val);
- if (err < 0)
- return err;
-
- if (val == BLK_MQ_POLL_CLASSIC)
- q->poll_nsec = BLK_MQ_POLL_CLASSIC;
- else if (val >= 0)
- q->poll_nsec = val * 1000;
- else
- return -EINVAL;
-
return count;
}
diff --git a/block/blk.h b/block/blk.h
index cc4e887..d65d969 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -432,6 +432,18 @@ int bio_add_hw_page(struct request_queue *q, struct bio *bio,
struct page *page, unsigned int len, unsigned int offset,
unsigned int max_sectors, bool *same_page);
+/*
+ * Clean up a page appropriately, where the page may be pinned, may have a
+ * ref taken on it or neither.
+ */
+static inline void bio_release_page(struct bio *bio, struct page *page)
+{
+ if (bio_flagged(bio, BIO_PAGE_PINNED))
+ unpin_user_page(page);
+ else if (bio_flagged(bio, BIO_PAGE_REFFED))
+ put_page(page);
+}
+
struct request_queue *blk_alloc_queue(int node_id);
int disk_scan_partitions(struct gendisk *disk, fmode_t mode);
diff --git a/block/opal_proto.h b/block/opal_proto.h
index 7152aa1..a4e5684 100644
--- a/block/opal_proto.h
+++ b/block/opal_proto.h
@@ -86,6 +86,15 @@ enum opal_response_token {
#define OPAL_MSID_KEYLEN 15
#define OPAL_UID_LENGTH_HALF 4
+/*
+ * Boolean operators from TCG Core spec 2.01 Section:
+ * 5.1.3.11
+ * Table 61
+ */
+#define OPAL_BOOLEAN_AND 0
+#define OPAL_BOOLEAN_OR 1
+#define OPAL_BOOLEAN_NOT 2
+
/* Enum to index OPALUID array */
enum opal_uid {
/* users */
@@ -105,6 +114,7 @@ enum opal_uid {
/* tables */
OPAL_TABLE_TABLE,
OPAL_LOCKINGRANGE_GLOBAL,
+ OPAL_LOCKINGRANGE_ACE_START_TO_KEY,
OPAL_LOCKINGRANGE_ACE_RDLOCKED,
OPAL_LOCKINGRANGE_ACE_WRLOCKED,
OPAL_MBRCONTROL,
diff --git a/block/sed-opal.c b/block/sed-opal.c
index c320093..3fc4e65 100644
--- a/block/sed-opal.c
+++ b/block/sed-opal.c
@@ -132,6 +132,8 @@ static const u8 opaluid[][OPAL_UID_LENGTH] = {
{ 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x01 },
[OPAL_LOCKINGRANGE_GLOBAL] =
{ 0x00, 0x00, 0x08, 0x02, 0x00, 0x00, 0x00, 0x01 },
+ [OPAL_LOCKINGRANGE_ACE_START_TO_KEY] =
+ { 0x00, 0x00, 0x00, 0x08, 0x00, 0x03, 0xD0, 0x01 },
[OPAL_LOCKINGRANGE_ACE_RDLOCKED] =
{ 0x00, 0x00, 0x00, 0x08, 0x00, 0x03, 0xE0, 0x01 },
[OPAL_LOCKINGRANGE_ACE_WRLOCKED] =
@@ -1147,12 +1149,8 @@ static int finalize_and_send(struct opal_dev *dev, cont_fn cont)
return opal_send_recv(dev, cont);
}
-/*
- * request @column from table @table on device @dev. On success, the column
- * data will be available in dev->resp->tok[4]
- */
-static int generic_get_column(struct opal_dev *dev, const u8 *table,
- u64 column)
+static int generic_get_columns(struct opal_dev *dev, const u8 *table,
+ u64 start_column, u64 end_column)
{
int err;
@@ -1162,12 +1160,12 @@ static int generic_get_column(struct opal_dev *dev, const u8 *table,
add_token_u8(&err, dev, OPAL_STARTNAME);
add_token_u8(&err, dev, OPAL_STARTCOLUMN);
- add_token_u64(&err, dev, column);
+ add_token_u64(&err, dev, start_column);
add_token_u8(&err, dev, OPAL_ENDNAME);
add_token_u8(&err, dev, OPAL_STARTNAME);
add_token_u8(&err, dev, OPAL_ENDCOLUMN);
- add_token_u64(&err, dev, column);
+ add_token_u64(&err, dev, end_column);
add_token_u8(&err, dev, OPAL_ENDNAME);
add_token_u8(&err, dev, OPAL_ENDLIST);
@@ -1179,6 +1177,16 @@ static int generic_get_column(struct opal_dev *dev, const u8 *table,
}
/*
+ * request @column from table @table on device @dev. On success, the column
+ * data will be available in dev->resp->tok[4]
+ */
+static int generic_get_column(struct opal_dev *dev, const u8 *table,
+ u64 column)
+{
+ return generic_get_columns(dev, table, column, column);
+}
+
+/*
* see TCG SAS 5.3.2.3 for a description of the available columns
*
* the result is provided in dev->resp->tok[4]
@@ -1437,6 +1445,129 @@ static int setup_locking_range(struct opal_dev *dev, void *data)
return finalize_and_send(dev, parse_and_check_status);
}
+static int response_get_column(const struct parsed_resp *resp,
+ int *iter,
+ u8 column,
+ u64 *value)
+{
+ const struct opal_resp_tok *tok;
+ int n = *iter;
+ u64 val;
+
+ tok = response_get_token(resp, n);
+ if (IS_ERR(tok))
+ return PTR_ERR(tok);
+
+ if (!response_token_matches(tok, OPAL_STARTNAME)) {
+ pr_debug("Unexpected response token type %d.\n", n);
+ return OPAL_INVAL_PARAM;
+ }
+ n++;
+
+ if (response_get_u64(resp, n) != column) {
+ pr_debug("Token %d does not match expected column %u.\n",
+ n, column);
+ return OPAL_INVAL_PARAM;
+ }
+ n++;
+
+ val = response_get_u64(resp, n);
+ n++;
+
+ tok = response_get_token(resp, n);
+ if (IS_ERR(tok))
+ return PTR_ERR(tok);
+
+ if (!response_token_matches(tok, OPAL_ENDNAME)) {
+ pr_debug("Unexpected response token type %d.\n", n);
+ return OPAL_INVAL_PARAM;
+ }
+ n++;
+
+ *value = val;
+ *iter = n;
+
+ return 0;
+}
+
+static int locking_range_status(struct opal_dev *dev, void *data)
+{
+ u8 lr_buffer[OPAL_UID_LENGTH];
+ u64 resp;
+ bool rlocked, wlocked;
+ int err, tok_n = 2;
+ struct opal_lr_status *lrst = data;
+
+ err = build_locking_range(lr_buffer, sizeof(lr_buffer),
+ lrst->session.opal_key.lr);
+ if (err)
+ return err;
+
+ err = generic_get_columns(dev, lr_buffer, OPAL_RANGESTART,
+ OPAL_WRITELOCKED);
+ if (err) {
+ pr_debug("Couldn't get lr %u table columns %d to %d.\n",
+ lrst->session.opal_key.lr, OPAL_RANGESTART,
+ OPAL_WRITELOCKED);
+ return err;
+ }
+
+ /* range start */
+ err = response_get_column(&dev->parsed, &tok_n, OPAL_RANGESTART,
+ &lrst->range_start);
+ if (err)
+ return err;
+
+ /* range length */
+ err = response_get_column(&dev->parsed, &tok_n, OPAL_RANGELENGTH,
+ &lrst->range_length);
+ if (err)
+ return err;
+
+ /* RLE */
+ err = response_get_column(&dev->parsed, &tok_n, OPAL_READLOCKENABLED,
+ &resp);
+ if (err)
+ return err;
+
+ lrst->RLE = !!resp;
+
+ /* WLE */
+ err = response_get_column(&dev->parsed, &tok_n, OPAL_WRITELOCKENABLED,
+ &resp);
+ if (err)
+ return err;
+
+ lrst->WLE = !!resp;
+
+ /* read locked */
+ err = response_get_column(&dev->parsed, &tok_n, OPAL_READLOCKED, &resp);
+ if (err)
+ return err;
+
+ rlocked = !!resp;
+
+ /* write locked */
+ err = response_get_column(&dev->parsed, &tok_n, OPAL_WRITELOCKED, &resp);
+ if (err)
+ return err;
+
+ wlocked = !!resp;
+
+ /* opal_lock_state can not map 'read locked' only state. */
+ lrst->l_state = OPAL_RW;
+ if (rlocked && wlocked)
+ lrst->l_state = OPAL_LK;
+ else if (wlocked)
+ lrst->l_state = OPAL_RO;
+ else if (rlocked) {
+ pr_debug("Can not report read locked only state.\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
static int start_generic_opal_session(struct opal_dev *dev,
enum opal_uid auth,
enum opal_uid sp_type,
@@ -1759,25 +1890,43 @@ static int set_sid_cpin_pin(struct opal_dev *dev, void *data)
return finalize_and_send(dev, parse_and_check_status);
}
-static int add_user_to_lr(struct opal_dev *dev, void *data)
+static void add_authority_object_ref(int *err,
+ struct opal_dev *dev,
+ const u8 *uid,
+ size_t uid_len)
+{
+ add_token_u8(err, dev, OPAL_STARTNAME);
+ add_token_bytestring(err, dev,
+ opaluid[OPAL_HALF_UID_AUTHORITY_OBJ_REF],
+ OPAL_UID_LENGTH/2);
+ add_token_bytestring(err, dev, uid, uid_len);
+ add_token_u8(err, dev, OPAL_ENDNAME);
+}
+
+static void add_boolean_object_ref(int *err,
+ struct opal_dev *dev,
+ u8 boolean_op)
+{
+ add_token_u8(err, dev, OPAL_STARTNAME);
+ add_token_bytestring(err, dev, opaluid[OPAL_HALF_UID_BOOLEAN_ACE],
+ OPAL_UID_LENGTH/2);
+ add_token_u8(err, dev, boolean_op);
+ add_token_u8(err, dev, OPAL_ENDNAME);
+}
+
+static int set_lr_boolean_ace(struct opal_dev *dev,
+ unsigned int opal_uid,
+ u8 lr,
+ const u8 *users,
+ size_t users_len)
{
u8 lr_buffer[OPAL_UID_LENGTH];
u8 user_uid[OPAL_UID_LENGTH];
- struct opal_lock_unlock *lkul = data;
+ u8 u;
int err;
- memcpy(lr_buffer, opaluid[OPAL_LOCKINGRANGE_ACE_RDLOCKED],
- OPAL_UID_LENGTH);
-
- if (lkul->l_state == OPAL_RW)
- memcpy(lr_buffer, opaluid[OPAL_LOCKINGRANGE_ACE_WRLOCKED],
- OPAL_UID_LENGTH);
-
- lr_buffer[7] = lkul->session.opal_key.lr;
-
- memcpy(user_uid, opaluid[OPAL_USER1_UID], OPAL_UID_LENGTH);
-
- user_uid[7] = lkul->session.who;
+ memcpy(lr_buffer, opaluid[opal_uid], OPAL_UID_LENGTH);
+ lr_buffer[7] = lr;
err = cmd_start(dev, lr_buffer, opalmethod[OPAL_SET]);
@@ -1790,35 +1939,49 @@ static int add_user_to_lr(struct opal_dev *dev, void *data)
add_token_u8(&err, dev, OPAL_STARTLIST);
+ for (u = 0; u < users_len; u++) {
+ if (users[u] == OPAL_ADMIN1)
+ memcpy(user_uid, opaluid[OPAL_ADMIN1_UID],
+ OPAL_UID_LENGTH);
+ else {
+ memcpy(user_uid, opaluid[OPAL_USER1_UID],
+ OPAL_UID_LENGTH);
+ user_uid[7] = users[u];
+ }
- add_token_u8(&err, dev, OPAL_STARTNAME);
- add_token_bytestring(&err, dev,
- opaluid[OPAL_HALF_UID_AUTHORITY_OBJ_REF],
- OPAL_UID_LENGTH/2);
- add_token_bytestring(&err, dev, user_uid, OPAL_UID_LENGTH);
- add_token_u8(&err, dev, OPAL_ENDNAME);
+ add_authority_object_ref(&err, dev, user_uid, sizeof(user_uid));
-
- add_token_u8(&err, dev, OPAL_STARTNAME);
- add_token_bytestring(&err, dev,
- opaluid[OPAL_HALF_UID_AUTHORITY_OBJ_REF],
- OPAL_UID_LENGTH/2);
- add_token_bytestring(&err, dev, user_uid, OPAL_UID_LENGTH);
- add_token_u8(&err, dev, OPAL_ENDNAME);
-
-
- add_token_u8(&err, dev, OPAL_STARTNAME);
- add_token_bytestring(&err, dev, opaluid[OPAL_HALF_UID_BOOLEAN_ACE],
- OPAL_UID_LENGTH/2);
- add_token_u8(&err, dev, 1);
- add_token_u8(&err, dev, OPAL_ENDNAME);
-
+ /*
+ * Add boolean operator in postfix only with
+ * two or more authorities being added in ACE
+ * expresion.
+ * */
+ if (u > 0)
+ add_boolean_object_ref(&err, dev, OPAL_BOOLEAN_OR);
+ }
add_token_u8(&err, dev, OPAL_ENDLIST);
add_token_u8(&err, dev, OPAL_ENDNAME);
add_token_u8(&err, dev, OPAL_ENDLIST);
add_token_u8(&err, dev, OPAL_ENDNAME);
+ return err;
+}
+
+static int add_user_to_lr(struct opal_dev *dev, void *data)
+{
+ int err;
+ struct opal_lock_unlock *lkul = data;
+ const u8 users[] = {
+ lkul->session.who
+ };
+
+ err = set_lr_boolean_ace(dev,
+ lkul->l_state == OPAL_RW ?
+ OPAL_LOCKINGRANGE_ACE_WRLOCKED :
+ OPAL_LOCKINGRANGE_ACE_RDLOCKED,
+ lkul->session.opal_key.lr, users,
+ ARRAY_SIZE(users));
if (err) {
pr_debug("Error building add user to locking range command.\n");
return err;
@@ -1827,6 +1990,27 @@ static int add_user_to_lr(struct opal_dev *dev, void *data)
return finalize_and_send(dev, parse_and_check_status);
}
+static int add_user_to_lr_ace(struct opal_dev *dev, void *data)
+{
+ int err;
+ struct opal_lock_unlock *lkul = data;
+ const u8 users[] = {
+ OPAL_ADMIN1,
+ lkul->session.who
+ };
+
+ err = set_lr_boolean_ace(dev, OPAL_LOCKINGRANGE_ACE_START_TO_KEY,
+ lkul->session.opal_key.lr, users,
+ ARRAY_SIZE(users));
+
+ if (err) {
+ pr_debug("Error building add user to locking ranges ACEs.\n");
+ return err;
+ }
+
+ return finalize_and_send(dev, parse_and_check_status);
+}
+
static int lock_unlock_locking_range(struct opal_dev *dev, void *data)
{
u8 lr_buffer[OPAL_UID_LENGTH];
@@ -2364,6 +2548,7 @@ static int opal_add_user_to_lr(struct opal_dev *dev,
const struct opal_step steps[] = {
{ start_admin1LSP_opal_session, &lk_unlk->session.opal_key },
{ add_user_to_lr, lk_unlk },
+ { add_user_to_lr_ace, lk_unlk },
{ end_opal_session, }
};
int ret;
@@ -2580,6 +2765,33 @@ static int opal_setup_locking_range(struct opal_dev *dev,
return ret;
}
+static int opal_locking_range_status(struct opal_dev *dev,
+ struct opal_lr_status *opal_lrst,
+ void __user *data)
+{
+ const struct opal_step lr_steps[] = {
+ { start_auth_opal_session, &opal_lrst->session },
+ { locking_range_status, opal_lrst },
+ { end_opal_session, }
+ };
+ int ret;
+
+ mutex_lock(&dev->dev_lock);
+ setup_opal_dev(dev);
+ ret = execute_steps(dev, lr_steps, ARRAY_SIZE(lr_steps));
+ mutex_unlock(&dev->dev_lock);
+
+ /* skip session info when copying back to uspace */
+ if (!ret && copy_to_user(data + offsetof(struct opal_lr_status, range_start),
+ (void *)opal_lrst + offsetof(struct opal_lr_status, range_start),
+ sizeof(*opal_lrst) - offsetof(struct opal_lr_status, range_start))) {
+ pr_debug("Error copying status to userspace\n");
+ return -EFAULT;
+ }
+
+ return ret;
+}
+
static int opal_set_new_pw(struct opal_dev *dev, struct opal_new_pw *opal_pw)
{
const struct opal_step pw_steps[] = {
@@ -2814,6 +3026,9 @@ int sed_ioctl(struct opal_dev *dev, unsigned int cmd, void __user *arg)
case IOC_OPAL_GET_STATUS:
ret = opal_get_status(dev, arg);
break;
+ case IOC_OPAL_GET_LR_STATUS:
+ ret = opal_locking_range_status(dev, p, arg);
+ break;
default:
break;
}
diff --git a/drivers/accel/ivpu/ivpu_drv.c b/drivers/accel/ivpu/ivpu_drv.c
index 231f29b..6a320a7 100644
--- a/drivers/accel/ivpu/ivpu_drv.c
+++ b/drivers/accel/ivpu/ivpu_drv.c
@@ -8,7 +8,6 @@
#include <linux/pci.h>
#include <drm/drm_accel.h>
-#include <drm/drm_drv.h>
#include <drm/drm_file.h>
#include <drm/drm_gem.h>
#include <drm/drm_ioctl.h>
@@ -118,6 +117,10 @@ static int ivpu_get_param_ioctl(struct drm_device *dev, void *data, struct drm_f
struct pci_dev *pdev = to_pci_dev(vdev->drm.dev);
struct drm_ivpu_param *args = data;
int ret = 0;
+ int idx;
+
+ if (!drm_dev_enter(dev, &idx))
+ return -ENODEV;
switch (args->param) {
case DRM_IVPU_PARAM_DEVICE_ID:
@@ -171,6 +174,7 @@ static int ivpu_get_param_ioctl(struct drm_device *dev, void *data, struct drm_f
break;
}
+ drm_dev_exit(idx);
return ret;
}
@@ -470,8 +474,8 @@ static int ivpu_dev_init(struct ivpu_device *vdev)
vdev->hw->ops = &ivpu_hw_mtl_ops;
vdev->platform = IVPU_PLATFORM_INVALID;
- vdev->context_xa_limit.min = IVPU_GLOBAL_CONTEXT_MMU_SSID + 1;
- vdev->context_xa_limit.max = IVPU_CONTEXT_LIMIT;
+ vdev->context_xa_limit.min = IVPU_USER_CONTEXT_MIN_SSID;
+ vdev->context_xa_limit.max = IVPU_USER_CONTEXT_MAX_SSID;
atomic64_set(&vdev->unique_id_counter, 0);
xa_init_flags(&vdev->context_xa, XA_FLAGS_ALLOC);
xa_init_flags(&vdev->submitted_jobs_xa, XA_FLAGS_ALLOC1);
@@ -565,6 +569,8 @@ static int ivpu_dev_init(struct ivpu_device *vdev)
ivpu_mmu_global_context_fini(vdev);
err_power_down:
ivpu_hw_power_down(vdev);
+ if (IVPU_WA(d3hot_after_power_off))
+ pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
err_xa_destroy:
xa_destroy(&vdev->submitted_jobs_xa);
xa_destroy(&vdev->context_xa);
@@ -575,7 +581,11 @@ static void ivpu_dev_fini(struct ivpu_device *vdev)
{
ivpu_pm_disable(vdev);
ivpu_shutdown(vdev);
+ if (IVPU_WA(d3hot_after_power_off))
+ pci_set_power_state(to_pci_dev(vdev->drm.dev), PCI_D3hot);
ivpu_job_done_thread_fini(vdev);
+ ivpu_pm_cancel_recovery(vdev);
+
ivpu_ipc_fini(vdev);
ivpu_fw_fini(vdev);
ivpu_mmu_global_context_fini(vdev);
@@ -622,7 +632,7 @@ static void ivpu_remove(struct pci_dev *pdev)
{
struct ivpu_device *vdev = pci_get_drvdata(pdev);
- drm_dev_unregister(&vdev->drm);
+ drm_dev_unplug(&vdev->drm);
ivpu_dev_fini(vdev);
}
diff --git a/drivers/accel/ivpu/ivpu_drv.h b/drivers/accel/ivpu/ivpu_drv.h
index f47b496..d3013fb 100644
--- a/drivers/accel/ivpu/ivpu_drv.h
+++ b/drivers/accel/ivpu/ivpu_drv.h
@@ -7,6 +7,7 @@
#define __IVPU_DRV_H__
#include <drm/drm_device.h>
+#include <drm/drm_drv.h>
#include <drm/drm_managed.h>
#include <drm/drm_mm.h>
#include <drm/drm_print.h>
@@ -24,7 +25,10 @@
#define PCI_DEVICE_ID_MTL 0x7d1d
#define IVPU_GLOBAL_CONTEXT_MMU_SSID 0
-#define IVPU_CONTEXT_LIMIT 64
+/* SSID 1 is used by the VPU to represent invalid context */
+#define IVPU_USER_CONTEXT_MIN_SSID 2
+#define IVPU_USER_CONTEXT_MAX_SSID (IVPU_USER_CONTEXT_MIN_SSID + 63)
+
#define IVPU_NUM_ENGINES 2
#define IVPU_PLATFORM_SILICON 0
@@ -70,6 +74,7 @@
struct ivpu_wa_table {
bool punit_disabled;
bool clear_runtime_mem;
+ bool d3hot_after_power_off;
};
struct ivpu_hw_info;
diff --git a/drivers/accel/ivpu/ivpu_hw_mtl.c b/drivers/accel/ivpu/ivpu_hw_mtl.c
index 62bfaa9..382ec12 100644
--- a/drivers/accel/ivpu/ivpu_hw_mtl.c
+++ b/drivers/accel/ivpu/ivpu_hw_mtl.c
@@ -12,24 +12,23 @@
#include "ivpu_mmu.h"
#include "ivpu_pm.h"
-#define TILE_FUSE_ENABLE_BOTH 0x0
-#define TILE_FUSE_ENABLE_UPPER 0x1
-#define TILE_FUSE_ENABLE_LOWER 0x2
-
-#define TILE_SKU_BOTH_MTL 0x3630
-#define TILE_SKU_LOWER_MTL 0x3631
-#define TILE_SKU_UPPER_MTL 0x3632
+#define TILE_FUSE_ENABLE_BOTH 0x0
+#define TILE_SKU_BOTH_MTL 0x3630
/* Work point configuration values */
-#define WP_CONFIG_1_TILE_5_3_RATIO 0x0101
-#define WP_CONFIG_1_TILE_4_3_RATIO 0x0102
-#define WP_CONFIG_2_TILE_5_3_RATIO 0x0201
-#define WP_CONFIG_2_TILE_4_3_RATIO 0x0202
-#define WP_CONFIG_0_TILE_PLL_OFF 0x0000
+#define CONFIG_1_TILE 0x01
+#define CONFIG_2_TILE 0x02
+#define PLL_RATIO_5_3 0x01
+#define PLL_RATIO_4_3 0x02
+#define WP_CONFIG(tile, ratio) (((tile) << 8) | (ratio))
+#define WP_CONFIG_1_TILE_5_3_RATIO WP_CONFIG(CONFIG_1_TILE, PLL_RATIO_5_3)
+#define WP_CONFIG_1_TILE_4_3_RATIO WP_CONFIG(CONFIG_1_TILE, PLL_RATIO_4_3)
+#define WP_CONFIG_2_TILE_5_3_RATIO WP_CONFIG(CONFIG_2_TILE, PLL_RATIO_5_3)
+#define WP_CONFIG_2_TILE_4_3_RATIO WP_CONFIG(CONFIG_2_TILE, PLL_RATIO_4_3)
+#define WP_CONFIG_0_TILE_PLL_OFF WP_CONFIG(0, 0)
#define PLL_REF_CLK_FREQ (50 * 1000000)
#define PLL_SIMULATION_FREQ (10 * 1000000)
-#define PLL_RATIO_TO_FREQ(x) ((x) * PLL_REF_CLK_FREQ)
#define PLL_DEFAULT_EPP_VALUE 0x80
#define TIM_SAFE_ENABLE 0xf1d0dead
@@ -101,6 +100,7 @@ static void ivpu_hw_wa_init(struct ivpu_device *vdev)
{
vdev->wa.punit_disabled = ivpu_is_fpga(vdev);
vdev->wa.clear_runtime_mem = false;
+ vdev->wa.d3hot_after_power_off = true;
}
static void ivpu_hw_timeouts_init(struct ivpu_device *vdev)
@@ -218,7 +218,8 @@ static int ivpu_pll_drive(struct ivpu_device *vdev, bool enable)
config = 0;
}
- ivpu_dbg(vdev, PM, "PLL workpoint request: %d Hz\n", PLL_RATIO_TO_FREQ(target_ratio));
+ ivpu_dbg(vdev, PM, "PLL workpoint request: config 0x%04x pll ratio 0x%x\n",
+ config, target_ratio);
ret = ivpu_pll_cmd_send(vdev, hw->pll.min_ratio, hw->pll.max_ratio, target_ratio, config);
if (ret) {
@@ -403,11 +404,6 @@ static int ivpu_boot_host_ss_axi_enable(struct ivpu_device *vdev)
return ivpu_boot_host_ss_axi_drive(vdev, true);
}
-static int ivpu_boot_host_ss_axi_disable(struct ivpu_device *vdev)
-{
- return ivpu_boot_host_ss_axi_drive(vdev, false);
-}
-
static int ivpu_boot_host_ss_top_noc_drive(struct ivpu_device *vdev, bool enable)
{
int ret;
@@ -441,11 +437,6 @@ static int ivpu_boot_host_ss_top_noc_enable(struct ivpu_device *vdev)
return ivpu_boot_host_ss_top_noc_drive(vdev, true);
}
-static int ivpu_boot_host_ss_top_noc_disable(struct ivpu_device *vdev)
-{
- return ivpu_boot_host_ss_top_noc_drive(vdev, false);
-}
-
static void ivpu_boot_pwr_island_trickle_drive(struct ivpu_device *vdev, bool enable)
{
u32 val = REGV_RD32(MTL_VPU_HOST_SS_AON_PWR_ISLAND_TRICKLE_EN0);
@@ -504,16 +495,6 @@ static void ivpu_boot_dpu_active_drive(struct ivpu_device *vdev, bool enable)
REGV_WR32(MTL_VPU_HOST_SS_AON_DPU_ACTIVE, val);
}
-static int ivpu_boot_pwr_domain_disable(struct ivpu_device *vdev)
-{
- ivpu_boot_dpu_active_drive(vdev, false);
- ivpu_boot_pwr_island_isolation_drive(vdev, true);
- ivpu_boot_pwr_island_trickle_drive(vdev, false);
- ivpu_boot_pwr_island_drive(vdev, false);
-
- return ivpu_boot_wait_for_pwr_island_status(vdev, 0x0);
-}
-
static int ivpu_boot_pwr_domain_enable(struct ivpu_device *vdev)
{
int ret;
@@ -629,34 +610,10 @@ static int ivpu_boot_d0i3_drive(struct ivpu_device *vdev, bool enable)
static int ivpu_hw_mtl_info_init(struct ivpu_device *vdev)
{
struct ivpu_hw_info *hw = vdev->hw;
- u32 tile_fuse;
- tile_fuse = REGB_RD32(MTL_BUTTRESS_TILE_FUSE);
- if (!REG_TEST_FLD(MTL_BUTTRESS_TILE_FUSE, VALID, tile_fuse))
- ivpu_warn(vdev, "Tile Fuse: Invalid (0x%x)\n", tile_fuse);
-
- hw->tile_fuse = REG_GET_FLD(MTL_BUTTRESS_TILE_FUSE, SKU, tile_fuse);
- switch (hw->tile_fuse) {
- case TILE_FUSE_ENABLE_LOWER:
- hw->sku = TILE_SKU_LOWER_MTL;
- hw->config = WP_CONFIG_1_TILE_5_3_RATIO;
- ivpu_dbg(vdev, MISC, "Tile Fuse: Enable Lower\n");
- break;
- case TILE_FUSE_ENABLE_UPPER:
- hw->sku = TILE_SKU_UPPER_MTL;
- hw->config = WP_CONFIG_1_TILE_4_3_RATIO;
- ivpu_dbg(vdev, MISC, "Tile Fuse: Enable Upper\n");
- break;
- case TILE_FUSE_ENABLE_BOTH:
- hw->sku = TILE_SKU_BOTH_MTL;
- hw->config = WP_CONFIG_2_TILE_5_3_RATIO;
- ivpu_dbg(vdev, MISC, "Tile Fuse: Enable Both\n");
- break;
- default:
- hw->config = WP_CONFIG_0_TILE_PLL_OFF;
- ivpu_dbg(vdev, MISC, "Tile Fuse: Disable\n");
- break;
- }
+ hw->tile_fuse = TILE_FUSE_ENABLE_BOTH;
+ hw->sku = TILE_SKU_BOTH_MTL;
+ hw->config = WP_CONFIG_2_TILE_4_3_RATIO;
ivpu_pll_init_frequency_ratios(vdev);
@@ -797,21 +754,8 @@ static int ivpu_hw_mtl_power_down(struct ivpu_device *vdev)
{
int ret = 0;
- /* FPGA requires manual clearing of IP_Reset bit by enabling quiescent state */
- if (ivpu_is_fpga(vdev)) {
- if (ivpu_boot_host_ss_top_noc_disable(vdev)) {
- ivpu_err(vdev, "Failed to disable TOP NOC\n");
- ret = -EIO;
- }
-
- if (ivpu_boot_host_ss_axi_disable(vdev)) {
- ivpu_err(vdev, "Failed to disable AXI\n");
- ret = -EIO;
- }
- }
-
- if (ivpu_boot_pwr_domain_disable(vdev)) {
- ivpu_err(vdev, "Failed to disable power domain\n");
+ if (ivpu_hw_mtl_reset(vdev)) {
+ ivpu_err(vdev, "Failed to reset the VPU\n");
ret = -EIO;
}
@@ -844,6 +788,19 @@ static void ivpu_hw_mtl_wdt_disable(struct ivpu_device *vdev)
REGV_WR32(MTL_VPU_CPU_SS_TIM_GEN_CONFIG, val);
}
+static u32 ivpu_hw_mtl_pll_to_freq(u32 ratio, u32 config)
+{
+ u32 pll_clock = PLL_REF_CLK_FREQ * ratio;
+ u32 cpu_clock;
+
+ if ((config & 0xff) == PLL_RATIO_4_3)
+ cpu_clock = pll_clock * 2 / 4;
+ else
+ cpu_clock = pll_clock * 2 / 5;
+
+ return cpu_clock;
+}
+
/* Register indirect accesses */
static u32 ivpu_hw_mtl_reg_pll_freq_get(struct ivpu_device *vdev)
{
@@ -855,7 +812,7 @@ static u32 ivpu_hw_mtl_reg_pll_freq_get(struct ivpu_device *vdev)
if (!ivpu_is_silicon(vdev))
return PLL_SIMULATION_FREQ;
- return PLL_RATIO_TO_FREQ(pll_curr_ratio);
+ return ivpu_hw_mtl_pll_to_freq(pll_curr_ratio, vdev->hw->config);
}
static u32 ivpu_hw_mtl_reg_telemetry_offset_get(struct ivpu_device *vdev)
diff --git a/drivers/accel/ivpu/ivpu_ipc.h b/drivers/accel/ivpu/ivpu_ipc.h
index 9838202..68f5b66 100644
--- a/drivers/accel/ivpu/ivpu_ipc.h
+++ b/drivers/accel/ivpu/ivpu_ipc.h
@@ -21,7 +21,7 @@ struct ivpu_bo;
#define IVPU_IPC_ALIGNMENT 64
#define IVPU_IPC_HDR_FREE 0
-#define IVPU_IPC_HDR_ALLOCATED 0
+#define IVPU_IPC_HDR_ALLOCATED 1
/**
* struct ivpu_ipc_hdr - The IPC message header structure, exchanged
diff --git a/drivers/accel/ivpu/ivpu_job.c b/drivers/accel/ivpu/ivpu_job.c
index 94068ae..910301f 100644
--- a/drivers/accel/ivpu/ivpu_job.c
+++ b/drivers/accel/ivpu/ivpu_job.c
@@ -489,12 +489,12 @@ ivpu_job_prepare_bos_for_submit(struct drm_file *file, struct ivpu_job *job, u32
int ivpu_submit_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
{
- int ret = 0;
struct ivpu_file_priv *file_priv = file->driver_priv;
struct ivpu_device *vdev = file_priv->vdev;
struct drm_ivpu_submit *params = data;
struct ivpu_job *job;
u32 *buf_handles;
+ int idx, ret;
if (params->engine > DRM_IVPU_ENGINE_COPY)
return -EINVAL;
@@ -523,6 +523,11 @@ int ivpu_submit_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
goto free_handles;
}
+ if (!drm_dev_enter(&vdev->drm, &idx)) {
+ ret = -ENODEV;
+ goto free_handles;
+ }
+
ivpu_dbg(vdev, JOB, "Submit ioctl: ctx %u buf_count %u\n",
file_priv->ctx.id, params->buffer_count);
@@ -530,7 +535,7 @@ int ivpu_submit_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
if (!job) {
ivpu_err(vdev, "Failed to create job\n");
ret = -ENOMEM;
- goto free_handles;
+ goto dev_exit;
}
ret = ivpu_job_prepare_bos_for_submit(file, job, buf_handles, params->buffer_count,
@@ -548,6 +553,8 @@ int ivpu_submit_ioctl(struct drm_device *dev, void *data, struct drm_file *file)
job_put:
job_put(job);
+dev_exit:
+ drm_dev_exit(idx);
free_handles:
kfree(buf_handles);
diff --git a/drivers/accel/ivpu/ivpu_pm.c b/drivers/accel/ivpu/ivpu_pm.c
index 553bcbd..7df72fa 100644
--- a/drivers/accel/ivpu/ivpu_pm.c
+++ b/drivers/accel/ivpu/ivpu_pm.c
@@ -98,12 +98,18 @@ static int ivpu_resume(struct ivpu_device *vdev)
static void ivpu_pm_recovery_work(struct work_struct *work)
{
struct ivpu_pm_info *pm = container_of(work, struct ivpu_pm_info, recovery_work);
- struct ivpu_device *vdev = pm->vdev;
+ struct ivpu_device *vdev = pm->vdev;
char *evt[2] = {"IVPU_PM_EVENT=IVPU_RECOVER", NULL};
int ret;
- ret = pci_reset_function(to_pci_dev(vdev->drm.dev));
- if (ret)
+retry:
+ ret = pci_try_reset_function(to_pci_dev(vdev->drm.dev));
+ if (ret == -EAGAIN && !drm_dev_is_unplugged(&vdev->drm)) {
+ cond_resched();
+ goto retry;
+ }
+
+ if (ret && ret != -EAGAIN)
ivpu_err(vdev, "Failed to reset VPU: %d\n", ret);
kobject_uevent_env(&vdev->drm.dev->kobj, KOBJ_CHANGE, evt);
@@ -306,6 +312,11 @@ int ivpu_pm_init(struct ivpu_device *vdev)
return 0;
}
+void ivpu_pm_cancel_recovery(struct ivpu_device *vdev)
+{
+ cancel_work_sync(&vdev->pm->recovery_work);
+}
+
void ivpu_pm_enable(struct ivpu_device *vdev)
{
struct device *dev = vdev->drm.dev;
diff --git a/drivers/accel/ivpu/ivpu_pm.h b/drivers/accel/ivpu/ivpu_pm.h
index dc1b375..baca981 100644
--- a/drivers/accel/ivpu/ivpu_pm.h
+++ b/drivers/accel/ivpu/ivpu_pm.h
@@ -21,6 +21,7 @@ struct ivpu_pm_info {
int ivpu_pm_init(struct ivpu_device *vdev);
void ivpu_pm_enable(struct ivpu_device *vdev);
void ivpu_pm_disable(struct ivpu_device *vdev);
+void ivpu_pm_cancel_recovery(struct ivpu_device *vdev);
int ivpu_pm_suspend_cb(struct device *dev);
int ivpu_pm_resume_cb(struct device *dev);
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c
index 9531dd0..a96da65 100644
--- a/drivers/acpi/bus.c
+++ b/drivers/acpi/bus.c
@@ -459,85 +459,67 @@ static void acpi_bus_osc_negotiate_usb_control(void)
Notification Handling
-------------------------------------------------------------------------- */
-/*
- * acpi_bus_notify
- * ---------------
- * Callback for all 'system-level' device notifications (values 0x00-0x7F).
+/**
+ * acpi_bus_notify - Global system-level (0x00-0x7F) notifications handler
+ * @handle: Target ACPI object.
+ * @type: Notification type.
+ * @data: Ignored.
+ *
+ * This only handles notifications related to device hotplug.
*/
static void acpi_bus_notify(acpi_handle handle, u32 type, void *data)
{
struct acpi_device *adev;
- u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE;
- bool hotplug_event = false;
switch (type) {
case ACPI_NOTIFY_BUS_CHECK:
acpi_handle_debug(handle, "ACPI_NOTIFY_BUS_CHECK event\n");
- hotplug_event = true;
break;
case ACPI_NOTIFY_DEVICE_CHECK:
acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK event\n");
- hotplug_event = true;
break;
case ACPI_NOTIFY_DEVICE_WAKE:
acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_WAKE event\n");
- break;
+ return;
case ACPI_NOTIFY_EJECT_REQUEST:
acpi_handle_debug(handle, "ACPI_NOTIFY_EJECT_REQUEST event\n");
- hotplug_event = true;
break;
case ACPI_NOTIFY_DEVICE_CHECK_LIGHT:
acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT event\n");
/* TBD: Exactly what does 'light' mean? */
- break;
+ return;
case ACPI_NOTIFY_FREQUENCY_MISMATCH:
acpi_handle_err(handle, "Device cannot be configured due "
"to a frequency mismatch\n");
- break;
+ return;
case ACPI_NOTIFY_BUS_MODE_MISMATCH:
acpi_handle_err(handle, "Device cannot be configured due "
"to a bus mode mismatch\n");
- break;
+ return;
case ACPI_NOTIFY_POWER_FAULT:
acpi_handle_err(handle, "Device has suffered a power fault\n");
- break;
+ return;
default:
acpi_handle_debug(handle, "Unknown event type 0x%x\n", type);
- break;
- }
-
- adev = acpi_get_acpi_dev(handle);
- if (!adev)
- goto err;
-
- if (adev->dev.driver) {
- struct acpi_driver *driver = to_acpi_driver(adev->dev.driver);
-
- if (driver && driver->ops.notify &&
- (driver->flags & ACPI_DRIVER_ALL_NOTIFY_EVENTS))
- driver->ops.notify(adev, type);
- }
-
- if (!hotplug_event) {
- acpi_put_acpi_dev(adev);
return;
}
- if (ACPI_SUCCESS(acpi_hotplug_schedule(adev, type)))
+ adev = acpi_get_acpi_dev(handle);
+
+ if (adev && ACPI_SUCCESS(acpi_hotplug_schedule(adev, type)))
return;
acpi_put_acpi_dev(adev);
- err:
- acpi_evaluate_ost(handle, type, ost_code, NULL);
+ acpi_evaluate_ost(handle, type, ACPI_OST_SC_NON_SPECIFIC_FAILURE, NULL);
}
static void acpi_notify_device(acpi_handle handle, u32 event, void *data)
@@ -562,42 +544,51 @@ static u32 acpi_device_fixed_event(void *data)
return ACPI_INTERRUPT_HANDLED;
}
-static int acpi_device_install_notify_handler(struct acpi_device *device)
+static int acpi_device_install_notify_handler(struct acpi_device *device,
+ struct acpi_driver *acpi_drv)
{
acpi_status status;
- if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON)
+ if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON) {
status =
acpi_install_fixed_event_handler(ACPI_EVENT_POWER_BUTTON,
acpi_device_fixed_event,
device);
- else if (device->device_type == ACPI_BUS_TYPE_SLEEP_BUTTON)
+ } else if (device->device_type == ACPI_BUS_TYPE_SLEEP_BUTTON) {
status =
acpi_install_fixed_event_handler(ACPI_EVENT_SLEEP_BUTTON,
acpi_device_fixed_event,
device);
- else
- status = acpi_install_notify_handler(device->handle,
- ACPI_DEVICE_NOTIFY,
+ } else {
+ u32 type = acpi_drv->flags & ACPI_DRIVER_ALL_NOTIFY_EVENTS ?
+ ACPI_ALL_NOTIFY : ACPI_DEVICE_NOTIFY;
+
+ status = acpi_install_notify_handler(device->handle, type,
acpi_notify_device,
device);
+ }
if (ACPI_FAILURE(status))
return -EINVAL;
return 0;
}
-static void acpi_device_remove_notify_handler(struct acpi_device *device)
+static void acpi_device_remove_notify_handler(struct acpi_device *device,
+ struct acpi_driver *acpi_drv)
{
- if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON)
+ if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON) {
acpi_remove_fixed_event_handler(ACPI_EVENT_POWER_BUTTON,
acpi_device_fixed_event);
- else if (device->device_type == ACPI_BUS_TYPE_SLEEP_BUTTON)
+ } else if (device->device_type == ACPI_BUS_TYPE_SLEEP_BUTTON) {
acpi_remove_fixed_event_handler(ACPI_EVENT_SLEEP_BUTTON,
acpi_device_fixed_event);
- else
- acpi_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY,
+ } else {
+ u32 type = acpi_drv->flags & ACPI_DRIVER_ALL_NOTIFY_EVENTS ?
+ ACPI_ALL_NOTIFY : ACPI_DEVICE_NOTIFY;
+
+ acpi_remove_notify_handler(device->handle, type,
acpi_notify_device);
+ }
}
/* Handle events targeting \_SB device (at present only graceful shutdown) */
@@ -1039,7 +1030,7 @@ static int acpi_device_probe(struct device *dev)
acpi_drv->name, acpi_dev->pnp.bus_id);
if (acpi_drv->ops.notify) {
- ret = acpi_device_install_notify_handler(acpi_dev);
+ ret = acpi_device_install_notify_handler(acpi_dev, acpi_drv);
if (ret) {
if (acpi_drv->ops.remove)
acpi_drv->ops.remove(acpi_dev);
@@ -1062,7 +1053,7 @@ static void acpi_device_remove(struct device *dev)
struct acpi_driver *acpi_drv = to_acpi_driver(dev->driver);
if (acpi_drv->ops.notify)
- acpi_device_remove_notify_handler(acpi_dev);
+ acpi_device_remove_notify_handler(acpi_dev, acpi_drv);
if (acpi_drv->ops.remove)
acpi_drv->ops.remove(acpi_dev);
diff --git a/drivers/base/cacheinfo.c b/drivers/base/cacheinfo.c
index f6573c3..f3903d0 100644
--- a/drivers/base/cacheinfo.c
+++ b/drivers/base/cacheinfo.c
@@ -474,12 +474,18 @@ int detect_cache_attributes(unsigned int cpu)
populate_leaves:
/*
- * populate_cache_leaves() may completely setup the cache leaves and
- * shared_cpu_map or it may leave it partially setup.
+ * If LLC is valid the cache leaves were already populated so just go to
+ * update the cpu map.
*/
- ret = populate_cache_leaves(cpu);
- if (ret)
- goto free_ci;
+ if (!last_level_cache_is_valid(cpu)) {
+ /*
+ * populate_cache_leaves() may completely setup the cache leaves and
+ * shared_cpu_map or it may leave it partially setup.
+ */
+ ret = populate_cache_leaves(cpu);
+ if (ret)
+ goto free_ci;
+ }
/*
* For systems using DT for cache hierarchy, fw_token
diff --git a/drivers/block/drbd/drbd_actlog.c b/drivers/block/drbd/drbd_actlog.c
index 4292558..64b3a1c 100644
--- a/drivers/block/drbd/drbd_actlog.c
+++ b/drivers/block/drbd/drbd_actlog.c
@@ -735,8 +735,9 @@ static bool update_rs_extent(struct drbd_device *device,
return false;
}
-void drbd_advance_rs_marks(struct drbd_device *device, unsigned long still_to_go)
+void drbd_advance_rs_marks(struct drbd_peer_device *peer_device, unsigned long still_to_go)
{
+ struct drbd_device *device = peer_device->device;
unsigned long now = jiffies;
unsigned long last = device->rs_mark_time[device->rs_last_mark];
int next = (device->rs_last_mark + 1) % DRBD_SYNC_MARKS;
@@ -819,7 +820,7 @@ static int update_sync_bits(struct drbd_device *device,
if (mode == SET_IN_SYNC) {
unsigned long still_to_go = drbd_bm_total_weight(device);
bool rs_is_done = (still_to_go <= device->rs_failed);
- drbd_advance_rs_marks(device, still_to_go);
+ drbd_advance_rs_marks(first_peer_device(device), still_to_go);
if (cleared || rs_is_done)
maybe_schedule_on_disk_bitmap_update(device, rs_is_done);
} else if (mode == RECORD_RS_FAILED)
@@ -843,10 +844,11 @@ static bool plausible_request_size(int size)
* called by worker on C_SYNC_TARGET and receiver on SyncSource.
*
*/
-int __drbd_change_sync(struct drbd_device *device, sector_t sector, int size,
+int __drbd_change_sync(struct drbd_peer_device *peer_device, sector_t sector, int size,
enum update_sync_bits_mode mode)
{
/* Is called from worker and receiver context _only_ */
+ struct drbd_device *device = peer_device->device;
unsigned long sbnr, ebnr, lbnr;
unsigned long count = 0;
sector_t esector, nr_sectors;
@@ -1009,14 +1011,15 @@ int drbd_rs_begin_io(struct drbd_device *device, sector_t sector)
* tries to set it to BME_LOCKED. Returns 0 upon success, and -EAGAIN
* if there is still application IO going on in this area.
*/
-int drbd_try_rs_begin_io(struct drbd_device *device, sector_t sector)
+int drbd_try_rs_begin_io(struct drbd_peer_device *peer_device, sector_t sector)
{
+ struct drbd_device *device = peer_device->device;
unsigned int enr = BM_SECT_TO_EXT(sector);
const unsigned int al_enr = enr*AL_EXT_PER_BM_SECT;
struct lc_element *e;
struct bm_extent *bm_ext;
int i;
- bool throttle = drbd_rs_should_slow_down(device, sector, true);
+ bool throttle = drbd_rs_should_slow_down(peer_device, sector, true);
/* If we need to throttle, a half-locked (only marked BME_NO_WRITES,
* not yet BME_LOCKED) extent needs to be kicked out explicitly if we
diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index 289876f..6ac8c54 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -1216,7 +1216,9 @@ static int bm_rw(struct drbd_device *device, const unsigned int flags, unsigned
* drbd_bm_read() - Read the whole bitmap from its on disk location.
* @device: DRBD device.
*/
-int drbd_bm_read(struct drbd_device *device) __must_hold(local)
+int drbd_bm_read(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local)
+
{
return bm_rw(device, BM_AIO_READ, 0);
}
@@ -1227,7 +1229,8 @@ int drbd_bm_read(struct drbd_device *device) __must_hold(local)
*
* Will only write pages that have changed since last IO.
*/
-int drbd_bm_write(struct drbd_device *device) __must_hold(local)
+int drbd_bm_write(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local)
{
return bm_rw(device, 0, 0);
}
@@ -1238,7 +1241,8 @@ int drbd_bm_write(struct drbd_device *device) __must_hold(local)
*
* Will write all pages.
*/
-int drbd_bm_write_all(struct drbd_device *device) __must_hold(local)
+int drbd_bm_write_all(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local)
{
return bm_rw(device, BM_AIO_WRITE_ALL_PAGES, 0);
}
@@ -1264,7 +1268,8 @@ int drbd_bm_write_lazy(struct drbd_device *device, unsigned upper_idx) __must_ho
* verify is aborted due to a failed peer disk, while local IO continues, or
* pending resync acks are still being processed.
*/
-int drbd_bm_write_copy_pages(struct drbd_device *device) __must_hold(local)
+int drbd_bm_write_copy_pages(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local)
{
return bm_rw(device, BM_AIO_COPY_PAGES, 0);
}
diff --git a/drivers/block/drbd/drbd_int.h b/drivers/block/drbd/drbd_int.h
index d89b7d0..a30a5ed 100644
--- a/drivers/block/drbd/drbd_int.h
+++ b/drivers/block/drbd/drbd_int.h
@@ -66,6 +66,7 @@ extern int drbd_proc_details;
struct drbd_device;
struct drbd_connection;
+struct drbd_peer_device;
/* Defines to control fault insertion */
enum {
@@ -126,8 +127,8 @@ struct bm_xfer_ctx {
unsigned bytes[2];
};
-extern void INFO_bm_xfer_stats(struct drbd_device *device,
- const char *direction, struct bm_xfer_ctx *c);
+extern void INFO_bm_xfer_stats(struct drbd_peer_device *peer_device,
+ const char *direction, struct bm_xfer_ctx *c);
static inline void bm_xfer_ctx_bit_to_word_offset(struct bm_xfer_ctx *c)
{
@@ -541,9 +542,10 @@ struct drbd_md_io {
struct bm_io_work {
struct drbd_work w;
+ struct drbd_peer_device *peer_device;
char *why;
enum bm_flag flags;
- int (*io_fn)(struct drbd_device *device);
+ int (*io_fn)(struct drbd_device *device, struct drbd_peer_device *peer_device);
void (*done)(struct drbd_device *device, int rv);
};
@@ -1041,7 +1043,7 @@ extern int drbd_send_drequest_csum(struct drbd_peer_device *, sector_t sector,
enum drbd_packet cmd);
extern int drbd_send_ov_request(struct drbd_peer_device *, sector_t sector, int size);
-extern int drbd_send_bitmap(struct drbd_device *device);
+extern int drbd_send_bitmap(struct drbd_device *device, struct drbd_peer_device *peer_device);
extern void drbd_send_sr_reply(struct drbd_peer_device *, enum drbd_state_rv retcode);
extern void conn_send_sr_reply(struct drbd_connection *connection, enum drbd_state_rv retcode);
extern int drbd_send_rs_deallocated(struct drbd_peer_device *, struct drbd_peer_request *);
@@ -1065,17 +1067,22 @@ extern void drbd_md_clear_flag(struct drbd_device *device, int flags)__must_hold
extern int drbd_md_test_flag(struct drbd_backing_dev *, int);
extern void drbd_md_mark_dirty(struct drbd_device *device);
extern void drbd_queue_bitmap_io(struct drbd_device *device,
- int (*io_fn)(struct drbd_device *),
+ int (*io_fn)(struct drbd_device *, struct drbd_peer_device *),
void (*done)(struct drbd_device *, int),
- char *why, enum bm_flag flags);
+ char *why, enum bm_flag flags,
+ struct drbd_peer_device *peer_device);
extern int drbd_bitmap_io(struct drbd_device *device,
- int (*io_fn)(struct drbd_device *),
- char *why, enum bm_flag flags);
+ int (*io_fn)(struct drbd_device *, struct drbd_peer_device *),
+ char *why, enum bm_flag flags,
+ struct drbd_peer_device *peer_device);
extern int drbd_bitmap_io_from_worker(struct drbd_device *device,
- int (*io_fn)(struct drbd_device *),
- char *why, enum bm_flag flags);
-extern int drbd_bmio_set_n_write(struct drbd_device *device) __must_hold(local);
-extern int drbd_bmio_clear_n_write(struct drbd_device *device) __must_hold(local);
+ int (*io_fn)(struct drbd_device *, struct drbd_peer_device *),
+ char *why, enum bm_flag flags,
+ struct drbd_peer_device *peer_device);
+extern int drbd_bmio_set_n_write(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local);
+extern int drbd_bmio_clear_n_write(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local);
/* Meta data layout
*
@@ -1284,14 +1291,18 @@ extern void _drbd_bm_set_bits(struct drbd_device *device,
const unsigned long s, const unsigned long e);
extern int drbd_bm_test_bit(struct drbd_device *device, unsigned long bitnr);
extern int drbd_bm_e_weight(struct drbd_device *device, unsigned long enr);
-extern int drbd_bm_read(struct drbd_device *device) __must_hold(local);
+extern int drbd_bm_read(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local);
extern void drbd_bm_mark_for_writeout(struct drbd_device *device, int page_nr);
-extern int drbd_bm_write(struct drbd_device *device) __must_hold(local);
+extern int drbd_bm_write(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local);
extern void drbd_bm_reset_al_hints(struct drbd_device *device) __must_hold(local);
extern int drbd_bm_write_hinted(struct drbd_device *device) __must_hold(local);
extern int drbd_bm_write_lazy(struct drbd_device *device, unsigned upper_idx) __must_hold(local);
-extern int drbd_bm_write_all(struct drbd_device *device) __must_hold(local);
-extern int drbd_bm_write_copy_pages(struct drbd_device *device) __must_hold(local);
+extern int drbd_bm_write_all(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local);
+extern int drbd_bm_write_copy_pages(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local);
extern size_t drbd_bm_words(struct drbd_device *device);
extern unsigned long drbd_bm_bits(struct drbd_device *device);
extern sector_t drbd_bm_capacity(struct drbd_device *device);
@@ -1422,21 +1433,24 @@ void drbd_resync_after_changed(struct drbd_device *device);
extern void drbd_start_resync(struct drbd_device *device, enum drbd_conns side);
extern void resume_next_sg(struct drbd_device *device);
extern void suspend_other_sg(struct drbd_device *device);
-extern int drbd_resync_finished(struct drbd_device *device);
+extern int drbd_resync_finished(struct drbd_peer_device *peer_device);
/* maybe rather drbd_main.c ? */
extern void *drbd_md_get_buffer(struct drbd_device *device, const char *intent);
extern void drbd_md_put_buffer(struct drbd_device *device);
extern int drbd_md_sync_page_io(struct drbd_device *device,
struct drbd_backing_dev *bdev, sector_t sector, enum req_op op);
-extern void drbd_ov_out_of_sync_found(struct drbd_device *, sector_t, int);
+extern void drbd_ov_out_of_sync_found(struct drbd_peer_device *peer_device,
+ sector_t sector, int size);
extern void wait_until_done_or_force_detached(struct drbd_device *device,
struct drbd_backing_dev *bdev, unsigned int *done);
-extern void drbd_rs_controller_reset(struct drbd_device *device);
+extern void drbd_rs_controller_reset(struct drbd_peer_device *peer_device);
-static inline void ov_out_of_sync_print(struct drbd_device *device)
+static inline void ov_out_of_sync_print(struct drbd_peer_device *peer_device)
{
+ struct drbd_device *device = peer_device->device;
+
if (device->ov_last_oos_size) {
- drbd_err(device, "Out of sync: start=%llu, size=%lu (sectors)\n",
+ drbd_err(peer_device, "Out of sync: start=%llu, size=%lu (sectors)\n",
(unsigned long long)device->ov_last_oos_start,
(unsigned long)device->ov_last_oos_size);
}
@@ -1475,7 +1489,7 @@ extern int drbd_ack_receiver(struct drbd_thread *thi);
extern void drbd_send_ping_wf(struct work_struct *ws);
extern void drbd_send_acks_wf(struct work_struct *ws);
extern bool drbd_rs_c_min_rate_throttle(struct drbd_device *device);
-extern bool drbd_rs_should_slow_down(struct drbd_device *device, sector_t sector,
+extern bool drbd_rs_should_slow_down(struct drbd_peer_device *peer_device, sector_t sector,
bool throttle_if_app_is_waiting);
extern int drbd_submit_peer_request(struct drbd_peer_request *peer_req);
extern int drbd_free_peer_reqs(struct drbd_device *, struct list_head *);
@@ -1531,22 +1545,22 @@ extern void drbd_al_begin_io(struct drbd_device *device, struct drbd_interval *i
extern void drbd_al_complete_io(struct drbd_device *device, struct drbd_interval *i);
extern void drbd_rs_complete_io(struct drbd_device *device, sector_t sector);
extern int drbd_rs_begin_io(struct drbd_device *device, sector_t sector);
-extern int drbd_try_rs_begin_io(struct drbd_device *device, sector_t sector);
+extern int drbd_try_rs_begin_io(struct drbd_peer_device *peer_device, sector_t sector);
extern void drbd_rs_cancel_all(struct drbd_device *device);
extern int drbd_rs_del_all(struct drbd_device *device);
-extern void drbd_rs_failed_io(struct drbd_device *device,
+extern void drbd_rs_failed_io(struct drbd_peer_device *peer_device,
sector_t sector, int size);
-extern void drbd_advance_rs_marks(struct drbd_device *device, unsigned long still_to_go);
+extern void drbd_advance_rs_marks(struct drbd_peer_device *peer_device, unsigned long still_to_go);
enum update_sync_bits_mode { RECORD_RS_FAILED, SET_OUT_OF_SYNC, SET_IN_SYNC };
-extern int __drbd_change_sync(struct drbd_device *device, sector_t sector, int size,
+extern int __drbd_change_sync(struct drbd_peer_device *peer_device, sector_t sector, int size,
enum update_sync_bits_mode mode);
-#define drbd_set_in_sync(device, sector, size) \
- __drbd_change_sync(device, sector, size, SET_IN_SYNC)
-#define drbd_set_out_of_sync(device, sector, size) \
- __drbd_change_sync(device, sector, size, SET_OUT_OF_SYNC)
-#define drbd_rs_failed_io(device, sector, size) \
- __drbd_change_sync(device, sector, size, RECORD_RS_FAILED)
+#define drbd_set_in_sync(peer_device, sector, size) \
+ __drbd_change_sync(peer_device, sector, size, SET_IN_SYNC)
+#define drbd_set_out_of_sync(peer_device, sector, size) \
+ __drbd_change_sync(peer_device, sector, size, SET_OUT_OF_SYNC)
+#define drbd_rs_failed_io(peer_device, sector, size) \
+ __drbd_change_sync(peer_device, sector, size, RECORD_RS_FAILED)
extern void drbd_al_shrink(struct drbd_device *device);
extern int drbd_al_initialize(struct drbd_device *, void *);
@@ -1918,18 +1932,14 @@ static inline void inc_ap_pending(struct drbd_device *device)
atomic_inc(&device->ap_pending_cnt);
}
-#define ERR_IF_CNT_IS_NEGATIVE(which, func, line) \
- if (atomic_read(&device->which) < 0) \
- drbd_err(device, "in %s:%d: " #which " = %d < 0 !\n", \
- func, line, \
- atomic_read(&device->which))
-
-#define dec_ap_pending(device) _dec_ap_pending(device, __func__, __LINE__)
-static inline void _dec_ap_pending(struct drbd_device *device, const char *func, int line)
+#define dec_ap_pending(device) ((void)expect((device), __dec_ap_pending(device) >= 0))
+static inline int __dec_ap_pending(struct drbd_device *device)
{
- if (atomic_dec_and_test(&device->ap_pending_cnt))
+ int ap_pending_cnt = atomic_dec_return(&device->ap_pending_cnt);
+
+ if (ap_pending_cnt == 0)
wake_up(&device->misc_wait);
- ERR_IF_CNT_IS_NEGATIVE(ap_pending_cnt, func, line);
+ return ap_pending_cnt;
}
/* counts how many resync-related answers we still expect from the peer
@@ -1938,16 +1948,16 @@ static inline void _dec_ap_pending(struct drbd_device *device, const char *func,
* C_SYNC_SOURCE sends P_RS_DATA_REPLY (and expects P_WRITE_ACK with ID_SYNCER)
* (or P_NEG_ACK with ID_SYNCER)
*/
-static inline void inc_rs_pending(struct drbd_device *device)
+static inline void inc_rs_pending(struct drbd_peer_device *peer_device)
{
- atomic_inc(&device->rs_pending_cnt);
+ atomic_inc(&peer_device->device->rs_pending_cnt);
}
-#define dec_rs_pending(device) _dec_rs_pending(device, __func__, __LINE__)
-static inline void _dec_rs_pending(struct drbd_device *device, const char *func, int line)
+#define dec_rs_pending(peer_device) \
+ ((void)expect((peer_device), __dec_rs_pending(peer_device) >= 0))
+static inline int __dec_rs_pending(struct drbd_peer_device *peer_device)
{
- atomic_dec(&device->rs_pending_cnt);
- ERR_IF_CNT_IS_NEGATIVE(rs_pending_cnt, func, line);
+ return atomic_dec_return(&peer_device->device->rs_pending_cnt);
}
/* counts how many answers we still need to send to the peer.
@@ -1964,18 +1974,16 @@ static inline void inc_unacked(struct drbd_device *device)
atomic_inc(&device->unacked_cnt);
}
-#define dec_unacked(device) _dec_unacked(device, __func__, __LINE__)
-static inline void _dec_unacked(struct drbd_device *device, const char *func, int line)
+#define dec_unacked(device) ((void)expect(device, __dec_unacked(device) >= 0))
+static inline int __dec_unacked(struct drbd_device *device)
{
- atomic_dec(&device->unacked_cnt);
- ERR_IF_CNT_IS_NEGATIVE(unacked_cnt, func, line);
+ return atomic_dec_return(&device->unacked_cnt);
}
-#define sub_unacked(device, n) _sub_unacked(device, n, __func__, __LINE__)
-static inline void _sub_unacked(struct drbd_device *device, int n, const char *func, int line)
+#define sub_unacked(device, n) ((void)expect(device, __sub_unacked(device) >= 0))
+static inline int __sub_unacked(struct drbd_device *device, int n)
{
- atomic_sub(n, &device->unacked_cnt);
- ERR_IF_CNT_IS_NEGATIVE(unacked_cnt, func, line);
+ return atomic_sub_return(n, &device->unacked_cnt);
}
static inline bool is_sync_target_state(enum drbd_conns connection_state)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index 2c764f7..83987e7 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -231,9 +231,11 @@ void tl_release(struct drbd_connection *connection, unsigned int barrier_nr,
}
req = list_prepare_entry(tmp, &connection->transfer_log, tl_requests);
list_for_each_entry_safe_from(req, r, &connection->transfer_log, tl_requests) {
+ struct drbd_peer_device *peer_device;
if (req->epoch != expect_epoch)
break;
- _req_mod(req, BARRIER_ACKED);
+ peer_device = conn_peer_device(connection, req->device->vnr);
+ _req_mod(req, BARRIER_ACKED, peer_device);
}
spin_unlock_irq(&connection->resource->req_lock);
@@ -256,10 +258,13 @@ void tl_release(struct drbd_connection *connection, unsigned int barrier_nr,
/* must hold resource->req_lock */
void _tl_restart(struct drbd_connection *connection, enum drbd_req_event what)
{
+ struct drbd_peer_device *peer_device;
struct drbd_request *req, *r;
- list_for_each_entry_safe(req, r, &connection->transfer_log, tl_requests)
- _req_mod(req, what);
+ list_for_each_entry_safe(req, r, &connection->transfer_log, tl_requests) {
+ peer_device = conn_peer_device(connection, req->device->vnr);
+ _req_mod(req, what, peer_device);
+ }
}
void tl_restart(struct drbd_connection *connection, enum drbd_req_event what)
@@ -297,7 +302,7 @@ void tl_abort_disk_io(struct drbd_device *device)
continue;
if (req->device != device)
continue;
- _req_mod(req, ABORT_DISK_IO);
+ _req_mod(req, ABORT_DISK_IO, NULL);
}
spin_unlock_irq(&connection->resource->req_lock);
}
@@ -1198,10 +1203,11 @@ static int fill_bitmap_rle_bits(struct drbd_device *device,
* code upon failure.
*/
static int
-send_bitmap_rle_or_plain(struct drbd_device *device, struct bm_xfer_ctx *c)
+send_bitmap_rle_or_plain(struct drbd_peer_device *peer_device, struct bm_xfer_ctx *c)
{
- struct drbd_socket *sock = &first_peer_device(device)->connection->data;
- unsigned int header_size = drbd_header_size(first_peer_device(device)->connection);
+ struct drbd_device *device = peer_device->device;
+ struct drbd_socket *sock = &peer_device->connection->data;
+ unsigned int header_size = drbd_header_size(peer_device->connection);
struct p_compressed_bm *p = sock->sbuf + header_size;
int len, err;
@@ -1212,7 +1218,7 @@ send_bitmap_rle_or_plain(struct drbd_device *device, struct bm_xfer_ctx *c)
if (len) {
dcbp_set_code(p, RLE_VLI_Bits);
- err = __send_command(first_peer_device(device)->connection, device->vnr, sock,
+ err = __send_command(peer_device->connection, device->vnr, sock,
P_COMPRESSED_BITMAP, sizeof(*p) + len,
NULL, 0);
c->packets[0]++;
@@ -1233,7 +1239,8 @@ send_bitmap_rle_or_plain(struct drbd_device *device, struct bm_xfer_ctx *c)
len = num_words * sizeof(*p);
if (len)
drbd_bm_get_lel(device, c->word_offset, num_words, p);
- err = __send_command(first_peer_device(device)->connection, device->vnr, sock, P_BITMAP, len, NULL, 0);
+ err = __send_command(peer_device->connection, device->vnr, sock, P_BITMAP,
+ len, NULL, 0);
c->word_offset += num_words;
c->bit_offset = c->word_offset * BITS_PER_LONG;
@@ -1245,7 +1252,7 @@ send_bitmap_rle_or_plain(struct drbd_device *device, struct bm_xfer_ctx *c)
}
if (!err) {
if (len == 0) {
- INFO_bm_xfer_stats(device, "send", c);
+ INFO_bm_xfer_stats(peer_device, "send", c);
return 0;
} else
return 1;
@@ -1254,7 +1261,8 @@ send_bitmap_rle_or_plain(struct drbd_device *device, struct bm_xfer_ctx *c)
}
/* See the comment at receive_bitmap() */
-static int _drbd_send_bitmap(struct drbd_device *device)
+static int _drbd_send_bitmap(struct drbd_device *device,
+ struct drbd_peer_device *peer_device)
{
struct bm_xfer_ctx c;
int err;
@@ -1266,7 +1274,7 @@ static int _drbd_send_bitmap(struct drbd_device *device)
if (drbd_md_test_flag(device->ldev, MDF_FULL_SYNC)) {
drbd_info(device, "Writing the whole bitmap, MDF_FullSync was set.\n");
drbd_bm_set_all(device);
- if (drbd_bm_write(device)) {
+ if (drbd_bm_write(device, peer_device)) {
/* write_bm did fail! Leave full sync flag set in Meta P_DATA
* but otherwise process as per normal - need to tell other
* side that a full resync is required! */
@@ -1285,20 +1293,20 @@ static int _drbd_send_bitmap(struct drbd_device *device)
};
do {
- err = send_bitmap_rle_or_plain(device, &c);
+ err = send_bitmap_rle_or_plain(peer_device, &c);
} while (err > 0);
return err == 0;
}
-int drbd_send_bitmap(struct drbd_device *device)
+int drbd_send_bitmap(struct drbd_device *device, struct drbd_peer_device *peer_device)
{
- struct drbd_socket *sock = &first_peer_device(device)->connection->data;
+ struct drbd_socket *sock = &peer_device->connection->data;
int err = -1;
mutex_lock(&sock->mutex);
if (sock->socket)
- err = !_drbd_send_bitmap(device);
+ err = !_drbd_send_bitmap(device, peer_device);
mutex_unlock(&sock->mutex);
return err;
}
@@ -3406,7 +3414,9 @@ void drbd_uuid_set_bm(struct drbd_device *device, u64 val) __must_hold(local)
*
* Sets all bits in the bitmap and writes the whole bitmap to stable storage.
*/
-int drbd_bmio_set_n_write(struct drbd_device *device) __must_hold(local)
+int drbd_bmio_set_n_write(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local)
+
{
int rv = -EIO;
@@ -3414,7 +3424,7 @@ int drbd_bmio_set_n_write(struct drbd_device *device) __must_hold(local)
drbd_md_sync(device);
drbd_bm_set_all(device);
- rv = drbd_bm_write(device);
+ rv = drbd_bm_write(device, peer_device);
if (!rv) {
drbd_md_clear_flag(device, MDF_FULL_SYNC);
@@ -3430,11 +3440,13 @@ int drbd_bmio_set_n_write(struct drbd_device *device) __must_hold(local)
*
* Clears all bits in the bitmap and writes the whole bitmap to stable storage.
*/
-int drbd_bmio_clear_n_write(struct drbd_device *device) __must_hold(local)
+int drbd_bmio_clear_n_write(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local)
+
{
drbd_resume_al(device);
drbd_bm_clear_all(device);
- return drbd_bm_write(device);
+ return drbd_bm_write(device, peer_device);
}
static int w_bitmap_io(struct drbd_work *w, int unused)
@@ -3453,7 +3465,7 @@ static int w_bitmap_io(struct drbd_work *w, int unused)
if (get_ldev(device)) {
drbd_bm_lock(device, work->why, work->flags);
- rv = work->io_fn(device);
+ rv = work->io_fn(device, work->peer_device);
drbd_bm_unlock(device);
put_ldev(device);
}
@@ -3488,11 +3500,12 @@ static int w_bitmap_io(struct drbd_work *w, int unused)
* put_ldev().
*/
void drbd_queue_bitmap_io(struct drbd_device *device,
- int (*io_fn)(struct drbd_device *),
+ int (*io_fn)(struct drbd_device *, struct drbd_peer_device *),
void (*done)(struct drbd_device *, int),
- char *why, enum bm_flag flags)
+ char *why, enum bm_flag flags,
+ struct drbd_peer_device *peer_device)
{
- D_ASSERT(device, current == first_peer_device(device)->connection->worker.task);
+ D_ASSERT(device, current == peer_device->connection->worker.task);
D_ASSERT(device, !test_bit(BITMAP_IO_QUEUED, &device->flags));
D_ASSERT(device, !test_bit(BITMAP_IO, &device->flags));
@@ -3501,6 +3514,7 @@ void drbd_queue_bitmap_io(struct drbd_device *device,
drbd_err(device, "FIXME going to queue '%s' but '%s' still pending?\n",
why, device->bm_io_work.why);
+ device->bm_io_work.peer_device = peer_device;
device->bm_io_work.io_fn = io_fn;
device->bm_io_work.done = done;
device->bm_io_work.why = why;
@@ -3512,7 +3526,7 @@ void drbd_queue_bitmap_io(struct drbd_device *device,
* application IO does not conflict anyways. */
if (flags == BM_LOCKED_CHANGE_ALLOWED || atomic_read(&device->ap_bio_cnt) == 0) {
if (!test_and_set_bit(BITMAP_IO_QUEUED, &device->flags))
- drbd_queue_work(&first_peer_device(device)->connection->sender_work,
+ drbd_queue_work(&peer_device->connection->sender_work,
&device->bm_io_work.w);
}
spin_unlock_irq(&device->resource->req_lock);
@@ -3528,8 +3542,10 @@ void drbd_queue_bitmap_io(struct drbd_device *device,
* freezes application IO while that the actual IO operations runs. This
* functions MAY NOT be called from worker context.
*/
-int drbd_bitmap_io(struct drbd_device *device, int (*io_fn)(struct drbd_device *),
- char *why, enum bm_flag flags)
+int drbd_bitmap_io(struct drbd_device *device,
+ int (*io_fn)(struct drbd_device *, struct drbd_peer_device *),
+ char *why, enum bm_flag flags,
+ struct drbd_peer_device *peer_device)
{
/* Only suspend io, if some operation is supposed to be locked out */
const bool do_suspend_io = flags & (BM_DONT_CLEAR|BM_DONT_SET|BM_DONT_TEST);
@@ -3541,7 +3557,7 @@ int drbd_bitmap_io(struct drbd_device *device, int (*io_fn)(struct drbd_device *
drbd_suspend_io(device);
drbd_bm_lock(device, why, flags);
- rv = io_fn(device);
+ rv = io_fn(device, peer_device);
drbd_bm_unlock(device);
if (do_suspend_io)
diff --git a/drivers/block/drbd/drbd_nl.c b/drivers/block/drbd/drbd_nl.c
index 60757ac..8967298 100644
--- a/drivers/block/drbd/drbd_nl.c
+++ b/drivers/block/drbd/drbd_nl.c
@@ -1053,7 +1053,7 @@ drbd_determine_dev_size(struct drbd_device *device, enum dds_flags flags, struct
la_size_changed ? "size changed" : "md moved");
/* next line implicitly does drbd_suspend_io()+drbd_resume_io() */
drbd_bitmap_io(device, md_moved ? &drbd_bm_write_all : &drbd_bm_write,
- "size changed", BM_LOCKED_MASK);
+ "size changed", BM_LOCKED_MASK, NULL);
/* on-disk bitmap and activity log is authoritative again
* (unless there was an IO error meanwhile...) */
@@ -2027,13 +2027,15 @@ int drbd_adm_attach(struct sk_buff *skb, struct genl_info *info)
drbd_info(device, "Assuming that all blocks are out of sync "
"(aka FullSync)\n");
if (drbd_bitmap_io(device, &drbd_bmio_set_n_write,
- "set_n_write from attaching", BM_LOCKED_MASK)) {
+ "set_n_write from attaching", BM_LOCKED_MASK,
+ NULL)) {
retcode = ERR_IO_MD_DISK;
goto force_diskless_dec;
}
} else {
if (drbd_bitmap_io(device, &drbd_bm_read,
- "read from attaching", BM_LOCKED_MASK)) {
+ "read from attaching", BM_LOCKED_MASK,
+ NULL)) {
retcode = ERR_IO_MD_DISK;
goto force_diskless_dec;
}
@@ -2972,7 +2974,7 @@ int drbd_adm_invalidate(struct sk_buff *skb, struct genl_info *info)
retcode = drbd_request_state(device, NS(disk, D_INCONSISTENT));
if (retcode >= SS_SUCCESS) {
if (drbd_bitmap_io(device, &drbd_bmio_set_n_write,
- "set_n_write from invalidate", BM_LOCKED_MASK))
+ "set_n_write from invalidate", BM_LOCKED_MASK, NULL))
retcode = ERR_IO_MD_DISK;
}
} else
@@ -3005,11 +3007,12 @@ static int drbd_adm_simple_request_state(struct sk_buff *skb, struct genl_info *
return 0;
}
-static int drbd_bmio_set_susp_al(struct drbd_device *device) __must_hold(local)
+static int drbd_bmio_set_susp_al(struct drbd_device *device,
+ struct drbd_peer_device *peer_device) __must_hold(local)
{
int rv;
- rv = drbd_bmio_set_n_write(device);
+ rv = drbd_bmio_set_n_write(device, peer_device);
drbd_suspend_al(device);
return rv;
}
@@ -3052,7 +3055,7 @@ int drbd_adm_invalidate_peer(struct sk_buff *skb, struct genl_info *info)
if (retcode >= SS_SUCCESS) {
if (drbd_bitmap_io(device, &drbd_bmio_set_susp_al,
"set_n_write from invalidate_peer",
- BM_LOCKED_SET_ALLOWED))
+ BM_LOCKED_SET_ALLOWED, NULL))
retcode = ERR_IO_MD_DISK;
}
} else
@@ -4148,7 +4151,7 @@ int drbd_adm_new_c_uuid(struct sk_buff *skb, struct genl_info *info)
if (args.clear_bm) {
err = drbd_bitmap_io(device, &drbd_bmio_clear_n_write,
- "clear_n_write from new_c_uuid", BM_LOCKED_MASK);
+ "clear_n_write from new_c_uuid", BM_LOCKED_MASK, NULL);
if (err) {
drbd_err(device, "Writing bitmap failed with %d\n", err);
retcode = ERR_IO_MD_DISK;
diff --git a/drivers/block/drbd/drbd_receiver.c b/drivers/block/drbd/drbd_receiver.c
index 757f469..e54404c 100644
--- a/drivers/block/drbd/drbd_receiver.c
+++ b/drivers/block/drbd/drbd_receiver.c
@@ -2044,11 +2044,11 @@ static int e_end_resync_block(struct drbd_work *w, int unused)
D_ASSERT(device, drbd_interval_empty(&peer_req->i));
if (likely((peer_req->flags & EE_WAS_ERROR) == 0)) {
- drbd_set_in_sync(device, sector, peer_req->i.size);
+ drbd_set_in_sync(peer_device, sector, peer_req->i.size);
err = drbd_send_ack(peer_device, P_RS_WRITE_ACK, peer_req);
} else {
/* Record failure to sync */
- drbd_rs_failed_io(device, sector, peer_req->i.size);
+ drbd_rs_failed_io(peer_device, sector, peer_req->i.size);
err = drbd_send_ack(peer_device, P_NEG_ACK, peer_req);
}
@@ -2067,7 +2067,7 @@ static int recv_resync_read(struct drbd_peer_device *peer_device, sector_t secto
if (!peer_req)
goto fail;
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
inc_unacked(device);
/* corresponding dec_unacked() in e_end_resync_block()
@@ -2138,7 +2138,7 @@ static int receive_DataReply(struct drbd_connection *connection, struct packet_i
err = recv_dless_read(peer_device, req, sector, pi->size);
if (!err)
- req_mod(req, DATA_RECEIVED);
+ req_mod(req, DATA_RECEIVED, peer_device);
/* else: nothing. handled from drbd_disconnect...
* I don't think we may complete this just yet
* in case we are "on-disconnect: freeze" */
@@ -2196,7 +2196,7 @@ static void restart_conflicting_writes(struct drbd_device *device,
continue;
/* as it is RQ_POSTPONED, this will cause it to
* be queued on the retry workqueue. */
- __req_mod(req, CONFLICT_RESOLVED, NULL);
+ __req_mod(req, CONFLICT_RESOLVED, NULL, NULL);
}
}
@@ -2220,7 +2220,7 @@ static int e_end_block(struct drbd_work *w, int cancel)
P_RS_WRITE_ACK : P_WRITE_ACK;
err = drbd_send_ack(peer_device, pcmd, peer_req);
if (pcmd == P_RS_WRITE_ACK)
- drbd_set_in_sync(device, sector, peer_req->i.size);
+ drbd_set_in_sync(peer_device, sector, peer_req->i.size);
} else {
err = drbd_send_ack(peer_device, P_NEG_ACK, peer_req);
/* we expect it to be marked out of sync anyways...
@@ -2420,6 +2420,7 @@ static blk_opf_t wire_flags_to_bio(struct drbd_connection *connection, u32 dpf)
static void fail_postponed_requests(struct drbd_device *device, sector_t sector,
unsigned int size)
{
+ struct drbd_peer_device *peer_device = first_peer_device(device);
struct drbd_interval *i;
repeat:
@@ -2433,7 +2434,7 @@ static void fail_postponed_requests(struct drbd_device *device, sector_t sector,
if (!(req->rq_state & RQ_POSTPONED))
continue;
req->rq_state &= ~RQ_POSTPONED;
- __req_mod(req, NEG_ACKED, &m);
+ __req_mod(req, NEG_ACKED, peer_device, &m);
spin_unlock_irq(&device->resource->req_lock);
if (m.bio)
complete_master_bio(device, &m);
@@ -2690,7 +2691,7 @@ static int receive_Data(struct drbd_connection *connection, struct packet_info *
if (device->state.pdsk < D_INCONSISTENT) {
/* In case we have the only disk of the cluster, */
- drbd_set_out_of_sync(device, peer_req->i.sector, peer_req->i.size);
+ drbd_set_out_of_sync(peer_device, peer_req->i.sector, peer_req->i.size);
peer_req->flags &= ~EE_MAY_SET_IN_SYNC;
drbd_al_begin_io(device, &peer_req->i);
peer_req->flags |= EE_CALL_AL_COMPLETE_IO;
@@ -2729,9 +2730,10 @@ static int receive_Data(struct drbd_connection *connection, struct packet_info *
* The current sync rate used here uses only the most recent two step marks,
* to have a short time average so we can react faster.
*/
-bool drbd_rs_should_slow_down(struct drbd_device *device, sector_t sector,
+bool drbd_rs_should_slow_down(struct drbd_peer_device *peer_device, sector_t sector,
bool throttle_if_app_is_waiting)
{
+ struct drbd_device *device = peer_device->device;
struct lc_element *tmp;
bool throttle = drbd_rs_c_min_rate_throttle(device);
@@ -2843,7 +2845,7 @@ static int receive_DataRequest(struct drbd_connection *connection, struct packet
break;
case P_OV_REPLY:
verb = 0;
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
drbd_send_ack_ex(peer_device, P_OV_RESULT, sector, size, ID_IN_SYNC);
break;
default:
@@ -2914,7 +2916,7 @@ static int receive_DataRequest(struct drbd_connection *connection, struct packet
/* track progress, we may need to throttle */
atomic_add(size >> 9, &device->rs_sect_in);
peer_req->w.cb = w_e_end_ov_reply;
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
/* drbd_rs_begin_io done when we sent this request,
* but accounting still needs to be done. */
goto submit_for_resync;
@@ -2977,7 +2979,7 @@ static int receive_DataRequest(struct drbd_connection *connection, struct packet
update_receiver_timing_details(connection, drbd_rs_should_slow_down);
if (device->state.peer != R_PRIMARY
- && drbd_rs_should_slow_down(device, sector, false))
+ && drbd_rs_should_slow_down(peer_device, sector, false))
schedule_timeout_uninterruptible(HZ/10);
update_receiver_timing_details(connection, drbd_rs_begin_io);
if (drbd_rs_begin_io(device, sector))
@@ -3226,10 +3228,11 @@ static void drbd_uuid_dump(struct drbd_device *device, char *text, u64 *uuid,
-1096 requires proto 96
*/
-static int drbd_uuid_compare(struct drbd_device *const device, enum drbd_role const peer_role, int *rule_nr) __must_hold(local)
+static int drbd_uuid_compare(struct drbd_peer_device *const peer_device,
+ enum drbd_role const peer_role, int *rule_nr) __must_hold(local)
{
- struct drbd_peer_device *const peer_device = first_peer_device(device);
- struct drbd_connection *const connection = peer_device ? peer_device->connection : NULL;
+ struct drbd_connection *const connection = peer_device->connection;
+ struct drbd_device *device = peer_device->device;
u64 self, peer;
int i, j;
@@ -3465,7 +3468,7 @@ static enum drbd_conns drbd_sync_handshake(struct drbd_peer_device *peer_device,
drbd_uuid_dump(device, "peer", device->p_uuid,
device->p_uuid[UI_SIZE], device->p_uuid[UI_FLAGS]);
- hg = drbd_uuid_compare(device, peer_role, &rule_nr);
+ hg = drbd_uuid_compare(peer_device, peer_role, &rule_nr);
spin_unlock_irq(&device->ldev->md.uuid_lock);
drbd_info(device, "uuid_compare()=%d by rule %d\n", hg, rule_nr);
@@ -3591,7 +3594,7 @@ static enum drbd_conns drbd_sync_handshake(struct drbd_peer_device *peer_device,
if (abs(hg) >= 2) {
drbd_info(device, "Writing the whole bitmap, full sync required after drbd_sync_handshake.\n");
if (drbd_bitmap_io(device, &drbd_bmio_set_n_write, "set_n_write from sync_handshake",
- BM_LOCKED_SET_ALLOWED))
+ BM_LOCKED_SET_ALLOWED, NULL))
return C_MASK;
}
@@ -4270,7 +4273,7 @@ static int receive_uuids(struct drbd_connection *connection, struct packet_info
drbd_info(device, "Accepted new current UUID, preparing to skip initial sync\n");
drbd_bitmap_io(device, &drbd_bmio_clear_n_write,
"clear_n_write from receive_uuids",
- BM_LOCKED_TEST_ALLOWED);
+ BM_LOCKED_TEST_ALLOWED, NULL);
_drbd_uuid_set(device, UI_CURRENT, p_uuid[UI_CURRENT]);
_drbd_uuid_set(device, UI_BITMAP, 0);
_drbd_set_state(_NS2(device, disk, D_UP_TO_DATE, pdsk, D_UP_TO_DATE),
@@ -4448,7 +4451,7 @@ static int receive_state(struct drbd_connection *connection, struct packet_info
else if (os.conn >= C_SYNC_SOURCE &&
peer_state.conn == C_CONNECTED) {
if (drbd_bm_total_weight(device) <= device->rs_failed)
- drbd_resync_finished(device);
+ drbd_resync_finished(peer_device);
return 0;
}
}
@@ -4456,8 +4459,8 @@ static int receive_state(struct drbd_connection *connection, struct packet_info
/* explicit verify finished notification, stop sector reached. */
if (os.conn == C_VERIFY_T && os.disk == D_UP_TO_DATE &&
peer_state.conn == C_CONNECTED && real_peer_disk == D_UP_TO_DATE) {
- ov_out_of_sync_print(device);
- drbd_resync_finished(device);
+ ov_out_of_sync_print(peer_device);
+ drbd_resync_finished(peer_device);
return 0;
}
@@ -4766,11 +4769,11 @@ decode_bitmap_c(struct drbd_peer_device *peer_device,
return -EIO;
}
-void INFO_bm_xfer_stats(struct drbd_device *device,
+void INFO_bm_xfer_stats(struct drbd_peer_device *peer_device,
const char *direction, struct bm_xfer_ctx *c)
{
/* what would it take to transfer it "plaintext" */
- unsigned int header_size = drbd_header_size(first_peer_device(device)->connection);
+ unsigned int header_size = drbd_header_size(peer_device->connection);
unsigned int data_size = DRBD_SOCKET_BUFFER_SIZE - header_size;
unsigned int plain =
header_size * (DIV_ROUND_UP(c->bm_words, data_size) + 1) +
@@ -4794,7 +4797,7 @@ void INFO_bm_xfer_stats(struct drbd_device *device,
r = 1000;
r = 1000 - r;
- drbd_info(device, "%s bitmap stats [Bytes(packets)]: plain %u(%u), RLE %u(%u), "
+ drbd_info(peer_device, "%s bitmap stats [Bytes(packets)]: plain %u(%u), RLE %u(%u), "
"total %u; compression: %u.%u%%\n",
direction,
c->bytes[1], c->packets[1],
@@ -4872,12 +4875,12 @@ static int receive_bitmap(struct drbd_connection *connection, struct packet_info
goto out;
}
- INFO_bm_xfer_stats(device, "receive", &c);
+ INFO_bm_xfer_stats(peer_device, "receive", &c);
if (device->state.conn == C_WF_BITMAP_T) {
enum drbd_state_rv rv;
- err = drbd_send_bitmap(device);
+ err = drbd_send_bitmap(device, peer_device);
if (err)
goto out;
/* Omit CS_ORDERED with this state transition to avoid deadlocks. */
@@ -4935,7 +4938,7 @@ static int receive_out_of_sync(struct drbd_connection *connection, struct packet
drbd_conn_str(device->state.conn));
}
- drbd_set_out_of_sync(device, be64_to_cpu(p->sector), be32_to_cpu(p->blksize));
+ drbd_set_out_of_sync(peer_device, be64_to_cpu(p->sector), be32_to_cpu(p->blksize));
return 0;
}
@@ -4956,7 +4959,7 @@ static int receive_rs_deallocated(struct drbd_connection *connection, struct pac
sector = be64_to_cpu(p->sector);
size = be32_to_cpu(p->blksize);
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
if (get_ldev(device)) {
struct drbd_peer_request *peer_req;
@@ -5214,7 +5217,7 @@ static int drbd_disconnected(struct drbd_peer_device *peer_device)
if (get_ldev(device)) {
drbd_bitmap_io(device, &drbd_bm_write_copy_pages,
- "write from disconnected", BM_LOCKED_CHANGE_ALLOWED);
+ "write from disconnected", BM_LOCKED_CHANGE_ALLOWED, NULL);
put_ldev(device);
}
@@ -5648,22 +5651,23 @@ static int got_IsInSync(struct drbd_connection *connection, struct packet_info *
if (get_ldev(device)) {
drbd_rs_complete_io(device, sector);
- drbd_set_in_sync(device, sector, blksize);
+ drbd_set_in_sync(peer_device, sector, blksize);
/* rs_same_csums is supposed to count in units of BM_BLOCK_SIZE */
device->rs_same_csum += (blksize >> BM_BLOCK_SHIFT);
put_ldev(device);
}
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
atomic_add(blksize >> 9, &device->rs_sect_in);
return 0;
}
static int
-validate_req_change_req_state(struct drbd_device *device, u64 id, sector_t sector,
+validate_req_change_req_state(struct drbd_peer_device *peer_device, u64 id, sector_t sector,
struct rb_root *root, const char *func,
enum drbd_req_event what, bool missing_ok)
{
+ struct drbd_device *device = peer_device->device;
struct drbd_request *req;
struct bio_and_error m;
@@ -5673,7 +5677,7 @@ validate_req_change_req_state(struct drbd_device *device, u64 id, sector_t secto
spin_unlock_irq(&device->resource->req_lock);
return -EIO;
}
- __req_mod(req, what, &m);
+ __req_mod(req, what, peer_device, &m);
spin_unlock_irq(&device->resource->req_lock);
if (m.bio)
@@ -5698,8 +5702,8 @@ static int got_BlockAck(struct drbd_connection *connection, struct packet_info *
update_peer_seq(peer_device, be32_to_cpu(p->seq_num));
if (p->block_id == ID_SYNCER) {
- drbd_set_in_sync(device, sector, blksize);
- dec_rs_pending(device);
+ drbd_set_in_sync(peer_device, sector, blksize);
+ dec_rs_pending(peer_device);
return 0;
}
switch (pi->cmd) {
@@ -5722,7 +5726,7 @@ static int got_BlockAck(struct drbd_connection *connection, struct packet_info *
BUG();
}
- return validate_req_change_req_state(device, p->block_id, sector,
+ return validate_req_change_req_state(peer_device, p->block_id, sector,
&device->write_requests, __func__,
what, false);
}
@@ -5744,12 +5748,12 @@ static int got_NegAck(struct drbd_connection *connection, struct packet_info *pi
update_peer_seq(peer_device, be32_to_cpu(p->seq_num));
if (p->block_id == ID_SYNCER) {
- dec_rs_pending(device);
- drbd_rs_failed_io(device, sector, size);
+ dec_rs_pending(peer_device);
+ drbd_rs_failed_io(peer_device, sector, size);
return 0;
}
- err = validate_req_change_req_state(device, p->block_id, sector,
+ err = validate_req_change_req_state(peer_device, p->block_id, sector,
&device->write_requests, __func__,
NEG_ACKED, true);
if (err) {
@@ -5758,7 +5762,7 @@ static int got_NegAck(struct drbd_connection *connection, struct packet_info *pi
request is no longer in the collision hash. */
/* In Protocol B we might already have got a P_RECV_ACK
but then get a P_NEG_ACK afterwards. */
- drbd_set_out_of_sync(device, sector, size);
+ drbd_set_out_of_sync(peer_device, sector, size);
}
return 0;
}
@@ -5780,7 +5784,7 @@ static int got_NegDReply(struct drbd_connection *connection, struct packet_info
drbd_err(device, "Got NegDReply; Sector %llus, len %u.\n",
(unsigned long long)sector, be32_to_cpu(p->blksize));
- return validate_req_change_req_state(device, p->block_id, sector,
+ return validate_req_change_req_state(peer_device, p->block_id, sector,
&device->read_requests, __func__,
NEG_ACKED, false);
}
@@ -5803,13 +5807,13 @@ static int got_NegRSDReply(struct drbd_connection *connection, struct packet_inf
update_peer_seq(peer_device, be32_to_cpu(p->seq_num));
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
if (get_ldev_if_state(device, D_FAILED)) {
drbd_rs_complete_io(device, sector);
switch (pi->cmd) {
case P_NEG_RS_DREPLY:
- drbd_rs_failed_io(device, sector, size);
+ drbd_rs_failed_io(peer_device, sector, size);
break;
case P_RS_CANCEL:
break;
@@ -5866,21 +5870,21 @@ static int got_OVResult(struct drbd_connection *connection, struct packet_info *
update_peer_seq(peer_device, be32_to_cpu(p->seq_num));
if (be64_to_cpu(p->block_id) == ID_OUT_OF_SYNC)
- drbd_ov_out_of_sync_found(device, sector, size);
+ drbd_ov_out_of_sync_found(peer_device, sector, size);
else
- ov_out_of_sync_print(device);
+ ov_out_of_sync_print(peer_device);
if (!get_ldev(device))
return 0;
drbd_rs_complete_io(device, sector);
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
--device->ov_left;
/* let's advance progress step marks only for every other megabyte */
if ((device->ov_left & 0x200) == 0x200)
- drbd_advance_rs_marks(device, device->ov_left);
+ drbd_advance_rs_marks(peer_device, device->ov_left);
if (device->ov_left == 0) {
dw = kmalloc(sizeof(*dw), GFP_NOIO);
@@ -5890,8 +5894,8 @@ static int got_OVResult(struct drbd_connection *connection, struct packet_info *
drbd_queue_work(&peer_device->connection->sender_work, &dw->w);
} else {
drbd_err(device, "kmalloc(dw) failed.");
- ov_out_of_sync_print(device);
- drbd_resync_finished(device);
+ ov_out_of_sync_print(peer_device);
+ drbd_resync_finished(peer_device);
}
}
put_ldev(device);
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index e36216d5..380e658 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -122,12 +122,13 @@ void drbd_req_destroy(struct kref *kref)
* before it even was submitted or sent.
* In that case we do not want to touch the bitmap at all.
*/
+ struct drbd_peer_device *peer_device = first_peer_device(device);
if ((s & (RQ_POSTPONED|RQ_LOCAL_MASK|RQ_NET_MASK)) != RQ_POSTPONED) {
if (!(s & RQ_NET_OK) || !(s & RQ_LOCAL_OK))
- drbd_set_out_of_sync(device, req->i.sector, req->i.size);
+ drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
if ((s & RQ_NET_OK) && (s & RQ_LOCAL_OK) && (s & RQ_NET_SIS))
- drbd_set_in_sync(device, req->i.sector, req->i.size);
+ drbd_set_in_sync(peer_device, req->i.sector, req->i.size);
}
/* one might be tempted to move the drbd_al_complete_io
@@ -552,12 +553,15 @@ static inline bool is_pending_write_protocol_A(struct drbd_request *req)
* happen "atomically" within the req_lock,
* and it enforces that we have to think in a very structured manner
* about the "events" that may happen to a request during its life time ...
+ *
+ *
+ * peer_device == NULL means local disk
*/
int __req_mod(struct drbd_request *req, enum drbd_req_event what,
+ struct drbd_peer_device *peer_device,
struct bio_and_error *m)
{
struct drbd_device *const device = req->device;
- struct drbd_peer_device *const peer_device = first_peer_device(device);
struct drbd_connection *const connection = peer_device ? peer_device->connection : NULL;
struct net_conf *nc;
int p, rv = 0;
@@ -617,7 +621,7 @@ int __req_mod(struct drbd_request *req, enum drbd_req_event what,
break;
case READ_COMPLETED_WITH_ERROR:
- drbd_set_out_of_sync(device, req->i.sector, req->i.size);
+ drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size);
drbd_report_io_error(device, req);
__drbd_chk_io_error(device, DRBD_READ_ERROR);
fallthrough;
@@ -1100,6 +1104,7 @@ static bool drbd_should_send_out_of_sync(union drbd_dev_state s)
static int drbd_process_write_request(struct drbd_request *req)
{
struct drbd_device *device = req->device;
+ struct drbd_peer_device *peer_device = first_peer_device(device);
int remote, send_oos;
remote = drbd_should_do_remote(device->state);
@@ -1115,7 +1120,7 @@ static int drbd_process_write_request(struct drbd_request *req)
/* The only size==0 bios we expect are empty flushes. */
D_ASSERT(device, req->master_bio->bi_opf & REQ_PREFLUSH);
if (remote)
- _req_mod(req, QUEUE_AS_DRBD_BARRIER);
+ _req_mod(req, QUEUE_AS_DRBD_BARRIER, peer_device);
return remote;
}
@@ -1125,10 +1130,10 @@ static int drbd_process_write_request(struct drbd_request *req)
D_ASSERT(device, !(remote && send_oos));
if (remote) {
- _req_mod(req, TO_BE_SENT);
- _req_mod(req, QUEUE_FOR_NET_WRITE);
- } else if (drbd_set_out_of_sync(device, req->i.sector, req->i.size))
- _req_mod(req, QUEUE_FOR_SEND_OOS);
+ _req_mod(req, TO_BE_SENT, peer_device);
+ _req_mod(req, QUEUE_FOR_NET_WRITE, peer_device);
+ } else if (drbd_set_out_of_sync(peer_device, req->i.sector, req->i.size))
+ _req_mod(req, QUEUE_FOR_SEND_OOS, peer_device);
return remote;
}
@@ -1312,6 +1317,7 @@ static void drbd_update_plug(struct drbd_plug_cb *plug, struct drbd_request *req
static void drbd_send_and_submit(struct drbd_device *device, struct drbd_request *req)
{
struct drbd_resource *resource = device->resource;
+ struct drbd_peer_device *peer_device = first_peer_device(device);
const int rw = bio_data_dir(req->master_bio);
struct bio_and_error m = { NULL, };
bool no_remote = false;
@@ -1375,8 +1381,8 @@ static void drbd_send_and_submit(struct drbd_device *device, struct drbd_request
/* We either have a private_bio, or we can read from remote.
* Otherwise we had done the goto nodata above. */
if (req->private_bio == NULL) {
- _req_mod(req, TO_BE_SENT);
- _req_mod(req, QUEUE_FOR_NET_READ);
+ _req_mod(req, TO_BE_SENT, peer_device);
+ _req_mod(req, QUEUE_FOR_NET_READ, peer_device);
} else
no_remote = true;
}
@@ -1397,7 +1403,7 @@ static void drbd_send_and_submit(struct drbd_device *device, struct drbd_request
req->pre_submit_jif = jiffies;
list_add_tail(&req->req_pending_local,
&device->pending_completion[rw == WRITE]);
- _req_mod(req, TO_BE_SUBMITTED);
+ _req_mod(req, TO_BE_SUBMITTED, NULL);
/* but we need to give up the spinlock to submit */
submit_private_bio = true;
} else if (no_remote) {
diff --git a/drivers/block/drbd/drbd_req.h b/drivers/block/drbd/drbd_req.h
index b4017b5..9ae860e 100644
--- a/drivers/block/drbd/drbd_req.h
+++ b/drivers/block/drbd/drbd_req.h
@@ -267,6 +267,7 @@ struct bio_and_error {
extern void start_new_tl_epoch(struct drbd_connection *connection);
extern void drbd_req_destroy(struct kref *kref);
extern int __req_mod(struct drbd_request *req, enum drbd_req_event what,
+ struct drbd_peer_device *peer_device,
struct bio_and_error *m);
extern void complete_master_bio(struct drbd_device *device,
struct bio_and_error *m);
@@ -280,14 +281,15 @@ extern void drbd_restart_request(struct drbd_request *req);
/* use this if you don't want to deal with calling complete_master_bio()
* outside the spinlock, e.g. when walking some list on cleanup. */
-static inline int _req_mod(struct drbd_request *req, enum drbd_req_event what)
+static inline int _req_mod(struct drbd_request *req, enum drbd_req_event what,
+ struct drbd_peer_device *peer_device)
{
struct drbd_device *device = req->device;
struct bio_and_error m;
int rv;
/* __req_mod possibly frees req, do not touch req after that! */
- rv = __req_mod(req, what, &m);
+ rv = __req_mod(req, what, peer_device, &m);
if (m.bio)
complete_master_bio(device, &m);
@@ -299,7 +301,8 @@ static inline int _req_mod(struct drbd_request *req, enum drbd_req_event what)
* of the lower level driver completion callback, so we need to
* spin_lock_irqsave here. */
static inline int req_mod(struct drbd_request *req,
- enum drbd_req_event what)
+ enum drbd_req_event what,
+ struct drbd_peer_device *peer_device)
{
unsigned long flags;
struct drbd_device *device = req->device;
@@ -307,7 +310,7 @@ static inline int req_mod(struct drbd_request *req,
int rv;
spin_lock_irqsave(&device->resource->req_lock, flags);
- rv = __req_mod(req, what, &m);
+ rv = __req_mod(req, what, peer_device, &m);
spin_unlock_irqrestore(&device->resource->req_lock, flags);
if (m.bio)
diff --git a/drivers/block/drbd/drbd_state.c b/drivers/block/drbd/drbd_state.c
index 75d13ea..563e67f 100644
--- a/drivers/block/drbd/drbd_state.c
+++ b/drivers/block/drbd/drbd_state.c
@@ -1222,9 +1222,11 @@ void drbd_resume_al(struct drbd_device *device)
}
/* helper for _drbd_set_state */
-static void set_ov_position(struct drbd_device *device, enum drbd_conns cs)
+static void set_ov_position(struct drbd_peer_device *peer_device, enum drbd_conns cs)
{
- if (first_peer_device(device)->connection->agreed_pro_version < 90)
+ struct drbd_device *device = peer_device->device;
+
+ if (peer_device->connection->agreed_pro_version < 90)
device->ov_start_sector = 0;
device->rs_total = drbd_bm_bits(device);
device->ov_position = 0;
@@ -1387,7 +1389,7 @@ _drbd_set_state(struct drbd_device *device, union drbd_state ns,
unsigned long now = jiffies;
int i;
- set_ov_position(device, ns.conn);
+ set_ov_position(peer_device, ns.conn);
device->rs_start = now;
device->rs_last_sect_ev = 0;
device->ov_last_oos_size = 0;
@@ -1398,7 +1400,7 @@ _drbd_set_state(struct drbd_device *device, union drbd_state ns,
device->rs_mark_time[i] = now;
}
- drbd_rs_controller_reset(device);
+ drbd_rs_controller_reset(peer_device);
if (ns.conn == C_VERIFY_S) {
drbd_info(device, "Starting Online Verify from sector %llu\n",
@@ -1518,8 +1520,9 @@ static void abw_start_sync(struct drbd_device *device, int rv)
}
int drbd_bitmap_io_from_worker(struct drbd_device *device,
- int (*io_fn)(struct drbd_device *),
- char *why, enum bm_flag flags)
+ int (*io_fn)(struct drbd_device *, struct drbd_peer_device *),
+ char *why, enum bm_flag flags,
+ struct drbd_peer_device *peer_device)
{
int rv;
@@ -1529,7 +1532,7 @@ int drbd_bitmap_io_from_worker(struct drbd_device *device,
atomic_inc(&device->suspend_cnt);
drbd_bm_lock(device, why, flags);
- rv = io_fn(device);
+ rv = io_fn(device, peer_device);
drbd_bm_unlock(device);
drbd_resume_io(device);
@@ -1809,7 +1812,7 @@ static void after_state_ch(struct drbd_device *device, union drbd_state os,
device->state.conn == C_WF_BITMAP_S)
drbd_queue_bitmap_io(device, &drbd_send_bitmap, NULL,
"send_bitmap (WFBitMapS)",
- BM_LOCKED_TEST_ALLOWED);
+ BM_LOCKED_TEST_ALLOWED, peer_device);
/* Lost contact to peer's copy of the data */
if (lost_contact_to_peer_data(os.pdsk, ns.pdsk)) {
@@ -1839,7 +1842,7 @@ static void after_state_ch(struct drbd_device *device, union drbd_state os,
* No harm done if the bitmap still changes,
* redirtied pages will follow later. */
drbd_bitmap_io_from_worker(device, &drbd_bm_write,
- "demote diskless peer", BM_LOCKED_SET_ALLOWED);
+ "demote diskless peer", BM_LOCKED_SET_ALLOWED, peer_device);
put_ldev(device);
}
@@ -1851,7 +1854,7 @@ static void after_state_ch(struct drbd_device *device, union drbd_state os,
/* No changes to the bitmap expected this time, so assert that,
* even though no harm was done if it did change. */
drbd_bitmap_io_from_worker(device, &drbd_bm_write,
- "demote", BM_LOCKED_TEST_ALLOWED);
+ "demote", BM_LOCKED_TEST_ALLOWED, peer_device);
put_ldev(device);
}
@@ -1888,7 +1891,8 @@ static void after_state_ch(struct drbd_device *device, union drbd_state os,
/* no other bitmap changes expected during this phase */
drbd_queue_bitmap_io(device,
&drbd_bmio_set_n_write, &abw_start_sync,
- "set_n_write from StartingSync", BM_LOCKED_TEST_ALLOWED);
+ "set_n_write from StartingSync", BM_LOCKED_TEST_ALLOWED,
+ peer_device);
/* first half of local IO error, failure to attach,
* or administrative detach */
@@ -2011,7 +2015,8 @@ static void after_state_ch(struct drbd_device *device, union drbd_state os,
if ((os.conn > C_CONNECTED && os.conn < C_AHEAD) &&
(ns.conn == C_CONNECTED || ns.conn >= C_AHEAD) && get_ldev(device)) {
drbd_queue_bitmap_io(device, &drbd_bm_write_copy_pages, NULL,
- "write from resync_finished", BM_LOCKED_CHANGE_ALLOWED);
+ "write from resync_finished", BM_LOCKED_CHANGE_ALLOWED,
+ peer_device);
put_ldev(device);
}
diff --git a/drivers/block/drbd/drbd_worker.c b/drivers/block/drbd/drbd_worker.c
index f467380..4352a50f 100644
--- a/drivers/block/drbd/drbd_worker.c
+++ b/drivers/block/drbd/drbd_worker.c
@@ -28,8 +28,8 @@
#include "drbd_protocol.h"
#include "drbd_req.h"
-static int make_ov_request(struct drbd_device *, int);
-static int make_resync_request(struct drbd_device *, int);
+static int make_ov_request(struct drbd_peer_device *, int);
+static int make_resync_request(struct drbd_peer_device *, int);
/* endio handlers:
* drbd_md_endio (defined here)
@@ -124,7 +124,7 @@ void drbd_endio_write_sec_final(struct drbd_peer_request *peer_req) __releases(l
* In case of a write error, send the neg ack anyways. */
if (!__test_and_set_bit(__EE_SEND_WRITE_ACK, &peer_req->flags))
inc_unacked(device);
- drbd_set_out_of_sync(device, peer_req->i.sector, peer_req->i.size);
+ drbd_set_out_of_sync(peer_device, peer_req->i.sector, peer_req->i.size);
}
spin_lock_irqsave(&device->resource->req_lock, flags);
@@ -276,7 +276,7 @@ void drbd_request_endio(struct bio *bio)
/* not req_mod(), we need irqsave here! */
spin_lock_irqsave(&device->resource->req_lock, flags);
- __req_mod(req, what, &m);
+ __req_mod(req, what, NULL, &m);
spin_unlock_irqrestore(&device->resource->req_lock, flags);
put_ldev(device);
@@ -363,7 +363,7 @@ static int w_e_send_csum(struct drbd_work *w, int cancel)
* drbd_alloc_pages due to pp_in_use > max_buffers. */
drbd_free_peer_req(device, peer_req);
peer_req = NULL;
- inc_rs_pending(device);
+ inc_rs_pending(peer_device);
err = drbd_send_drequest_csum(peer_device, sector, size,
digest, digest_size,
P_CSUM_RS_REQUEST);
@@ -430,10 +430,10 @@ int w_resync_timer(struct drbd_work *w, int cancel)
switch (device->state.conn) {
case C_VERIFY_S:
- make_ov_request(device, cancel);
+ make_ov_request(first_peer_device(device), cancel);
break;
case C_SYNC_TARGET:
- make_resync_request(device, cancel);
+ make_resync_request(first_peer_device(device), cancel);
break;
}
@@ -493,8 +493,9 @@ struct fifo_buffer *fifo_alloc(unsigned int fifo_size)
return fb;
}
-static int drbd_rs_controller(struct drbd_device *device, unsigned int sect_in)
+static int drbd_rs_controller(struct drbd_peer_device *peer_device, unsigned int sect_in)
{
+ struct drbd_device *device = peer_device->device;
struct disk_conf *dc;
unsigned int want; /* The number of sectors we want in-flight */
int req_sect; /* Number of sectors to request in this turn */
@@ -545,8 +546,9 @@ static int drbd_rs_controller(struct drbd_device *device, unsigned int sect_in)
return req_sect;
}
-static int drbd_rs_number_requests(struct drbd_device *device)
+static int drbd_rs_number_requests(struct drbd_peer_device *peer_device)
{
+ struct drbd_device *device = peer_device->device;
unsigned int sect_in; /* Number of sectors that came in since the last turn */
int number, mxb;
@@ -556,7 +558,7 @@ static int drbd_rs_number_requests(struct drbd_device *device)
rcu_read_lock();
mxb = drbd_get_max_buffers(device) / 2;
if (rcu_dereference(device->rs_plan_s)->size) {
- number = drbd_rs_controller(device, sect_in) >> (BM_BLOCK_SHIFT - 9);
+ number = drbd_rs_controller(peer_device, sect_in) >> (BM_BLOCK_SHIFT - 9);
device->c_sync_rate = number * HZ * (BM_BLOCK_SIZE / 1024) / SLEEP_TIME;
} else {
device->c_sync_rate = rcu_dereference(device->ldev->disk_conf)->resync_rate;
@@ -580,9 +582,9 @@ static int drbd_rs_number_requests(struct drbd_device *device)
return number;
}
-static int make_resync_request(struct drbd_device *const device, int cancel)
+static int make_resync_request(struct drbd_peer_device *const peer_device, int cancel)
{
- struct drbd_peer_device *const peer_device = first_peer_device(device);
+ struct drbd_device *const device = peer_device->device;
struct drbd_connection *const connection = peer_device ? peer_device->connection : NULL;
unsigned long bit;
sector_t sector;
@@ -598,7 +600,7 @@ static int make_resync_request(struct drbd_device *const device, int cancel)
if (device->rs_total == 0) {
/* empty resync? */
- drbd_resync_finished(device);
+ drbd_resync_finished(peer_device);
return 0;
}
@@ -618,7 +620,7 @@ static int make_resync_request(struct drbd_device *const device, int cancel)
}
max_bio_size = queue_max_hw_sectors(device->rq_queue) << 9;
- number = drbd_rs_number_requests(device);
+ number = drbd_rs_number_requests(peer_device);
if (number <= 0)
goto requeue;
@@ -653,7 +655,7 @@ static int make_resync_request(struct drbd_device *const device, int cancel)
sector = BM_BIT_TO_SECT(bit);
- if (drbd_try_rs_begin_io(device, sector)) {
+ if (drbd_try_rs_begin_io(peer_device, sector)) {
device->bm_resync_fo = bit;
goto requeue;
}
@@ -729,13 +731,13 @@ static int make_resync_request(struct drbd_device *const device, int cancel)
} else {
int err;
- inc_rs_pending(device);
+ inc_rs_pending(peer_device);
err = drbd_send_drequest(peer_device,
size == discard_granularity ? P_RS_THIN_REQ : P_RS_DATA_REQUEST,
sector, size, ID_SYNCER);
if (err) {
drbd_err(device, "drbd_send_drequest() failed, aborting...\n");
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
put_ldev(device);
return err;
}
@@ -760,8 +762,9 @@ static int make_resync_request(struct drbd_device *const device, int cancel)
return 0;
}
-static int make_ov_request(struct drbd_device *device, int cancel)
+static int make_ov_request(struct drbd_peer_device *peer_device, int cancel)
{
+ struct drbd_device *device = peer_device->device;
int number, i, size;
sector_t sector;
const sector_t capacity = get_capacity(device->vdisk);
@@ -770,7 +773,7 @@ static int make_ov_request(struct drbd_device *device, int cancel)
if (unlikely(cancel))
return 1;
- number = drbd_rs_number_requests(device);
+ number = drbd_rs_number_requests(peer_device);
sector = device->ov_position;
for (i = 0; i < number; i++) {
@@ -788,7 +791,7 @@ static int make_ov_request(struct drbd_device *device, int cancel)
size = BM_BLOCK_SIZE;
- if (drbd_try_rs_begin_io(device, sector)) {
+ if (drbd_try_rs_begin_io(peer_device, sector)) {
device->ov_position = sector;
goto requeue;
}
@@ -796,9 +799,9 @@ static int make_ov_request(struct drbd_device *device, int cancel)
if (sector + (size>>9) > capacity)
size = (capacity-sector)<<9;
- inc_rs_pending(device);
+ inc_rs_pending(peer_device);
if (drbd_send_ov_request(first_peer_device(device), sector, size)) {
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
return 0;
}
sector += BM_SECT_PER_BIT;
@@ -818,8 +821,8 @@ int w_ov_finished(struct drbd_work *w, int cancel)
container_of(w, struct drbd_device_work, w);
struct drbd_device *device = dw->device;
kfree(dw);
- ov_out_of_sync_print(device);
- drbd_resync_finished(device);
+ ov_out_of_sync_print(first_peer_device(device));
+ drbd_resync_finished(first_peer_device(device));
return 0;
}
@@ -831,7 +834,7 @@ static int w_resync_finished(struct drbd_work *w, int cancel)
struct drbd_device *device = dw->device;
kfree(dw);
- drbd_resync_finished(device);
+ drbd_resync_finished(first_peer_device(device));
return 0;
}
@@ -846,9 +849,10 @@ static void ping_peer(struct drbd_device *device)
test_bit(GOT_PING_ACK, &connection->flags) || device->state.conn < C_CONNECTED);
}
-int drbd_resync_finished(struct drbd_device *device)
+int drbd_resync_finished(struct drbd_peer_device *peer_device)
{
- struct drbd_connection *connection = first_peer_device(device)->connection;
+ struct drbd_device *device = peer_device->device;
+ struct drbd_connection *connection = peer_device->connection;
unsigned long db, dt, dbdt;
unsigned long n_oos;
union drbd_state os, ns;
@@ -1129,7 +1133,7 @@ int w_e_end_rsdata_req(struct drbd_work *w, int cancel)
err = drbd_send_ack(peer_device, P_RS_CANCEL, peer_req);
} else if (likely((peer_req->flags & EE_WAS_ERROR) == 0)) {
if (likely(device->state.pdsk >= D_INCONSISTENT)) {
- inc_rs_pending(device);
+ inc_rs_pending(peer_device);
if (peer_req->flags & EE_RS_THIN_REQ && all_zero(peer_req))
err = drbd_send_rs_deallocated(peer_device, peer_req);
else
@@ -1148,7 +1152,7 @@ int w_e_end_rsdata_req(struct drbd_work *w, int cancel)
err = drbd_send_ack(peer_device, P_NEG_RS_DREPLY, peer_req);
/* update resync data with failure */
- drbd_rs_failed_io(device, peer_req->i.sector, peer_req->i.size);
+ drbd_rs_failed_io(peer_device, peer_req->i.sector, peer_req->i.size);
}
dec_unacked(device);
@@ -1199,12 +1203,12 @@ int w_e_end_csum_rs_req(struct drbd_work *w, int cancel)
}
if (eq) {
- drbd_set_in_sync(device, peer_req->i.sector, peer_req->i.size);
+ drbd_set_in_sync(peer_device, peer_req->i.sector, peer_req->i.size);
/* rs_same_csums unit is BM_BLOCK_SIZE */
device->rs_same_csum += peer_req->i.size >> BM_BLOCK_SHIFT;
err = drbd_send_ack(peer_device, P_RS_IS_IN_SYNC, peer_req);
} else {
- inc_rs_pending(device);
+ inc_rs_pending(peer_device);
peer_req->block_id = ID_SYNCER; /* By setting block_id, digest pointer becomes invalid! */
peer_req->flags &= ~EE_HAS_DIGEST; /* This peer request no longer has a digest pointer */
kfree(di);
@@ -1257,10 +1261,10 @@ int w_e_end_ov_req(struct drbd_work *w, int cancel)
* drbd_alloc_pages due to pp_in_use > max_buffers. */
drbd_free_peer_req(device, peer_req);
peer_req = NULL;
- inc_rs_pending(device);
+ inc_rs_pending(peer_device);
err = drbd_send_drequest_csum(peer_device, sector, size, digest, digest_size, P_OV_REPLY);
if (err)
- dec_rs_pending(device);
+ dec_rs_pending(peer_device);
kfree(digest);
out:
@@ -1270,15 +1274,16 @@ int w_e_end_ov_req(struct drbd_work *w, int cancel)
return err;
}
-void drbd_ov_out_of_sync_found(struct drbd_device *device, sector_t sector, int size)
+void drbd_ov_out_of_sync_found(struct drbd_peer_device *peer_device, sector_t sector, int size)
{
+ struct drbd_device *device = peer_device->device;
if (device->ov_last_oos_start + device->ov_last_oos_size == sector) {
device->ov_last_oos_size += size>>9;
} else {
device->ov_last_oos_start = sector;
device->ov_last_oos_size = size>>9;
}
- drbd_set_out_of_sync(device, sector, size);
+ drbd_set_out_of_sync(peer_device, sector, size);
}
int w_e_end_ov_reply(struct drbd_work *w, int cancel)
@@ -1328,9 +1333,9 @@ int w_e_end_ov_reply(struct drbd_work *w, int cancel)
* drbd_alloc_pages due to pp_in_use > max_buffers. */
drbd_free_peer_req(device, peer_req);
if (!eq)
- drbd_ov_out_of_sync_found(device, sector, size);
+ drbd_ov_out_of_sync_found(peer_device, sector, size);
else
- ov_out_of_sync_print(device);
+ ov_out_of_sync_print(peer_device);
err = drbd_send_ack_ex(peer_device, P_OV_RESULT, sector, size,
eq ? ID_IN_SYNC : ID_OUT_OF_SYNC);
@@ -1341,14 +1346,14 @@ int w_e_end_ov_reply(struct drbd_work *w, int cancel)
/* let's advance progress step marks only for every other megabyte */
if ((device->ov_left & 0x200) == 0x200)
- drbd_advance_rs_marks(device, device->ov_left);
+ drbd_advance_rs_marks(peer_device, device->ov_left);
stop_sector_reached = verify_can_do_stop_sector(device) &&
(sector + (size>>9)) >= device->ov_stop_sector;
if (device->ov_left == 0 || stop_sector_reached) {
- ov_out_of_sync_print(device);
- drbd_resync_finished(device);
+ ov_out_of_sync_print(peer_device);
+ drbd_resync_finished(peer_device);
}
return err;
@@ -1425,7 +1430,7 @@ int w_send_out_of_sync(struct drbd_work *w, int cancel)
int err;
if (unlikely(cancel)) {
- req_mod(req, SEND_CANCELED);
+ req_mod(req, SEND_CANCELED, peer_device);
return 0;
}
req->pre_send_jif = jiffies;
@@ -1437,7 +1442,7 @@ int w_send_out_of_sync(struct drbd_work *w, int cancel)
maybe_send_barrier(connection, req->epoch);
err = drbd_send_out_of_sync(peer_device, req);
- req_mod(req, OOS_HANDED_TO_NETWORK);
+ req_mod(req, OOS_HANDED_TO_NETWORK, peer_device);
return err;
}
@@ -1457,7 +1462,7 @@ int w_send_dblock(struct drbd_work *w, int cancel)
int err;
if (unlikely(cancel)) {
- req_mod(req, SEND_CANCELED);
+ req_mod(req, SEND_CANCELED, peer_device);
return 0;
}
req->pre_send_jif = jiffies;
@@ -1467,7 +1472,7 @@ int w_send_dblock(struct drbd_work *w, int cancel)
connection->send.current_epoch_writes++;
err = drbd_send_dblock(peer_device, req);
- req_mod(req, err ? SEND_FAILED : HANDED_OVER_TO_NETWORK);
+ req_mod(req, err ? SEND_FAILED : HANDED_OVER_TO_NETWORK, peer_device);
if (do_send_unplug && !err)
pd_send_unplug_remote(peer_device);
@@ -1490,7 +1495,7 @@ int w_send_read_req(struct drbd_work *w, int cancel)
int err;
if (unlikely(cancel)) {
- req_mod(req, SEND_CANCELED);
+ req_mod(req, SEND_CANCELED, peer_device);
return 0;
}
req->pre_send_jif = jiffies;
@@ -1502,7 +1507,7 @@ int w_send_read_req(struct drbd_work *w, int cancel)
err = drbd_send_drequest(peer_device, P_DATA_REQUEST, req->i.sector, req->i.size,
(unsigned long)req);
- req_mod(req, err ? SEND_FAILED : HANDED_OVER_TO_NETWORK);
+ req_mod(req, err ? SEND_FAILED : HANDED_OVER_TO_NETWORK, peer_device);
if (do_send_unplug && !err)
pd_send_unplug_remote(peer_device);
@@ -1668,8 +1673,9 @@ void drbd_resync_after_changed(struct drbd_device *device)
} while (changed);
}
-void drbd_rs_controller_reset(struct drbd_device *device)
+void drbd_rs_controller_reset(struct drbd_peer_device *peer_device)
{
+ struct drbd_device *device = peer_device->device;
struct gendisk *disk = device->ldev->backing_bdev->bd_disk;
struct fifo_buffer *plan;
@@ -1891,10 +1897,10 @@ void drbd_start_resync(struct drbd_device *device, enum drbd_conns side)
rcu_read_unlock();
schedule_timeout_interruptible(timeo);
}
- drbd_resync_finished(device);
+ drbd_resync_finished(peer_device);
}
- drbd_rs_controller_reset(device);
+ drbd_rs_controller_reset(peer_device);
/* ns.conn may already be != device->state.conn,
* we may have been paused in between, or become paused until
* the timer triggers.
@@ -1909,8 +1915,9 @@ void drbd_start_resync(struct drbd_device *device, enum drbd_conns side)
mutex_unlock(device->state_mutex);
}
-static void update_on_disk_bitmap(struct drbd_device *device, bool resync_done)
+static void update_on_disk_bitmap(struct drbd_peer_device *peer_device, bool resync_done)
{
+ struct drbd_device *device = peer_device->device;
struct sib_info sib = { .sib_reason = SIB_SYNC_PROGRESS, };
device->rs_last_bcast = jiffies;
@@ -1919,7 +1926,7 @@ static void update_on_disk_bitmap(struct drbd_device *device, bool resync_done)
drbd_bm_write_lazy(device, 0);
if (resync_done && is_sync_state(device->state.conn))
- drbd_resync_finished(device);
+ drbd_resync_finished(peer_device);
drbd_bcast_event(device, &sib);
/* update timestamp, in case it took a while to write out stuff */
@@ -1945,6 +1952,7 @@ static void drbd_ldev_destroy(struct drbd_device *device)
static void go_diskless(struct drbd_device *device)
{
+ struct drbd_peer_device *peer_device = first_peer_device(device);
D_ASSERT(device, device->state.disk == D_FAILED);
/* we cannot assert local_cnt == 0 here, as get_ldev_if_state will
* inc/dec it frequently. Once we are D_DISKLESS, no one will touch
@@ -1970,7 +1978,7 @@ static void go_diskless(struct drbd_device *device)
* Any modifications would not be expected anymore, though.
*/
if (drbd_bitmap_io_from_worker(device, drbd_bm_write,
- "detach", BM_LOCKED_TEST_ALLOWED)) {
+ "detach", BM_LOCKED_TEST_ALLOWED, peer_device)) {
if (test_bit(WAS_READ_ERROR, &device->flags)) {
drbd_md_set_flag(device, MDF_FULL_SYNC);
drbd_md_sync(device);
@@ -2017,7 +2025,7 @@ static void do_device_work(struct drbd_device *device, const unsigned long todo)
do_md_sync(device);
if (test_bit(RS_DONE, &todo) ||
test_bit(RS_PROGRESS, &todo))
- update_on_disk_bitmap(device, test_bit(RS_DONE, &todo));
+ update_on_disk_bitmap(first_peer_device(device), test_bit(RS_DONE, &todo));
if (test_bit(GO_DISKLESS, &todo))
go_diskless(device);
if (test_bit(DESTROY_DISK, &todo))
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 28eb59f..bc31bb7 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1010,9 +1010,6 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
/* This is safe, since we have a reference from open(). */
__module_get(THIS_MODULE);
- /* suppress uevents while reconfiguring the device */
- dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 1);
-
/*
* If we don't hold exclusive handle for the device, upgrade to it
* here to avoid changing device under exclusive owner.
@@ -1067,6 +1064,9 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
}
}
+ /* suppress uevents while reconfiguring the device */
+ dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 1);
+
disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE);
set_disk_ro(lo->lo_disk, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
@@ -1109,17 +1109,17 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
if (partscan)
clear_bit(GD_SUPPRESS_PART_SCAN, &lo->lo_disk->state);
+ /* enable and uncork uevent now that we are done */
+ dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 0);
+
loop_global_unlock(lo, is_loop);
if (partscan)
loop_reread_partitions(lo);
+
if (!(mode & FMODE_EXCL))
bd_abort_claiming(bdev, loop_configure);
- error = 0;
-done:
- /* enable and uncork uevent now that we are done */
- dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 0);
- return error;
+ return 0;
out_unlock:
loop_global_unlock(lo, is_loop);
@@ -1130,7 +1130,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode,
fput(file);
/* This is safe: open() is still holding a reference. */
module_put(THIS_MODULE);
- goto done;
+ return error;
}
static void __loop_clr_fd(struct loop_device *lo, bool release)
diff --git a/drivers/block/nbd.c b/drivers/block/nbd.c
index 592cfa8..c0b1611 100644
--- a/drivers/block/nbd.c
+++ b/drivers/block/nbd.c
@@ -1934,11 +1934,11 @@ static int nbd_genl_connect(struct sk_buff *skb, struct genl_info *info)
return -EINVAL;
}
}
- if (!info->attrs[NBD_ATTR_SOCKETS]) {
+ if (GENL_REQ_ATTR_CHECK(info, NBD_ATTR_SOCKETS)) {
pr_err("must specify at least one socket\n");
return -EINVAL;
}
- if (!info->attrs[NBD_ATTR_SIZE_BYTES]) {
+ if (GENL_REQ_ATTR_CHECK(info, NBD_ATTR_SIZE_BYTES)) {
pr_err("must specify a size in bytes for the device\n");
return -EINVAL;
}
@@ -2123,7 +2123,7 @@ static int nbd_genl_disconnect(struct sk_buff *skb, struct genl_info *info)
if (!netlink_capable(skb, CAP_SYS_ADMIN))
return -EPERM;
- if (!info->attrs[NBD_ATTR_INDEX]) {
+ if (GENL_REQ_ATTR_CHECK(info, NBD_ATTR_INDEX)) {
pr_err("must specify an index to disconnect\n");
return -EINVAL;
}
@@ -2161,7 +2161,7 @@ static int nbd_genl_reconfigure(struct sk_buff *skb, struct genl_info *info)
if (!netlink_capable(skb, CAP_SYS_ADMIN))
return -EPERM;
- if (!info->attrs[NBD_ATTR_INDEX]) {
+ if (GENL_REQ_ATTR_CHECK(info, NBD_ATTR_INDEX)) {
pr_err("must specify a device to reconfigure\n");
return -EINVAL;
}
@@ -2325,6 +2325,7 @@ static struct genl_family nbd_genl_family __ro_after_init = {
.n_small_ops = ARRAY_SIZE(nbd_connect_genl_ops),
.resv_start_op = NBD_CMD_STATUS + 1,
.maxattr = NBD_ATTR_MAX,
+ .netnsok = 1,
.policy = nbd_attr_policy,
.mcgrps = nbd_mcast_grps,
.n_mcgrps = ARRAY_SIZE(nbd_mcast_grps),
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index 9e6b032..bc2c587 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -1030,8 +1030,8 @@ static int null_flush_cache_page(struct nullb *nullb, struct nullb_page *c_page)
if (!t_page)
return -ENOMEM;
- src = kmap_atomic(c_page->page);
- dst = kmap_atomic(t_page->page);
+ src = kmap_local_page(c_page->page);
+ dst = kmap_local_page(t_page->page);
for (i = 0; i < PAGE_SECTORS;
i += (nullb->dev->blocksize >> SECTOR_SHIFT)) {
@@ -1043,8 +1043,8 @@ static int null_flush_cache_page(struct nullb *nullb, struct nullb_page *c_page)
}
}
- kunmap_atomic(dst);
- kunmap_atomic(src);
+ kunmap_local(dst);
+ kunmap_local(src);
ret = radix_tree_delete_item(&nullb->dev->cache, idx, c_page);
null_free_page(ret);
@@ -1112,7 +1112,6 @@ static int copy_to_nullb(struct nullb *nullb, struct page *source,
size_t temp, count = 0;
unsigned int offset;
struct nullb_page *t_page;
- void *dst, *src;
while (count < n) {
temp = min_t(size_t, nullb->dev->blocksize, n - count);
@@ -1126,11 +1125,7 @@ static int copy_to_nullb(struct nullb *nullb, struct page *source,
if (!t_page)
return -ENOSPC;
- src = kmap_atomic(source);
- dst = kmap_atomic(t_page->page);
- memcpy(dst + offset, src + off + count, temp);
- kunmap_atomic(dst);
- kunmap_atomic(src);
+ memcpy_page(t_page->page, offset, source, off + count, temp);
__set_bit(sector & SECTOR_MASK, t_page->bitmap);
@@ -1149,7 +1144,6 @@ static int copy_from_nullb(struct nullb *nullb, struct page *dest,
size_t temp, count = 0;
unsigned int offset;
struct nullb_page *t_page;
- void *dst, *src;
while (count < n) {
temp = min_t(size_t, nullb->dev->blocksize, n - count);
@@ -1158,16 +1152,11 @@ static int copy_from_nullb(struct nullb *nullb, struct page *dest,
t_page = null_lookup_page(nullb, sector, false,
!null_cache_active(nullb));
- dst = kmap_atomic(dest);
- if (!t_page) {
- memset(dst + off + count, 0, temp);
- goto next;
- }
- src = kmap_atomic(t_page->page);
- memcpy(dst + off + count, src + offset, temp);
- kunmap_atomic(src);
-next:
- kunmap_atomic(dst);
+ if (t_page)
+ memcpy_page(dest, off + count, t_page->page, offset,
+ temp);
+ else
+ zero_user(dest, off + count, temp);
count += temp;
sector += temp >> SECTOR_SHIFT;
@@ -1178,11 +1167,7 @@ static int copy_from_nullb(struct nullb *nullb, struct page *dest,
static void nullb_fill_pattern(struct nullb *nullb, struct page *page,
unsigned int len, unsigned int off)
{
- void *dst;
-
- dst = kmap_atomic(page);
- memset(dst + off, 0xFF, len);
- kunmap_atomic(dst);
+ memset_page(page, off, 0xff, len);
}
blk_status_t null_handle_discard(struct nullb_device *dev,
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c
index c73cc57..fdccbf5 100644
--- a/drivers/block/ublk_drv.c
+++ b/drivers/block/ublk_drv.c
@@ -298,9 +298,7 @@ static inline bool ublk_can_use_task_work(const struct ublk_queue *ubq)
static inline bool ublk_need_get_data(const struct ublk_queue *ubq)
{
- if (ubq->flags & UBLK_F_NEED_GET_DATA)
- return true;
- return false;
+ return ubq->flags & UBLK_F_NEED_GET_DATA;
}
static struct ublk_device *ublk_get_device(struct ublk_device *ub)
@@ -349,25 +347,19 @@ static inline int ublk_queue_cmd_buf_size(struct ublk_device *ub, int q_id)
static inline bool ublk_queue_can_use_recovery_reissue(
struct ublk_queue *ubq)
{
- if ((ubq->flags & UBLK_F_USER_RECOVERY) &&
- (ubq->flags & UBLK_F_USER_RECOVERY_REISSUE))
- return true;
- return false;
+ return (ubq->flags & UBLK_F_USER_RECOVERY) &&
+ (ubq->flags & UBLK_F_USER_RECOVERY_REISSUE);
}
static inline bool ublk_queue_can_use_recovery(
struct ublk_queue *ubq)
{
- if (ubq->flags & UBLK_F_USER_RECOVERY)
- return true;
- return false;
+ return ubq->flags & UBLK_F_USER_RECOVERY;
}
static inline bool ublk_can_use_recovery(struct ublk_device *ub)
{
- if (ub->dev_info.flags & UBLK_F_USER_RECOVERY)
- return true;
- return false;
+ return ub->dev_info.flags & UBLK_F_USER_RECOVERY;
}
static void ublk_free_disk(struct gendisk *disk)
@@ -428,10 +420,9 @@ static const struct block_device_operations ub_fops = {
#define UBLK_MAX_PIN_PAGES 32
struct ublk_map_data {
- const struct ublk_queue *ubq;
const struct request *rq;
- const struct ublk_io *io;
- unsigned max_bytes;
+ unsigned long ubuf;
+ unsigned int len;
};
struct ublk_io_iter {
@@ -488,18 +479,17 @@ static inline unsigned ublk_copy_io_pages(struct ublk_io_iter *data,
return done;
}
-static inline int ublk_copy_user_pages(struct ublk_map_data *data,
- bool to_vm)
+static int ublk_copy_user_pages(struct ublk_map_data *data, bool to_vm)
{
const unsigned int gup_flags = to_vm ? FOLL_WRITE : 0;
- const unsigned long start_vm = data->io->addr;
+ const unsigned long start_vm = data->ubuf;
unsigned int done = 0;
struct ublk_io_iter iter = {
.pg_off = start_vm & (PAGE_SIZE - 1),
.bio = data->rq->bio,
.iter = data->rq->bio->bi_iter,
};
- const unsigned int nr_pages = round_up(data->max_bytes +
+ const unsigned int nr_pages = round_up(data->len +
(start_vm & (PAGE_SIZE - 1)), PAGE_SIZE) >> PAGE_SHIFT;
while (done < nr_pages) {
@@ -512,42 +502,49 @@ static inline int ublk_copy_user_pages(struct ublk_map_data *data,
iter.pages);
if (iter.nr_pages <= 0)
return done == 0 ? iter.nr_pages : done;
- len = ublk_copy_io_pages(&iter, data->max_bytes, to_vm);
+ len = ublk_copy_io_pages(&iter, data->len, to_vm);
for (i = 0; i < iter.nr_pages; i++) {
if (to_vm)
set_page_dirty(iter.pages[i]);
put_page(iter.pages[i]);
}
- data->max_bytes -= len;
+ data->len -= len;
done += iter.nr_pages;
}
return done;
}
+static inline bool ublk_need_map_req(const struct request *req)
+{
+ return ublk_rq_has_data(req) && req_op(req) == REQ_OP_WRITE;
+}
+
+static inline bool ublk_need_unmap_req(const struct request *req)
+{
+ return ublk_rq_has_data(req) && req_op(req) == REQ_OP_READ;
+}
+
static int ublk_map_io(const struct ublk_queue *ubq, const struct request *req,
struct ublk_io *io)
{
const unsigned int rq_bytes = blk_rq_bytes(req);
+
/*
* no zero copy, we delay copy WRITE request data into ublksrv
* context and the big benefit is that pinning pages in current
* context is pretty fast, see ublk_pin_user_pages
*/
- if (req_op(req) != REQ_OP_WRITE && req_op(req) != REQ_OP_FLUSH)
- return rq_bytes;
-
- if (ublk_rq_has_data(req)) {
+ if (ublk_need_map_req(req)) {
struct ublk_map_data data = {
- .ubq = ubq,
.rq = req,
- .io = io,
- .max_bytes = rq_bytes,
+ .ubuf = io->addr,
+ .len = rq_bytes,
};
ublk_copy_user_pages(&data, true);
- return rq_bytes - data.max_bytes;
+ return rq_bytes - data.len;
}
return rq_bytes;
}
@@ -558,19 +555,18 @@ static int ublk_unmap_io(const struct ublk_queue *ubq,
{
const unsigned int rq_bytes = blk_rq_bytes(req);
- if (req_op(req) == REQ_OP_READ && ublk_rq_has_data(req)) {
+ if (ublk_need_unmap_req(req)) {
struct ublk_map_data data = {
- .ubq = ubq,
.rq = req,
- .io = io,
- .max_bytes = io->res,
+ .ubuf = io->addr,
+ .len = io->res,
};
WARN_ON_ONCE(io->res > rq_bytes);
ublk_copy_user_pages(&data, false);
- return io->res - data.max_bytes;
+ return io->res - data.len;
}
return rq_bytes;
}
@@ -655,14 +651,15 @@ static void ublk_complete_rq(struct request *req)
struct ublk_queue *ubq = req->mq_hctx->driver_data;
struct ublk_io *io = &ubq->ios[req->tag];
unsigned int unmapped_bytes;
+ blk_status_t res = BLK_STS_OK;
/* failed read IO if nothing is read */
if (!io->res && req_op(req) == REQ_OP_READ)
io->res = -EIO;
if (io->res < 0) {
- blk_mq_end_request(req, errno_to_blk_status(io->res));
- return;
+ res = errno_to_blk_status(io->res);
+ goto exit;
}
/*
@@ -671,10 +668,8 @@ static void ublk_complete_rq(struct request *req)
*
* Both the two needn't unmap.
*/
- if (req_op(req) != REQ_OP_READ && req_op(req) != REQ_OP_WRITE) {
- blk_mq_end_request(req, BLK_STS_OK);
- return;
- }
+ if (req_op(req) != REQ_OP_READ && req_op(req) != REQ_OP_WRITE)
+ goto exit;
/* for READ request, writing data in iod->addr to rq buffers */
unmapped_bytes = ublk_unmap_io(ubq, req, io);
@@ -691,6 +686,10 @@ static void ublk_complete_rq(struct request *req)
blk_mq_requeue_request(req, true);
else
__blk_mq_end_request(req, BLK_STS_OK);
+
+ return;
+exit:
+ blk_mq_end_request(req, res);
}
/*
@@ -771,9 +770,7 @@ static inline void __ublk_rq_task_work(struct request *req,
return;
}
- if (ublk_need_get_data(ubq) &&
- (req_op(req) == REQ_OP_WRITE ||
- req_op(req) == REQ_OP_FLUSH)) {
+ if (ublk_need_get_data(ubq) && ublk_need_map_req(req)) {
/*
* We have not handled UBLK_IO_NEED_GET_DATA command yet,
* so immepdately pass UBLK_IO_RES_NEED_GET_DATA to ublksrv
diff --git a/drivers/char/random.c b/drivers/char/random.c
index 253f2dd..2b47b50 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1546,7 +1546,7 @@ const struct file_operations random_fops = {
.compat_ioctl = compat_ptr_ioctl,
.fasync = random_fasync,
.llseek = noop_llseek,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.splice_write = iter_file_splice_write,
};
@@ -1557,7 +1557,7 @@ const struct file_operations urandom_fops = {
.compat_ioctl = compat_ptr_ioctl,
.fasync = random_fasync,
.llseek = noop_llseek,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.splice_write = iter_file_splice_write,
};
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
index 60b1857..aeeec21 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c
@@ -981,7 +981,12 @@ static bool amdgpu_atcs_pci_probe_handle(struct pci_dev *pdev)
*/
bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev)
{
- if (adev->flags & AMD_IS_APU)
+ if ((adev->flags & AMD_IS_APU) &&
+ adev->gfx.imu.funcs) /* Not need to do mode2 reset for IMU enabled APUs */
+ return false;
+
+ if ((adev->flags & AMD_IS_APU) &&
+ amdgpu_acpi_is_s3_active(adev))
return false;
if (amdgpu_sriov_vf(adev))
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
index e25e1b2..8dc442f 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c
@@ -212,6 +212,21 @@ bool needs_dsc_aux_workaround(struct dc_link *link)
return false;
}
+bool is_synaptics_cascaded_panamera(struct dc_link *link, struct drm_dp_mst_port *port)
+{
+ u8 branch_vendor_data[4] = { 0 }; // Vendor data 0x50C ~ 0x50F
+
+ if (drm_dp_dpcd_read(port->mgr->aux, DP_BRANCH_VENDOR_SPECIFIC_START, &branch_vendor_data, 4) == 4) {
+ if (link->dpcd_caps.branch_dev_id == DP_BRANCH_DEVICE_ID_90CC24 &&
+ IS_SYNAPTICS_CASCADED_PANAMERA(link->dpcd_caps.branch_dev_name, branch_vendor_data)) {
+ DRM_INFO("Synaptics Cascaded MST hub\n");
+ return true;
+ }
+ }
+
+ return false;
+}
+
static bool validate_dsc_caps_on_connector(struct amdgpu_dm_connector *aconnector)
{
struct dc_sink *dc_sink = aconnector->dc_sink;
@@ -235,6 +250,10 @@ static bool validate_dsc_caps_on_connector(struct amdgpu_dm_connector *aconnecto
needs_dsc_aux_workaround(aconnector->dc_link))
aconnector->dsc_aux = &aconnector->mst_root->dm_dp_aux.aux;
+ /* synaptics cascaded MST hub case */
+ if (!aconnector->dsc_aux && is_synaptics_cascaded_panamera(aconnector->dc_link, port))
+ aconnector->dsc_aux = port->mgr->aux;
+
if (!aconnector->dsc_aux)
return false;
@@ -662,12 +681,25 @@ struct dsc_mst_fairness_params {
struct amdgpu_dm_connector *aconnector;
};
-static int kbps_to_peak_pbn(int kbps)
+static uint16_t get_fec_overhead_multiplier(struct dc_link *dc_link)
+{
+ u8 link_coding_cap;
+ uint16_t fec_overhead_multiplier_x1000 = PBN_FEC_OVERHEAD_MULTIPLIER_8B_10B;
+
+ link_coding_cap = dc_link_dp_mst_decide_link_encoding_format(dc_link);
+ if (link_coding_cap == DP_128b_132b_ENCODING)
+ fec_overhead_multiplier_x1000 = PBN_FEC_OVERHEAD_MULTIPLIER_128B_132B;
+
+ return fec_overhead_multiplier_x1000;
+}
+
+static int kbps_to_peak_pbn(int kbps, uint16_t fec_overhead_multiplier_x1000)
{
u64 peak_kbps = kbps;
peak_kbps *= 1006;
- peak_kbps = div_u64(peak_kbps, 1000);
+ peak_kbps *= fec_overhead_multiplier_x1000;
+ peak_kbps = div_u64(peak_kbps, 1000 * 1000);
return (int) DIV64_U64_ROUND_UP(peak_kbps * 64, (54 * 8 * 1000));
}
@@ -761,11 +793,12 @@ static int increase_dsc_bpp(struct drm_atomic_state *state,
int link_timeslots_used;
int fair_pbn_alloc;
int ret = 0;
+ uint16_t fec_overhead_multiplier_x1000 = get_fec_overhead_multiplier(dc_link);
for (i = 0; i < count; i++) {
if (vars[i + k].dsc_enabled) {
initial_slack[i] =
- kbps_to_peak_pbn(params[i].bw_range.max_kbps) - vars[i + k].pbn;
+ kbps_to_peak_pbn(params[i].bw_range.max_kbps, fec_overhead_multiplier_x1000) - vars[i + k].pbn;
bpp_increased[i] = false;
remaining_to_increase += 1;
} else {
@@ -861,6 +894,7 @@ static int try_disable_dsc(struct drm_atomic_state *state,
int next_index;
int remaining_to_try = 0;
int ret;
+ uint16_t fec_overhead_multiplier_x1000 = get_fec_overhead_multiplier(dc_link);
for (i = 0; i < count; i++) {
if (vars[i + k].dsc_enabled
@@ -890,7 +924,7 @@ static int try_disable_dsc(struct drm_atomic_state *state,
if (next_index == -1)
break;
- vars[next_index].pbn = kbps_to_peak_pbn(params[next_index].bw_range.stream_kbps);
+ vars[next_index].pbn = kbps_to_peak_pbn(params[next_index].bw_range.stream_kbps, fec_overhead_multiplier_x1000);
ret = drm_dp_atomic_find_time_slots(state,
params[next_index].port->mgr,
params[next_index].port,
@@ -903,7 +937,7 @@ static int try_disable_dsc(struct drm_atomic_state *state,
vars[next_index].dsc_enabled = false;
vars[next_index].bpp_x16 = 0;
} else {
- vars[next_index].pbn = kbps_to_peak_pbn(params[next_index].bw_range.max_kbps);
+ vars[next_index].pbn = kbps_to_peak_pbn(params[next_index].bw_range.max_kbps, fec_overhead_multiplier_x1000);
ret = drm_dp_atomic_find_time_slots(state,
params[next_index].port->mgr,
params[next_index].port,
@@ -932,6 +966,7 @@ static int compute_mst_dsc_configs_for_link(struct drm_atomic_state *state,
int count = 0;
int i, k, ret;
bool debugfs_overwrite = false;
+ uint16_t fec_overhead_multiplier_x1000 = get_fec_overhead_multiplier(dc_link);
memset(params, 0, sizeof(params));
@@ -993,7 +1028,7 @@ static int compute_mst_dsc_configs_for_link(struct drm_atomic_state *state,
/* Try no compression */
for (i = 0; i < count; i++) {
vars[i + k].aconnector = params[i].aconnector;
- vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps);
+ vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps, fec_overhead_multiplier_x1000);
vars[i + k].dsc_enabled = false;
vars[i + k].bpp_x16 = 0;
ret = drm_dp_atomic_find_time_slots(state, params[i].port->mgr, params[i].port,
@@ -1012,7 +1047,7 @@ static int compute_mst_dsc_configs_for_link(struct drm_atomic_state *state,
/* Try max compression */
for (i = 0; i < count; i++) {
if (params[i].compression_possible && params[i].clock_force_enable != DSC_CLK_FORCE_DISABLE) {
- vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.min_kbps);
+ vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.min_kbps, fec_overhead_multiplier_x1000);
vars[i + k].dsc_enabled = true;
vars[i + k].bpp_x16 = params[i].bw_range.min_target_bpp_x16;
ret = drm_dp_atomic_find_time_slots(state, params[i].port->mgr,
@@ -1020,7 +1055,7 @@ static int compute_mst_dsc_configs_for_link(struct drm_atomic_state *state,
if (ret < 0)
return ret;
} else {
- vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps);
+ vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps, fec_overhead_multiplier_x1000);
vars[i + k].dsc_enabled = false;
vars[i + k].bpp_x16 = 0;
ret = drm_dp_atomic_find_time_slots(state, params[i].port->mgr,
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
index 97fd70d..1e4ede1 100644
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
+++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h
@@ -34,6 +34,21 @@
#define SYNAPTICS_RC_OFFSET 0x4BC
#define SYNAPTICS_RC_DATA 0x4C0
+#define DP_BRANCH_VENDOR_SPECIFIC_START 0x50C
+
+/**
+ * Panamera MST Hub detection
+ * Offset DPCD 050Eh == 0x5A indicates cascaded MST hub case
+ * Check from beginning of branch device vendor specific field (050Ch)
+ */
+#define IS_SYNAPTICS_PANAMERA(branchDevName) (((int)branchDevName[4] & 0xF0) == 0x50 ? 1 : 0)
+#define BRANCH_HW_REVISION_PANAMERA_A2 0x10
+#define SYNAPTICS_CASCADED_HUB_ID 0x5A
+#define IS_SYNAPTICS_CASCADED_PANAMERA(devName, data) ((IS_SYNAPTICS_PANAMERA(devName) && ((int)data[2] == SYNAPTICS_CASCADED_HUB_ID)) ? 1 : 0)
+
+#define PBN_FEC_OVERHEAD_MULTIPLIER_8B_10B 1031
+#define PBN_FEC_OVERHEAD_MULTIPLIER_128B_132B 1000
+
struct amdgpu_display_manager;
struct amdgpu_dm_connector;
diff --git a/drivers/gpu/drm/drm_buddy.c b/drivers/gpu/drm/drm_buddy.c
index 3d1f50f4..7098f12 100644
--- a/drivers/gpu/drm/drm_buddy.c
+++ b/drivers/gpu/drm/drm_buddy.c
@@ -146,8 +146,8 @@ int drm_buddy_init(struct drm_buddy *mm, u64 size, u64 chunk_size)
unsigned int order;
u64 root_size;
- root_size = rounddown_pow_of_two(size);
- order = ilog2(root_size) - ilog2(chunk_size);
+ order = ilog2(size) - ilog2(chunk_size);
+ root_size = chunk_size << order;
root = drm_block_alloc(mm, NULL, order, offset);
if (!root)
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_drv.c b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
index 44ca803..31a7f59 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_drv.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_drv.c
@@ -22,7 +22,6 @@
#include "etnaviv_gem.h"
#include "etnaviv_mmu.h"
#include "etnaviv_perfmon.h"
-#include "common.xml.h"
/*
* DRM operations:
@@ -476,47 +475,7 @@ static const struct drm_ioctl_desc etnaviv_ioctls[] = {
ETNA_IOCTL(PM_QUERY_SIG, pm_query_sig, DRM_RENDER_ALLOW),
};
-static void etnaviv_fop_show_fdinfo(struct seq_file *m, struct file *f)
-{
- struct drm_file *file = f->private_data;
- struct drm_device *dev = file->minor->dev;
- struct etnaviv_drm_private *priv = dev->dev_private;
- struct etnaviv_file_private *ctx = file->driver_priv;
-
- /*
- * For a description of the text output format used here, see
- * Documentation/gpu/drm-usage-stats.rst.
- */
- seq_printf(m, "drm-driver:\t%s\n", dev->driver->name);
- seq_printf(m, "drm-client-id:\t%u\n", ctx->id);
-
- for (int i = 0; i < ETNA_MAX_PIPES; i++) {
- struct etnaviv_gpu *gpu = priv->gpu[i];
- char engine[10] = "UNK";
- int cur = 0;
-
- if (!gpu)
- continue;
-
- if (gpu->identity.features & chipFeatures_PIPE_2D)
- cur = snprintf(engine, sizeof(engine), "2D");
- if (gpu->identity.features & chipFeatures_PIPE_3D)
- cur = snprintf(engine + cur, sizeof(engine) - cur,
- "%s3D", cur ? "/" : "");
- if (gpu->identity.nn_core_count > 0)
- cur = snprintf(engine + cur, sizeof(engine) - cur,
- "%sNN", cur ? "/" : "");
-
- seq_printf(m, "drm-engine-%s:\t%llu ns\n", engine,
- ctx->sched_entity[i].elapsed_ns);
- }
-}
-
-static const struct file_operations fops = {
- .owner = THIS_MODULE,
- DRM_GEM_FOPS,
- .show_fdinfo = etnaviv_fop_show_fdinfo,
-};
+DEFINE_DRM_GEM_FOPS(fops);
static const struct drm_driver etnaviv_drm_driver = {
.driver_features = DRIVER_GEM | DRIVER_RENDER,
diff --git a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
index 7031db1..3524b58 100644
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
+++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c
@@ -91,7 +91,15 @@ static void *etnaviv_gem_prime_vmap_impl(struct etnaviv_gem_object *etnaviv_obj)
static int etnaviv_gem_prime_mmap_obj(struct etnaviv_gem_object *etnaviv_obj,
struct vm_area_struct *vma)
{
- return dma_buf_mmap(etnaviv_obj->base.dma_buf, vma, 0);
+ int ret;
+
+ ret = dma_buf_mmap(etnaviv_obj->base.dma_buf, vma, 0);
+ if (!ret) {
+ /* Drop the reference acquired by drm_gem_mmap_obj(). */
+ drm_gem_object_put(&etnaviv_obj->base);
+ }
+
+ return ret;
}
static const struct etnaviv_gem_ops etnaviv_gem_prime_ops = {
diff --git a/drivers/gpu/drm/i915/display/intel_color.c b/drivers/gpu/drm/i915/display/intel_color.c
index 8d97c29..bd598a7 100644
--- a/drivers/gpu/drm/i915/display/intel_color.c
+++ b/drivers/gpu/drm/i915/display/intel_color.c
@@ -47,6 +47,11 @@ struct intel_color_funcs {
*/
void (*color_commit_arm)(const struct intel_crtc_state *crtc_state);
/*
+ * Perform any extra tasks needed after all the
+ * double buffered registers have been latched.
+ */
+ void (*color_post_update)(const struct intel_crtc_state *crtc_state);
+ /*
* Load LUTs (and other single buffered color management
* registers). Will (hopefully) be called during the vblank
* following the latching of any double buffered registers
@@ -614,9 +619,33 @@ static void ilk_lut_12p4_pack(struct drm_color_lut *entry, u32 ldw, u32 udw)
static void icl_color_commit_noarm(const struct intel_crtc_state *crtc_state)
{
+ /*
+ * Despite Wa_1406463849, ICL no longer suffers from the SKL
+ * DC5/PSR CSC black screen issue (see skl_color_commit_noarm()).
+ * Possibly due to the extra sticky CSC arming
+ * (see icl_color_post_update()).
+ *
+ * On TGL+ all CSC arming issues have been properly fixed.
+ */
icl_load_csc_matrix(crtc_state);
}
+static void skl_color_commit_noarm(const struct intel_crtc_state *crtc_state)
+{
+ /*
+ * Possibly related to display WA #1184, SKL CSC loses the latched
+ * CSC coeff/offset register values if the CSC registers are disarmed
+ * between DC5 exit and PSR exit. This will cause the plane(s) to
+ * output all black (until CSC_MODE is rearmed and properly latched).
+ * Once PSR exit (and proper register latching) has occurred the
+ * danger is over. Thus when PSR is enabled the CSC coeff/offset
+ * register programming will be peformed from skl_color_commit_arm()
+ * which is called after PSR exit.
+ */
+ if (!crtc_state->has_psr)
+ ilk_load_csc_matrix(crtc_state);
+}
+
static void ilk_color_commit_noarm(const struct intel_crtc_state *crtc_state)
{
ilk_load_csc_matrix(crtc_state);
@@ -659,6 +688,9 @@ static void skl_color_commit_arm(const struct intel_crtc_state *crtc_state)
enum pipe pipe = crtc->pipe;
u32 val = 0;
+ if (crtc_state->has_psr)
+ ilk_load_csc_matrix(crtc_state);
+
/*
* We don't (yet) allow userspace to control the pipe background color,
* so force it to black, but apply pipe gamma and CSC appropriately
@@ -677,6 +709,47 @@ static void skl_color_commit_arm(const struct intel_crtc_state *crtc_state)
crtc_state->csc_mode);
}
+static void icl_color_commit_arm(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+ struct drm_i915_private *i915 = to_i915(crtc->base.dev);
+ enum pipe pipe = crtc->pipe;
+
+ /*
+ * We don't (yet) allow userspace to control the pipe background color,
+ * so force it to black.
+ */
+ intel_de_write(i915, SKL_BOTTOM_COLOR(pipe), 0);
+
+ intel_de_write(i915, GAMMA_MODE(crtc->pipe),
+ crtc_state->gamma_mode);
+
+ intel_de_write_fw(i915, PIPE_CSC_MODE(crtc->pipe),
+ crtc_state->csc_mode);
+}
+
+static void icl_color_post_update(const struct intel_crtc_state *crtc_state)
+{
+ struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
+ struct drm_i915_private *i915 = to_i915(crtc->base.dev);
+
+ /*
+ * Despite Wa_1406463849, ICL CSC is no longer disarmed by
+ * coeff/offset register *writes*. Instead, once CSC_MODE
+ * is armed it stays armed, even after it has been latched.
+ * Afterwards the coeff/offset registers become effectively
+ * self-arming. That self-arming must be disabled before the
+ * next icl_color_commit_noarm() tries to write the next set
+ * of coeff/offset registers. Fortunately register *reads*
+ * do still disarm the CSC. Naturally this must not be done
+ * until the previously written CSC registers have actually
+ * been latched.
+ *
+ * TGL+ no longer need this workaround.
+ */
+ intel_de_read_fw(i915, PIPE_CSC_PREOFF_HI(crtc->pipe));
+}
+
static struct drm_property_blob *
create_linear_lut(struct drm_i915_private *i915, int lut_size)
{
@@ -1373,6 +1446,14 @@ void intel_color_commit_arm(const struct intel_crtc_state *crtc_state)
i915->display.funcs.color->color_commit_arm(crtc_state);
}
+void intel_color_post_update(const struct intel_crtc_state *crtc_state)
+{
+ struct drm_i915_private *i915 = to_i915(crtc_state->uapi.crtc->dev);
+
+ if (i915->display.funcs.color->color_post_update)
+ i915->display.funcs.color->color_post_update(crtc_state);
+}
+
void intel_color_prepare_commit(struct intel_crtc_state *crtc_state)
{
struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc);
@@ -3064,10 +3145,20 @@ static const struct intel_color_funcs i9xx_color_funcs = {
.lut_equal = i9xx_lut_equal,
};
+static const struct intel_color_funcs tgl_color_funcs = {
+ .color_check = icl_color_check,
+ .color_commit_noarm = icl_color_commit_noarm,
+ .color_commit_arm = icl_color_commit_arm,
+ .load_luts = icl_load_luts,
+ .read_luts = icl_read_luts,
+ .lut_equal = icl_lut_equal,
+};
+
static const struct intel_color_funcs icl_color_funcs = {
.color_check = icl_color_check,
.color_commit_noarm = icl_color_commit_noarm,
- .color_commit_arm = skl_color_commit_arm,
+ .color_commit_arm = icl_color_commit_arm,
+ .color_post_update = icl_color_post_update,
.load_luts = icl_load_luts,
.read_luts = icl_read_luts,
.lut_equal = icl_lut_equal,
@@ -3075,7 +3166,7 @@ static const struct intel_color_funcs icl_color_funcs = {
static const struct intel_color_funcs glk_color_funcs = {
.color_check = glk_color_check,
- .color_commit_noarm = ilk_color_commit_noarm,
+ .color_commit_noarm = skl_color_commit_noarm,
.color_commit_arm = skl_color_commit_arm,
.load_luts = glk_load_luts,
.read_luts = glk_read_luts,
@@ -3084,7 +3175,7 @@ static const struct intel_color_funcs glk_color_funcs = {
static const struct intel_color_funcs skl_color_funcs = {
.color_check = ivb_color_check,
- .color_commit_noarm = ilk_color_commit_noarm,
+ .color_commit_noarm = skl_color_commit_noarm,
.color_commit_arm = skl_color_commit_arm,
.load_luts = bdw_load_luts,
.read_luts = bdw_read_luts,
@@ -3180,7 +3271,9 @@ void intel_color_init_hooks(struct drm_i915_private *i915)
else
i915->display.funcs.color = &i9xx_color_funcs;
} else {
- if (DISPLAY_VER(i915) >= 11)
+ if (DISPLAY_VER(i915) >= 12)
+ i915->display.funcs.color = &tgl_color_funcs;
+ else if (DISPLAY_VER(i915) == 11)
i915->display.funcs.color = &icl_color_funcs;
else if (DISPLAY_VER(i915) == 10)
i915->display.funcs.color = &glk_color_funcs;
diff --git a/drivers/gpu/drm/i915/display/intel_color.h b/drivers/gpu/drm/i915/display/intel_color.h
index d620b5b..8002492 100644
--- a/drivers/gpu/drm/i915/display/intel_color.h
+++ b/drivers/gpu/drm/i915/display/intel_color.h
@@ -21,6 +21,7 @@ void intel_color_prepare_commit(struct intel_crtc_state *crtc_state);
void intel_color_cleanup_commit(struct intel_crtc_state *crtc_state);
void intel_color_commit_noarm(const struct intel_crtc_state *crtc_state);
void intel_color_commit_arm(const struct intel_crtc_state *crtc_state);
+void intel_color_post_update(const struct intel_crtc_state *crtc_state);
void intel_color_load_luts(const struct intel_crtc_state *crtc_state);
void intel_color_get_config(struct intel_crtc_state *crtc_state);
bool intel_color_lut_equal(const struct intel_crtc_state *crtc_state,
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 208b1b5..63b4b73 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -1209,6 +1209,9 @@ static void intel_post_plane_update(struct intel_atomic_state *state,
if (needs_cursorclk_wa(old_crtc_state) &&
!needs_cursorclk_wa(new_crtc_state))
icl_wa_cursorclkgating(dev_priv, pipe, false);
+
+ if (intel_crtc_needs_color_update(new_crtc_state))
+ intel_color_post_update(new_crtc_state);
}
static void intel_crtc_enable_flip_done(struct intel_atomic_state *state,
@@ -7091,6 +7094,8 @@ static void intel_update_crtc(struct intel_atomic_state *state,
intel_fbc_update(state, crtc);
+ drm_WARN_ON(&i915->drm, !intel_display_power_is_enabled(i915, POWER_DOMAIN_DC_OFF));
+
if (!modeset &&
intel_crtc_needs_color_update(new_crtc_state))
intel_color_commit_noarm(new_crtc_state);
@@ -7458,8 +7463,28 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
drm_atomic_helper_wait_for_dependencies(&state->base);
drm_dp_mst_atomic_wait_for_dependencies(&state->base);
- if (state->modeset)
- wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_MODESET);
+ /*
+ * During full modesets we write a lot of registers, wait
+ * for PLLs, etc. Doing that while DC states are enabled
+ * is not a good idea.
+ *
+ * During fastsets and other updates we also need to
+ * disable DC states due to the following scenario:
+ * 1. DC5 exit and PSR exit happen
+ * 2. Some or all _noarm() registers are written
+ * 3. Due to some long delay PSR is re-entered
+ * 4. DC5 entry -> DMC saves the already written new
+ * _noarm() registers and the old not yet written
+ * _arm() registers
+ * 5. DC5 exit -> DMC restores a mixture of old and
+ * new register values and arms the update
+ * 6. PSR exit -> hardware latches a mixture of old and
+ * new register values -> corrupted frame, or worse
+ * 7. New _arm() registers are finally written
+ * 8. Hardware finally latches a complete set of new
+ * register values, and subsequent frames will be OK again
+ */
+ wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_DC_OFF);
intel_atomic_prepare_plane_clear_colors(state);
@@ -7608,8 +7633,8 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
* the culprit.
*/
intel_uncore_arm_unclaimed_mmio_detection(&dev_priv->uncore);
- intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET, wakeref);
}
+ intel_display_power_put(dev_priv, POWER_DOMAIN_DC_OFF, wakeref);
intel_runtime_pm_put(&dev_priv->runtime_pm, state->wakeref);
/*
diff --git a/drivers/gpu/drm/i915/display/intel_dpt.c b/drivers/gpu/drm/i915/display/intel_dpt.c
index ad1a37b..2a9f40a 100644
--- a/drivers/gpu/drm/i915/display/intel_dpt.c
+++ b/drivers/gpu/drm/i915/display/intel_dpt.c
@@ -301,6 +301,7 @@ intel_dpt_create(struct intel_framebuffer *fb)
vm->pte_encode = gen8_ggtt_pte_encode;
dpt->obj = dpt_obj;
+ dpt->obj->is_dpt = true;
return &dpt->vm;
}
@@ -309,5 +310,6 @@ void intel_dpt_destroy(struct i915_address_space *vm)
{
struct i915_dpt *dpt = i915_vm_to_dpt(vm);
+ dpt->obj->is_dpt = false;
i915_vm_put(&dpt->vm);
}
diff --git a/drivers/gpu/drm/i915/display/intel_tc.c b/drivers/gpu/drm/i915/display/intel_tc.c
index f453287..be510b9 100644
--- a/drivers/gpu/drm/i915/display/intel_tc.c
+++ b/drivers/gpu/drm/i915/display/intel_tc.c
@@ -418,9 +418,9 @@ static bool icl_tc_phy_is_owned(struct intel_digital_port *dig_port)
val = intel_de_read(i915, PORT_TX_DFLEXDPCSSS(dig_port->tc_phy_fia));
if (val == 0xffffffff) {
drm_dbg_kms(&i915->drm,
- "Port %s: PHY in TCCOLD, assume safe mode\n",
+ "Port %s: PHY in TCCOLD, assume not owned\n",
dig_port->tc_port_name);
- return true;
+ return false;
}
return val & DP_PHY_MODE_STATUS_NOT_SAFE(dig_port->tc_phy_fia_idx);
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
index 8949fb0..3198b64 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c
@@ -127,7 +127,8 @@ i915_gem_object_create_lmem_from_data(struct drm_i915_private *i915,
memcpy(map, data, size);
- i915_gem_object_unpin_map(obj);
+ i915_gem_object_flush_map(obj);
+ __i915_gem_object_release_map(obj);
return obj;
}
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object.h b/drivers/gpu/drm/i915/gem/i915_gem_object.h
index f9a8acb..885ccde 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h
@@ -303,7 +303,7 @@ i915_gem_object_never_mmap(const struct drm_i915_gem_object *obj)
static inline bool
i915_gem_object_is_framebuffer(const struct drm_i915_gem_object *obj)
{
- return READ_ONCE(obj->frontbuffer);
+ return READ_ONCE(obj->frontbuffer) || obj->is_dpt;
}
static inline unsigned int
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
index 19c9bdd..5dcbbef 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
+++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h
@@ -491,6 +491,9 @@ struct drm_i915_gem_object {
*/
unsigned int cache_dirty:1;
+ /* @is_dpt: Object houses a display page table (DPT) */
+ unsigned int is_dpt:1;
+
/**
* @read_domains: Read memory domains.
*
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c
index f5d7b51..2c92fa9 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.c
+++ b/drivers/gpu/drm/i915/gt/intel_rps.c
@@ -2075,16 +2075,6 @@ void intel_rps_sanitize(struct intel_rps *rps)
rps_disable_interrupts(rps);
}
-u32 intel_rps_read_rpstat_fw(struct intel_rps *rps)
-{
- struct drm_i915_private *i915 = rps_to_i915(rps);
- i915_reg_t rpstat;
-
- rpstat = (GRAPHICS_VER(i915) >= 12) ? GEN12_RPSTAT1 : GEN6_RPSTAT1;
-
- return intel_uncore_read_fw(rps_to_gt(rps)->uncore, rpstat);
-}
-
u32 intel_rps_read_rpstat(struct intel_rps *rps)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
@@ -2095,7 +2085,7 @@ u32 intel_rps_read_rpstat(struct intel_rps *rps)
return intel_uncore_read(rps_to_gt(rps)->uncore, rpstat);
}
-u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat)
+static u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
u32 cagf;
@@ -2118,10 +2108,11 @@ u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat)
return cagf;
}
-static u32 read_cagf(struct intel_rps *rps)
+static u32 __read_cagf(struct intel_rps *rps, bool take_fw)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
+ i915_reg_t r = INVALID_MMIO_REG;
u32 freq;
/*
@@ -2129,22 +2120,30 @@ static u32 read_cagf(struct intel_rps *rps)
* registers will return 0 freq when GT is in RC6
*/
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) {
- freq = intel_uncore_read(uncore, MTL_MIRROR_TARGET_WP1);
+ r = MTL_MIRROR_TARGET_WP1;
} else if (GRAPHICS_VER(i915) >= 12) {
- freq = intel_uncore_read(uncore, GEN12_RPSTAT1);
+ r = GEN12_RPSTAT1;
} else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) {
vlv_punit_get(i915);
freq = vlv_punit_read(i915, PUNIT_REG_GPU_FREQ_STS);
vlv_punit_put(i915);
} else if (GRAPHICS_VER(i915) >= 6) {
- freq = intel_uncore_read(uncore, GEN6_RPSTAT1);
+ r = GEN6_RPSTAT1;
} else {
- freq = intel_uncore_read(uncore, MEMSTAT_ILK);
+ r = MEMSTAT_ILK;
}
+ if (i915_mmio_reg_valid(r))
+ freq = take_fw ? intel_uncore_read(uncore, r) : intel_uncore_read_fw(uncore, r);
+
return intel_rps_get_cagf(rps, freq);
}
+static u32 read_cagf(struct intel_rps *rps)
+{
+ return __read_cagf(rps, true);
+}
+
u32 intel_rps_read_actual_frequency(struct intel_rps *rps)
{
struct intel_runtime_pm *rpm = rps_to_uncore(rps)->rpm;
@@ -2157,7 +2156,12 @@ u32 intel_rps_read_actual_frequency(struct intel_rps *rps)
return freq;
}
-u32 intel_rps_read_punit_req(struct intel_rps *rps)
+u32 intel_rps_read_actual_frequency_fw(struct intel_rps *rps)
+{
+ return intel_gpu_freq(rps, __read_cagf(rps, false));
+}
+
+static u32 intel_rps_read_punit_req(struct intel_rps *rps)
{
struct intel_uncore *uncore = rps_to_uncore(rps);
struct intel_runtime_pm *rpm = rps_to_uncore(rps)->rpm;
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h b/drivers/gpu/drm/i915/gt/intel_rps.h
index c622962..a3fa987 100644
--- a/drivers/gpu/drm/i915/gt/intel_rps.h
+++ b/drivers/gpu/drm/i915/gt/intel_rps.h
@@ -37,8 +37,8 @@ void intel_rps_mark_interactive(struct intel_rps *rps, bool interactive);
int intel_gpu_freq(struct intel_rps *rps, int val);
int intel_freq_opcode(struct intel_rps *rps, int val);
-u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1);
u32 intel_rps_read_actual_frequency(struct intel_rps *rps);
+u32 intel_rps_read_actual_frequency_fw(struct intel_rps *rps);
u32 intel_rps_get_requested_frequency(struct intel_rps *rps);
u32 intel_rps_get_min_frequency(struct intel_rps *rps);
u32 intel_rps_get_min_raw_freq(struct intel_rps *rps);
@@ -49,10 +49,8 @@ int intel_rps_set_max_frequency(struct intel_rps *rps, u32 val);
u32 intel_rps_get_rp0_frequency(struct intel_rps *rps);
u32 intel_rps_get_rp1_frequency(struct intel_rps *rps);
u32 intel_rps_get_rpn_frequency(struct intel_rps *rps);
-u32 intel_rps_read_punit_req(struct intel_rps *rps);
u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps);
u32 intel_rps_read_rpstat(struct intel_rps *rps);
-u32 intel_rps_read_rpstat_fw(struct intel_rps *rps);
void gen6_rps_get_freq_caps(struct intel_rps *rps, struct intel_rps_freq_caps *caps);
void intel_rps_raise_unslice(struct intel_rps *rps);
void intel_rps_lower_unslice(struct intel_rps *rps);
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c
index 824a34e..283a4a3 100644
--- a/drivers/gpu/drm/i915/i915_perf.c
+++ b/drivers/gpu/drm/i915/i915_perf.c
@@ -1592,9 +1592,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream)
/*
* Wa_16011777198:dg2: Unset the override of GUCRC mode to enable rc6.
*/
- if (intel_uc_uses_guc_rc(>->uc) &&
- (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) ||
- IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0)))
+ if (stream->override_gucrc)
drm_WARN_ON(>->i915->drm,
intel_guc_slpc_unset_gucrc_mode(>->uc.guc.slpc));
@@ -3305,8 +3303,10 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
if (ret) {
drm_dbg(&stream->perf->i915->drm,
"Unable to override gucrc mode\n");
- goto err_config;
+ goto err_gucrc;
}
+
+ stream->override_gucrc = true;
}
ret = alloc_oa_buffer(stream);
@@ -3345,11 +3345,15 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream,
free_oa_buffer(stream);
err_oa_buf_alloc:
- free_oa_configs(stream);
+ if (stream->override_gucrc)
+ intel_guc_slpc_unset_gucrc_mode(>->uc.guc.slpc);
+err_gucrc:
intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL);
intel_engine_pm_put(stream->engine);
+ free_oa_configs(stream);
+
err_config:
free_noa_wait(stream);
diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h
index ca150b7..4d5d8c3 100644
--- a/drivers/gpu/drm/i915/i915_perf_types.h
+++ b/drivers/gpu/drm/i915/i915_perf_types.h
@@ -316,6 +316,12 @@ struct i915_perf_stream {
* buffer should be checked for available data.
*/
u64 poll_oa_period;
+
+ /**
+ * @override_gucrc: GuC RC has been overridden for the perf stream,
+ * and we need to restore the default configuration on release.
+ */
+ bool override_gucrc;
};
/**
diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c
index 52531ab..6d422b0 100644
--- a/drivers/gpu/drm/i915/i915_pmu.c
+++ b/drivers/gpu/drm/i915/i915_pmu.c
@@ -393,14 +393,12 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns)
* case we assume the system is running at the intended
* frequency. Fortunately, the read should rarely fail!
*/
- val = intel_rps_read_rpstat_fw(rps);
- if (val)
- val = intel_rps_get_cagf(rps, val);
- else
- val = rps->cur_freq;
+ val = intel_rps_read_actual_frequency_fw(rps);
+ if (!val)
+ val = intel_gpu_freq(rps, rps->cur_freq);
add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_ACT],
- intel_gpu_freq(rps, val), period_ns / 1000);
+ val, period_ns / 1000);
}
if (pmu->enable & config_mask(I915_PMU_REQUESTED_FREQUENCY)) {
diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c b/drivers/gpu/drm/nouveau/nouveau_backlight.c
index 40409a2..91b5ecc 100644
--- a/drivers/gpu/drm/nouveau/nouveau_backlight.c
+++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c
@@ -33,6 +33,7 @@
#include <linux/apple-gmux.h>
#include <linux/backlight.h>
#include <linux/idr.h>
+#include <drm/drm_probe_helper.h>
#include "nouveau_drv.h"
#include "nouveau_reg.h"
@@ -299,8 +300,12 @@ nv50_backlight_init(struct nouveau_backlight *bl,
struct nouveau_drm *drm = nouveau_drm(nv_encoder->base.base.dev);
struct nvif_object *device = &drm->client.device.object;
+ /*
+ * Note when this runs the connectors have not been probed yet,
+ * so nv_conn->base.status is not set yet.
+ */
if (!nvif_rd32(device, NV50_PDISP_SOR_PWM_CTL(ffs(nv_encoder->dcb->or) - 1)) ||
- nv_conn->base.status != connector_status_connected)
+ drm_helper_probe_detect(&nv_conn->base, NULL, false) != connector_status_connected)
return -ENODEV;
if (nv_conn->type == DCB_CONNECTOR_eDP) {
diff --git a/drivers/gpu/drm/scheduler/sched_main.c b/drivers/gpu/drm/scheduler/sched_main.c
index 4e6ad6e..0e43784 100644
--- a/drivers/gpu/drm/scheduler/sched_main.c
+++ b/drivers/gpu/drm/scheduler/sched_main.c
@@ -906,12 +906,6 @@ drm_sched_get_cleanup_job(struct drm_gpu_scheduler *sched)
spin_unlock(&sched->job_list_lock);
- if (job) {
- job->entity->elapsed_ns += ktime_to_ns(
- ktime_sub(job->s_fence->finished.timestamp,
- job->s_fence->scheduled.timestamp));
- }
-
return job;
}
diff --git a/drivers/gpu/drm/tests/drm_buddy_test.c b/drivers/gpu/drm/tests/drm_buddy_test.c
index f8ee714..09ee6f6 100644
--- a/drivers/gpu/drm/tests/drm_buddy_test.c
+++ b/drivers/gpu/drm/tests/drm_buddy_test.c
@@ -89,7 +89,8 @@ static int check_block(struct kunit *test, struct drm_buddy *mm,
err = -EINVAL;
}
- if (!is_power_of_2(block_size)) {
+ /* We can't use is_power_of_2() for a u64 on 32-bit systems. */
+ if (block_size & (block_size - 1)) {
kunit_err(test, "block size not power of two\n");
err = -EINVAL;
}
diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c
index f642ec8..29131f1 100644
--- a/drivers/input/joystick/xpad.c
+++ b/drivers/input/joystick/xpad.c
@@ -781,9 +781,6 @@ static void xpad_process_packet(struct usb_xpad *xpad, u16 cmd, unsigned char *d
input_report_key(dev, BTN_C, data[8]);
input_report_key(dev, BTN_Z, data[9]);
- /* Profile button has a value of 0-3, so it is reported as an axis */
- if (xpad->mapping & MAP_PROFILE_BUTTON)
- input_report_abs(dev, ABS_PROFILE, data[34]);
input_sync(dev);
}
@@ -1061,6 +1058,10 @@ static void xpadone_process_packet(struct usb_xpad *xpad, u16 cmd, unsigned char
(__u16) le16_to_cpup((__le16 *)(data + 8)));
}
+ /* Profile button has a value of 0-3, so it is reported as an axis */
+ if (xpad->mapping & MAP_PROFILE_BUTTON)
+ input_report_abs(dev, ABS_PROFILE, data[34]);
+
/* paddle handling */
/* based on SDL's SDL_hidapi_xboxone.c */
if (xpad->mapping & MAP_PADDLES) {
diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c
index 989228b..e2c11d9 100644
--- a/drivers/input/mouse/alps.c
+++ b/drivers/input/mouse/alps.c
@@ -852,8 +852,8 @@ static void alps_process_packet_v6(struct psmouse *psmouse)
x = y = z = 0;
/* Divide 4 since trackpoint's speed is too fast */
- input_report_rel(dev2, REL_X, (char)x / 4);
- input_report_rel(dev2, REL_Y, -((char)y / 4));
+ input_report_rel(dev2, REL_X, (s8)x / 4);
+ input_report_rel(dev2, REL_Y, -((s8)y / 4));
psmouse_report_standard_buttons(dev2, packet[3]);
@@ -1104,8 +1104,8 @@ static void alps_process_trackstick_packet_v7(struct psmouse *psmouse)
((packet[3] & 0x20) << 1);
z = (packet[5] & 0x3f) | ((packet[3] & 0x80) >> 1);
- input_report_rel(dev2, REL_X, (char)x);
- input_report_rel(dev2, REL_Y, -((char)y));
+ input_report_rel(dev2, REL_X, (s8)x);
+ input_report_rel(dev2, REL_Y, -((s8)y));
input_report_abs(dev2, ABS_PRESSURE, z);
psmouse_report_standard_buttons(dev2, packet[1]);
@@ -2294,20 +2294,20 @@ static int alps_get_v3_v7_resolution(struct psmouse *psmouse, int reg_pitch)
if (reg < 0)
return reg;
- x_pitch = (char)(reg << 4) >> 4; /* sign extend lower 4 bits */
+ x_pitch = (s8)(reg << 4) >> 4; /* sign extend lower 4 bits */
x_pitch = 50 + 2 * x_pitch; /* In 0.1 mm units */
- y_pitch = (char)reg >> 4; /* sign extend upper 4 bits */
+ y_pitch = (s8)reg >> 4; /* sign extend upper 4 bits */
y_pitch = 36 + 2 * y_pitch; /* In 0.1 mm units */
reg = alps_command_mode_read_reg(psmouse, reg_pitch + 1);
if (reg < 0)
return reg;
- x_electrode = (char)(reg << 4) >> 4; /* sign extend lower 4 bits */
+ x_electrode = (s8)(reg << 4) >> 4; /* sign extend lower 4 bits */
x_electrode = 17 + x_electrode;
- y_electrode = (char)reg >> 4; /* sign extend upper 4 bits */
+ y_electrode = (s8)reg >> 4; /* sign extend upper 4 bits */
y_electrode = 13 + y_electrode;
x_phys = x_pitch * (x_electrode - 1); /* In 0.1 mm units */
diff --git a/drivers/input/mouse/focaltech.c b/drivers/input/mouse/focaltech.c
index 6fd5fff..c74b990 100644
--- a/drivers/input/mouse/focaltech.c
+++ b/drivers/input/mouse/focaltech.c
@@ -202,8 +202,8 @@ static void focaltech_process_rel_packet(struct psmouse *psmouse,
state->pressed = packet[0] >> 7;
finger1 = ((packet[0] >> 4) & 0x7) - 1;
if (finger1 < FOC_MAX_FINGERS) {
- state->fingers[finger1].x += (char)packet[1];
- state->fingers[finger1].y += (char)packet[2];
+ state->fingers[finger1].x += (s8)packet[1];
+ state->fingers[finger1].y += (s8)packet[2];
} else {
psmouse_err(psmouse, "First finger in rel packet invalid: %d\n",
finger1);
@@ -218,8 +218,8 @@ static void focaltech_process_rel_packet(struct psmouse *psmouse,
*/
finger2 = ((packet[3] >> 4) & 0x7) - 1;
if (finger2 < FOC_MAX_FINGERS) {
- state->fingers[finger2].x += (char)packet[4];
- state->fingers[finger2].y += (char)packet[5];
+ state->fingers[finger2].x += (s8)packet[4];
+ state->fingers[finger2].y += (s8)packet[5];
}
}
diff --git a/drivers/input/serio/i8042-acpipnpio.h b/drivers/input/serio/i8042-acpipnpio.h
index efc6173..028e45b 100644
--- a/drivers/input/serio/i8042-acpipnpio.h
+++ b/drivers/input/serio/i8042-acpipnpio.h
@@ -611,6 +611,14 @@ static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = {
.driver_data = (void *)(SERIO_QUIRK_NOMUX)
},
{
+ /* Fujitsu Lifebook A574/H */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "FMVA0501PZ"),
+ },
+ .driver_data = (void *)(SERIO_QUIRK_NOMUX)
+ },
+ {
/* Gigabyte M912 */
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "GIGABYTE"),
@@ -1117,6 +1125,20 @@ static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = {
SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP)
},
{
+ /*
+ * Setting SERIO_QUIRK_NOMUX or SERIO_QUIRK_RESET_ALWAYS makes
+ * the keyboard very laggy for ~5 seconds after boot and
+ * sometimes also after resume.
+ * However both are required for the keyboard to not fail
+ * completely sometimes after boot or resume.
+ */
+ .matches = {
+ DMI_MATCH(DMI_BOARD_NAME, "N150CU"),
+ },
+ .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS |
+ SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP)
+ },
+ {
.matches = {
DMI_MATCH(DMI_BOARD_NAME, "NH5xAx"),
},
@@ -1124,6 +1146,20 @@ static const struct dmi_system_id i8042_dmi_quirk_table[] __initconst = {
SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP)
},
{
+ /*
+ * Setting SERIO_QUIRK_NOMUX or SERIO_QUIRK_RESET_ALWAYS makes
+ * the keyboard very laggy for ~5 seconds after boot and
+ * sometimes also after resume.
+ * However both are required for the keyboard to not fail
+ * completely sometimes after boot or resume.
+ */
+ .matches = {
+ DMI_MATCH(DMI_BOARD_NAME, "NHxxRZQ"),
+ },
+ .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS |
+ SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP)
+ },
+ {
.matches = {
DMI_MATCH(DMI_BOARD_NAME, "NL5xRU"),
},
diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c
index b348172..d77f116 100644
--- a/drivers/input/touchscreen/goodix.c
+++ b/drivers/input/touchscreen/goodix.c
@@ -124,10 +124,18 @@ static const unsigned long goodix_irq_flags[] = {
static const struct dmi_system_id nine_bytes_report[] = {
#if defined(CONFIG_DMI) && defined(CONFIG_X86)
{
- .ident = "Lenovo YogaBook",
- /* YB1-X91L/F and YB1-X90L/F */
+ /* Lenovo Yoga Book X90F / X90L */
.matches = {
- DMI_MATCH(DMI_PRODUCT_NAME, "Lenovo YB1-X9")
+ DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Intel Corporation"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "CHERRYVIEW D1 PLATFORM"),
+ DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "YETI-11"),
+ }
+ },
+ {
+ /* Lenovo Yoga Book X91F / X91L */
+ .matches = {
+ /* Non exact match to match F + L versions */
+ DMI_MATCH(DMI_PRODUCT_NAME, "Lenovo YB1-X91"),
}
},
#endif
diff --git a/drivers/iommu/exynos-iommu.c b/drivers/iommu/exynos-iommu.c
index 483aaae..1abd187 100644
--- a/drivers/iommu/exynos-iommu.c
+++ b/drivers/iommu/exynos-iommu.c
@@ -1415,23 +1415,26 @@ static struct iommu_device *exynos_iommu_probe_device(struct device *dev)
return &data->iommu;
}
-static void exynos_iommu_release_device(struct device *dev)
+static void exynos_iommu_set_platform_dma(struct device *dev)
{
struct exynos_iommu_owner *owner = dev_iommu_priv_get(dev);
- struct sysmmu_drvdata *data;
if (owner->domain) {
struct iommu_group *group = iommu_group_get(dev);
if (group) {
-#ifndef CONFIG_ARM
- WARN_ON(owner->domain !=
- iommu_group_default_domain(group));
-#endif
exynos_iommu_detach_device(owner->domain, dev);
iommu_group_put(group);
}
}
+}
+
+static void exynos_iommu_release_device(struct device *dev)
+{
+ struct exynos_iommu_owner *owner = dev_iommu_priv_get(dev);
+ struct sysmmu_drvdata *data;
+
+ exynos_iommu_set_platform_dma(dev);
list_for_each_entry(data, &owner->controllers, owner_node)
device_link_del(data->link);
@@ -1479,7 +1482,7 @@ static const struct iommu_ops exynos_iommu_ops = {
.domain_alloc = exynos_iommu_domain_alloc,
.device_group = generic_device_group,
#ifdef CONFIG_ARM
- .set_platform_dma_ops = exynos_iommu_release_device,
+ .set_platform_dma_ops = exynos_iommu_set_platform_dma,
#endif
.probe_device = exynos_iommu_probe_device,
.release_device = exynos_iommu_release_device,
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 6acfe87..23828d1 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1071,7 +1071,8 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
}
err = -EINVAL;
- if (cap_sagaw(iommu->cap) == 0) {
+ if (!cap_sagaw(iommu->cap) &&
+ (!ecap_smts(iommu->ecap) || ecap_slts(iommu->ecap))) {
pr_info("%s: No supported address widths. Not attempting DMA translation.\n",
iommu->name);
drhd->ignored = 1;
diff --git a/drivers/iommu/intel/iommu.h b/drivers/iommu/intel/iommu.h
index d6df3b8..694ab9b 100644
--- a/drivers/iommu/intel/iommu.h
+++ b/drivers/iommu/intel/iommu.h
@@ -641,6 +641,8 @@ struct iommu_pmu {
DECLARE_BITMAP(used_mask, IOMMU_PMU_IDX_MAX);
struct perf_event *event_list[IOMMU_PMU_IDX_MAX];
unsigned char irq_name[16];
+ struct hlist_node cpuhp_node;
+ int cpu;
};
#define IOMMU_IRQ_ID_OFFSET_PRQ (DMAR_UNITS_SUPPORTED)
diff --git a/drivers/iommu/intel/irq_remapping.c b/drivers/iommu/intel/irq_remapping.c
index 6d01fa0..df9e261 100644
--- a/drivers/iommu/intel/irq_remapping.c
+++ b/drivers/iommu/intel/irq_remapping.c
@@ -311,14 +311,12 @@ static int set_ioapic_sid(struct irte *irte, int apic)
if (!irte)
return -1;
- down_read(&dmar_global_lock);
for (i = 0; i < MAX_IO_APICS; i++) {
if (ir_ioapic[i].iommu && ir_ioapic[i].id == apic) {
sid = (ir_ioapic[i].bus << 8) | ir_ioapic[i].devfn;
break;
}
}
- up_read(&dmar_global_lock);
if (sid == 0) {
pr_warn("Failed to set source-id of IOAPIC (%d)\n", apic);
@@ -338,14 +336,12 @@ static int set_hpet_sid(struct irte *irte, u8 id)
if (!irte)
return -1;
- down_read(&dmar_global_lock);
for (i = 0; i < MAX_HPET_TBS; i++) {
if (ir_hpet[i].iommu && ir_hpet[i].id == id) {
sid = (ir_hpet[i].bus << 8) | ir_hpet[i].devfn;
break;
}
}
- up_read(&dmar_global_lock);
if (sid == 0) {
pr_warn("Failed to set source-id of HPET block (%d)\n", id);
@@ -1339,9 +1335,7 @@ static int intel_irq_remapping_alloc(struct irq_domain *domain,
if (!data)
goto out_free_parent;
- down_read(&dmar_global_lock);
index = alloc_irte(iommu, &data->irq_2_iommu, nr_irqs);
- up_read(&dmar_global_lock);
if (index < 0) {
pr_warn("Failed to allocate IRTE\n");
kfree(data);
diff --git a/drivers/iommu/intel/perfmon.c b/drivers/iommu/intel/perfmon.c
index e17d974..cf43e79 100644
--- a/drivers/iommu/intel/perfmon.c
+++ b/drivers/iommu/intel/perfmon.c
@@ -773,19 +773,34 @@ static void iommu_pmu_unset_interrupt(struct intel_iommu *iommu)
iommu->perf_irq = 0;
}
-static int iommu_pmu_cpu_online(unsigned int cpu)
+static int iommu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
{
+ struct iommu_pmu *iommu_pmu = hlist_entry_safe(node, typeof(*iommu_pmu), cpuhp_node);
+
if (cpumask_empty(&iommu_pmu_cpu_mask))
cpumask_set_cpu(cpu, &iommu_pmu_cpu_mask);
+ if (cpumask_test_cpu(cpu, &iommu_pmu_cpu_mask))
+ iommu_pmu->cpu = cpu;
+
return 0;
}
-static int iommu_pmu_cpu_offline(unsigned int cpu)
+static int iommu_pmu_cpu_offline(unsigned int cpu, struct hlist_node *node)
{
- struct dmar_drhd_unit *drhd;
- struct intel_iommu *iommu;
- int target;
+ struct iommu_pmu *iommu_pmu = hlist_entry_safe(node, typeof(*iommu_pmu), cpuhp_node);
+ int target = cpumask_first(&iommu_pmu_cpu_mask);
+
+ /*
+ * The iommu_pmu_cpu_mask has been updated when offline the CPU
+ * for the first iommu_pmu. Migrate the other iommu_pmu to the
+ * new target.
+ */
+ if (target < nr_cpu_ids && target != iommu_pmu->cpu) {
+ perf_pmu_migrate_context(&iommu_pmu->pmu, cpu, target);
+ iommu_pmu->cpu = target;
+ return 0;
+ }
if (!cpumask_test_and_clear_cpu(cpu, &iommu_pmu_cpu_mask))
return 0;
@@ -795,45 +810,50 @@ static int iommu_pmu_cpu_offline(unsigned int cpu)
if (target < nr_cpu_ids)
cpumask_set_cpu(target, &iommu_pmu_cpu_mask);
else
- target = -1;
+ return 0;
- rcu_read_lock();
-
- for_each_iommu(iommu, drhd) {
- if (!iommu->pmu)
- continue;
- perf_pmu_migrate_context(&iommu->pmu->pmu, cpu, target);
- }
- rcu_read_unlock();
+ perf_pmu_migrate_context(&iommu_pmu->pmu, cpu, target);
+ iommu_pmu->cpu = target;
return 0;
}
static int nr_iommu_pmu;
+static enum cpuhp_state iommu_cpuhp_slot;
static int iommu_pmu_cpuhp_setup(struct iommu_pmu *iommu_pmu)
{
int ret;
- if (nr_iommu_pmu++)
- return 0;
+ if (!nr_iommu_pmu) {
+ ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+ "driver/iommu/intel/perfmon:online",
+ iommu_pmu_cpu_online,
+ iommu_pmu_cpu_offline);
+ if (ret < 0)
+ return ret;
+ iommu_cpuhp_slot = ret;
+ }
- ret = cpuhp_setup_state(CPUHP_AP_PERF_X86_IOMMU_PERF_ONLINE,
- "driver/iommu/intel/perfmon:online",
- iommu_pmu_cpu_online,
- iommu_pmu_cpu_offline);
- if (ret)
- nr_iommu_pmu = 0;
+ ret = cpuhp_state_add_instance(iommu_cpuhp_slot, &iommu_pmu->cpuhp_node);
+ if (ret) {
+ if (!nr_iommu_pmu)
+ cpuhp_remove_multi_state(iommu_cpuhp_slot);
+ return ret;
+ }
+ nr_iommu_pmu++;
- return ret;
+ return 0;
}
static void iommu_pmu_cpuhp_free(struct iommu_pmu *iommu_pmu)
{
+ cpuhp_state_remove_instance(iommu_cpuhp_slot, &iommu_pmu->cpuhp_node);
+
if (--nr_iommu_pmu)
return;
- cpuhp_remove_state(CPUHP_AP_PERF_X86_IOMMU_PERF_ONLINE);
+ cpuhp_remove_multi_state(iommu_cpuhp_slot);
}
void iommu_pmu_register(struct intel_iommu *iommu)
diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c
index 2055a75..7899f5f 100644
--- a/drivers/md/dm-table.c
+++ b/drivers/md/dm-table.c
@@ -1202,21 +1202,12 @@ struct dm_crypto_profile {
struct mapped_device *md;
};
-struct dm_keyslot_evict_args {
- const struct blk_crypto_key *key;
- int err;
-};
-
static int dm_keyslot_evict_callback(struct dm_target *ti, struct dm_dev *dev,
sector_t start, sector_t len, void *data)
{
- struct dm_keyslot_evict_args *args = data;
- int err;
+ const struct blk_crypto_key *key = data;
- err = blk_crypto_evict_key(dev->bdev, args->key);
- if (!args->err)
- args->err = err;
- /* Always try to evict the key from all devices. */
+ blk_crypto_evict_key(dev->bdev, key);
return 0;
}
@@ -1229,7 +1220,6 @@ static int dm_keyslot_evict(struct blk_crypto_profile *profile,
{
struct mapped_device *md =
container_of(profile, struct dm_crypto_profile, profile)->md;
- struct dm_keyslot_evict_args args = { key };
struct dm_table *t;
int srcu_idx;
@@ -1242,11 +1232,12 @@ static int dm_keyslot_evict(struct blk_crypto_profile *profile,
if (!ti->type->iterate_devices)
continue;
- ti->type->iterate_devices(ti, dm_keyslot_evict_callback, &args);
+ ti->type->iterate_devices(ti, dm_keyslot_evict_callback,
+ (void *)key);
}
dm_put_live_table(md, srcu_idx);
- return args.err;
+ return 0;
}
static int
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 2d0f934..dfde008 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1467,7 +1467,8 @@ static void setup_split_accounting(struct clone_info *ci, unsigned int len)
}
static void alloc_multiple_bios(struct bio_list *blist, struct clone_info *ci,
- struct dm_target *ti, unsigned int num_bios)
+ struct dm_target *ti, unsigned int num_bios,
+ unsigned *len)
{
struct bio *bio;
int try;
@@ -1478,7 +1479,7 @@ static void alloc_multiple_bios(struct bio_list *blist, struct clone_info *ci,
if (try)
mutex_lock(&ci->io->md->table_devices_lock);
for (bio_nr = 0; bio_nr < num_bios; bio_nr++) {
- bio = alloc_tio(ci, ti, bio_nr, NULL,
+ bio = alloc_tio(ci, ti, bio_nr, len,
try ? GFP_NOIO : GFP_NOWAIT);
if (!bio)
break;
@@ -1513,8 +1514,10 @@ static int __send_duplicate_bios(struct clone_info *ci, struct dm_target *ti,
ret = 1;
break;
default:
+ if (len)
+ setup_split_accounting(ci, *len);
/* dm_accept_partial_bio() is not supported with shared tio->len_ptr */
- alloc_multiple_bios(&blist, ci, ti, num_bios);
+ alloc_multiple_bios(&blist, ci, ti, num_bios, len);
while ((clone = bio_list_pop(&blist))) {
dm_tio_set_flag(clone_to_tio(clone), DM_TIO_IS_DUPLICATE_BIO);
__map_bio(clone);
diff --git a/drivers/md/md.c b/drivers/md/md.c
index 39e49e5..13321db 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -6260,7 +6260,6 @@ static void __md_stop(struct mddev *mddev)
module_put(pers->owner);
clear_bit(MD_RECOVERY_FROZEN, &mddev->recovery);
- percpu_ref_exit(&mddev->writes_pending);
percpu_ref_exit(&mddev->active_io);
bioset_exit(&mddev->bio_set);
bioset_exit(&mddev->sync_set);
@@ -6273,6 +6272,7 @@ void md_stop(struct mddev *mddev)
*/
__md_stop_writes(mddev);
__md_stop(mddev);
+ percpu_ref_exit(&mddev->writes_pending);
}
EXPORT_SYMBOL_GPL(md_stop);
@@ -7843,6 +7843,7 @@ static void md_free_disk(struct gendisk *disk)
{
struct mddev *mddev = disk->private_data;
+ percpu_ref_exit(&mddev->writes_pending);
mddev_free(mddev);
}
diff --git a/drivers/media/i2c/imx290.c b/drivers/media/i2c/imx290.c
index 49d6c8b..48ae2e0 100644
--- a/drivers/media/i2c/imx290.c
+++ b/drivers/media/i2c/imx290.c
@@ -1098,7 +1098,7 @@ static int imx290_runtime_suspend(struct device *dev)
}
static const struct dev_pm_ops imx290_pm_ops = {
- SET_RUNTIME_PM_OPS(imx290_runtime_suspend, imx290_runtime_resume, NULL)
+ RUNTIME_PM_OPS(imx290_runtime_suspend, imx290_runtime_resume, NULL)
};
/* ----------------------------------------------------------------------------
@@ -1362,8 +1362,8 @@ static struct i2c_driver imx290_i2c_driver = {
.remove = imx290_remove,
.driver = {
.name = "imx290",
- .pm = &imx290_pm_ops,
- .of_match_table = of_match_ptr(imx290_of_match),
+ .pm = pm_ptr(&imx290_pm_ops),
+ .of_match_table = imx290_of_match,
},
};
diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c
index 61ff20a..cfb11c5 100644
--- a/drivers/media/platform/qcom/venus/firmware.c
+++ b/drivers/media/platform/qcom/venus/firmware.c
@@ -38,8 +38,8 @@ static void venus_reset_cpu(struct venus_core *core)
writel(fw_size, wrapper_base + WRAPPER_FW_END_ADDR);
writel(0, wrapper_base + WRAPPER_CPA_START_ADDR);
writel(fw_size, wrapper_base + WRAPPER_CPA_END_ADDR);
- writel(0, wrapper_base + WRAPPER_NONPIX_START_ADDR);
- writel(0, wrapper_base + WRAPPER_NONPIX_END_ADDR);
+ writel(fw_size, wrapper_base + WRAPPER_NONPIX_START_ADDR);
+ writel(fw_size, wrapper_base + WRAPPER_NONPIX_END_ADDR);
if (IS_V6(core)) {
/* Bring XTSS out of reset */
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c
index 0a78045..522d375 100644
--- a/drivers/mtd/spi-nor/core.c
+++ b/drivers/mtd/spi-nor/core.c
@@ -3343,7 +3343,19 @@ static struct spi_mem_driver spi_nor_driver = {
.remove = spi_nor_remove,
.shutdown = spi_nor_shutdown,
};
-module_spi_mem_driver(spi_nor_driver);
+
+static int __init spi_nor_module_init(void)
+{
+ return spi_mem_driver_register(&spi_nor_driver);
+}
+module_init(spi_nor_module_init);
+
+static void __exit spi_nor_module_exit(void)
+{
+ spi_mem_driver_unregister(&spi_nor_driver);
+ spi_nor_debugfs_shutdown();
+}
+module_exit(spi_nor_module_exit);
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Huang Shijie <shijie8@gmail.com>");
diff --git a/drivers/mtd/spi-nor/core.h b/drivers/mtd/spi-nor/core.h
index 25423225..e0cc42a 100644
--- a/drivers/mtd/spi-nor/core.h
+++ b/drivers/mtd/spi-nor/core.h
@@ -711,8 +711,10 @@ static inline struct spi_nor *mtd_to_spi_nor(struct mtd_info *mtd)
#ifdef CONFIG_DEBUG_FS
void spi_nor_debugfs_register(struct spi_nor *nor);
+void spi_nor_debugfs_shutdown(void);
#else
static inline void spi_nor_debugfs_register(struct spi_nor *nor) {}
+static inline void spi_nor_debugfs_shutdown(void) {}
#endif
#endif /* __LINUX_MTD_SPI_NOR_INTERNAL_H */
diff --git a/drivers/mtd/spi-nor/debugfs.c b/drivers/mtd/spi-nor/debugfs.c
index 845b78c..fc7ad20 100644
--- a/drivers/mtd/spi-nor/debugfs.c
+++ b/drivers/mtd/spi-nor/debugfs.c
@@ -226,13 +226,13 @@ static void spi_nor_debugfs_unregister(void *data)
nor->debugfs_root = NULL;
}
+static struct dentry *rootdir;
+
void spi_nor_debugfs_register(struct spi_nor *nor)
{
- struct dentry *rootdir, *d;
+ struct dentry *d;
int ret;
- /* Create rootdir once. Will never be deleted again. */
- rootdir = debugfs_lookup(SPI_NOR_DEBUGFS_ROOT, NULL);
if (!rootdir)
rootdir = debugfs_create_dir(SPI_NOR_DEBUGFS_ROOT, NULL);
@@ -247,3 +247,8 @@ void spi_nor_debugfs_register(struct spi_nor *nor)
debugfs_create_file("capabilities", 0444, d, nor,
&spi_nor_capabilities_fops);
}
+
+void spi_nor_debugfs_shutdown(void)
+{
+ debugfs_remove(rootdir);
+}
diff --git a/drivers/net/dsa/b53/b53_mmap.c b/drivers/net/dsa/b53/b53_mmap.c
index 70887e0..d9434ed 100644
--- a/drivers/net/dsa/b53/b53_mmap.c
+++ b/drivers/net/dsa/b53/b53_mmap.c
@@ -216,6 +216,18 @@ static int b53_mmap_write64(struct b53_device *dev, u8 page, u8 reg,
return 0;
}
+static int b53_mmap_phy_read16(struct b53_device *dev, int addr, int reg,
+ u16 *value)
+{
+ return -EIO;
+}
+
+static int b53_mmap_phy_write16(struct b53_device *dev, int addr, int reg,
+ u16 value)
+{
+ return -EIO;
+}
+
static const struct b53_io_ops b53_mmap_ops = {
.read8 = b53_mmap_read8,
.read16 = b53_mmap_read16,
@@ -227,6 +239,8 @@ static const struct b53_io_ops b53_mmap_ops = {
.write32 = b53_mmap_write32,
.write48 = b53_mmap_write48,
.write64 = b53_mmap_write64,
+ .phy_read16 = b53_mmap_phy_read16,
+ .phy_write16 = b53_mmap_phy_write16,
};
static int b53_mmap_probe_of(struct platform_device *pdev,
diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c
index 003b0ac..3fffd5d 100644
--- a/drivers/net/dsa/microchip/ksz8795.c
+++ b/drivers/net/dsa/microchip/ksz8795.c
@@ -958,15 +958,14 @@ int ksz8_fdb_dump(struct ksz_device *dev, int port,
u16 entries = 0;
u8 timestamp = 0;
u8 fid;
- u8 member;
- struct alu_struct alu;
+ u8 src_port;
+ u8 mac[ETH_ALEN];
do {
- alu.is_static = false;
- ret = ksz8_r_dyn_mac_table(dev, i, alu.mac, &fid, &member,
+ ret = ksz8_r_dyn_mac_table(dev, i, mac, &fid, &src_port,
×tamp, &entries);
- if (!ret && (member & BIT(port))) {
- ret = cb(alu.mac, alu.fid, alu.is_static, data);
+ if (!ret && port == src_port) {
+ ret = cb(mac, fid, false, data);
if (ret)
break;
}
diff --git a/drivers/net/dsa/microchip/ksz8863_smi.c b/drivers/net/dsa/microchip/ksz8863_smi.c
index 2f4623f..3698112 100644
--- a/drivers/net/dsa/microchip/ksz8863_smi.c
+++ b/drivers/net/dsa/microchip/ksz8863_smi.c
@@ -82,22 +82,16 @@ static const struct regmap_bus regmap_smi[] = {
{
.read = ksz8863_mdio_read,
.write = ksz8863_mdio_write,
- .max_raw_read = 1,
- .max_raw_write = 1,
},
{
.read = ksz8863_mdio_read,
.write = ksz8863_mdio_write,
.val_format_endian_default = REGMAP_ENDIAN_BIG,
- .max_raw_read = 2,
- .max_raw_write = 2,
},
{
.read = ksz8863_mdio_read,
.write = ksz8863_mdio_write,
.val_format_endian_default = REGMAP_ENDIAN_BIG,
- .max_raw_read = 4,
- .max_raw_write = 4,
}
};
@@ -108,7 +102,6 @@ static const struct regmap_config ksz8863_regmap_config[] = {
.pad_bits = 24,
.val_bits = 8,
.cache_type = REGCACHE_NONE,
- .use_single_read = 1,
.lock = ksz_regmap_lock,
.unlock = ksz_regmap_unlock,
},
@@ -118,7 +111,6 @@ static const struct regmap_config ksz8863_regmap_config[] = {
.pad_bits = 24,
.val_bits = 16,
.cache_type = REGCACHE_NONE,
- .use_single_read = 1,
.lock = ksz_regmap_lock,
.unlock = ksz_regmap_unlock,
},
@@ -128,7 +120,6 @@ static const struct regmap_config ksz8863_regmap_config[] = {
.pad_bits = 24,
.val_bits = 32,
.cache_type = REGCACHE_NONE,
- .use_single_read = 1,
.lock = ksz_regmap_lock,
.unlock = ksz_regmap_unlock,
}
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c
index 7fc2155..74c56d0 100644
--- a/drivers/net/dsa/microchip/ksz_common.c
+++ b/drivers/net/dsa/microchip/ksz_common.c
@@ -404,13 +404,13 @@ static const u32 ksz8863_masks[] = {
[VLAN_TABLE_VALID] = BIT(19),
[STATIC_MAC_TABLE_VALID] = BIT(19),
[STATIC_MAC_TABLE_USE_FID] = BIT(21),
- [STATIC_MAC_TABLE_FID] = GENMASK(29, 26),
+ [STATIC_MAC_TABLE_FID] = GENMASK(25, 22),
[STATIC_MAC_TABLE_OVERRIDE] = BIT(20),
[STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(18, 16),
- [DYNAMIC_MAC_TABLE_ENTRIES_H] = GENMASK(5, 0),
- [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(7),
+ [DYNAMIC_MAC_TABLE_ENTRIES_H] = GENMASK(1, 0),
+ [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(2),
[DYNAMIC_MAC_TABLE_NOT_READY] = BIT(7),
- [DYNAMIC_MAC_TABLE_ENTRIES] = GENMASK(31, 28),
+ [DYNAMIC_MAC_TABLE_ENTRIES] = GENMASK(31, 24),
[DYNAMIC_MAC_TABLE_FID] = GENMASK(19, 16),
[DYNAMIC_MAC_TABLE_SRC_PORT] = GENMASK(21, 20),
[DYNAMIC_MAC_TABLE_TIMESTAMP] = GENMASK(23, 22),
@@ -420,10 +420,10 @@ static u8 ksz8863_shifts[] = {
[VLAN_TABLE_MEMBERSHIP_S] = 16,
[STATIC_MAC_FWD_PORTS] = 16,
[STATIC_MAC_FID] = 22,
- [DYNAMIC_MAC_ENTRIES_H] = 3,
+ [DYNAMIC_MAC_ENTRIES_H] = 8,
[DYNAMIC_MAC_ENTRIES] = 24,
[DYNAMIC_MAC_FID] = 16,
- [DYNAMIC_MAC_TIMESTAMP] = 24,
+ [DYNAMIC_MAC_TIMESTAMP] = 22,
[DYNAMIC_MAC_SRC_PORT] = 20,
};
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c
index 30383c4..0de7b36 100644
--- a/drivers/net/dsa/mv88e6xxx/chip.c
+++ b/drivers/net/dsa/mv88e6xxx/chip.c
@@ -3354,9 +3354,14 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port)
* If this is the upstream port for this switch, enable
* forwarding of unknown unicasts and multicasts.
*/
- reg = MV88E6XXX_PORT_CTL0_IGMP_MLD_SNOOP |
- MV88E6185_PORT_CTL0_USE_TAG | MV88E6185_PORT_CTL0_USE_IP |
+ reg = MV88E6185_PORT_CTL0_USE_TAG | MV88E6185_PORT_CTL0_USE_IP |
MV88E6XXX_PORT_CTL0_STATE_FORWARDING;
+ /* Forward any IPv4 IGMP or IPv6 MLD frames received
+ * by a USER port to the CPU port to allow snooping.
+ */
+ if (dsa_is_user_port(ds, port))
+ reg |= MV88E6XXX_PORT_CTL0_IGMP_MLD_SNOOP;
+
err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL0, reg);
if (err)
return err;
diff --git a/drivers/net/dsa/realtek/realtek-mdio.c b/drivers/net/dsa/realtek/realtek-mdio.c
index 3e54fac..5a8fe70 100644
--- a/drivers/net/dsa/realtek/realtek-mdio.c
+++ b/drivers/net/dsa/realtek/realtek-mdio.c
@@ -21,6 +21,7 @@
#include <linux/module.h>
#include <linux/of_device.h>
+#include <linux/overflow.h>
#include <linux/regmap.h>
#include "realtek.h"
@@ -152,7 +153,9 @@ static int realtek_mdio_probe(struct mdio_device *mdiodev)
if (!var)
return -EINVAL;
- priv = devm_kzalloc(&mdiodev->dev, sizeof(*priv), GFP_KERNEL);
+ priv = devm_kzalloc(&mdiodev->dev,
+ size_add(sizeof(*priv), var->chip_data_sz),
+ GFP_KERNEL);
if (!priv)
return -ENOMEM;
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
index 16c4906..12083b9 100644
--- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
+++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c
@@ -672,6 +672,18 @@ static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp,
return 0;
}
+static struct sk_buff *
+bnx2x_build_skb(const struct bnx2x_fastpath *fp, void *data)
+{
+ struct sk_buff *skb;
+
+ if (fp->rx_frag_size)
+ skb = build_skb(data, fp->rx_frag_size);
+ else
+ skb = slab_build_skb(data);
+ return skb;
+}
+
static void bnx2x_frag_free(const struct bnx2x_fastpath *fp, void *data)
{
if (fp->rx_frag_size)
@@ -779,7 +791,7 @@ static void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp,
dma_unmap_single(&bp->pdev->dev, dma_unmap_addr(rx_buf, mapping),
fp->rx_buf_size, DMA_FROM_DEVICE);
if (likely(new_data))
- skb = build_skb(data, fp->rx_frag_size);
+ skb = bnx2x_build_skb(fp, data);
if (likely(skb)) {
#ifdef BNX2X_STOP_ON_ERROR
@@ -1046,7 +1058,7 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget)
dma_unmap_addr(rx_buf, mapping),
fp->rx_buf_size,
DMA_FROM_DEVICE);
- skb = build_skb(data, fp->rx_frag_size);
+ skb = bnx2x_build_skb(fp, data);
if (unlikely(!skb)) {
bnx2x_frag_free(fp, data);
bnx2x_fp_qstats(bp, fp)->
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
index e2e2c98..c23e3b3 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c
@@ -175,12 +175,12 @@ static const struct pci_device_id bnxt_pci_tbl[] = {
{ PCI_VDEVICE(BROADCOM, 0x1750), .driver_data = BCM57508 },
{ PCI_VDEVICE(BROADCOM, 0x1751), .driver_data = BCM57504 },
{ PCI_VDEVICE(BROADCOM, 0x1752), .driver_data = BCM57502 },
- { PCI_VDEVICE(BROADCOM, 0x1800), .driver_data = BCM57508_NPAR },
+ { PCI_VDEVICE(BROADCOM, 0x1800), .driver_data = BCM57502_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x1801), .driver_data = BCM57504_NPAR },
- { PCI_VDEVICE(BROADCOM, 0x1802), .driver_data = BCM57502_NPAR },
- { PCI_VDEVICE(BROADCOM, 0x1803), .driver_data = BCM57508_NPAR },
+ { PCI_VDEVICE(BROADCOM, 0x1802), .driver_data = BCM57508_NPAR },
+ { PCI_VDEVICE(BROADCOM, 0x1803), .driver_data = BCM57502_NPAR },
{ PCI_VDEVICE(BROADCOM, 0x1804), .driver_data = BCM57504_NPAR },
- { PCI_VDEVICE(BROADCOM, 0x1805), .driver_data = BCM57502_NPAR },
+ { PCI_VDEVICE(BROADCOM, 0x1805), .driver_data = BCM57508_NPAR },
{ PCI_VDEVICE(BROADCOM, 0xd802), .driver_data = BCM58802 },
{ PCI_VDEVICE(BROADCOM, 0xd804), .driver_data = BCM58804 },
#ifdef CONFIG_BNXT_SRIOV
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
index c0628ac..5928430 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h
@@ -1226,6 +1226,7 @@ struct bnxt_link_info {
#define BNXT_LINK_SPEED_40GB PORT_PHY_QCFG_RESP_LINK_SPEED_40GB
#define BNXT_LINK_SPEED_50GB PORT_PHY_QCFG_RESP_LINK_SPEED_50GB
#define BNXT_LINK_SPEED_100GB PORT_PHY_QCFG_RESP_LINK_SPEED_100GB
+#define BNXT_LINK_SPEED_200GB PORT_PHY_QCFG_RESP_LINK_SPEED_200GB
u16 support_speeds;
u16 support_pam4_speeds;
u16 auto_link_speeds; /* fw adv setting */
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
index ec57312..6bd18eb 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c
@@ -1714,6 +1714,8 @@ u32 bnxt_fw_to_ethtool_speed(u16 fw_link_speed)
return SPEED_50000;
case BNXT_LINK_SPEED_100GB:
return SPEED_100000;
+ case BNXT_LINK_SPEED_200GB:
+ return SPEED_200000;
default:
return SPEED_UNKNOWN;
}
@@ -3738,6 +3740,7 @@ static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest,
bnxt_ulp_stop(bp);
rc = bnxt_close_nic(bp, true, false);
if (rc) {
+ etest->flags |= ETH_TEST_FL_FAILED;
bnxt_ulp_start(bp, rc);
return;
}
diff --git a/drivers/net/ethernet/intel/i40e/i40e_diag.c b/drivers/net/ethernet/intel/i40e/i40e_diag.c
index 5b3519c..97fe178 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_diag.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_diag.c
@@ -44,7 +44,7 @@ static int i40e_diag_reg_pattern_test(struct i40e_hw *hw,
return 0;
}
-struct i40e_diag_reg_test_info i40e_reg_list[] = {
+const struct i40e_diag_reg_test_info i40e_reg_list[] = {
/* offset mask elements stride */
{I40E_QTX_CTL(0), 0x0000FFBF, 1,
I40E_QTX_CTL(1) - I40E_QTX_CTL(0)},
@@ -78,27 +78,28 @@ int i40e_diag_reg_test(struct i40e_hw *hw)
{
int ret_code = 0;
u32 reg, mask;
+ u32 elements;
u32 i, j;
for (i = 0; i40e_reg_list[i].offset != 0 &&
!ret_code; i++) {
+ elements = i40e_reg_list[i].elements;
/* set actual reg range for dynamically allocated resources */
if (i40e_reg_list[i].offset == I40E_QTX_CTL(0) &&
hw->func_caps.num_tx_qp != 0)
- i40e_reg_list[i].elements = hw->func_caps.num_tx_qp;
+ elements = hw->func_caps.num_tx_qp;
if ((i40e_reg_list[i].offset == I40E_PFINT_ITRN(0, 0) ||
i40e_reg_list[i].offset == I40E_PFINT_ITRN(1, 0) ||
i40e_reg_list[i].offset == I40E_PFINT_ITRN(2, 0) ||
i40e_reg_list[i].offset == I40E_QINT_TQCTL(0) ||
i40e_reg_list[i].offset == I40E_QINT_RQCTL(0)) &&
hw->func_caps.num_msix_vectors != 0)
- i40e_reg_list[i].elements =
- hw->func_caps.num_msix_vectors - 1;
+ elements = hw->func_caps.num_msix_vectors - 1;
/* test register access */
mask = i40e_reg_list[i].mask;
- for (j = 0; j < i40e_reg_list[i].elements && !ret_code; j++) {
+ for (j = 0; j < elements && !ret_code; j++) {
reg = i40e_reg_list[i].offset +
(j * i40e_reg_list[i].stride);
ret_code = i40e_diag_reg_pattern_test(hw, reg, mask);
diff --git a/drivers/net/ethernet/intel/i40e/i40e_diag.h b/drivers/net/ethernet/intel/i40e/i40e_diag.h
index e641035..c3ce5f3 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_diag.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_diag.h
@@ -20,7 +20,7 @@ struct i40e_diag_reg_test_info {
u32 stride; /* bytes between each element */
};
-extern struct i40e_diag_reg_test_info i40e_reg_list[];
+extern const struct i40e_diag_reg_test_info i40e_reg_list[];
int i40e_diag_reg_test(struct i40e_hw *hw);
int i40e_diag_eeprom_test(struct i40e_hw *hw);
diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c
index 4eca8d1..b7682de0a 100644
--- a/drivers/net/ethernet/intel/ice/ice_sched.c
+++ b/drivers/net/ethernet/intel/ice/ice_sched.c
@@ -2788,7 +2788,7 @@ static int
ice_sched_assoc_vsi_to_agg(struct ice_port_info *pi, u32 agg_id,
u16 vsi_handle, unsigned long *tc_bitmap)
{
- struct ice_sched_agg_vsi_info *agg_vsi_info, *old_agg_vsi_info = NULL;
+ struct ice_sched_agg_vsi_info *agg_vsi_info, *iter, *old_agg_vsi_info = NULL;
struct ice_sched_agg_info *agg_info, *old_agg_info;
struct ice_hw *hw = pi->hw;
int status = 0;
@@ -2806,11 +2806,13 @@ ice_sched_assoc_vsi_to_agg(struct ice_port_info *pi, u32 agg_id,
if (old_agg_info && old_agg_info != agg_info) {
struct ice_sched_agg_vsi_info *vtmp;
- list_for_each_entry_safe(old_agg_vsi_info, vtmp,
+ list_for_each_entry_safe(iter, vtmp,
&old_agg_info->agg_vsi_list,
list_entry)
- if (old_agg_vsi_info->vsi_handle == vsi_handle)
+ if (iter->vsi_handle == vsi_handle) {
+ old_agg_vsi_info = iter;
break;
+ }
}
/* check if entry already exist */
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c
index 61f844d..46b3685 100644
--- a/drivers/net/ethernet/intel/ice/ice_switch.c
+++ b/drivers/net/ethernet/intel/ice/ice_switch.c
@@ -1780,18 +1780,36 @@ ice_update_vsi(struct ice_hw *hw, u16 vsi_handle, struct ice_vsi_ctx *vsi_ctx,
int
ice_cfg_rdma_fltr(struct ice_hw *hw, u16 vsi_handle, bool enable)
{
- struct ice_vsi_ctx *ctx;
+ struct ice_vsi_ctx *ctx, *cached_ctx;
+ int status;
- ctx = ice_get_vsi_ctx(hw, vsi_handle);
+ cached_ctx = ice_get_vsi_ctx(hw, vsi_handle);
+ if (!cached_ctx)
+ return -ENOENT;
+
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
if (!ctx)
- return -EIO;
+ return -ENOMEM;
+
+ ctx->info.q_opt_rss = cached_ctx->info.q_opt_rss;
+ ctx->info.q_opt_tc = cached_ctx->info.q_opt_tc;
+ ctx->info.q_opt_flags = cached_ctx->info.q_opt_flags;
+
+ ctx->info.valid_sections = cpu_to_le16(ICE_AQ_VSI_PROP_Q_OPT_VALID);
if (enable)
ctx->info.q_opt_flags |= ICE_AQ_VSI_Q_OPT_PE_FLTR_EN;
else
ctx->info.q_opt_flags &= ~ICE_AQ_VSI_Q_OPT_PE_FLTR_EN;
- return ice_update_vsi(hw, vsi_handle, ctx, NULL);
+ status = ice_update_vsi(hw, vsi_handle, ctx, NULL);
+ if (!status) {
+ cached_ctx->info.q_opt_flags = ctx->info.q_opt_flags;
+ cached_ctx->info.valid_sections |= ctx->info.valid_sections;
+ }
+
+ kfree(ctx);
+ return status;
}
/**
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx.c b/drivers/net/ethernet/intel/ice/ice_txrx.c
index b61dd9f..4fcf2d0 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx.c
@@ -938,6 +938,7 @@ ice_reuse_rx_page(struct ice_rx_ring *rx_ring, struct ice_rx_buf *old_buf)
* ice_get_rx_buf - Fetch Rx buffer and synchronize data for use
* @rx_ring: Rx descriptor ring to transact packets on
* @size: size of buffer to add to skb
+ * @ntc: index of next to clean element
*
* This function will pull an Rx buffer from the ring and synchronize it
* for use by the CPU.
@@ -1026,7 +1027,6 @@ ice_build_skb(struct ice_rx_ring *rx_ring, struct xdp_buff *xdp)
/**
* ice_construct_skb - Allocate skb and populate it
* @rx_ring: Rx descriptor ring to transact packets on
- * @rx_buf: Rx buffer to pull data from
* @xdp: xdp_buff pointing to the data
*
* This function allocates an skb. It then populates it with the page
diff --git a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
index 7bc5aa3..c8322fb 100644
--- a/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
+++ b/drivers/net/ethernet/intel/ice/ice_txrx_lib.c
@@ -438,6 +438,7 @@ int __ice_xmit_xdp_ring(struct xdp_buff *xdp, struct ice_tx_ring *xdp_ring,
* ice_finalize_xdp_rx - Bump XDP Tx tail and/or flush redirect map
* @xdp_ring: XDP ring
* @xdp_res: Result of the receive batch
+ * @first_idx: index to write from caller
*
* This function bumps XDP Tx tail and/or flush redirect map, and
* should be called when a batch of packets has been processed in the
diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c
index e6ef6b3..5fd75e7 100644
--- a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c
+++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c
@@ -542,6 +542,72 @@ static void ice_vc_fdir_rem_prof_all(struct ice_vf *vf)
}
/**
+ * ice_vc_fdir_has_prof_conflict
+ * @vf: pointer to the VF structure
+ * @conf: FDIR configuration for each filter
+ *
+ * Check if @conf has conflicting profile with existing profiles
+ *
+ * Return: true on success, and false on error.
+ */
+static bool
+ice_vc_fdir_has_prof_conflict(struct ice_vf *vf,
+ struct virtchnl_fdir_fltr_conf *conf)
+{
+ struct ice_fdir_fltr *desc;
+
+ list_for_each_entry(desc, &vf->fdir.fdir_rule_list, fltr_node) {
+ struct virtchnl_fdir_fltr_conf *existing_conf;
+ enum ice_fltr_ptype flow_type_a, flow_type_b;
+ struct ice_fdir_fltr *a, *b;
+
+ existing_conf = to_fltr_conf_from_desc(desc);
+ a = &existing_conf->input;
+ b = &conf->input;
+ flow_type_a = a->flow_type;
+ flow_type_b = b->flow_type;
+
+ /* No need to compare two rules with different tunnel types or
+ * with the same protocol type.
+ */
+ if (existing_conf->ttype != conf->ttype ||
+ flow_type_a == flow_type_b)
+ continue;
+
+ switch (flow_type_a) {
+ case ICE_FLTR_PTYPE_NONF_IPV4_UDP:
+ case ICE_FLTR_PTYPE_NONF_IPV4_TCP:
+ case ICE_FLTR_PTYPE_NONF_IPV4_SCTP:
+ if (flow_type_b == ICE_FLTR_PTYPE_NONF_IPV4_OTHER)
+ return true;
+ break;
+ case ICE_FLTR_PTYPE_NONF_IPV4_OTHER:
+ if (flow_type_b == ICE_FLTR_PTYPE_NONF_IPV4_UDP ||
+ flow_type_b == ICE_FLTR_PTYPE_NONF_IPV4_TCP ||
+ flow_type_b == ICE_FLTR_PTYPE_NONF_IPV4_SCTP)
+ return true;
+ break;
+ case ICE_FLTR_PTYPE_NONF_IPV6_UDP:
+ case ICE_FLTR_PTYPE_NONF_IPV6_TCP:
+ case ICE_FLTR_PTYPE_NONF_IPV6_SCTP:
+ if (flow_type_b == ICE_FLTR_PTYPE_NONF_IPV6_OTHER)
+ return true;
+ break;
+ case ICE_FLTR_PTYPE_NONF_IPV6_OTHER:
+ if (flow_type_b == ICE_FLTR_PTYPE_NONF_IPV6_UDP ||
+ flow_type_b == ICE_FLTR_PTYPE_NONF_IPV6_TCP ||
+ flow_type_b == ICE_FLTR_PTYPE_NONF_IPV6_SCTP)
+ return true;
+ break;
+ default:
+ break;
+ }
+ }
+
+ return false;
+}
+
+/**
* ice_vc_fdir_write_flow_prof
* @vf: pointer to the VF structure
* @flow: filter flow type
@@ -677,6 +743,13 @@ ice_vc_fdir_config_input_set(struct ice_vf *vf, struct virtchnl_fdir_add *fltr,
enum ice_fltr_ptype flow;
int ret;
+ ret = ice_vc_fdir_has_prof_conflict(vf, conf);
+ if (ret) {
+ dev_dbg(dev, "Found flow profile conflict for VF %d\n",
+ vf->vf_id);
+ return ret;
+ }
+
flow = input->flow_type;
ret = ice_vc_fdir_alloc_prof(vf, flow);
if (ret) {
diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 0e39d19..2cad76d0 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -3549,6 +3549,8 @@ static void mvneta_txq_sw_deinit(struct mvneta_port *pp,
netdev_tx_reset_queue(nq);
+ txq->buf = NULL;
+ txq->tso_hdrs = NULL;
txq->descs = NULL;
txq->last_desc = 0;
txq->next_desc_to_proc = 0;
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c
index 41d935d..40aeaa7 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c
@@ -62,35 +62,38 @@ static const struct mvpp2_cls_flow cls_flows[MVPP2_N_PRS_FLOWS] = {
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_UNTAG,
MVPP22_CLS_HEK_IP4_2T,
MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4 |
- MVPP2_PRS_RI_L4_TCP,
+ MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_TCP,
MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_UNTAG,
MVPP22_CLS_HEK_IP4_2T,
MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4_OPT |
- MVPP2_PRS_RI_L4_TCP,
+ MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_TCP,
MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_UNTAG,
MVPP22_CLS_HEK_IP4_2T,
MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4_OTHER |
- MVPP2_PRS_RI_L4_TCP,
+ MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_TCP,
MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
/* TCP over IPv4 flows, fragmented, with vlan tag */
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_TAG,
MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED,
- MVPP2_PRS_RI_L3_IP4 | MVPP2_PRS_RI_L4_TCP,
+ MVPP2_PRS_RI_L3_IP4 | MVPP2_PRS_RI_IP_FRAG_TRUE |
+ MVPP2_PRS_RI_L4_TCP,
MVPP2_PRS_IP_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_TAG,
MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED,
- MVPP2_PRS_RI_L3_IP4_OPT | MVPP2_PRS_RI_L4_TCP,
+ MVPP2_PRS_RI_L3_IP4_OPT | MVPP2_PRS_RI_IP_FRAG_TRUE |
+ MVPP2_PRS_RI_L4_TCP,
MVPP2_PRS_IP_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_TAG,
MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED,
- MVPP2_PRS_RI_L3_IP4_OTHER | MVPP2_PRS_RI_L4_TCP,
+ MVPP2_PRS_RI_L3_IP4_OTHER | MVPP2_PRS_RI_IP_FRAG_TRUE |
+ MVPP2_PRS_RI_L4_TCP,
MVPP2_PRS_IP_MASK),
/* UDP over IPv4 flows, Not fragmented, no vlan tag */
@@ -132,35 +135,38 @@ static const struct mvpp2_cls_flow cls_flows[MVPP2_N_PRS_FLOWS] = {
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_UNTAG,
MVPP22_CLS_HEK_IP4_2T,
MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4 |
- MVPP2_PRS_RI_L4_UDP,
+ MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_UDP,
MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_UNTAG,
MVPP22_CLS_HEK_IP4_2T,
MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4_OPT |
- MVPP2_PRS_RI_L4_UDP,
+ MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_UDP,
MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_UNTAG,
MVPP22_CLS_HEK_IP4_2T,
MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4_OTHER |
- MVPP2_PRS_RI_L4_UDP,
+ MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_UDP,
MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
/* UDP over IPv4 flows, fragmented, with vlan tag */
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_TAG,
MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED,
- MVPP2_PRS_RI_L3_IP4 | MVPP2_PRS_RI_L4_UDP,
+ MVPP2_PRS_RI_L3_IP4 | MVPP2_PRS_RI_IP_FRAG_TRUE |
+ MVPP2_PRS_RI_L4_UDP,
MVPP2_PRS_IP_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_TAG,
MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED,
- MVPP2_PRS_RI_L3_IP4_OPT | MVPP2_PRS_RI_L4_UDP,
+ MVPP2_PRS_RI_L3_IP4_OPT | MVPP2_PRS_RI_IP_FRAG_TRUE |
+ MVPP2_PRS_RI_L4_UDP,
MVPP2_PRS_IP_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_TAG,
MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED,
- MVPP2_PRS_RI_L3_IP4_OTHER | MVPP2_PRS_RI_L4_UDP,
+ MVPP2_PRS_RI_L3_IP4_OTHER | MVPP2_PRS_RI_IP_FRAG_TRUE |
+ MVPP2_PRS_RI_L4_UDP,
MVPP2_PRS_IP_MASK),
/* TCP over IPv6 flows, not fragmented, no vlan tag */
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c
index 75ba57b..9af22f4 100644
--- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c
+++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c
@@ -1539,8 +1539,8 @@ static int mvpp2_prs_vlan_init(struct platform_device *pdev, struct mvpp2 *priv)
if (!priv->prs_double_vlans)
return -ENOMEM;
- /* Double VLAN: 0x8100, 0x88A8 */
- err = mvpp2_prs_double_vlan_add(priv, ETH_P_8021Q, ETH_P_8021AD,
+ /* Double VLAN: 0x88A8, 0x8100 */
+ err = mvpp2_prs_double_vlan_add(priv, ETH_P_8021AD, ETH_P_8021Q,
MVPP2_PRS_PORT_MASK);
if (err)
return err;
@@ -1607,59 +1607,45 @@ static int mvpp2_prs_vlan_init(struct platform_device *pdev, struct mvpp2 *priv)
static int mvpp2_prs_pppoe_init(struct mvpp2 *priv)
{
struct mvpp2_prs_entry pe;
- int tid;
+ int tid, ihl;
- /* IPv4 over PPPoE with options */
- tid = mvpp2_prs_tcam_first_free(priv, MVPP2_PE_FIRST_FREE_TID,
- MVPP2_PE_LAST_FREE_TID);
- if (tid < 0)
- return tid;
+ /* IPv4 over PPPoE with header length >= 5 */
+ for (ihl = MVPP2_PRS_IPV4_IHL_MIN; ihl <= MVPP2_PRS_IPV4_IHL_MAX; ihl++) {
+ tid = mvpp2_prs_tcam_first_free(priv, MVPP2_PE_FIRST_FREE_TID,
+ MVPP2_PE_LAST_FREE_TID);
+ if (tid < 0)
+ return tid;
- memset(&pe, 0, sizeof(pe));
- mvpp2_prs_tcam_lu_set(&pe, MVPP2_PRS_LU_PPPOE);
- pe.index = tid;
+ memset(&pe, 0, sizeof(pe));
+ mvpp2_prs_tcam_lu_set(&pe, MVPP2_PRS_LU_PPPOE);
+ pe.index = tid;
- mvpp2_prs_match_etype(&pe, 0, PPP_IP);
+ mvpp2_prs_match_etype(&pe, 0, PPP_IP);
+ mvpp2_prs_tcam_data_byte_set(&pe, MVPP2_ETH_TYPE_LEN,
+ MVPP2_PRS_IPV4_HEAD | ihl,
+ MVPP2_PRS_IPV4_HEAD_MASK |
+ MVPP2_PRS_IPV4_IHL_MASK);
- mvpp2_prs_sram_next_lu_set(&pe, MVPP2_PRS_LU_IP4);
- mvpp2_prs_sram_ri_update(&pe, MVPP2_PRS_RI_L3_IP4_OPT,
- MVPP2_PRS_RI_L3_PROTO_MASK);
- /* goto ipv4 dest-address (skip eth_type + IP-header-size - 4) */
- mvpp2_prs_sram_shift_set(&pe, MVPP2_ETH_TYPE_LEN +
- sizeof(struct iphdr) - 4,
- MVPP2_PRS_SRAM_OP_SEL_SHIFT_ADD);
- /* Set L3 offset */
- mvpp2_prs_sram_offset_set(&pe, MVPP2_PRS_SRAM_UDF_TYPE_L3,
- MVPP2_ETH_TYPE_LEN,
- MVPP2_PRS_SRAM_OP_SEL_UDF_ADD);
+ mvpp2_prs_sram_next_lu_set(&pe, MVPP2_PRS_LU_IP4);
+ mvpp2_prs_sram_ri_update(&pe, MVPP2_PRS_RI_L3_IP4,
+ MVPP2_PRS_RI_L3_PROTO_MASK);
+ /* goto ipv4 dst-address (skip eth_type + IP-header-size - 4) */
+ mvpp2_prs_sram_shift_set(&pe, MVPP2_ETH_TYPE_LEN +
+ sizeof(struct iphdr) - 4,
+ MVPP2_PRS_SRAM_OP_SEL_SHIFT_ADD);
+ /* Set L3 offset */
+ mvpp2_prs_sram_offset_set(&pe, MVPP2_PRS_SRAM_UDF_TYPE_L3,
+ MVPP2_ETH_TYPE_LEN,
+ MVPP2_PRS_SRAM_OP_SEL_UDF_ADD);
+ /* Set L4 offset */
+ mvpp2_prs_sram_offset_set(&pe, MVPP2_PRS_SRAM_UDF_TYPE_L4,
+ MVPP2_ETH_TYPE_LEN + (ihl * 4),
+ MVPP2_PRS_SRAM_OP_SEL_UDF_ADD);
- /* Update shadow table and hw entry */
- mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_PPPOE);
- mvpp2_prs_hw_write(priv, &pe);
-
- /* IPv4 over PPPoE without options */
- tid = mvpp2_prs_tcam_first_free(priv, MVPP2_PE_FIRST_FREE_TID,
- MVPP2_PE_LAST_FREE_TID);
- if (tid < 0)
- return tid;
-
- pe.index = tid;
-
- mvpp2_prs_tcam_data_byte_set(&pe, MVPP2_ETH_TYPE_LEN,
- MVPP2_PRS_IPV4_HEAD |
- MVPP2_PRS_IPV4_IHL_MIN,
- MVPP2_PRS_IPV4_HEAD_MASK |
- MVPP2_PRS_IPV4_IHL_MASK);
-
- /* Clear ri before updating */
- pe.sram[MVPP2_PRS_SRAM_RI_WORD] = 0x0;
- pe.sram[MVPP2_PRS_SRAM_RI_CTRL_WORD] = 0x0;
- mvpp2_prs_sram_ri_update(&pe, MVPP2_PRS_RI_L3_IP4,
- MVPP2_PRS_RI_L3_PROTO_MASK);
-
- /* Update shadow table and hw entry */
- mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_PPPOE);
- mvpp2_prs_hw_write(priv, &pe);
+ /* Update shadow table and hw entry */
+ mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_PPPOE);
+ mvpp2_prs_hw_write(priv, &pe);
+ }
/* IPv6 over PPPoE */
tid = mvpp2_prs_tcam_first_free(priv, MVPP2_PE_FIRST_FREE_TID,
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index 3cb4362..282f943 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -763,8 +763,6 @@ static void mtk_mac_link_up(struct phylink_config *config,
break;
}
- mtk_set_queue_speed(mac->hw, mac->id, speed);
-
/* Configure duplex */
if (duplex == DUPLEX_FULL)
mcr |= MAC_MCR_FORCE_DPX;
@@ -2059,9 +2057,6 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
skb_checksum_none_assert(skb);
skb->protocol = eth_type_trans(skb, netdev);
- if (reason == MTK_PPE_CPU_REASON_HIT_UNBIND_RATE_REACHED)
- mtk_ppe_check_skb(eth->ppe[0], skb, hash);
-
if (netdev->features & NETIF_F_HW_VLAN_CTAG_RX) {
if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2)) {
if (trxd.rxd3 & RX_DMA_VTAG_V2) {
@@ -2089,6 +2084,9 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
__vlan_hwaccel_put_tag(skb, htons(vlan_proto), vlan_tci);
}
+ if (reason == MTK_PPE_CPU_REASON_HIT_UNBIND_RATE_REACHED)
+ mtk_ppe_check_skb(eth->ppe[0], skb, hash);
+
skb_record_rx_queue(skb, 0);
napi_gro_receive(napi, skb);
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe.c b/drivers/net/ethernet/mediatek/mtk_ppe.c
index 6883eb3..fd07d6e 100644
--- a/drivers/net/ethernet/mediatek/mtk_ppe.c
+++ b/drivers/net/ethernet/mediatek/mtk_ppe.c
@@ -8,6 +8,7 @@
#include <linux/platform_device.h>
#include <linux/if_ether.h>
#include <linux/if_vlan.h>
+#include <net/dst_metadata.h>
#include <net/dsa.h>
#include "mtk_eth_soc.h"
#include "mtk_ppe.h"
@@ -458,6 +459,7 @@ __mtk_foe_entry_clear(struct mtk_ppe *ppe, struct mtk_flow_entry *entry)
hwe->ib1 &= ~MTK_FOE_IB1_STATE;
hwe->ib1 |= FIELD_PREP(MTK_FOE_IB1_STATE, MTK_FOE_STATE_INVALID);
dma_wmb();
+ mtk_ppe_cache_clear(ppe);
}
entry->hash = 0xffff;
@@ -699,7 +701,9 @@ void __mtk_ppe_check_skb(struct mtk_ppe *ppe, struct sk_buff *skb, u16 hash)
skb->dev->dsa_ptr->tag_ops->proto != DSA_TAG_PROTO_MTK)
goto out;
- tag += 4;
+ if (!skb_metadata_dst(skb))
+ tag += 4;
+
if (get_unaligned_be16(tag) != ETH_P_8021Q)
break;
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c
index 81afd5e..161751b 100644
--- a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c
+++ b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c
@@ -576,6 +576,7 @@ mtk_eth_setup_tc_block(struct net_device *dev, struct flow_block_offload *f)
if (IS_ERR(block_cb))
return PTR_ERR(block_cb);
+ flow_block_cb_incref(block_cb);
flow_block_cb_add(block_cb, f);
list_add_tail(&block_cb->driver_list, &block_cb_list);
return 0;
@@ -584,7 +585,7 @@ mtk_eth_setup_tc_block(struct net_device *dev, struct flow_block_offload *f)
if (!block_cb)
return -ENOENT;
- if (flow_block_cb_decref(block_cb)) {
+ if (!flow_block_cb_decref(block_cb)) {
flow_block_cb_remove(block_cb, f);
list_del(&block_cb->driver_list);
}
diff --git a/drivers/net/ethernet/realtek/r8169_phy_config.c b/drivers/net/ethernet/realtek/r8169_phy_config.c
index 930496c..b50f167 100644
--- a/drivers/net/ethernet/realtek/r8169_phy_config.c
+++ b/drivers/net/ethernet/realtek/r8169_phy_config.c
@@ -826,6 +826,9 @@ static void rtl8168h_2_hw_phy_config(struct rtl8169_private *tp,
/* disable phy pfm mode */
phy_modify_paged(phydev, 0x0a44, 0x11, BIT(7), 0);
+ /* disable 10m pll off */
+ phy_modify_paged(phydev, 0x0a43, 0x10, BIT(0), 0);
+
rtl8168g_disable_aldps(phydev);
rtl8168g_config_eee_phy(phydev);
}
diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c
index 7022fb2..d30459d 100644
--- a/drivers/net/ethernet/sfc/ef10.c
+++ b/drivers/net/ethernet/sfc/ef10.c
@@ -1304,7 +1304,8 @@ static void efx_ef10_fini_nic(struct efx_nic *efx)
static int efx_ef10_init_nic(struct efx_nic *efx)
{
struct efx_ef10_nic_data *nic_data = efx->nic_data;
- netdev_features_t hw_enc_features = 0;
+ struct net_device *net_dev = efx->net_dev;
+ netdev_features_t tun_feats, tso_feats;
int rc;
if (nic_data->must_check_datapath_caps) {
@@ -1349,20 +1350,30 @@ static int efx_ef10_init_nic(struct efx_nic *efx)
nic_data->must_restore_piobufs = false;
}
- /* add encapsulated checksum offload features */
+ /* encap features might change during reset if fw variant changed */
if (efx_has_cap(efx, VXLAN_NVGRE) && !efx_ef10_is_vf(efx))
- hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
- /* add encapsulated TSO features */
+ net_dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM;
+ else
+ net_dev->hw_enc_features &= ~(NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM);
+
+ tun_feats = NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
+ NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM;
+ tso_feats = NETIF_F_TSO | NETIF_F_TSO6;
+
if (efx_has_cap(efx, TX_TSO_V2_ENCAP)) {
- netdev_features_t encap_tso_features;
-
- encap_tso_features = NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE |
- NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM;
-
- hw_enc_features |= encap_tso_features | NETIF_F_TSO;
- efx->net_dev->features |= encap_tso_features;
+ /* If this is first nic_init, or if it is a reset and a new fw
+ * variant has added new features, enable them by default.
+ * If the features are not new, maintain their current value.
+ */
+ if (!(net_dev->hw_features & tun_feats))
+ net_dev->features |= tun_feats;
+ net_dev->hw_enc_features |= tun_feats | tso_feats;
+ net_dev->hw_features |= tun_feats;
+ } else {
+ net_dev->hw_enc_features &= ~(tun_feats | tso_feats);
+ net_dev->hw_features &= ~tun_feats;
+ net_dev->features &= ~tun_feats;
}
- efx->net_dev->hw_enc_features = hw_enc_features;
/* don't fail init if RSS setup doesn't work */
rc = efx->type->rx_push_rss_config(efx, false,
@@ -4021,7 +4032,10 @@ static unsigned int efx_ef10_recycle_ring_size(const struct efx_nic *efx)
NETIF_F_HW_VLAN_CTAG_FILTER | \
NETIF_F_IPV6_CSUM | \
NETIF_F_RXHASH | \
- NETIF_F_NTUPLE)
+ NETIF_F_NTUPLE | \
+ NETIF_F_SG | \
+ NETIF_F_RXCSUM | \
+ NETIF_F_RXALL)
const struct efx_nic_type efx_hunt_a0_vf_nic_type = {
.is_vf = true,
diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c
index 02c2ade..884d8d1 100644
--- a/drivers/net/ethernet/sfc/efx.c
+++ b/drivers/net/ethernet/sfc/efx.c
@@ -1001,21 +1001,18 @@ static int efx_pci_probe_post_io(struct efx_nic *efx)
}
/* Determine netdevice features */
- net_dev->features |= (efx->type->offload_features | NETIF_F_SG |
- NETIF_F_TSO | NETIF_F_RXCSUM | NETIF_F_RXALL);
- if (efx->type->offload_features & (NETIF_F_IPV6_CSUM | NETIF_F_HW_CSUM)) {
- net_dev->features |= NETIF_F_TSO6;
- if (efx_has_cap(efx, TX_TSO_V2_ENCAP))
- net_dev->hw_enc_features |= NETIF_F_TSO6;
- }
- /* Check whether device supports TSO */
- if (!efx->type->tso_versions || !efx->type->tso_versions(efx))
- net_dev->features &= ~NETIF_F_ALL_TSO;
+ net_dev->features |= efx->type->offload_features;
+
+ /* Add TSO features */
+ if (efx->type->tso_versions && efx->type->tso_versions(efx))
+ net_dev->features |= NETIF_F_TSO | NETIF_F_TSO6;
+
/* Mask for features that also apply to VLAN devices */
net_dev->vlan_features |= (NETIF_F_HW_CSUM | NETIF_F_SG |
NETIF_F_HIGHDMA | NETIF_F_ALL_TSO |
NETIF_F_RXCSUM);
+ /* Determine user configurable features */
net_dev->hw_features |= net_dev->features & ~efx->fixed_features;
/* Disable receiving frames with bad FCS, by default. */
diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c
index a2e5119..a690d13 100644
--- a/drivers/net/ethernet/smsc/smsc911x.c
+++ b/drivers/net/ethernet/smsc/smsc911x.c
@@ -1037,8 +1037,6 @@ static int smsc911x_mii_probe(struct net_device *dev)
return ret;
}
- /* Indicate that the MAC is responsible for managing PHY PM */
- phydev->mac_managed_pm = true;
phy_attached_info(phydev);
phy_set_max_speed(phydev, SPEED_100);
@@ -1066,6 +1064,7 @@ static int smsc911x_mii_init(struct platform_device *pdev,
struct net_device *dev)
{
struct smsc911x_data *pdata = netdev_priv(dev);
+ struct phy_device *phydev;
int err = -ENXIO;
pdata->mii_bus = mdiobus_alloc();
@@ -1108,6 +1107,10 @@ static int smsc911x_mii_init(struct platform_device *pdev,
goto err_out_free_bus_2;
}
+ phydev = phy_find_first(pdata->mii_bus);
+ if (phydev)
+ phydev->mac_managed_pm = true;
+
return 0;
err_out_free_bus_2:
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h
index ec9c130..54bb072 100644
--- a/drivers/net/ethernet/stmicro/stmmac/common.h
+++ b/drivers/net/ethernet/stmicro/stmmac/common.h
@@ -532,7 +532,6 @@ struct mac_device_info {
unsigned int xlgmac;
unsigned int num_vlan;
u32 vlan_filter[32];
- unsigned int promisc;
bool vlan_fail_q_en;
u8 vlan_fail_q;
};
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
index 8c7a0b7..36251ec 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c
@@ -472,12 +472,6 @@ static int dwmac4_add_hw_vlan_rx_fltr(struct net_device *dev,
if (vid > 4095)
return -EINVAL;
- if (hw->promisc) {
- netdev_err(dev,
- "Adding VLAN in promisc mode not supported\n");
- return -EPERM;
- }
-
/* Single Rx VLAN Filter */
if (hw->num_vlan == 1) {
/* For single VLAN filter, VID 0 means VLAN promiscuous */
@@ -527,12 +521,6 @@ static int dwmac4_del_hw_vlan_rx_fltr(struct net_device *dev,
{
int i, ret = 0;
- if (hw->promisc) {
- netdev_err(dev,
- "Deleting VLAN in promisc mode not supported\n");
- return -EPERM;
- }
-
/* Single Rx VLAN Filter */
if (hw->num_vlan == 1) {
if ((hw->vlan_filter[0] & GMAC_VLAN_TAG_VID) == vid) {
@@ -557,39 +545,6 @@ static int dwmac4_del_hw_vlan_rx_fltr(struct net_device *dev,
return ret;
}
-static void dwmac4_vlan_promisc_enable(struct net_device *dev,
- struct mac_device_info *hw)
-{
- void __iomem *ioaddr = hw->pcsr;
- u32 value;
- u32 hash;
- u32 val;
- int i;
-
- /* Single Rx VLAN Filter */
- if (hw->num_vlan == 1) {
- dwmac4_write_single_vlan(dev, 0);
- return;
- }
-
- /* Extended Rx VLAN Filter Enable */
- for (i = 0; i < hw->num_vlan; i++) {
- if (hw->vlan_filter[i] & GMAC_VLAN_TAG_DATA_VEN) {
- val = hw->vlan_filter[i] & ~GMAC_VLAN_TAG_DATA_VEN;
- dwmac4_write_vlan_filter(dev, hw, i, val);
- }
- }
-
- hash = readl(ioaddr + GMAC_VLAN_HASH_TABLE);
- if (hash & GMAC_VLAN_VLHT) {
- value = readl(ioaddr + GMAC_VLAN_TAG);
- if (value & GMAC_VLAN_VTHM) {
- value &= ~GMAC_VLAN_VTHM;
- writel(value, ioaddr + GMAC_VLAN_TAG);
- }
- }
-}
-
static void dwmac4_restore_hw_vlan_rx_fltr(struct net_device *dev,
struct mac_device_info *hw)
{
@@ -709,22 +664,12 @@ static void dwmac4_set_filter(struct mac_device_info *hw,
}
/* VLAN filtering */
- if (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER)
+ if (dev->flags & IFF_PROMISC && !hw->vlan_fail_q_en)
+ value &= ~GMAC_PACKET_FILTER_VTFE;
+ else if (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER)
value |= GMAC_PACKET_FILTER_VTFE;
writel(value, ioaddr + GMAC_PACKET_FILTER);
-
- if (dev->flags & IFF_PROMISC && !hw->vlan_fail_q_en) {
- if (!hw->promisc) {
- hw->promisc = 1;
- dwmac4_vlan_promisc_enable(dev, hw);
- }
- } else {
- if (hw->promisc) {
- hw->promisc = 0;
- dwmac4_restore_hw_vlan_rx_fltr(dev, hw);
- }
- }
}
static void dwmac4_flow_ctrl(struct mac_device_info *hw, unsigned int duplex,
diff --git a/drivers/net/ethernet/wangxun/libwx/wx_type.h b/drivers/net/ethernet/wangxun/libwx/wx_type.h
index 77d8d7f..97e2c1e 100644
--- a/drivers/net/ethernet/wangxun/libwx/wx_type.h
+++ b/drivers/net/ethernet/wangxun/libwx/wx_type.h
@@ -222,7 +222,7 @@
#define WX_PX_INTA 0x110
#define WX_PX_GPIE 0x118
#define WX_PX_GPIE_MODEL BIT(0)
-#define WX_PX_IC 0x120
+#define WX_PX_IC(_i) (0x120 + (_i) * 4)
#define WX_PX_IMS(_i) (0x140 + (_i) * 4)
#define WX_PX_IMC(_i) (0x150 + (_i) * 4)
#define WX_PX_ISB_ADDR_L 0x160
diff --git a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
index 5b564d3..17412e5 100644
--- a/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
+++ b/drivers/net/ethernet/wangxun/ngbe/ngbe_main.c
@@ -352,7 +352,7 @@ static void ngbe_up(struct wx *wx)
netif_tx_start_all_queues(wx->netdev);
/* clear any pending interrupts, may auto mask */
- rd32(wx, WX_PX_IC);
+ rd32(wx, WX_PX_IC(0));
rd32(wx, WX_PX_MISC_IC);
ngbe_irq_enable(wx, true);
if (wx->gpio_ctrl)
diff --git a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
index 6c0a982..a58ce54 100644
--- a/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
+++ b/drivers/net/ethernet/wangxun/txgbe/txgbe_main.c
@@ -229,7 +229,8 @@ static void txgbe_up_complete(struct wx *wx)
wx_napi_enable_all(wx);
/* clear any pending interrupts, may auto mask */
- rd32(wx, WX_PX_IC);
+ rd32(wx, WX_PX_IC(0));
+ rd32(wx, WX_PX_IC(1));
rd32(wx, WX_PX_MISC_IC);
txgbe_irq_enable(wx, true);
diff --git a/drivers/net/ieee802154/ca8210.c b/drivers/net/ieee802154/ca8210.c
index 0b0c6c0..d0b5129 100644
--- a/drivers/net/ieee802154/ca8210.c
+++ b/drivers/net/ieee802154/ca8210.c
@@ -1902,10 +1902,9 @@ static int ca8210_skb_tx(
struct ca8210_priv *priv
)
{
- int status;
struct ieee802154_hdr header = { };
struct secspec secspec;
- unsigned int mac_len;
+ int mac_len, status;
dev_dbg(&priv->spi->dev, "%s called\n", __func__);
diff --git a/drivers/net/ipa/gsi_trans.c b/drivers/net/ipa/gsi_trans.c
index 0f52c06..ee6fb00 100644
--- a/drivers/net/ipa/gsi_trans.c
+++ b/drivers/net/ipa/gsi_trans.c
@@ -156,7 +156,7 @@ int gsi_trans_pool_init_dma(struct device *dev, struct gsi_trans_pool *pool,
* gsi_trans_pool_exit_dma() can assume the total allocated
* size is exactly (count * size).
*/
- total_size = get_order(total_size) << PAGE_SHIFT;
+ total_size = PAGE_SIZE << get_order(total_size);
virt = dma_alloc_coherent(dev, total_size, &addr, GFP_KERNEL);
if (!virt)
diff --git a/drivers/net/net_failover.c b/drivers/net/net_failover.c
index 7a28e08..d0c916a 100644
--- a/drivers/net/net_failover.c
+++ b/drivers/net/net_failover.c
@@ -130,14 +130,10 @@ static u16 net_failover_select_queue(struct net_device *dev,
txq = ops->ndo_select_queue(primary_dev, skb, sb_dev);
else
txq = netdev_pick_tx(primary_dev, skb, NULL);
-
- qdisc_skb_cb(skb)->slave_dev_queue_mapping = skb->queue_mapping;
-
- return txq;
+ } else {
+ txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
}
- txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0;
-
/* Save the original txq to restore before passing to the driver */
qdisc_skb_cb(skb)->slave_dev_queue_mapping = skb->queue_mapping;
diff --git a/drivers/net/phy/dp83869.c b/drivers/net/phy/dp83869.c
index b4ff9c5..9ab5eff 100644
--- a/drivers/net/phy/dp83869.c
+++ b/drivers/net/phy/dp83869.c
@@ -588,15 +588,13 @@ static int dp83869_of_init(struct phy_device *phydev)
&dp83869_internal_delay[0],
delay_size, true);
if (dp83869->rx_int_delay < 0)
- dp83869->rx_int_delay =
- dp83869_internal_delay[DP83869_CLK_DELAY_DEF];
+ dp83869->rx_int_delay = DP83869_CLK_DELAY_DEF;
dp83869->tx_int_delay = phy_get_internal_delay(phydev, dev,
&dp83869_internal_delay[0],
delay_size, false);
if (dp83869->tx_int_delay < 0)
- dp83869->tx_int_delay =
- dp83869_internal_delay[DP83869_CLK_DELAY_DEF];
+ dp83869->tx_int_delay = DP83869_CLK_DELAY_DEF;
return ret;
}
diff --git a/drivers/net/phy/micrel.c b/drivers/net/phy/micrel.c
index 2c84fcc..4e884e4 100644
--- a/drivers/net/phy/micrel.c
+++ b/drivers/net/phy/micrel.c
@@ -4151,6 +4151,7 @@ static struct phy_driver ksphy_driver[] = {
.resume = kszphy_resume,
.cable_test_start = ksz9x31_cable_test_start,
.cable_test_get_status = ksz9x31_cable_test_get_status,
+ .get_features = ksz9477_get_features,
}, {
.phy_id = PHY_ID_KSZ8873MLL,
.phy_id_mask = MICREL_PHY_ID_MASK,
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 1785f1c..1de3e33 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -3057,7 +3057,7 @@ EXPORT_SYMBOL_GPL(device_phy_find_device);
* and "phy-device" are not supported in ACPI. DT supports all the three
* named references to the phy node.
*/
-struct fwnode_handle *fwnode_get_phy_node(struct fwnode_handle *fwnode)
+struct fwnode_handle *fwnode_get_phy_node(const struct fwnode_handle *fwnode)
{
struct fwnode_handle *phy_node;
diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
index daac293..9fc50fc 100644
--- a/drivers/net/phy/sfp-bus.c
+++ b/drivers/net/phy/sfp-bus.c
@@ -17,7 +17,7 @@ struct sfp_bus {
/* private: */
struct kref kref;
struct list_head node;
- struct fwnode_handle *fwnode;
+ const struct fwnode_handle *fwnode;
const struct sfp_socket_ops *socket_ops;
struct device *sfp_dev;
@@ -390,7 +390,7 @@ static const struct sfp_upstream_ops *sfp_get_upstream_ops(struct sfp_bus *bus)
return bus->registered ? bus->upstream_ops : NULL;
}
-static struct sfp_bus *sfp_bus_get(struct fwnode_handle *fwnode)
+static struct sfp_bus *sfp_bus_get(const struct fwnode_handle *fwnode)
{
struct sfp_bus *sfp, *new, *found = NULL;
@@ -593,7 +593,7 @@ static void sfp_upstream_clear(struct sfp_bus *bus)
* - %-ENOMEM if we failed to allocate the bus.
* - an error from the upstream's connect_phy() method.
*/
-struct sfp_bus *sfp_bus_find_fwnode(struct fwnode_handle *fwnode)
+struct sfp_bus *sfp_bus_find_fwnode(const struct fwnode_handle *fwnode)
{
struct fwnode_reference_args ref;
struct sfp_bus *bus;
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index 6829870..da488cbb 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1688,7 +1688,9 @@ vmxnet3_rq_rx_complete(struct vmxnet3_rx_queue *rq,
if (unlikely(rcd->ts))
__vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), rcd->tci);
- if (adapter->netdev->features & NETIF_F_LRO)
+ /* Use GRO callback if UPT is enabled */
+ if ((adapter->netdev->features & NETIF_F_LRO) &&
+ !rq->shared->updateRxProd)
netif_receive_skb(skb);
else
napi_gro_receive(&rq->napi, skb);
diff --git a/drivers/net/wwan/iosm/iosm_ipc_imem.c b/drivers/net/wwan/iosm/iosm_ipc_imem.c
index 1e6a479..c066b00 100644
--- a/drivers/net/wwan/iosm/iosm_ipc_imem.c
+++ b/drivers/net/wwan/iosm/iosm_ipc_imem.c
@@ -587,6 +587,13 @@ static void ipc_imem_run_state_worker(struct work_struct *instance)
while (ctrl_chl_idx < IPC_MEM_MAX_CHANNELS) {
if (!ipc_chnl_cfg_get(&chnl_cfg_port, ctrl_chl_idx)) {
ipc_imem->ipc_port[ctrl_chl_idx] = NULL;
+
+ if (ipc_imem->pcie->pci->device == INTEL_CP_DEVICE_7560_ID &&
+ chnl_cfg_port.wwan_port_type == WWAN_PORT_XMMRPC) {
+ ctrl_chl_idx++;
+ continue;
+ }
+
if (ipc_imem->pcie->pci->device == INTEL_CP_DEVICE_7360_ID &&
chnl_cfg_port.wwan_port_type == WWAN_PORT_MBIM) {
ctrl_chl_idx++;
diff --git a/drivers/net/xen-netback/common.h b/drivers/net/xen-netback/common.h
index 3dbfc8a..1fcbd83 100644
--- a/drivers/net/xen-netback/common.h
+++ b/drivers/net/xen-netback/common.h
@@ -166,7 +166,7 @@ struct xenvif_queue { /* Per-queue data for xenvif */
struct pending_tx_info pending_tx_info[MAX_PENDING_REQS];
grant_handle_t grant_tx_handle[MAX_PENDING_REQS];
- struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS];
+ struct gnttab_copy tx_copy_ops[2 * MAX_PENDING_REQS];
struct gnttab_map_grant_ref tx_map_ops[MAX_PENDING_REQS];
struct gnttab_unmap_grant_ref tx_unmap_ops[MAX_PENDING_REQS];
/* passed to gnttab_[un]map_refs with pages under (un)mapping */
diff --git a/drivers/net/xen-netback/netback.c b/drivers/net/xen-netback/netback.c
index 1b42676..c1501f4 100644
--- a/drivers/net/xen-netback/netback.c
+++ b/drivers/net/xen-netback/netback.c
@@ -334,6 +334,7 @@ static int xenvif_count_requests(struct xenvif_queue *queue,
struct xenvif_tx_cb {
u16 copy_pending_idx[XEN_NETBK_LEGACY_SLOTS_MAX + 1];
u8 copy_count;
+ u32 split_mask;
};
#define XENVIF_TX_CB(skb) ((struct xenvif_tx_cb *)(skb)->cb)
@@ -361,6 +362,8 @@ static inline struct sk_buff *xenvif_alloc_skb(unsigned int size)
struct sk_buff *skb =
alloc_skb(size + NET_SKB_PAD + NET_IP_ALIGN,
GFP_ATOMIC | __GFP_NOWARN);
+
+ BUILD_BUG_ON(sizeof(*XENVIF_TX_CB(skb)) > sizeof(skb->cb));
if (unlikely(skb == NULL))
return NULL;
@@ -396,11 +399,13 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
nr_slots = shinfo->nr_frags + 1;
copy_count(skb) = 0;
+ XENVIF_TX_CB(skb)->split_mask = 0;
/* Create copy ops for exactly data_len bytes into the skb head. */
__skb_put(skb, data_len);
while (data_len > 0) {
int amount = data_len > txp->size ? txp->size : data_len;
+ bool split = false;
cop->source.u.ref = txp->gref;
cop->source.domid = queue->vif->domid;
@@ -413,6 +418,13 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
cop->dest.u.gmfn = virt_to_gfn(skb->data + skb_headlen(skb)
- data_len);
+ /* Don't cross local page boundary! */
+ if (cop->dest.offset + amount > XEN_PAGE_SIZE) {
+ amount = XEN_PAGE_SIZE - cop->dest.offset;
+ XENVIF_TX_CB(skb)->split_mask |= 1U << copy_count(skb);
+ split = true;
+ }
+
cop->len = amount;
cop->flags = GNTCOPY_source_gref;
@@ -420,7 +432,8 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
pending_idx = queue->pending_ring[index];
callback_param(queue, pending_idx).ctx = NULL;
copy_pending_idx(skb, copy_count(skb)) = pending_idx;
- copy_count(skb)++;
+ if (!split)
+ copy_count(skb)++;
cop++;
data_len -= amount;
@@ -441,7 +454,8 @@ static void xenvif_get_requests(struct xenvif_queue *queue,
nr_slots--;
} else {
/* The copy op partially covered the tx_request.
- * The remainder will be mapped.
+ * The remainder will be mapped or copied in the next
+ * iteration.
*/
txp->offset += amount;
txp->size -= amount;
@@ -539,6 +553,13 @@ static int xenvif_tx_check_gop(struct xenvif_queue *queue,
pending_idx = copy_pending_idx(skb, i);
newerr = (*gopp_copy)->status;
+
+ /* Split copies need to be handled together. */
+ if (XENVIF_TX_CB(skb)->split_mask & (1U << i)) {
+ (*gopp_copy)++;
+ if (!newerr)
+ newerr = (*gopp_copy)->status;
+ }
if (likely(!newerr)) {
/* The first frag might still have this slot mapped */
if (i < copy_count(skb) - 1 || !sharedslot)
@@ -973,10 +994,8 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
/* No crossing a page as the payload mustn't fragment. */
if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) {
- netdev_err(queue->vif->dev,
- "txreq.offset: %u, size: %u, end: %lu\n",
- txreq.offset, txreq.size,
- (unsigned long)(txreq.offset&~XEN_PAGE_MASK) + txreq.size);
+ netdev_err(queue->vif->dev, "Cross page boundary, txreq.offset: %u, size: %u\n",
+ txreq.offset, txreq.size);
xenvif_fatal_tx_err(queue->vif);
break;
}
@@ -1061,10 +1080,6 @@ static void xenvif_tx_build_gops(struct xenvif_queue *queue,
__skb_queue_tail(&queue->tx_queue, skb);
queue->tx.req_cons = idx;
-
- if ((*map_ops >= ARRAY_SIZE(queue->tx_map_ops)) ||
- (*copy_ops >= ARRAY_SIZE(queue->tx_copy_ops)))
- break;
}
return;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index b615906..282d808 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -3441,7 +3441,8 @@ static const struct pci_device_id nvme_id_table[] = {
{ PCI_DEVICE(0x1d97, 0x1d97), /* Lexar NM620 */
.driver_data = NVME_QUIRK_BOGUS_NID, },
{ PCI_DEVICE(0x1d97, 0x2269), /* Lexar NM760 */
- .driver_data = NVME_QUIRK_BOGUS_NID, },
+ .driver_data = NVME_QUIRK_BOGUS_NID |
+ NVME_QUIRK_IGNORE_DEV_SUBNQN, },
{ PCI_DEVICE(PCI_VENDOR_ID_AMAZON, 0x0061),
.driver_data = NVME_QUIRK_DMA_ADDRESS_BITS_48, },
{ PCI_DEVICE(PCI_VENDOR_ID_AMAZON, 0x0065),
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 42c0598..49c9e7b 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1620,22 +1620,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
if (ret)
goto err_init_connect;
- queue->rd_enabled = true;
set_bit(NVME_TCP_Q_ALLOCATED, &queue->flags);
- nvme_tcp_init_recv_ctx(queue);
-
- write_lock_bh(&queue->sock->sk->sk_callback_lock);
- queue->sock->sk->sk_user_data = queue;
- queue->state_change = queue->sock->sk->sk_state_change;
- queue->data_ready = queue->sock->sk->sk_data_ready;
- queue->write_space = queue->sock->sk->sk_write_space;
- queue->sock->sk->sk_data_ready = nvme_tcp_data_ready;
- queue->sock->sk->sk_state_change = nvme_tcp_state_change;
- queue->sock->sk->sk_write_space = nvme_tcp_write_space;
-#ifdef CONFIG_NET_RX_BUSY_POLL
- queue->sock->sk->sk_ll_usec = 1;
-#endif
- write_unlock_bh(&queue->sock->sk->sk_callback_lock);
return 0;
@@ -1655,7 +1640,7 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl, int qid)
return ret;
}
-static void nvme_tcp_restore_sock_calls(struct nvme_tcp_queue *queue)
+static void nvme_tcp_restore_sock_ops(struct nvme_tcp_queue *queue)
{
struct socket *sock = queue->sock;
@@ -1670,7 +1655,7 @@ static void nvme_tcp_restore_sock_calls(struct nvme_tcp_queue *queue)
static void __nvme_tcp_stop_queue(struct nvme_tcp_queue *queue)
{
kernel_sock_shutdown(queue->sock, SHUT_RDWR);
- nvme_tcp_restore_sock_calls(queue);
+ nvme_tcp_restore_sock_ops(queue);
cancel_work_sync(&queue->io_work);
}
@@ -1688,21 +1673,42 @@ static void nvme_tcp_stop_queue(struct nvme_ctrl *nctrl, int qid)
mutex_unlock(&queue->queue_lock);
}
+static void nvme_tcp_setup_sock_ops(struct nvme_tcp_queue *queue)
+{
+ write_lock_bh(&queue->sock->sk->sk_callback_lock);
+ queue->sock->sk->sk_user_data = queue;
+ queue->state_change = queue->sock->sk->sk_state_change;
+ queue->data_ready = queue->sock->sk->sk_data_ready;
+ queue->write_space = queue->sock->sk->sk_write_space;
+ queue->sock->sk->sk_data_ready = nvme_tcp_data_ready;
+ queue->sock->sk->sk_state_change = nvme_tcp_state_change;
+ queue->sock->sk->sk_write_space = nvme_tcp_write_space;
+#ifdef CONFIG_NET_RX_BUSY_POLL
+ queue->sock->sk->sk_ll_usec = 1;
+#endif
+ write_unlock_bh(&queue->sock->sk->sk_callback_lock);
+}
+
static int nvme_tcp_start_queue(struct nvme_ctrl *nctrl, int idx)
{
struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl);
+ struct nvme_tcp_queue *queue = &ctrl->queues[idx];
int ret;
+ queue->rd_enabled = true;
+ nvme_tcp_init_recv_ctx(queue);
+ nvme_tcp_setup_sock_ops(queue);
+
if (idx)
ret = nvmf_connect_io_queue(nctrl, idx);
else
ret = nvmf_connect_admin_queue(nctrl);
if (!ret) {
- set_bit(NVME_TCP_Q_LIVE, &ctrl->queues[idx].flags);
+ set_bit(NVME_TCP_Q_LIVE, &queue->flags);
} else {
- if (test_bit(NVME_TCP_Q_ALLOCATED, &ctrl->queues[idx].flags))
- __nvme_tcp_stop_queue(&ctrl->queues[idx]);
+ if (test_bit(NVME_TCP_Q_ALLOCATED, &queue->flags))
+ __nvme_tcp_stop_queue(queue);
dev_err(nctrl->device,
"failed to connect queue: %d ret=%d\n", idx, ret);
}
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c
index 53a16b8..8e33e6e 100644
--- a/drivers/pci/controller/dwc/pcie-designware.c
+++ b/drivers/pci/controller/dwc/pcie-designware.c
@@ -1001,11 +1001,6 @@ void dw_pcie_setup(struct dw_pcie *pci)
dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, val);
}
- val = dw_pcie_readl_dbi(pci, PCIE_PORT_LINK_CONTROL);
- val &= ~PORT_LINK_FAST_LINK_MODE;
- val |= PORT_LINK_DLL_LINK_EN;
- dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, val);
-
if (dw_pcie_cap_is(pci, CDM_CHECK)) {
val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS);
val |= PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS |
@@ -1013,6 +1008,11 @@ void dw_pcie_setup(struct dw_pcie *pci)
dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, val);
}
+ val = dw_pcie_readl_dbi(pci, PCIE_PORT_LINK_CONTROL);
+ val &= ~PORT_LINK_FAST_LINK_MODE;
+ val |= PORT_LINK_DLL_LINK_EN;
+ dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, val);
+
if (!pci->num_lanes) {
dev_dbg(pci->dev, "Using h/w default number of lanes\n");
return;
diff --git a/drivers/pinctrl/mediatek/Kconfig b/drivers/pinctrl/mediatek/Kconfig
index f20c283..a71874f 100644
--- a/drivers/pinctrl/mediatek/Kconfig
+++ b/drivers/pinctrl/mediatek/Kconfig
@@ -45,35 +45,35 @@
# For ARMv7 SoCs
config PINCTRL_MT2701
- bool "Mediatek MT2701 pin control"
+ bool "MediaTek MT2701 pin control"
depends on MACH_MT7623 || MACH_MT2701 || COMPILE_TEST
depends on OF
default MACH_MT2701
select PINCTRL_MTK
config PINCTRL_MT7623
- bool "Mediatek MT7623 pin control with generic binding"
+ bool "MediaTek MT7623 pin control with generic binding"
depends on MACH_MT7623 || COMPILE_TEST
depends on OF
default MACH_MT7623
select PINCTRL_MTK_MOORE
config PINCTRL_MT7629
- bool "Mediatek MT7629 pin control"
+ bool "MediaTek MT7629 pin control"
depends on MACH_MT7629 || COMPILE_TEST
depends on OF
default MACH_MT7629
select PINCTRL_MTK_MOORE
config PINCTRL_MT8135
- bool "Mediatek MT8135 pin control"
+ bool "MediaTek MT8135 pin control"
depends on MACH_MT8135 || COMPILE_TEST
depends on OF
default MACH_MT8135
select PINCTRL_MTK
config PINCTRL_MT8127
- bool "Mediatek MT8127 pin control"
+ bool "MediaTek MT8127 pin control"
depends on MACH_MT8127 || COMPILE_TEST
depends on OF
default MACH_MT8127
@@ -88,33 +88,33 @@
select PINCTRL_MTK
config PINCTRL_MT6765
- tristate "Mediatek MT6765 pin control"
+ tristate "MediaTek MT6765 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK_PARIS
config PINCTRL_MT6779
- tristate "Mediatek MT6779 pin control"
+ tristate "MediaTek MT6779 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK_PARIS
help
Say yes here to support pin controller and gpio driver
- on Mediatek MT6779 SoC.
+ on MediaTek MT6779 SoC.
In MTK platform, we support virtual gpio and use it to
map specific eint which doesn't have real gpio pin.
config PINCTRL_MT6795
- bool "Mediatek MT6795 pin control"
+ bool "MediaTek MT6795 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK_PARIS
config PINCTRL_MT6797
- bool "Mediatek MT6797 pin control"
+ bool "MediaTek MT6797 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
@@ -128,40 +128,42 @@
select PINCTRL_MTK_MOORE
config PINCTRL_MT7981
- bool "Mediatek MT7981 pin control"
+ bool "MediaTek MT7981 pin control"
depends on OF
+ depends on ARM64 || COMPILE_TEST
+ default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK_MOORE
config PINCTRL_MT7986
- bool "Mediatek MT7986 pin control"
+ bool "MediaTek MT7986 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK_MOORE
config PINCTRL_MT8167
- bool "Mediatek MT8167 pin control"
+ bool "MediaTek MT8167 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK
config PINCTRL_MT8173
- bool "Mediatek MT8173 pin control"
+ bool "MediaTek MT8173 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK
config PINCTRL_MT8183
- bool "Mediatek MT8183 pin control"
+ bool "MediaTek MT8183 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK_PARIS
config PINCTRL_MT8186
- bool "Mediatek MT8186 pin control"
+ bool "MediaTek MT8186 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
@@ -180,28 +182,28 @@
map specific eint which doesn't have real gpio pin.
config PINCTRL_MT8192
- bool "Mediatek MT8192 pin control"
+ bool "MediaTek MT8192 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK_PARIS
config PINCTRL_MT8195
- bool "Mediatek MT8195 pin control"
+ bool "MediaTek MT8195 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK_PARIS
config PINCTRL_MT8365
- bool "Mediatek MT8365 pin control"
+ bool "MediaTek MT8365 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
select PINCTRL_MTK
config PINCTRL_MT8516
- bool "Mediatek MT8516 pin control"
+ bool "MediaTek MT8516 pin control"
depends on OF
depends on ARM64 || COMPILE_TEST
default ARM64 && ARCH_MEDIATEK
@@ -209,7 +211,7 @@
# For PMIC
config PINCTRL_MT6397
- bool "Mediatek MT6397 pin control"
+ bool "MediaTek MT6397 pin control"
depends on MFD_MT6397 || COMPILE_TEST
depends on OF
default MFD_MT6397
diff --git a/drivers/pinctrl/pinctrl-amd.c b/drivers/pinctrl/pinctrl-amd.c
index 9236a13..609821b 100644
--- a/drivers/pinctrl/pinctrl-amd.c
+++ b/drivers/pinctrl/pinctrl-amd.c
@@ -872,32 +872,34 @@ static const struct pinconf_ops amd_pinconf_ops = {
.pin_config_group_set = amd_pinconf_group_set,
};
-static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
+static void amd_gpio_irq_init_pin(struct amd_gpio *gpio_dev, int pin)
{
- struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
+ const struct pin_desc *pd;
unsigned long flags;
u32 pin_reg, mask;
- int i;
mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) |
BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF) |
BIT(WAKE_CNTRL_OFF_S4);
- for (i = 0; i < desc->npins; i++) {
- int pin = desc->pins[i].number;
- const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin);
+ pd = pin_desc_get(gpio_dev->pctrl, pin);
+ if (!pd)
+ return;
- if (!pd)
- continue;
+ raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+ pin_reg = readl(gpio_dev->base + pin * 4);
+ pin_reg &= ~mask;
+ writel(pin_reg, gpio_dev->base + pin * 4);
+ raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
+}
- raw_spin_lock_irqsave(&gpio_dev->lock, flags);
+static void amd_gpio_irq_init(struct amd_gpio *gpio_dev)
+{
+ struct pinctrl_desc *desc = gpio_dev->pctrl->desc;
+ int i;
- pin_reg = readl(gpio_dev->base + i * 4);
- pin_reg &= ~mask;
- writel(pin_reg, gpio_dev->base + i * 4);
-
- raw_spin_unlock_irqrestore(&gpio_dev->lock, flags);
- }
+ for (i = 0; i < desc->npins; i++)
+ amd_gpio_irq_init_pin(gpio_dev, i);
}
#ifdef CONFIG_PM_SLEEP
@@ -950,8 +952,10 @@ static int amd_gpio_resume(struct device *dev)
for (i = 0; i < desc->npins; i++) {
int pin = desc->pins[i].number;
- if (!amd_gpio_should_save(gpio_dev, pin))
+ if (!amd_gpio_should_save(gpio_dev, pin)) {
+ amd_gpio_irq_init_pin(gpio_dev, pin);
continue;
+ }
raw_spin_lock_irqsave(&gpio_dev->lock, flags);
gpio_dev->saved_regs[i] |= readl(gpio_dev->base + pin * 4) & PIN_IRQ_PENDING;
diff --git a/drivers/pinctrl/pinctrl-at91-pio4.c b/drivers/pinctrl/pinctrl-at91-pio4.c
index 373eed8..c775d23 100644
--- a/drivers/pinctrl/pinctrl-at91-pio4.c
+++ b/drivers/pinctrl/pinctrl-at91-pio4.c
@@ -1206,7 +1206,6 @@ static int atmel_pinctrl_probe(struct platform_device *pdev)
dev_err(dev, "can't add the irq domain\n");
return -ENODEV;
}
- atmel_pioctrl->irq_domain->name = "atmel gpio";
for (i = 0; i < atmel_pioctrl->npins; i++) {
int irq = irq_create_mapping(atmel_pioctrl->irq_domain, i);
diff --git a/drivers/pinctrl/pinctrl-ocelot.c b/drivers/pinctrl/pinctrl-ocelot.c
index 29e4a62..1dcbd09 100644
--- a/drivers/pinctrl/pinctrl-ocelot.c
+++ b/drivers/pinctrl/pinctrl-ocelot.c
@@ -1204,7 +1204,7 @@ static int ocelot_pinmux_set_mux(struct pinctrl_dev *pctldev,
regmap_update_bits(info->map, REG_ALT(0, info, pin->pin),
BIT(p), f << p);
regmap_update_bits(info->map, REG_ALT(1, info, pin->pin),
- BIT(p), f << (p - 1));
+ BIT(p), (f >> 1) << p);
return 0;
}
diff --git a/drivers/pinctrl/stm32/pinctrl-stm32.c b/drivers/pinctrl/stm32/pinctrl-stm32.c
index cb33a23..04ace4c 100644
--- a/drivers/pinctrl/stm32/pinctrl-stm32.c
+++ b/drivers/pinctrl/stm32/pinctrl-stm32.c
@@ -1330,7 +1330,7 @@ static int stm32_gpiolib_register_bank(struct stm32_pinctrl *pctl, struct fwnode
if (fwnode_property_read_u32(fwnode, "st,bank-ioport", &bank_ioport_nr))
bank_ioport_nr = bank_nr;
- bank->gpio_chip.base = bank_nr * STM32_GPIO_PINS_PER_BANK;
+ bank->gpio_chip.base = -1;
bank->gpio_chip.ngpio = npins;
bank->gpio_chip.fwnode = fwnode;
diff --git a/drivers/platform/x86/asus-nb-wmi.c b/drivers/platform/x86/asus-nb-wmi.c
index cb15acd..e2c9a68 100644
--- a/drivers/platform/x86/asus-nb-wmi.c
+++ b/drivers/platform/x86/asus-nb-wmi.c
@@ -464,7 +464,8 @@ static const struct dmi_system_id asus_quirks[] = {
.ident = "ASUS ROG FLOW X13",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
- DMI_MATCH(DMI_PRODUCT_NAME, "GV301Q"),
+ /* Match GV301** */
+ DMI_MATCH(DMI_PRODUCT_NAME, "GV301"),
},
.driver_data = &quirk_asus_tablet_mode,
},
diff --git a/drivers/platform/x86/gigabyte-wmi.c b/drivers/platform/x86/gigabyte-wmi.c
index 4dd39ab..2a42604 100644
--- a/drivers/platform/x86/gigabyte-wmi.c
+++ b/drivers/platform/x86/gigabyte-wmi.c
@@ -151,6 +151,7 @@ static const struct dmi_system_id gigabyte_wmi_known_working_platforms[] = {
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B550I AORUS PRO AX"),
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B550M AORUS PRO-P"),
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B550M DS3H"),
+ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B650 AORUS ELITE AX"),
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B660 GAMING X DDR4"),
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("B660I AORUS PRO DDR4"),
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("Z390 I AORUS PRO WIFI-CF"),
@@ -160,6 +161,7 @@ static const struct dmi_system_id gigabyte_wmi_known_working_platforms[] = {
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("X570 GAMING X"),
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("X570 I AORUS PRO WIFI"),
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("X570 UD"),
+ DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("X570S AORUS ELITE"),
DMI_EXACT_MATCH_GIGABYTE_BOARD_NAME("Z690M AORUS ELITE AX DDR4"),
{ }
};
diff --git a/drivers/platform/x86/ideapad-laptop.c b/drivers/platform/x86/ideapad-laptop.c
index 0eb5bfd..959ec3c 100644
--- a/drivers/platform/x86/ideapad-laptop.c
+++ b/drivers/platform/x86/ideapad-laptop.c
@@ -1170,7 +1170,6 @@ static const struct key_entry ideapad_keymap[] = {
{ KE_KEY, 65, { KEY_PROG4 } },
{ KE_KEY, 66, { KEY_TOUCHPAD_OFF } },
{ KE_KEY, 67, { KEY_TOUCHPAD_ON } },
- { KE_KEY, 68, { KEY_TOUCHPAD_TOGGLE } },
{ KE_KEY, 128, { KEY_ESC } },
/*
@@ -1526,18 +1525,16 @@ static void ideapad_sync_touchpad_state(struct ideapad_private *priv, bool send_
if (priv->features.ctrl_ps2_aux_port)
i8042_command(¶m, value ? I8042_CMD_AUX_ENABLE : I8042_CMD_AUX_DISABLE);
- if (send_events) {
- /*
- * On older models the EC controls the touchpad and toggles it
- * on/off itself, in this case we report KEY_TOUCHPAD_ON/_OFF.
- * If the EC did not toggle, report KEY_TOUCHPAD_TOGGLE.
- */
- if (value != priv->r_touchpad_val) {
- ideapad_input_report(priv, value ? 67 : 66);
- sysfs_notify(&priv->platform_device->dev.kobj, NULL, "touchpad");
- } else {
- ideapad_input_report(priv, 68);
- }
+ /*
+ * On older models the EC controls the touchpad and toggles it on/off
+ * itself, in this case we report KEY_TOUCHPAD_ON/_OFF. Some models do
+ * an acpi-notify with VPC bit 5 set on resume, so this function get
+ * called with send_events=true on every resume. Therefor if the EC did
+ * not toggle, do nothing to avoid sending spurious KEY_TOUCHPAD_TOGGLE.
+ */
+ if (send_events && value != priv->r_touchpad_val) {
+ ideapad_input_report(priv, value ? 67 : 66);
+ sysfs_notify(&priv->platform_device->dev.kobj, NULL, "touchpad");
}
priv->r_touchpad_val = value;
diff --git a/drivers/platform/x86/intel/pmc/core.c b/drivers/platform/x86/intel/pmc/core.c
index 3a15d32..b959196 100644
--- a/drivers/platform/x86/intel/pmc/core.c
+++ b/drivers/platform/x86/intel/pmc/core.c
@@ -66,7 +66,18 @@ static inline void pmc_core_reg_write(struct pmc_dev *pmcdev, int reg_offset,
static inline u64 pmc_core_adjust_slp_s0_step(struct pmc_dev *pmcdev, u32 value)
{
- return (u64)value * pmcdev->map->slp_s0_res_counter_step;
+ /*
+ * ADL PCH does not have the SLP_S0 counter and LPM Residency counters are
+ * used as a workaround which uses 30.5 usec tick. All other client
+ * programs have the legacy SLP_S0 residency counter that is using the 122
+ * usec tick.
+ */
+ const int lpm_adj_x2 = pmcdev->map->lpm_res_counter_step_x2;
+
+ if (pmcdev->map == &adl_reg_map)
+ return (u64)value * GET_X2_COUNTER((u64)lpm_adj_x2);
+ else
+ return (u64)value * pmcdev->map->slp_s0_res_counter_step;
}
static int set_etr3(struct pmc_dev *pmcdev)
diff --git a/drivers/ptp/ptp_qoriq.c b/drivers/ptp/ptp_qoriq.c
index 6153016..350154e 100644
--- a/drivers/ptp/ptp_qoriq.c
+++ b/drivers/ptp/ptp_qoriq.c
@@ -637,7 +637,7 @@ static int ptp_qoriq_probe(struct platform_device *dev)
return 0;
no_clock:
- iounmap(ptp_qoriq->base);
+ iounmap(base);
no_ioremap:
release_resource(ptp_qoriq->rsrc);
no_resource:
diff --git a/drivers/regulator/fixed.c b/drivers/regulator/fixed.c
index 2a9867a..e6724a22 100644
--- a/drivers/regulator/fixed.c
+++ b/drivers/regulator/fixed.c
@@ -215,7 +215,7 @@ static int reg_fixed_voltage_probe(struct platform_device *pdev)
drvdata->enable_clock = devm_clk_get(dev, NULL);
if (IS_ERR(drvdata->enable_clock)) {
dev_err(dev, "Can't get enable-clock from devicetree\n");
- return -ENOENT;
+ return PTR_ERR(drvdata->enable_clock);
}
} else if (drvtype && drvtype->has_performance_state) {
drvdata->desc.ops = &fixed_voltage_domain_ops;
diff --git a/drivers/s390/block/dasd.c b/drivers/s390/block/dasd.c
index a9c2a8d..9fbfce7 100644
--- a/drivers/s390/block/dasd.c
+++ b/drivers/s390/block/dasd.c
@@ -73,7 +73,8 @@ static void dasd_profile_init(struct dasd_profile *, struct dentry *);
static void dasd_profile_exit(struct dasd_profile *);
static void dasd_hosts_init(struct dentry *, struct dasd_device *);
static void dasd_hosts_exit(struct dasd_device *);
-
+static int dasd_handle_autoquiesce(struct dasd_device *, struct dasd_ccw_req *,
+ unsigned int);
/*
* SECTION: Operations on the device structure.
*/
@@ -1451,6 +1452,8 @@ int dasd_start_IO(struct dasd_ccw_req *cqr)
case -ENODEV:
DBF_DEV_EVENT(DBF_WARNING, device, "%s",
"start_IO: -ENODEV device gone, retry");
+ /* this is equivalent to CC=3 for SSCH report this to EER */
+ dasd_handle_autoquiesce(device, cqr, DASD_EER_STARTIO);
break;
case -EIO:
DBF_DEV_EVENT(DBF_WARNING, device, "%s",
@@ -1953,6 +1956,16 @@ static void __dasd_device_process_final_queue(struct dasd_device *device,
}
/*
+ * check if device should be autoquiesced due to too many timeouts
+ */
+static void __dasd_device_check_autoquiesce_timeout(struct dasd_device *device,
+ struct dasd_ccw_req *cqr)
+{
+ if ((device->default_retries - cqr->retries) >= device->aq_timeouts)
+ dasd_handle_autoquiesce(device, cqr, DASD_EER_TIMEOUTS);
+}
+
+/*
* Take a look at the first request on the ccw queue and check
* if it reached its expire time. If so, terminate the IO.
*/
@@ -1986,6 +1999,7 @@ static void __dasd_device_check_expire(struct dasd_device *device)
"remaining\n", cqr, (cqr->expires/HZ),
cqr->retries);
}
+ __dasd_device_check_autoquiesce_timeout(device, cqr);
}
}
@@ -2325,7 +2339,7 @@ static int _dasd_sleep_on(struct dasd_ccw_req *maincqr, int interruptible)
/* Non-temporary stop condition will trigger fail fast */
if (device->stopped & ~DASD_STOPPED_PENDING &&
test_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags) &&
- (!dasd_eer_enabled(device))) {
+ !dasd_eer_enabled(device) && device->aq_mask == 0) {
cqr->status = DASD_CQR_FAILED;
cqr->intrc = -ENOLINK;
continue;
@@ -2801,20 +2815,18 @@ static void __dasd_process_block_ccw_queue(struct dasd_block *block,
dasd_log_sense(cqr, &cqr->irb);
}
- /* First of all call extended error reporting. */
- if (dasd_eer_enabled(base) &&
- cqr->status == DASD_CQR_FAILED) {
- dasd_eer_write(base, cqr, DASD_EER_FATALERROR);
-
- /* restart request */
+ /*
+ * First call extended error reporting and check for autoquiesce
+ */
+ spin_lock_irqsave(get_ccwdev_lock(base->cdev), flags);
+ if (cqr->status == DASD_CQR_FAILED &&
+ dasd_handle_autoquiesce(base, cqr, DASD_EER_FATALERROR)) {
cqr->status = DASD_CQR_FILLED;
cqr->retries = 255;
- spin_lock_irqsave(get_ccwdev_lock(base->cdev), flags);
- dasd_device_set_stop_bits(base, DASD_STOPPED_QUIESCE);
- spin_unlock_irqrestore(get_ccwdev_lock(base->cdev),
- flags);
+ spin_unlock_irqrestore(get_ccwdev_lock(base->cdev), flags);
goto restart;
}
+ spin_unlock_irqrestore(get_ccwdev_lock(base->cdev), flags);
/* Process finished ERP request. */
if (cqr->refers) {
@@ -2856,7 +2868,7 @@ static void __dasd_block_start_head(struct dasd_block *block)
/* Non-temporary stop condition will trigger fail fast */
if (block->base->stopped & ~DASD_STOPPED_PENDING &&
test_bit(DASD_CQR_FLAGS_FAILFAST, &cqr->flags) &&
- (!dasd_eer_enabled(block->base))) {
+ !dasd_eer_enabled(block->base) && block->base->aq_mask == 0) {
cqr->status = DASD_CQR_FAILED;
cqr->intrc = -ENOLINK;
dasd_schedule_block_bh(block);
@@ -2941,7 +2953,7 @@ static int _dasd_requeue_request(struct dasd_ccw_req *cqr)
return 0;
spin_lock_irq(&cqr->dq->lock);
req = (struct request *) cqr->callback_data;
- blk_mq_requeue_request(req, false);
+ blk_mq_requeue_request(req, true);
spin_unlock_irq(&cqr->dq->lock);
return 0;
@@ -3670,8 +3682,8 @@ int dasd_generic_last_path_gone(struct dasd_device *device)
dev_warn(&device->cdev->dev, "No operational channel path is left "
"for the device\n");
DBF_DEV_EVENT(DBF_WARNING, device, "%s", "last path gone");
- /* First of all call extended error reporting. */
- dasd_eer_write(device, NULL, DASD_EER_NOPATH);
+ /* First call extended error reporting and check for autoquiesce. */
+ dasd_handle_autoquiesce(device, NULL, DASD_EER_NOPATH);
if (device->state < DASD_STATE_BASIC)
return 0;
@@ -3803,7 +3815,8 @@ void dasd_generic_path_event(struct ccw_device *cdev, int *path_event)
"No verified channel paths remain for the device\n");
DBF_DEV_EVENT(DBF_WARNING, device,
"%s", "last verified path gone");
- dasd_eer_write(device, NULL, DASD_EER_NOPATH);
+ /* First call extended error reporting and check for autoquiesce. */
+ dasd_handle_autoquiesce(device, NULL, DASD_EER_NOPATH);
dasd_device_set_stop_bits(device,
DASD_STOPPED_DC_WAIT);
}
@@ -3825,7 +3838,8 @@ EXPORT_SYMBOL_GPL(dasd_generic_verify_path);
void dasd_generic_space_exhaust(struct dasd_device *device,
struct dasd_ccw_req *cqr)
{
- dasd_eer_write(device, NULL, DASD_EER_NOSPC);
+ /* First call extended error reporting and check for autoquiesce. */
+ dasd_handle_autoquiesce(device, NULL, DASD_EER_NOSPC);
if (device->state < DASD_STATE_BASIC)
return;
@@ -3958,6 +3972,31 @@ void dasd_schedule_requeue(struct dasd_device *device)
}
EXPORT_SYMBOL(dasd_schedule_requeue);
+static int dasd_handle_autoquiesce(struct dasd_device *device,
+ struct dasd_ccw_req *cqr,
+ unsigned int reason)
+{
+ /* in any case write eer message with reason */
+ if (dasd_eer_enabled(device))
+ dasd_eer_write(device, cqr, reason);
+
+ if (!test_bit(reason, &device->aq_mask))
+ return 0;
+
+ /* notify eer about autoquiesce */
+ if (dasd_eer_enabled(device))
+ dasd_eer_write(device, NULL, DASD_EER_AUTOQUIESCE);
+
+ pr_info("%s: The DASD has been put in the quiesce state\n",
+ dev_name(&device->cdev->dev));
+ dasd_device_set_stop_bits(device, DASD_STOPPED_QUIESCE);
+
+ if (device->features & DASD_FEATURE_REQUEUEQUIESCE)
+ dasd_schedule_requeue(device);
+
+ return 1;
+}
+
static struct dasd_ccw_req *dasd_generic_build_rdc(struct dasd_device *device,
int rdc_buffer_size,
int magic)
diff --git a/drivers/s390/block/dasd_devmap.c b/drivers/s390/block/dasd_devmap.c
index df17f0f..620fab0 100644
--- a/drivers/s390/block/dasd_devmap.c
+++ b/drivers/s390/block/dasd_devmap.c
@@ -50,6 +50,7 @@ struct dasd_devmap {
unsigned short features;
struct dasd_device *device;
struct dasd_copy_relation *copy;
+ unsigned int aq_mask;
};
/*
@@ -1476,6 +1477,128 @@ dasd_eer_store(struct device *dev, struct device_attribute *attr,
static DEVICE_ATTR(eer_enabled, 0644, dasd_eer_show, dasd_eer_store);
/*
+ * aq_mask controls if the DASD should be quiesced on certain triggers
+ * The aq_mask attribute is interpreted as bitmap of the DASD_EER_* triggers.
+ */
+static ssize_t dasd_aq_mask_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct dasd_devmap *devmap;
+ unsigned int aq_mask = 0;
+
+ devmap = dasd_find_busid(dev_name(dev));
+ if (!IS_ERR(devmap))
+ aq_mask = devmap->aq_mask;
+
+ return sysfs_emit(buf, "%d\n", aq_mask);
+}
+
+static ssize_t dasd_aq_mask_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dasd_devmap *devmap;
+ unsigned int val;
+
+ if (kstrtouint(buf, 0, &val) || val > DASD_EER_VALID)
+ return -EINVAL;
+
+ devmap = dasd_devmap_from_cdev(to_ccwdev(dev));
+ if (IS_ERR(devmap))
+ return PTR_ERR(devmap);
+
+ spin_lock(&dasd_devmap_lock);
+ devmap->aq_mask = val;
+ if (devmap->device)
+ devmap->device->aq_mask = devmap->aq_mask;
+ spin_unlock(&dasd_devmap_lock);
+
+ return count;
+}
+
+static DEVICE_ATTR(aq_mask, 0644, dasd_aq_mask_show, dasd_aq_mask_store);
+
+/*
+ * aq_requeue controls if requests are returned to the blocklayer on quiesce
+ * or if requests are only not started
+ */
+static ssize_t dasd_aqr_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct dasd_devmap *devmap;
+ int flag;
+
+ devmap = dasd_find_busid(dev_name(dev));
+ if (!IS_ERR(devmap))
+ flag = (devmap->features & DASD_FEATURE_REQUEUEQUIESCE) != 0;
+ else
+ flag = (DASD_FEATURE_DEFAULT &
+ DASD_FEATURE_REQUEUEQUIESCE) != 0;
+ return sysfs_emit(buf, "%d\n", flag);
+}
+
+static ssize_t dasd_aqr_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ bool val;
+ int rc;
+
+ if (kstrtobool(buf, &val))
+ return -EINVAL;
+
+ rc = dasd_set_feature(to_ccwdev(dev), DASD_FEATURE_REQUEUEQUIESCE, val);
+
+ return rc ? : count;
+}
+
+static DEVICE_ATTR(aq_requeue, 0644, dasd_aqr_show, dasd_aqr_store);
+
+/*
+ * aq_timeouts controls how much retries have to time out until
+ * a device gets autoquiesced
+ */
+static ssize_t
+dasd_aq_timeouts_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct dasd_device *device;
+ int len;
+
+ device = dasd_device_from_cdev(to_ccwdev(dev));
+ if (IS_ERR(device))
+ return -ENODEV;
+ len = sysfs_emit(buf, "%u\n", device->aq_timeouts);
+ dasd_put_device(device);
+ return len;
+}
+
+static ssize_t
+dasd_aq_timeouts_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct dasd_device *device;
+ unsigned int val;
+
+ device = dasd_device_from_cdev(to_ccwdev(dev));
+ if (IS_ERR(device))
+ return -ENODEV;
+
+ if ((kstrtouint(buf, 10, &val) != 0) ||
+ val > DASD_RETRIES_MAX || val == 0) {
+ dasd_put_device(device);
+ return -EINVAL;
+ }
+
+ if (val)
+ device->aq_timeouts = val;
+
+ dasd_put_device(device);
+ return count;
+}
+
+static DEVICE_ATTR(aq_timeouts, 0644, dasd_aq_timeouts_show,
+ dasd_aq_timeouts_store);
+
+/*
* expiration time for default requests
*/
static ssize_t
@@ -2324,6 +2447,9 @@ static struct attribute * dasd_attrs[] = {
&dev_attr_copy_pair.attr,
&dev_attr_copy_role.attr,
&dev_attr_ping.attr,
+ &dev_attr_aq_mask.attr,
+ &dev_attr_aq_requeue.attr,
+ &dev_attr_aq_timeouts.attr,
NULL,
};
diff --git a/drivers/s390/block/dasd_eckd.c b/drivers/s390/block/dasd_eckd.c
index 1a69f97..ade1369 100644
--- a/drivers/s390/block/dasd_eckd.c
+++ b/drivers/s390/block/dasd_eckd.c
@@ -2109,6 +2109,7 @@ dasd_eckd_check_characteristics(struct dasd_device *device)
device->default_retries = DASD_RETRIES;
device->path_thrhld = DASD_ECKD_PATH_THRHLD;
device->path_interval = DASD_ECKD_PATH_INTERVAL;
+ device->aq_timeouts = DASD_RETRIES_MAX;
if (private->conf.gneq) {
value = 1;
diff --git a/drivers/s390/block/dasd_eer.c b/drivers/s390/block/dasd_eer.c
index a4cc772..c956de7 100644
--- a/drivers/s390/block/dasd_eer.c
+++ b/drivers/s390/block/dasd_eer.c
@@ -387,6 +387,7 @@ void dasd_eer_write(struct dasd_device *device, struct dasd_ccw_req *cqr,
break;
case DASD_EER_NOPATH:
case DASD_EER_NOSPC:
+ case DASD_EER_AUTOQUIESCE:
dasd_eer_write_standard_trigger(device, NULL, id);
break;
case DASD_EER_STATECHANGE:
diff --git a/drivers/s390/block/dasd_int.h b/drivers/s390/block/dasd_int.h
index 97adc8a..33f812f 100644
--- a/drivers/s390/block/dasd_int.h
+++ b/drivers/s390/block/dasd_int.h
@@ -444,22 +444,22 @@ struct dasd_discipline {
extern struct dasd_discipline *dasd_diag_discipline_pointer;
-/*
- * Notification numbers for extended error reporting notifications:
- * The DASD_EER_DISABLE notification is sent before a dasd_device (and it's
- * eer pointer) is freed. The error reporting module needs to do all necessary
- * cleanup steps.
- * The DASD_EER_TRIGGER notification sends the actual error reports (triggers).
- */
-#define DASD_EER_DISABLE 0
-#define DASD_EER_TRIGGER 1
+/* Trigger IDs for extended error reporting DASD EER and autoquiesce */
+enum eer_trigger {
+ DASD_EER_FATALERROR = 1,
+ DASD_EER_NOPATH,
+ DASD_EER_STATECHANGE,
+ DASD_EER_PPRCSUSPEND,
+ DASD_EER_NOSPC,
+ DASD_EER_TIMEOUTS,
+ DASD_EER_STARTIO,
-/* Trigger IDs for extended error reporting DASD_EER_TRIGGER notification */
-#define DASD_EER_FATALERROR 1
-#define DASD_EER_NOPATH 2
-#define DASD_EER_STATECHANGE 3
-#define DASD_EER_PPRCSUSPEND 4
-#define DASD_EER_NOSPC 5
+ /* enum end marker, only add new trigger above */
+ DASD_EER_MAX,
+ DASD_EER_AUTOQUIESCE = 31, /* internal only */
+};
+
+#define DASD_EER_VALID ((1U << DASD_EER_MAX) - 1)
/* DASD path handling */
@@ -637,6 +637,8 @@ struct dasd_device {
struct dasd_format_entry format_entry;
struct kset *paths_info;
struct dasd_copy_relation *copy;
+ unsigned long aq_mask;
+ unsigned int aq_timeouts;
};
struct dasd_block {
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c
index 997b524..a48c693 100644
--- a/drivers/s390/crypto/vfio_ap_drv.c
+++ b/drivers/s390/crypto/vfio_ap_drv.c
@@ -54,8 +54,9 @@ static struct ap_driver vfio_ap_drv = {
static void vfio_ap_matrix_dev_release(struct device *dev)
{
- struct ap_matrix_dev *matrix_dev = dev_get_drvdata(dev);
+ struct ap_matrix_dev *matrix_dev;
+ matrix_dev = container_of(dev, struct ap_matrix_dev, device);
kfree(matrix_dev);
}
diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c
index 3ceece9..c895189 100644
--- a/drivers/scsi/megaraid/megaraid_sas_base.c
+++ b/drivers/scsi/megaraid/megaraid_sas_base.c
@@ -3298,7 +3298,7 @@ fw_crash_buffer_show(struct device *cdev,
spin_lock_irqsave(&instance->crashdump_lock, flags);
buff_offset = instance->fw_crash_buffer_offset;
- if (!instance->crash_dump_buf &&
+ if (!instance->crash_dump_buf ||
!((instance->fw_crash_state == AVAILABLE) ||
(instance->fw_crash_state == COPYING))) {
dev_err(&instance->pdev->dev,
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c
index 84c9a55..8a83f3fc 100644
--- a/drivers/scsi/megaraid/megaraid_sas_fusion.c
+++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c
@@ -4771,7 +4771,7 @@ int megasas_task_abort_fusion(struct scsi_cmnd *scmd)
devhandle = megasas_get_tm_devhandle(scmd->device);
if (devhandle == (u16)ULONG_MAX) {
- ret = SUCCESS;
+ ret = FAILED;
sdev_printk(KERN_INFO, scmd->device,
"task abort issued for invalid devhandle\n");
mutex_unlock(&instance->reset_mutex);
@@ -4841,7 +4841,7 @@ int megasas_reset_target_fusion(struct scsi_cmnd *scmd)
devhandle = megasas_get_tm_devhandle(scmd->device);
if (devhandle == (u16)ULONG_MAX) {
- ret = SUCCESS;
+ ret = FAILED;
sdev_printk(KERN_INFO, scmd->device,
"target reset issued for invalid devhandle\n");
mutex_unlock(&instance->reset_mutex);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 2ee9ea5..14ae0a9 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -6616,11 +6616,6 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc)
else if (rc == -EAGAIN)
goto try_32bit_dma;
total_sz += sense_sz;
- ioc_info(ioc,
- "sense pool(0x%p)- dma(0x%llx): depth(%d),"
- "element_size(%d), pool_size(%d kB)\n",
- ioc->sense, (unsigned long long)ioc->sense_dma, ioc->scsiio_depth,
- SCSI_SENSE_BUFFERSIZE, sz / 1024);
/* reply pool, 4 byte align */
sz = ioc->reply_free_queue_depth * ioc->reply_sz;
rc = _base_allocate_reply_pool(ioc, sz);
diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index 5cce1ba..09ef0b3 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -314,11 +314,18 @@ static int scsi_vpd_inquiry(struct scsi_device *sdev, unsigned char *buffer,
if (result)
return -EIO;
- /* Sanity check that we got the page back that we asked for */
+ /*
+ * Sanity check that we got the page back that we asked for and that
+ * the page size is not 0.
+ */
if (buffer[1] != page)
return -EIO;
- return get_unaligned_be16(&buffer[2]) + 4;
+ result = get_unaligned_be16(&buffer[2]);
+ if (!result)
+ return -EIO;
+
+ return result + 4;
}
static int scsi_get_vpd_size(struct scsi_device *sdev, u8 page)
diff --git a/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c b/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c
index 90526f4..d71ee50 100644
--- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c
+++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c
@@ -153,7 +153,6 @@ static int sys_set_trip_temp(struct thermal_zone_device *tzd, int trip, int temp
cancel_delayed_work_sync(&pci_info->work);
proc_thermal_mmio_write(pci_info, PROC_THERMAL_MMIO_INT_ENABLE_0, 0);
proc_thermal_mmio_write(pci_info, PROC_THERMAL_MMIO_THRES_0, 0);
- thermal_zone_device_disable(tzd);
pci_info->stored_thres = 0;
return 0;
}
diff --git a/drivers/thermal/intel/intel_powerclamp.c b/drivers/thermal/intel/intel_powerclamp.c
index c7ba568..91fc7e2 100644
--- a/drivers/thermal/intel/intel_powerclamp.c
+++ b/drivers/thermal/intel/intel_powerclamp.c
@@ -235,6 +235,12 @@ static int max_idle_set(const char *arg, const struct kernel_param *kp)
goto skip_limit_set;
}
+ if (!cpumask_available(idle_injection_cpu_mask)) {
+ ret = allocate_copy_idle_injection_mask(cpu_present_mask);
+ if (ret)
+ goto skip_limit_set;
+ }
+
if (check_invalid(idle_injection_cpu_mask, new_max_idle)) {
ret = -EINVAL;
goto skip_limit_set;
@@ -791,7 +797,8 @@ static int __init powerclamp_init(void)
return retval;
mutex_lock(&powerclamp_lock);
- retval = allocate_copy_idle_injection_mask(cpu_present_mask);
+ if (!cpumask_available(idle_injection_cpu_mask))
+ retval = allocate_copy_idle_injection_mask(cpu_present_mask);
mutex_unlock(&powerclamp_lock);
if (retval)
diff --git a/drivers/thermal/thermal_sysfs.c b/drivers/thermal/thermal_sysfs.c
index a4aba7b..6c20c9f 100644
--- a/drivers/thermal/thermal_sysfs.c
+++ b/drivers/thermal/thermal_sysfs.c
@@ -876,8 +876,6 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
unsigned long states = cdev->max_state + 1;
int var;
- lockdep_assert_held(&cdev->lock);
-
var = sizeof(*stats);
var += sizeof(*stats->time_in_state) * states;
var += sizeof(*stats->trans_table) * states * states;
@@ -903,8 +901,6 @@ static void cooling_device_stats_setup(struct thermal_cooling_device *cdev)
static void cooling_device_stats_destroy(struct thermal_cooling_device *cdev)
{
- lockdep_assert_held(&cdev->lock);
-
kfree(cdev->stats);
cdev->stats = NULL;
}
@@ -931,6 +927,8 @@ void thermal_cooling_device_destroy_sysfs(struct thermal_cooling_device *cdev)
void thermal_cooling_device_stats_reinit(struct thermal_cooling_device *cdev)
{
+ lockdep_assert_held(&cdev->lock);
+
cooling_device_stats_destroy(cdev);
cooling_device_stats_setup(cdev);
}
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 36fb945..9d117e57 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -466,7 +466,7 @@ static const struct file_operations tty_fops = {
.llseek = no_llseek,
.read_iter = tty_read,
.write_iter = tty_write,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.splice_write = iter_file_splice_write,
.poll = tty_poll,
.unlocked_ioctl = tty_ioctl,
@@ -481,7 +481,7 @@ static const struct file_operations console_fops = {
.llseek = no_llseek,
.read_iter = tty_read,
.write_iter = redirected_tty_write,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.splice_write = iter_file_splice_write,
.poll = tty_poll,
.unlocked_ioctl = tty_ioctl,
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 90e40d5..e54f088 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -1921,8 +1921,7 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
level = -1;
ULIST_ITER_INIT(&uiter);
while (1) {
- bool is_shared;
- bool cached;
+ const unsigned long prev_ref_count = ctx->refs.nnodes;
walk_ctx.bytenr = bytenr;
ret = find_parent_nodes(&walk_ctx, &shared);
@@ -1940,21 +1939,36 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
ret = 0;
/*
- * If our data extent was not directly shared (without multiple
- * reference items), than it might have a single reference item
- * with a count > 1 for the same offset, which means there are 2
- * (or more) file extent items that point to the data extent -
- * this happens when a file extent item needs to be split and
- * then one item gets moved to another leaf due to a b+tree leaf
- * split when inserting some item. In this case the file extent
- * items may be located in different leaves and therefore some
- * of the leaves may be referenced through shared subtrees while
- * others are not. Since our extent buffer cache only works for
- * a single path (by far the most common case and simpler to
- * deal with), we can not use it if we have multiple leaves
- * (which implies multiple paths).
+ * More than one extent buffer (bytenr) may have been added to
+ * the ctx->refs ulist, in which case we have to check multiple
+ * tree paths in case the first one is not shared, so we can not
+ * use the path cache which is made for a single path. Multiple
+ * extent buffers at the current level happen when:
+ *
+ * 1) level -1, the data extent: If our data extent was not
+ * directly shared (without multiple reference items), then
+ * it might have a single reference item with a count > 1 for
+ * the same offset, which means there are 2 (or more) file
+ * extent items that point to the data extent - this happens
+ * when a file extent item needs to be split and then one
+ * item gets moved to another leaf due to a b+tree leaf split
+ * when inserting some item. In this case the file extent
+ * items may be located in different leaves and therefore
+ * some of the leaves may be referenced through shared
+ * subtrees while others are not. Since our extent buffer
+ * cache only works for a single path (by far the most common
+ * case and simpler to deal with), we can not use it if we
+ * have multiple leaves (which implies multiple paths).
+ *
+ * 2) level >= 0, a tree node/leaf: We can have a mix of direct
+ * and indirect references on a b+tree node/leaf, so we have
+ * to check multiple paths, and the extent buffer (the
+ * current bytenr) may be shared or not. One example is
+ * during relocation as we may get a shared tree block ref
+ * (direct ref) and a non-shared tree block ref (indirect
+ * ref) for the same node/leaf.
*/
- if (level == -1 && ctx->refs.nnodes > 1)
+ if ((ctx->refs.nnodes - prev_ref_count) > 1)
ctx->use_path_cache = false;
if (level >= 0)
@@ -1964,12 +1978,17 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
if (!node)
break;
bytenr = node->val;
- level++;
- cached = lookup_backref_shared_cache(ctx, root, bytenr, level,
- &is_shared);
- if (cached) {
- ret = (is_shared ? 1 : 0);
- break;
+ if (ctx->use_path_cache) {
+ bool is_shared;
+ bool cached;
+
+ level++;
+ cached = lookup_backref_shared_cache(ctx, root, bytenr,
+ level, &is_shared);
+ if (cached) {
+ ret = (is_shared ? 1 : 0);
+ break;
+ }
}
shared.share_count = 0;
shared.have_delayed_delete_refs = false;
@@ -1977,6 +1996,28 @@ int btrfs_is_data_extent_shared(struct btrfs_inode *inode, u64 bytenr,
}
/*
+ * If the path cache is disabled, then it means at some tree level we
+ * got multiple parents due to a mix of direct and indirect backrefs or
+ * multiple leaves with file extent items pointing to the same data
+ * extent. We have to invalidate the cache and cache only the sharedness
+ * result for the levels where we got only one node/reference.
+ */
+ if (!ctx->use_path_cache) {
+ int i = 0;
+
+ level--;
+ if (ret >= 0 && level >= 0) {
+ bytenr = ctx->path_cache_entries[level].bytenr;
+ ctx->use_path_cache = true;
+ store_backref_shared_cache(ctx, root, bytenr, level, ret);
+ i = level + 1;
+ }
+
+ for ( ; i < BTRFS_MAX_LEVEL; i++)
+ ctx->path_cache_entries[i].bytenr = 0;
+ }
+
+ /*
* Cache the sharedness result for the data extent if we know our inode
* has more than 1 file extent item that refers to the data extent.
*/
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index a0ef1a1..ba769a1 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -3732,7 +3732,9 @@ static long btrfs_ioctl_qgroup_assign(struct file *file, void __user *arg)
}
/* update qgroup status and info */
+ mutex_lock(&fs_info->qgroup_ioctl_lock);
err = btrfs_run_qgroups(trans);
+ mutex_unlock(&fs_info->qgroup_ioctl_lock);
if (err < 0)
btrfs_handle_fs_error(fs_info, err,
"failed to update qgroup status and info");
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 52a7d2f..f41da7a 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2828,13 +2828,22 @@ int btrfs_qgroup_account_extents(struct btrfs_trans_handle *trans)
}
/*
- * called from commit_transaction. Writes all changed qgroups to disk.
+ * Writes all changed qgroups to disk.
+ * Called by the transaction commit path and the qgroup assign ioctl.
*/
int btrfs_run_qgroups(struct btrfs_trans_handle *trans)
{
struct btrfs_fs_info *fs_info = trans->fs_info;
int ret = 0;
+ /*
+ * In case we are called from the qgroup assign ioctl, assert that we
+ * are holding the qgroup_ioctl_lock, otherwise we can race with a quota
+ * disable operation (ioctl) and access a freed quota root.
+ */
+ if (trans->transaction->state != TRANS_STATE_COMMIT_DOING)
+ lockdep_assert_held(&fs_info->qgroup_ioctl_lock);
+
if (!fs_info->quota_root)
return ret;
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 18329eb..b8d5b1f 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -2035,7 +2035,20 @@ static void cleanup_transaction(struct btrfs_trans_handle *trans, int err)
if (current->journal_info == trans)
current->journal_info = NULL;
- btrfs_scrub_cancel(fs_info);
+
+ /*
+ * If relocation is running, we can't cancel scrub because that will
+ * result in a deadlock. Before relocating a block group, relocation
+ * pauses scrub, then starts and commits a transaction before unpausing
+ * scrub. If the transaction commit is being done by the relocation
+ * task or triggered by another task and the relocation task is waiting
+ * for the commit, and we end up here due to an error in the commit
+ * path, then calling btrfs_scrub_cancel() will deadlock, as we are
+ * asking for scrub to stop while having it asked to be paused higher
+ * above in relocation code.
+ */
+ if (!test_bit(BTRFS_FS_RELOC_RUNNING, &fs_info->flags))
+ btrfs_scrub_cancel(fs_info);
kmem_cache_free(btrfs_trans_handle_cachep, trans);
}
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 6d0124b..c6d5928 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1366,8 +1366,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
* So, we need to add a special mount option to scan for
* later supers, using BTRFS_SUPER_MIRROR_MAX instead
*/
- flags |= FMODE_EXCL;
+ /*
+ * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may
+ * initiate the device scan which may race with the user's mount
+ * or mkfs command, resulting in failure.
+ * Since the device scan is solely for reading purposes, there is
+ * no need for FMODE_EXCL. Additionally, the devices are read again
+ * during the mount process. It is ok to get some inconsistent
+ * values temporarily, as the device paths of the fsid are the only
+ * required information for assembling the volume.
+ */
bdev = blkdev_get_by_path(path, flags, holder);
if (IS_ERR(bdev))
return ERR_CAST(bdev);
@@ -3266,8 +3275,15 @@ int btrfs_relocate_chunk(struct btrfs_fs_info *fs_info, u64 chunk_offset)
btrfs_scrub_pause(fs_info);
ret = btrfs_relocate_block_group(fs_info, chunk_offset);
btrfs_scrub_continue(fs_info);
- if (ret)
+ if (ret) {
+ /*
+ * If we had a transaction abort, stop all running scrubs.
+ * See transaction.c:cleanup_transaction() why we do it here.
+ */
+ if (BTRFS_FS_ERROR(fs_info))
+ btrfs_scrub_cancel(fs_info);
return ret;
+ }
block_group = btrfs_lookup_block_group(fs_info, chunk_offset);
if (!block_group)
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index ac9034f..5fd67c1 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -1362,7 +1362,7 @@ const struct file_operations cifs_file_ops = {
.fsync = cifs_fsync,
.flush = cifs_flush,
.mmap = cifs_file_mmap,
- .splice_read = cifs_splice_read,
+ .splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.llseek = cifs_llseek,
.unlocked_ioctl = cifs_ioctl,
@@ -1382,7 +1382,7 @@ const struct file_operations cifs_file_strict_ops = {
.fsync = cifs_strict_fsync,
.flush = cifs_flush,
.mmap = cifs_file_strict_mmap,
- .splice_read = cifs_splice_read,
+ .splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.llseek = cifs_llseek,
.unlocked_ioctl = cifs_ioctl,
@@ -1420,7 +1420,7 @@ const struct file_operations cifs_file_nobrl_ops = {
.fsync = cifs_fsync,
.flush = cifs_flush,
.mmap = cifs_file_mmap,
- .splice_read = cifs_splice_read,
+ .splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.llseek = cifs_llseek,
.unlocked_ioctl = cifs_ioctl,
@@ -1438,7 +1438,7 @@ const struct file_operations cifs_file_strict_nobrl_ops = {
.fsync = cifs_strict_fsync,
.flush = cifs_flush,
.mmap = cifs_file_strict_mmap,
- .splice_read = cifs_splice_read,
+ .splice_read = generic_file_splice_read,
.splice_write = iter_file_splice_write,
.llseek = cifs_llseek,
.unlocked_ioctl = cifs_ioctl,
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 71fe0a0a..9321c16 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -100,9 +100,6 @@ extern ssize_t cifs_strict_readv(struct kiocb *iocb, struct iov_iter *to);
extern ssize_t cifs_user_writev(struct kiocb *iocb, struct iov_iter *from);
extern ssize_t cifs_direct_writev(struct kiocb *iocb, struct iov_iter *from);
extern ssize_t cifs_strict_writev(struct kiocb *iocb, struct iov_iter *from);
-extern ssize_t cifs_splice_read(struct file *in, loff_t *ppos,
- struct pipe_inode_info *pipe, size_t len,
- unsigned int flags);
extern int cifs_flock(struct file *pfile, int cmd, struct file_lock *plock);
extern int cifs_lock(struct file *, int, struct file_lock *);
extern int cifs_fsync(struct file *, loff_t, loff_t, int);
@@ -124,7 +121,10 @@ extern const struct dentry_operations cifs_ci_dentry_ops;
#ifdef CONFIG_CIFS_DFS_UPCALL
extern struct vfsmount *cifs_dfs_d_automount(struct path *path);
#else
-#define cifs_dfs_d_automount NULL
+static inline struct vfsmount *cifs_dfs_d_automount(struct path *path)
+{
+ return ERR_PTR(-EREMOTE);
+}
#endif
/* Functions related to symlinks */
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index 38a697e..0d30b17 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -71,7 +71,7 @@ cifs_reconnect_tcon(struct cifs_tcon *tcon, int smb_command)
int rc;
struct cifs_ses *ses;
struct TCP_Server_Info *server;
- struct nls_table *nls_codepage;
+ struct nls_table *nls_codepage = NULL;
/*
* SMBs NegProt, SessSetup, uLogoff do not have tcon yet so check for
@@ -99,6 +99,7 @@ cifs_reconnect_tcon(struct cifs_tcon *tcon, int smb_command)
}
spin_unlock(&tcon->tc_lock);
+again:
rc = cifs_wait_for_server_reconnect(server, tcon->retry);
if (rc)
return rc;
@@ -110,8 +111,7 @@ cifs_reconnect_tcon(struct cifs_tcon *tcon, int smb_command)
}
spin_unlock(&ses->chan_lock);
- nls_codepage = load_nls_default();
-
+ mutex_lock(&ses->session_mutex);
/*
* Recheck after acquire mutex. If another thread is negotiating
* and the server never sends an answer the socket will be closed
@@ -120,29 +120,38 @@ cifs_reconnect_tcon(struct cifs_tcon *tcon, int smb_command)
spin_lock(&server->srv_lock);
if (server->tcpStatus == CifsNeedReconnect) {
spin_unlock(&server->srv_lock);
+ mutex_lock(&ses->session_mutex);
+
+ if (tcon->retry)
+ goto again;
rc = -EHOSTDOWN;
goto out;
}
spin_unlock(&server->srv_lock);
+ nls_codepage = load_nls_default();
+
/*
* need to prevent multiple threads trying to simultaneously
* reconnect the same SMB session
*/
+ spin_lock(&ses->ses_lock);
spin_lock(&ses->chan_lock);
- if (!cifs_chan_needs_reconnect(ses, server)) {
+ if (!cifs_chan_needs_reconnect(ses, server) &&
+ ses->ses_status == SES_GOOD) {
spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
/* this means that we only need to tree connect */
if (tcon->need_reconnect)
goto skip_sess_setup;
- rc = -EHOSTDOWN;
+ mutex_unlock(&ses->session_mutex);
goto out;
}
spin_unlock(&ses->chan_lock);
+ spin_unlock(&ses->ses_lock);
- mutex_lock(&ses->session_mutex);
rc = cifs_negotiate_protocol(0, ses, server);
if (!rc)
rc = cifs_setup_session(0, ses, server, nls_codepage);
@@ -4373,8 +4382,13 @@ CIFSGetDFSRefer(const unsigned int xid, struct cifs_ses *ses,
return -ENODEV;
getDFSRetry:
- rc = smb_init(SMB_COM_TRANSACTION2, 15, ses->tcon_ipc, (void **) &pSMB,
- (void **) &pSMBr);
+ /*
+ * Use smb_init_no_reconnect() instead of smb_init() as
+ * CIFSGetDFSRefer() may be called from cifs_reconnect_tcon() and thus
+ * causing an infinite recursion.
+ */
+ rc = smb_init_no_reconnect(SMB_COM_TRANSACTION2, 15, ses->tcon_ipc,
+ (void **)&pSMB, (void **)&pSMBr);
if (rc)
return rc;
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index 6831a99..040bb30 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -5066,19 +5066,3 @@ const struct address_space_operations cifs_addr_ops_smallbuf = {
.launder_folio = cifs_launder_folio,
.migrate_folio = filemap_migrate_folio,
};
-
-/*
- * Splice data from a file into a pipe.
- */
-ssize_t cifs_splice_read(struct file *in, loff_t *ppos,
- struct pipe_inode_info *pipe, size_t len,
- unsigned int flags)
-{
- if (unlikely(*ppos >= file_inode(in)->i_sb->s_maxbytes))
- return 0;
- if (unlikely(!len))
- return 0;
- if (in->f_flags & O_DIRECT)
- return direct_splice_read(in, ppos, pipe, len, flags);
- return filemap_splice_read(in, ppos, pipe, len, flags);
-}
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 6bd2aa6..2b92132 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -310,7 +310,6 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
case SMB2_READ:
case SMB2_WRITE:
case SMB2_LOCK:
- case SMB2_IOCTL:
case SMB2_QUERY_DIRECTORY:
case SMB2_CHANGE_NOTIFY:
case SMB2_QUERY_INFO:
diff --git a/fs/coda/file.c b/fs/coda/file.c
index 3f3c81e..12b26bd 100644
--- a/fs/coda/file.c
+++ b/fs/coda/file.c
@@ -23,6 +23,7 @@
#include <linux/slab.h>
#include <linux/uaccess.h>
#include <linux/uio.h>
+#include <linux/splice.h>
#include <linux/coda.h>
#include "coda_psdev.h"
@@ -94,6 +95,32 @@ coda_file_write_iter(struct kiocb *iocb, struct iov_iter *to)
return ret;
}
+static ssize_t
+coda_file_splice_read(struct file *coda_file, loff_t *ppos,
+ struct pipe_inode_info *pipe,
+ size_t len, unsigned int flags)
+{
+ struct inode *coda_inode = file_inode(coda_file);
+ struct coda_file_info *cfi = coda_ftoc(coda_file);
+ struct file *in = cfi->cfi_container;
+ loff_t ki_pos = *ppos;
+ ssize_t ret;
+
+ ret = venus_access_intent(coda_inode->i_sb, coda_i2f(coda_inode),
+ &cfi->cfi_access_intent,
+ len, ki_pos, CODA_ACCESS_TYPE_READ);
+ if (ret)
+ goto finish_read;
+
+ ret = vfs_splice_read(in, ppos, pipe, len, flags);
+
+finish_read:
+ venus_access_intent(coda_inode->i_sb, coda_i2f(coda_inode),
+ &cfi->cfi_access_intent,
+ len, ki_pos, CODA_ACCESS_TYPE_READ_FINISH);
+ return ret;
+}
+
static void
coda_vm_open(struct vm_area_struct *vma)
{
@@ -302,5 +329,5 @@ const struct file_operations coda_file_operations = {
.open = coda_open,
.release = coda_release,
.fsync = coda_fsync,
- .splice_read = generic_file_splice_read,
+ .splice_read = coda_file_splice_read,
};
diff --git a/fs/direct-io.c b/fs/direct-io.c
index ab0d7ea..47b90c6 100644
--- a/fs/direct-io.c
+++ b/fs/direct-io.c
@@ -403,6 +403,8 @@ dio_bio_alloc(struct dio *dio, struct dio_submit *sdio,
bio->bi_end_io = dio_bio_end_aio;
else
bio->bi_end_io = dio_bio_end_io;
+ /* for now require references for all pages */
+ bio_set_flag(bio, BIO_PAGE_REFFED);
sdio->bio = bio;
sdio->logical_offset_in_bio = sdio->cur_page_fs_offset;
}
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 0b8b449..d101b3b 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -899,7 +899,8 @@ static int ext4_file_open(struct inode *inode, struct file *filp)
return ret;
}
- filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC;
+ filp->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC |
+ FMODE_DIO_PARALLEL_WRITE;
return dquot_file_open(inode, filp);
}
diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c
index f771001..ceeb0a1 100644
--- a/fs/iomap/direct-io.c
+++ b/fs/iomap/direct-io.c
@@ -202,7 +202,6 @@ static void iomap_dio_zero(const struct iomap_iter *iter, struct iomap_dio *dio,
bio->bi_private = dio;
bio->bi_end_io = iomap_dio_bio_end_io;
- get_page(page);
__bio_add_page(bio, page, len, 0);
iomap_dio_submit_bio(iter, dio, bio, pos);
}
diff --git a/fs/kernfs/file.c b/fs/kernfs/file.c
index e4a50e4..9d23b81 100644
--- a/fs/kernfs/file.c
+++ b/fs/kernfs/file.c
@@ -1011,7 +1011,7 @@ const struct file_operations kernfs_file_fops = {
.release = kernfs_fop_release,
.poll = kernfs_fop_poll,
.fsync = noop_fsync,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.splice_write = iter_file_splice_write,
};
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 22a93ae..5607b1e 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -1980,8 +1980,7 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (!data->rpc_done) {
if (data->rpc_status)
return ERR_PTR(data->rpc_status);
- /* cached opens have already been processed */
- goto update;
+ return nfs4_try_open_cached(data);
}
ret = nfs_refresh_inode(inode, &data->f_attr);
@@ -1990,7 +1989,7 @@ _nfs4_opendata_reclaim_to_nfs4_state(struct nfs4_opendata *data)
if (data->o_res.delegation_type != 0)
nfs4_opendata_check_deleg(data, state);
-update:
+
if (!update_open_stateid(state, &data->o_res.stateid,
NULL, data->o_arg.fmode))
return ERR_PTR(-EAGAIN);
diff --git a/fs/overlayfs/file.c b/fs/overlayfs/file.c
index 7c04f03..8619788 100644
--- a/fs/overlayfs/file.c
+++ b/fs/overlayfs/file.c
@@ -419,6 +419,27 @@ static ssize_t ovl_write_iter(struct kiocb *iocb, struct iov_iter *iter)
return ret;
}
+static ssize_t ovl_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe, size_t len,
+ unsigned int flags)
+{
+ const struct cred *old_cred;
+ struct fd real;
+ ssize_t ret;
+
+ ret = ovl_real_fdget(in, &real);
+ if (ret)
+ return ret;
+
+ old_cred = ovl_override_creds(file_inode(in)->i_sb);
+ ret = vfs_splice_read(real.file, ppos, pipe, len, flags);
+ revert_creds(old_cred);
+ ovl_file_accessed(in);
+
+ fdput(real);
+ return ret;
+}
+
/*
* Calling iter_file_splice_write() directly from overlay's f_op may deadlock
* due to lock order inversion between pipe->mutex in iter_file_splice_write()
@@ -695,7 +716,7 @@ const struct file_operations ovl_file_operations = {
.fallocate = ovl_fallocate,
.fadvise = ovl_fadvise,
.flush = ovl_flush,
- .splice_read = generic_file_splice_read,
+ .splice_read = ovl_splice_read,
.splice_write = ovl_splice_write,
.copy_file_range = ovl_copy_file_range,
diff --git a/fs/proc/inode.c b/fs/proc/inode.c
index f495fdb..711f127 100644
--- a/fs/proc/inode.c
+++ b/fs/proc/inode.c
@@ -591,7 +591,7 @@ static const struct file_operations proc_iter_file_ops = {
.llseek = proc_reg_llseek,
.read_iter = proc_reg_read_iter,
.write = proc_reg_write,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.poll = proc_reg_poll,
.unlocked_ioctl = proc_reg_unlocked_ioctl,
.mmap = proc_reg_mmap,
@@ -617,7 +617,7 @@ static const struct file_operations proc_reg_file_ops_compat = {
static const struct file_operations proc_iter_file_ops_compat = {
.llseek = proc_reg_llseek,
.read_iter = proc_reg_read_iter,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.write = proc_reg_write,
.poll = proc_reg_poll,
.unlocked_ioctl = proc_reg_unlocked_ioctl,
diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c
index 5851eb5..e49f996 100644
--- a/fs/proc/proc_sysctl.c
+++ b/fs/proc/proc_sysctl.c
@@ -869,7 +869,7 @@ static const struct file_operations proc_sys_file_operations = {
.poll = proc_sys_poll,
.read_iter = proc_sys_read,
.write_iter = proc_sys_write,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.splice_write = iter_file_splice_write,
.llseek = default_llseek,
};
diff --git a/fs/proc_namespace.c b/fs/proc_namespace.c
index 846f945..492abbb 100644
--- a/fs/proc_namespace.c
+++ b/fs/proc_namespace.c
@@ -324,7 +324,7 @@ static int mountstats_open(struct inode *inode, struct file *file)
const struct file_operations proc_mounts_operations = {
.open = mounts_open,
.read_iter = seq_read_iter,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.llseek = seq_lseek,
.release = mounts_release,
.poll = mounts_poll,
@@ -333,7 +333,7 @@ const struct file_operations proc_mounts_operations = {
const struct file_operations proc_mountinfo_operations = {
.open = mountinfo_open,
.read_iter = seq_read_iter,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.llseek = seq_lseek,
.release = mounts_release,
.poll = mounts_poll,
@@ -342,7 +342,7 @@ const struct file_operations proc_mountinfo_operations = {
const struct file_operations proc_mountstats_operations = {
.open = mountstats_open,
.read_iter = seq_read_iter,
- .splice_read = generic_file_splice_read,
+ .splice_read = direct_splice_read,
.llseek = seq_lseek,
.release = mounts_release,
};
diff --git a/fs/splice.c b/fs/splice.c
index 2c3dec2..b7fd976 100644
--- a/fs/splice.c
+++ b/fs/splice.c
@@ -295,7 +295,7 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppos,
struct kiocb kiocb;
struct page **pages;
ssize_t ret;
- size_t used, npages, chunk, remain, reclaim;
+ size_t used, npages, chunk, remain, keep = 0;
int i;
/* Work out how much data we can actually add into the pipe */
@@ -309,7 +309,7 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppos,
if (!bv)
return -ENOMEM;
- pages = (void *)(bv + npages);
+ pages = (struct page **)(bv + npages);
npages = alloc_pages_bulk_array(GFP_USER, npages, pages);
if (!npages) {
kfree(bv);
@@ -332,11 +332,8 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppos,
kiocb.ki_pos = *ppos;
ret = call_read_iter(in, &kiocb, &to);
- reclaim = npages * PAGE_SIZE;
- remain = 0;
if (ret > 0) {
- reclaim -= ret;
- remain = ret;
+ keep = DIV_ROUND_UP(ret, PAGE_SIZE);
*ppos = kiocb.ki_pos;
file_accessed(in);
} else if (ret < 0) {
@@ -349,14 +346,12 @@ ssize_t direct_splice_read(struct file *in, loff_t *ppos,
}
/* Free any pages that didn't get touched at all. */
- reclaim /= PAGE_SIZE;
- if (reclaim) {
- npages -= reclaim;
- release_pages(pages + npages, reclaim);
- }
+ if (keep < npages)
+ release_pages(pages + keep, npages - keep);
/* Push the remaining pages into the pipe. */
- for (i = 0; i < npages; i++) {
+ remain = ret;
+ for (i = 0; i < keep; i++) {
struct pipe_buffer *buf = pipe_head_buf(pipe);
chunk = min_t(size_t, remain, PAGE_SIZE);
@@ -392,29 +387,13 @@ ssize_t generic_file_splice_read(struct file *in, loff_t *ppos,
struct pipe_inode_info *pipe, size_t len,
unsigned int flags)
{
- struct iov_iter to;
- struct kiocb kiocb;
- int ret;
-
- iov_iter_pipe(&to, ITER_DEST, pipe, len);
- init_sync_kiocb(&kiocb, in);
- kiocb.ki_pos = *ppos;
- ret = call_read_iter(in, &kiocb, &to);
- if (ret > 0) {
- *ppos = kiocb.ki_pos;
- file_accessed(in);
- } else if (ret < 0) {
- /* free what was emitted */
- pipe_discard_from(pipe, to.start_head);
- /*
- * callers of ->splice_read() expect -EAGAIN on
- * "can't put anything in there", rather than -EFAULT.
- */
- if (ret == -EFAULT)
- ret = -EAGAIN;
- }
-
- return ret;
+ if (unlikely(*ppos >= file_inode(in)->i_sb->s_maxbytes))
+ return 0;
+ if (unlikely(!len))
+ return 0;
+ if (in->f_flags & O_DIRECT)
+ return direct_splice_read(in, ppos, pipe, len, flags);
+ return filemap_splice_read(in, ppos, pipe, len, flags);
}
EXPORT_SYMBOL(generic_file_splice_read);
@@ -856,12 +835,24 @@ static long do_splice_from(struct pipe_inode_info *pipe, struct file *out,
return out->f_op->splice_write(pipe, out, ppos, len, flags);
}
-/*
- * Attempt to initiate a splice from a file to a pipe.
+/**
+ * vfs_splice_read - Read data from a file and splice it into a pipe
+ * @in: File to splice from
+ * @ppos: Input file offset
+ * @pipe: Pipe to splice to
+ * @len: Number of bytes to splice
+ * @flags: Splice modifier flags (SPLICE_F_*)
+ *
+ * Splice the requested amount of data from the input file to the pipe. This
+ * is synchronous as the caller must hold the pipe lock across the entire
+ * operation.
+ *
+ * If successful, it returns the amount of data spliced, 0 if it hit the EOF or
+ * a hole and a negative error code otherwise.
*/
-static long do_splice_to(struct file *in, loff_t *ppos,
- struct pipe_inode_info *pipe, size_t len,
- unsigned int flags)
+long vfs_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe, size_t len,
+ unsigned int flags)
{
unsigned int p_space;
int ret;
@@ -884,6 +875,7 @@ static long do_splice_to(struct file *in, loff_t *ppos,
return warn_unsupported(in, "read");
return in->f_op->splice_read(in, ppos, pipe, len, flags);
}
+EXPORT_SYMBOL_GPL(vfs_splice_read);
/**
* splice_direct_to_actor - splices data directly between two non-pipes
@@ -953,7 +945,7 @@ ssize_t splice_direct_to_actor(struct file *in, struct splice_desc *sd,
size_t read_len;
loff_t pos = sd->pos, prev_pos = pos;
- ret = do_splice_to(in, &pos, pipe, len, flags);
+ ret = vfs_splice_read(in, &pos, pipe, len, flags);
if (unlikely(ret <= 0))
goto out_release;
@@ -1101,7 +1093,7 @@ long splice_file_to_pipe(struct file *in,
pipe_lock(opipe);
ret = wait_for_space(opipe, flags);
if (!ret)
- ret = do_splice_to(in, offset, opipe, len, flags);
+ ret = vfs_splice_read(in, offset, opipe, len, flags);
pipe_unlock(opipe);
if (ret > 0)
wakeup_pipe_readers(opipe);
diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 705250f..863289a 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1171,7 +1171,8 @@ xfs_file_open(
{
if (xfs_is_shutdown(XFS_M(inode->i_sb)))
return -EIO;
- file->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC;
+ file->f_mode |= FMODE_NOWAIT | FMODE_BUF_RASYNC | FMODE_BUF_WASYNC |
+ FMODE_DIO_PARALLEL_WRITE;
return generic_file_open(inode, file);
}
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c
index 617e4f9..132f01d 100644
--- a/fs/zonefs/file.c
+++ b/fs/zonefs/file.c
@@ -382,6 +382,7 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
struct zonefs_zone *z = zonefs_inode_zone(inode);
struct block_device *bdev = inode->i_sb->s_bdev;
unsigned int max = bdev_max_zone_append_sectors(bdev);
+ pgoff_t start, end;
struct bio *bio;
ssize_t size = 0;
int nr_pages;
@@ -390,6 +391,19 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize);
iov_iter_truncate(from, max);
+ /*
+ * If the inode block size (zone write granularity) is smaller than the
+ * page size, we may be appending data belonging to the last page of the
+ * inode straddling inode->i_size, with that page already cached due to
+ * a buffered read or readahead. So make sure to invalidate that page.
+ * This will always be a no-op for the case where the block size is
+ * equal to the page size.
+ */
+ start = iocb->ki_pos >> PAGE_SHIFT;
+ end = (iocb->ki_pos + iov_iter_count(from) - 1) >> PAGE_SHIFT;
+ if (invalidate_inode_pages2_range(inode->i_mapping, start, end))
+ return -EBUSY;
+
nr_pages = iov_iter_npages(from, BIO_MAX_VECS);
if (!nr_pages)
return 0;
@@ -567,11 +581,21 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from)
append = sync;
}
- if (append)
+ if (append) {
ret = zonefs_file_dio_append(iocb, from);
- else
+ } else {
+ /*
+ * iomap_dio_rw() may return ENOTBLK if there was an issue with
+ * page invalidation. Overwrite that error code with EBUSY to
+ * be consistent with zonefs_file_dio_append() return value for
+ * similar issues.
+ */
ret = iomap_dio_rw(iocb, from, &zonefs_write_iomap_ops,
&zonefs_write_dio_ops, 0, NULL, 0);
+ if (ret == -ENOTBLK)
+ ret = -EBUSY;
+ }
+
if (zonefs_zone_is_seq(z) &&
(ret > 0 || ret == -EIOCBQUEUED)) {
if (ret > 0)
diff --git a/include/drm/gpu_scheduler.h b/include/drm/gpu_scheduler.h
index 9db9e5e..9935d1e 100644
--- a/include/drm/gpu_scheduler.h
+++ b/include/drm/gpu_scheduler.h
@@ -228,13 +228,6 @@ struct drm_sched_entity {
*/
struct rb_node rb_tree_node;
- /**
- * @elapsed_ns:
- *
- * Records the amount of time where jobs from this entity were active
- * on the GPU.
- */
- uint64_t elapsed_ns;
};
/**
diff --git a/include/linux/bio.h b/include/linux/bio.h
index d766be7..d8c30c7 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -229,7 +229,7 @@ static inline void bio_cnt_set(struct bio *bio, unsigned int count)
static inline bool bio_flagged(struct bio *bio, unsigned int bit)
{
- return (bio->bi_flags & (1U << bit)) != 0;
+ return bio->bi_flags & (1U << bit);
}
static inline void bio_set_flag(struct bio *bio, unsigned int bit)
@@ -488,7 +488,8 @@ void zero_fill_bio(struct bio *bio);
static inline void bio_release_pages(struct bio *bio, bool mark_dirty)
{
- if (!bio_flagged(bio, BIO_NO_PAGE_REF))
+ if (bio_flagged(bio, BIO_PAGE_REFFED) ||
+ bio_flagged(bio, BIO_PAGE_PINNED))
__bio_release_pages(bio, mark_dirty);
}
diff --git a/include/linux/blk-crypto.h b/include/linux/blk-crypto.h
index 1e3e5d0..5e5822c 100644
--- a/include/linux/blk-crypto.h
+++ b/include/linux/blk-crypto.h
@@ -95,8 +95,8 @@ int blk_crypto_init_key(struct blk_crypto_key *blk_key, const u8 *raw_key,
int blk_crypto_start_using_key(struct block_device *bdev,
const struct blk_crypto_key *key);
-int blk_crypto_evict_key(struct block_device *bdev,
- const struct blk_crypto_key *key);
+void blk_crypto_evict_key(struct block_device *bdev,
+ const struct blk_crypto_key *key);
bool blk_crypto_config_supported_natively(struct block_device *bdev,
const struct blk_crypto_config *cfg);
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index de0b0c3..06caacd 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -57,8 +57,6 @@ typedef __u32 __bitwise req_flags_t;
#define RQF_SPECIAL_PAYLOAD ((__force req_flags_t)(1 << 18))
/* The per-zone write lock is held for this request */
#define RQF_ZONE_WRITE_LOCKED ((__force req_flags_t)(1 << 19))
-/* already slept for hybrid poll */
-#define RQF_MQ_POLL_SLEPT ((__force req_flags_t)(1 << 20))
/* ->timeout has been called, don't expire again */
#define RQF_TIMED_OUT ((__force req_flags_t)(1 << 21))
/* queue has elevator attached */
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 99be590..a0e339f 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -318,7 +318,8 @@ struct bio {
* bio flags
*/
enum {
- BIO_NO_PAGE_REF, /* don't put release vec pages */
+ BIO_PAGE_PINNED, /* Unpin pages in bio_release_pages() */
+ BIO_PAGE_REFFED, /* put pages in bio_release_pages() */
BIO_CLONED, /* doesn't own data */
BIO_BOUNCED, /* bio is a bounce bio */
BIO_QUIET, /* Make BIO Quiet */
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 941304f..e3242e6 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -44,12 +44,6 @@ extern const struct device_type disk_type;
extern struct device_type part_type;
extern struct class block_class;
-/* Must be consistent with blk_mq_poll_stats_bkt() */
-#define BLK_MQ_POLL_STATS_BKTS 16
-
-/* Doing classic polling */
-#define BLK_MQ_POLL_CLASSIC -1
-
/*
* Maximum number of blkcg policies allowed to be registered concurrently.
* Defined here to simplify include dependency.
@@ -468,10 +462,6 @@ struct request_queue {
#endif
unsigned int rq_timeout;
- int poll_nsec;
-
- struct blk_stat_callback *poll_cb;
- struct blk_rq_stat *poll_stat;
struct timer_list timeout;
struct work_struct timeout_work;
@@ -870,8 +860,6 @@ blk_status_t errno_to_blk_status(int errno);
/* only poll the hardware once, don't continue until a completion was found */
#define BLK_POLL_ONESHOT (1 << 0)
-/* do not sleep to wait for the expected completion time */
-#define BLK_POLL_NOSLEEP (1 << 1)
int bio_poll(struct bio *bio, struct io_comp_batch *iob, unsigned int flags);
int iocb_bio_iopoll(struct kiocb *kiocb, struct io_comp_batch *iob,
unsigned int flags);
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index c6fab00..5b2f814 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -218,7 +218,6 @@ enum cpuhp_state {
CPUHP_AP_PERF_X86_CQM_ONLINE,
CPUHP_AP_PERF_X86_CSTATE_ONLINE,
CPUHP_AP_PERF_X86_IDXD_ONLINE,
- CPUHP_AP_PERF_X86_IOMMU_PERF_ONLINE,
CPUHP_AP_PERF_S390_CF_ONLINE,
CPUHP_AP_PERF_S390_SF_ONLINE,
CPUHP_AP_PERF_ARM_CCI_ONLINE,
diff --git a/include/linux/fs.h b/include/linux/fs.h
index c85916e..475d886 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -168,6 +168,9 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
#define FMODE_NOREUSE ((__force fmode_t)0x800000)
+/* File supports non-exclusive O_DIRECT writes from multiple threads */
+#define FMODE_DIO_PARALLEL_WRITE ((__force fmode_t)0x1000000)
+
/* File was opened by fanotify and shouldn't generate fanotify events */
#define FMODE_NONOTIFY ((__force fmode_t)0x4000000)
diff --git a/include/linux/genl_magic_func.h b/include/linux/genl_magic_func.h
index 4a4b387..2984b0c 100644
--- a/include/linux/genl_magic_func.h
+++ b/include/linux/genl_magic_func.h
@@ -209,7 +209,7 @@ static int s_name ## _from_attrs_for_change(struct s_name *s, \
* Magic: define op number to op name mapping {{{1
* {{{2
*/
-const char *CONCAT_(GENL_MAGIC_FAMILY, _genl_cmd_to_str)(__u8 cmd)
+static const char *CONCAT_(GENL_MAGIC_FAMILY, _genl_cmd_to_str)(__u8 cmd)
{
switch (cmd) {
#undef GENL_op
diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h
index 00689c1..fa621a5 100644
--- a/include/linux/io_uring_types.h
+++ b/include/linux/io_uring_types.h
@@ -188,8 +188,10 @@ struct io_ev_fd {
};
struct io_alloc_cache {
- struct hlist_head list;
+ struct io_wq_work_node list;
unsigned int nr_cached;
+ unsigned int max_cached;
+ size_t elem_size;
};
struct io_ring_ctx {
@@ -239,7 +241,6 @@ struct io_ring_ctx {
* uring_lock, and updated through io_uring_register(2)
*/
struct io_rsrc_node *rsrc_node;
- int rsrc_cached_refs;
atomic_t cancel_seq;
struct io_file_table file_table;
unsigned nr_user_files;
@@ -295,7 +296,7 @@ struct io_ring_ctx {
spinlock_t completion_lock;
bool poll_multi_queue;
- bool cq_waiting;
+ atomic_t cq_wait_nr;
/*
* ->iopoll_list is protected by the ctx->uring_lock for
@@ -330,11 +331,9 @@ struct io_ring_ctx {
struct io_rsrc_data *file_data;
struct io_rsrc_data *buf_data;
- struct delayed_work rsrc_put_work;
- struct callback_head rsrc_put_tw;
- struct llist_head rsrc_put_llist;
+ /* protected by ->uring_lock */
struct list_head rsrc_ref_list;
- spinlock_t rsrc_ref_lock;
+ struct io_alloc_cache rsrc_node_cache;
struct list_head io_buffers_pages;
@@ -366,6 +365,11 @@ struct io_ring_ctx {
unsigned evfd_last_cq_tail;
};
+struct io_tw_state {
+ /* ->uring_lock is taken, callbacks can use io_tw_lock to lock it */
+ bool locked;
+};
+
enum {
REQ_F_FIXED_FILE_BIT = IOSQE_FIXED_FILE_BIT,
REQ_F_IO_DRAIN_BIT = IOSQE_IO_DRAIN_BIT,
@@ -472,7 +476,7 @@ enum {
REQ_F_HASH_LOCKED = BIT(REQ_F_HASH_LOCKED_BIT),
};
-typedef void (*io_req_tw_func_t)(struct io_kiocb *req, bool *locked);
+typedef void (*io_req_tw_func_t)(struct io_kiocb *req, struct io_tw_state *ts);
struct io_task_work {
struct llist_node node;
@@ -562,6 +566,7 @@ struct io_kiocb {
atomic_t refs;
atomic_t poll_refs;
struct io_task_work io_task_work;
+ unsigned nr_tw;
/* for polled requests, i.e. IORING_OP_POLL_ADD and async armed poll */
union {
struct hlist_node hash_node;
diff --git a/include/linux/phy.h b/include/linux/phy.h
index 36bf0bb..db7c0bd 100644
--- a/include/linux/phy.h
+++ b/include/linux/phy.h
@@ -1547,7 +1547,7 @@ int fwnode_get_phy_id(struct fwnode_handle *fwnode, u32 *phy_id);
struct mdio_device *fwnode_mdio_find_device(struct fwnode_handle *fwnode);
struct phy_device *fwnode_phy_find_device(struct fwnode_handle *phy_fwnode);
struct phy_device *device_phy_find_device(struct device *dev);
-struct fwnode_handle *fwnode_get_phy_node(struct fwnode_handle *fwnode);
+struct fwnode_handle *fwnode_get_phy_node(const struct fwnode_handle *fwnode);
struct phy_device *get_phy_device(struct mii_bus *bus, int addr, bool is_c45);
int phy_device_register(struct phy_device *phy);
void phy_device_free(struct phy_device *phydev);
diff --git a/include/linux/sed-opal.h b/include/linux/sed-opal.h
index 31ac562a..042c1e2 100644
--- a/include/linux/sed-opal.h
+++ b/include/linux/sed-opal.h
@@ -45,6 +45,7 @@ static inline bool is_sed_ioctl(unsigned int cmd)
case IOC_OPAL_WRITE_SHADOW_MBR:
case IOC_OPAL_GENERIC_TABLE_RW:
case IOC_OPAL_GET_STATUS:
+ case IOC_OPAL_GET_LR_STATUS:
return true;
}
return false;
diff --git a/include/linux/sfp.h b/include/linux/sfp.h
index 52b98f9..ef06a19 100644
--- a/include/linux/sfp.h
+++ b/include/linux/sfp.h
@@ -557,7 +557,7 @@ int sfp_get_module_eeprom_by_page(struct sfp_bus *bus,
void sfp_upstream_start(struct sfp_bus *bus);
void sfp_upstream_stop(struct sfp_bus *bus);
void sfp_bus_put(struct sfp_bus *bus);
-struct sfp_bus *sfp_bus_find_fwnode(struct fwnode_handle *fwnode);
+struct sfp_bus *sfp_bus_find_fwnode(const struct fwnode_handle *fwnode);
int sfp_bus_add_upstream(struct sfp_bus *bus, void *upstream,
const struct sfp_upstream_ops *ops);
void sfp_bus_del_upstream(struct sfp_bus *bus);
@@ -619,7 +619,8 @@ static inline void sfp_bus_put(struct sfp_bus *bus)
{
}
-static inline struct sfp_bus *sfp_bus_find_fwnode(struct fwnode_handle *fwnode)
+static inline struct sfp_bus *
+sfp_bus_find_fwnode(const struct fwnode_handle *fwnode)
{
return NULL;
}
diff --git a/include/linux/splice.h b/include/linux/splice.h
index a55179f..8f052c3 100644
--- a/include/linux/splice.h
+++ b/include/linux/splice.h
@@ -76,6 +76,9 @@ extern ssize_t splice_to_pipe(struct pipe_inode_info *,
struct splice_pipe_desc *);
extern ssize_t add_to_pipe(struct pipe_inode_info *,
struct pipe_buffer *);
+long vfs_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe, size_t len,
+ unsigned int flags);
extern ssize_t splice_direct_to_actor(struct file *, struct splice_desc *,
splice_direct_actor *);
extern long do_splice(struct file *in, loff_t *off_in,
diff --git a/include/linux/uio.h b/include/linux/uio.h
index ed35f44..de6e93d 100644
--- a/include/linux/uio.h
+++ b/include/linux/uio.h
@@ -11,7 +11,6 @@
#include <uapi/linux/uio.h>
struct page;
-struct pipe_inode_info;
typedef unsigned int __bitwise iov_iter_extraction_t;
@@ -25,7 +24,6 @@ enum iter_type {
ITER_IOVEC,
ITER_KVEC,
ITER_BVEC,
- ITER_PIPE,
ITER_XARRAY,
ITER_DISCARD,
ITER_UBUF,
@@ -45,10 +43,7 @@ struct iov_iter {
bool nofault;
bool data_source;
bool user_backed;
- union {
- size_t iov_offset;
- int last_offset;
- };
+ size_t iov_offset;
/*
* Hack alert: overlay ubuf_iovec with iovec + count, so
* that the members resolve correctly regardless of the type
@@ -73,7 +68,6 @@ struct iov_iter {
const struct kvec *kvec;
const struct bio_vec *bvec;
struct xarray *xarray;
- struct pipe_inode_info *pipe;
void __user *ubuf;
};
size_t count;
@@ -81,10 +75,6 @@ struct iov_iter {
};
union {
unsigned long nr_segs;
- struct {
- unsigned int head;
- unsigned int start_head;
- };
loff_t xarray_start;
};
};
@@ -132,11 +122,6 @@ static inline bool iov_iter_is_bvec(const struct iov_iter *i)
return iov_iter_type(i) == ITER_BVEC;
}
-static inline bool iov_iter_is_pipe(const struct iov_iter *i)
-{
- return iov_iter_type(i) == ITER_PIPE;
-}
-
static inline bool iov_iter_is_discard(const struct iov_iter *i)
{
return iov_iter_type(i) == ITER_DISCARD;
@@ -269,19 +254,11 @@ void iov_iter_kvec(struct iov_iter *i, unsigned int direction, const struct kvec
unsigned long nr_segs, size_t count);
void iov_iter_bvec(struct iov_iter *i, unsigned int direction, const struct bio_vec *bvec,
unsigned long nr_segs, size_t count);
-void iov_iter_pipe(struct iov_iter *i, unsigned int direction, struct pipe_inode_info *pipe,
- size_t count);
void iov_iter_discard(struct iov_iter *i, unsigned int direction, size_t count);
void iov_iter_xarray(struct iov_iter *i, unsigned int direction, struct xarray *xarray,
loff_t start, size_t count);
-ssize_t iov_iter_get_pages(struct iov_iter *i, struct page **pages,
- size_t maxsize, unsigned maxpages, size_t *start,
- iov_iter_extraction_t extraction_flags);
ssize_t iov_iter_get_pages2(struct iov_iter *i, struct page **pages,
size_t maxsize, unsigned maxpages, size_t *start);
-ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
- struct page ***pages, size_t maxsize, size_t *start,
- iov_iter_extraction_t extraction_flags);
ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i, struct page ***pages,
size_t maxsize, size_t *start);
int iov_iter_npages(const struct iov_iter *i, int maxpages);
diff --git a/include/trace/events/f2fs.h b/include/trace/events/f2fs.h
index 1322d34..99cbc59 100644
--- a/include/trace/events/f2fs.h
+++ b/include/trace/events/f2fs.h
@@ -512,7 +512,7 @@ TRACE_EVENT(f2fs_truncate_partial_nodes,
TP_STRUCT__entry(
__field(dev_t, dev)
__field(ino_t, ino)
- __field(nid_t, nid[3])
+ __array(nid_t, nid, 3)
__field(int, depth)
__field(int, err)
),
diff --git a/include/trace/events/io_uring.h b/include/trace/events/io_uring.h
index 936fd41..69454f1 100644
--- a/include/trace/events/io_uring.h
+++ b/include/trace/events/io_uring.h
@@ -360,19 +360,18 @@ TRACE_EVENT(io_uring_complete,
);
/**
- * io_uring_submit_sqe - called before submitting one SQE
+ * io_uring_submit_req - called before submitting a request
*
* @req: pointer to a submitted request
- * @force_nonblock: whether a context blocking or not
*
* Allows to track SQE submitting, to understand what was the source of it, SQ
* thread or io_uring_enter call.
*/
-TRACE_EVENT(io_uring_submit_sqe,
+TRACE_EVENT(io_uring_submit_req,
- TP_PROTO(struct io_kiocb *req, bool force_nonblock),
+ TP_PROTO(struct io_kiocb *req),
- TP_ARGS(req, force_nonblock),
+ TP_ARGS(req),
TP_STRUCT__entry (
__field( void *, ctx )
@@ -380,7 +379,6 @@ TRACE_EVENT(io_uring_submit_sqe,
__field( unsigned long long, user_data )
__field( u8, opcode )
__field( u32, flags )
- __field( bool, force_nonblock )
__field( bool, sq_thread )
__string( op_str, io_uring_get_opcode(req->opcode) )
@@ -392,16 +390,15 @@ TRACE_EVENT(io_uring_submit_sqe,
__entry->user_data = req->cqe.user_data;
__entry->opcode = req->opcode;
__entry->flags = req->flags;
- __entry->force_nonblock = force_nonblock;
__entry->sq_thread = req->ctx->flags & IORING_SETUP_SQPOLL;
__assign_str(op_str, io_uring_get_opcode(req->opcode));
),
TP_printk("ring %p, req %p, user_data 0x%llx, opcode %s, flags 0x%x, "
- "non block %d, sq_thread %d", __entry->ctx, __entry->req,
+ "sq_thread %d", __entry->ctx, __entry->req,
__entry->user_data, __get_str(op_str),
- __entry->flags, __entry->force_nonblock, __entry->sq_thread)
+ __entry->flags, __entry->sq_thread)
);
/*
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 90b2fb0..012fa0d 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -768,7 +768,7 @@ TRACE_EVENT_RCU(rcu_torture_read,
TP_ARGS(rcutorturename, rhp, secs, c_old, c),
TP_STRUCT__entry(
- __field(char, rcutorturename[RCUTORTURENAME_LEN])
+ __array(char, rcutorturename, RCUTORTURENAME_LEN)
__field(struct rcu_head *, rhp)
__field(unsigned long, secs)
__field(unsigned long, c_old)
diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h
index 709de6d..f8d14d1 100644
--- a/include/uapi/linux/io_uring.h
+++ b/include/uapi/linux/io_uring.h
@@ -389,6 +389,9 @@ enum {
#define IORING_OFF_SQ_RING 0ULL
#define IORING_OFF_CQ_RING 0x8000000ULL
#define IORING_OFF_SQES 0x10000000ULL
+#define IORING_OFF_PBUF_RING 0x80000000ULL
+#define IORING_OFF_PBUF_SHIFT 16
+#define IORING_OFF_MMAP_MASK 0xf8000000ULL
/*
* Filled with the offset for mmap(2)
@@ -568,19 +571,6 @@ struct io_uring_rsrc_update2 {
__u32 resv2;
};
-struct io_uring_notification_slot {
- __u64 tag;
- __u64 resv[3];
-};
-
-struct io_uring_notification_register {
- __u32 nr_slots;
- __u32 resv;
- __u64 resv2;
- __u64 data;
- __u64 resv3;
-};
-
/* Skip updating fd indexes set to this value in the fd table */
#define IORING_REGISTER_FILES_SKIP (-2)
@@ -635,12 +625,26 @@ struct io_uring_buf_ring {
};
};
+/*
+ * Flags for IORING_REGISTER_PBUF_RING.
+ *
+ * IOU_PBUF_RING_MMAP: If set, kernel will allocate the memory for the ring.
+ * The application must not set a ring_addr in struct
+ * io_uring_buf_reg, instead it must subsequently call
+ * mmap(2) with the offset set as:
+ * IORING_OFF_PBUF_RING | (bgid << IORING_OFF_PBUF_SHIFT)
+ * to get a virtual mapping for the ring.
+ */
+enum {
+ IOU_PBUF_RING_MMAP = 1,
+};
+
/* argument for IORING_(UN)REGISTER_PBUF_RING */
struct io_uring_buf_reg {
__u64 ring_addr;
__u32 ring_entries;
__u16 bgid;
- __u16 pad;
+ __u16 flags;
__u64 resv[3];
};
diff --git a/include/uapi/linux/sed-opal.h b/include/uapi/linux/sed-opal.h
index d7a1524..3905c8f 100644
--- a/include/uapi/linux/sed-opal.h
+++ b/include/uapi/linux/sed-opal.h
@@ -78,6 +78,16 @@ struct opal_user_lr_setup {
struct opal_session_info session;
};
+struct opal_lr_status {
+ struct opal_session_info session;
+ __u64 range_start;
+ __u64 range_length;
+ __u32 RLE; /* Read Lock enabled */
+ __u32 WLE; /* Write Lock Enabled */
+ __u32 l_state;
+ __u8 align[4];
+};
+
struct opal_lock_unlock {
struct opal_session_info session;
__u32 l_state;
@@ -168,5 +178,6 @@ struct opal_status {
#define IOC_OPAL_WRITE_SHADOW_MBR _IOW('p', 234, struct opal_shadow_mbr)
#define IOC_OPAL_GENERIC_TABLE_RW _IOW('p', 235, struct opal_read_write_table)
#define IOC_OPAL_GET_STATUS _IOR('p', 236, struct opal_status)
+#define IOC_OPAL_GET_LR_STATUS _IOW('p', 237, struct opal_lr_status)
#endif /* _UAPI_SED_OPAL_H */
diff --git a/io_uring/alloc_cache.h b/io_uring/alloc_cache.h
index 729793a..851a527 100644
--- a/io_uring/alloc_cache.h
+++ b/io_uring/alloc_cache.h
@@ -7,15 +7,17 @@
#define IO_ALLOC_CACHE_MAX 512
struct io_cache_entry {
- struct hlist_node node;
+ struct io_wq_work_node node;
};
static inline bool io_alloc_cache_put(struct io_alloc_cache *cache,
struct io_cache_entry *entry)
{
- if (cache->nr_cached < IO_ALLOC_CACHE_MAX) {
+ if (cache->nr_cached < cache->max_cached) {
cache->nr_cached++;
- hlist_add_head(&entry->node, &cache->list);
+ wq_stack_add_head(&entry->node, &cache->list);
+ /* KASAN poisons object */
+ kasan_slab_free_mempool(entry);
return true;
}
return false;
@@ -23,30 +25,37 @@ static inline bool io_alloc_cache_put(struct io_alloc_cache *cache,
static inline struct io_cache_entry *io_alloc_cache_get(struct io_alloc_cache *cache)
{
- if (!hlist_empty(&cache->list)) {
- struct hlist_node *node = cache->list.first;
+ if (cache->list.next) {
+ struct io_cache_entry *entry;
- hlist_del(node);
- return container_of(node, struct io_cache_entry, node);
+ entry = container_of(cache->list.next, struct io_cache_entry, node);
+ kasan_unpoison_range(entry, cache->elem_size);
+ cache->list.next = cache->list.next->next;
+ cache->nr_cached--;
+ return entry;
}
return NULL;
}
-static inline void io_alloc_cache_init(struct io_alloc_cache *cache)
+static inline void io_alloc_cache_init(struct io_alloc_cache *cache,
+ unsigned max_nr, size_t size)
{
- INIT_HLIST_HEAD(&cache->list);
+ cache->list.next = NULL;
cache->nr_cached = 0;
+ cache->max_cached = max_nr;
+ cache->elem_size = size;
}
static inline void io_alloc_cache_free(struct io_alloc_cache *cache,
void (*free)(struct io_cache_entry *))
{
- while (!hlist_empty(&cache->list)) {
- struct hlist_node *node = cache->list.first;
+ while (1) {
+ struct io_cache_entry *entry = io_alloc_cache_get(cache);
- hlist_del(node);
- free(container_of(node, struct io_cache_entry, node));
+ if (!entry)
+ break;
+ free(entry);
}
cache->nr_cached = 0;
}
diff --git a/io_uring/io-wq.c b/io_uring/io-wq.c
index f81c0a7..b271598 100644
--- a/io_uring/io-wq.c
+++ b/io_uring/io-wq.c
@@ -15,6 +15,7 @@
#include <linux/cpu.h>
#include <linux/task_work.h>
#include <linux/audit.h>
+#include <linux/mmu_context.h>
#include <uapi/linux/io_uring.h>
#include "io-wq.h"
@@ -39,7 +40,7 @@ enum {
};
/*
- * One for each thread in a wqe pool
+ * One for each thread in a wq pool
*/
struct io_worker {
refcount_t ref;
@@ -47,7 +48,7 @@ struct io_worker {
struct hlist_nulls_node nulls_node;
struct list_head all_list;
struct task_struct *task;
- struct io_wqe *wqe;
+ struct io_wq *wq;
struct io_wq_work *cur_work;
struct io_wq_work *next_work;
@@ -73,7 +74,7 @@ struct io_worker {
#define IO_WQ_NR_HASH_BUCKETS (1u << IO_WQ_HASH_ORDER)
-struct io_wqe_acct {
+struct io_wq_acct {
unsigned nr_workers;
unsigned max_workers;
int index;
@@ -90,26 +91,6 @@ enum {
};
/*
- * Per-node worker thread pool
- */
-struct io_wqe {
- raw_spinlock_t lock;
- struct io_wqe_acct acct[IO_WQ_ACCT_NR];
-
- int node;
-
- struct hlist_nulls_head free_list;
- struct list_head all_list;
-
- struct wait_queue_entry wait;
-
- struct io_wq *wq;
- struct io_wq_work *hash_tail[IO_WQ_NR_HASH_BUCKETS];
-
- cpumask_var_t cpu_mask;
-};
-
-/*
* Per io_wq state
*/
struct io_wq {
@@ -127,7 +108,19 @@ struct io_wq {
struct task_struct *task;
- struct io_wqe *wqes[];
+ struct io_wq_acct acct[IO_WQ_ACCT_NR];
+
+ /* lock protects access to elements below */
+ raw_spinlock_t lock;
+
+ struct hlist_nulls_head free_list;
+ struct list_head all_list;
+
+ struct wait_queue_entry wait;
+
+ struct io_wq_work *hash_tail[IO_WQ_NR_HASH_BUCKETS];
+
+ cpumask_var_t cpu_mask;
};
static enum cpuhp_state io_wq_online;
@@ -140,10 +133,10 @@ struct io_cb_cancel_data {
bool cancel_all;
};
-static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index);
-static void io_wqe_dec_running(struct io_worker *worker);
-static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
- struct io_wqe_acct *acct,
+static bool create_io_worker(struct io_wq *wq, int index);
+static void io_wq_dec_running(struct io_worker *worker);
+static bool io_acct_cancel_pending_work(struct io_wq *wq,
+ struct io_wq_acct *acct,
struct io_cb_cancel_data *match);
static void create_worker_cb(struct callback_head *cb);
static void io_wq_cancel_tw_create(struct io_wq *wq);
@@ -159,20 +152,20 @@ static void io_worker_release(struct io_worker *worker)
complete(&worker->ref_done);
}
-static inline struct io_wqe_acct *io_get_acct(struct io_wqe *wqe, bool bound)
+static inline struct io_wq_acct *io_get_acct(struct io_wq *wq, bool bound)
{
- return &wqe->acct[bound ? IO_WQ_ACCT_BOUND : IO_WQ_ACCT_UNBOUND];
+ return &wq->acct[bound ? IO_WQ_ACCT_BOUND : IO_WQ_ACCT_UNBOUND];
}
-static inline struct io_wqe_acct *io_work_get_acct(struct io_wqe *wqe,
- struct io_wq_work *work)
+static inline struct io_wq_acct *io_work_get_acct(struct io_wq *wq,
+ struct io_wq_work *work)
{
- return io_get_acct(wqe, !(work->flags & IO_WQ_WORK_UNBOUND));
+ return io_get_acct(wq, !(work->flags & IO_WQ_WORK_UNBOUND));
}
-static inline struct io_wqe_acct *io_wqe_get_acct(struct io_worker *worker)
+static inline struct io_wq_acct *io_wq_get_acct(struct io_worker *worker)
{
- return io_get_acct(worker->wqe, worker->flags & IO_WORKER_F_BOUND);
+ return io_get_acct(worker->wq, worker->flags & IO_WORKER_F_BOUND);
}
static void io_worker_ref_put(struct io_wq *wq)
@@ -183,14 +176,13 @@ static void io_worker_ref_put(struct io_wq *wq)
static void io_worker_cancel_cb(struct io_worker *worker)
{
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
+ struct io_wq_acct *acct = io_wq_get_acct(worker);
+ struct io_wq *wq = worker->wq;
atomic_dec(&acct->nr_running);
- raw_spin_lock(&worker->wqe->lock);
+ raw_spin_lock(&wq->lock);
acct->nr_workers--;
- raw_spin_unlock(&worker->wqe->lock);
+ raw_spin_unlock(&wq->lock);
io_worker_ref_put(wq);
clear_bit_unlock(0, &worker->create_state);
io_worker_release(worker);
@@ -208,8 +200,7 @@ static bool io_task_worker_match(struct callback_head *cb, void *data)
static void io_worker_exit(struct io_worker *worker)
{
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
+ struct io_wq *wq = worker->wq;
while (1) {
struct callback_head *cb = task_work_cancel_match(wq->task,
@@ -223,23 +214,23 @@ static void io_worker_exit(struct io_worker *worker)
io_worker_release(worker);
wait_for_completion(&worker->ref_done);
- raw_spin_lock(&wqe->lock);
+ raw_spin_lock(&wq->lock);
if (worker->flags & IO_WORKER_F_FREE)
hlist_nulls_del_rcu(&worker->nulls_node);
list_del_rcu(&worker->all_list);
- raw_spin_unlock(&wqe->lock);
- io_wqe_dec_running(worker);
+ raw_spin_unlock(&wq->lock);
+ io_wq_dec_running(worker);
worker->flags = 0;
preempt_disable();
current->flags &= ~PF_IO_WORKER;
preempt_enable();
kfree_rcu(worker, rcu);
- io_worker_ref_put(wqe->wq);
+ io_worker_ref_put(wq);
do_exit(0);
}
-static inline bool io_acct_run_queue(struct io_wqe_acct *acct)
+static inline bool io_acct_run_queue(struct io_wq_acct *acct)
{
bool ret = false;
@@ -256,8 +247,8 @@ static inline bool io_acct_run_queue(struct io_wqe_acct *acct)
* Check head of free list for an available worker. If one isn't available,
* caller must create one.
*/
-static bool io_wqe_activate_free_worker(struct io_wqe *wqe,
- struct io_wqe_acct *acct)
+static bool io_wq_activate_free_worker(struct io_wq *wq,
+ struct io_wq_acct *acct)
__must_hold(RCU)
{
struct hlist_nulls_node *n;
@@ -268,10 +259,10 @@ static bool io_wqe_activate_free_worker(struct io_wqe *wqe,
* activate. If a given worker is on the free_list but in the process
* of exiting, keep trying.
*/
- hlist_nulls_for_each_entry_rcu(worker, n, &wqe->free_list, nulls_node) {
+ hlist_nulls_for_each_entry_rcu(worker, n, &wq->free_list, nulls_node) {
if (!io_worker_get(worker))
continue;
- if (io_wqe_get_acct(worker) != acct) {
+ if (io_wq_get_acct(worker) != acct) {
io_worker_release(worker);
continue;
}
@@ -289,7 +280,7 @@ static bool io_wqe_activate_free_worker(struct io_wqe *wqe,
* We need a worker. If we find a free one, we're good. If not, and we're
* below the max number of workers, create one.
*/
-static bool io_wqe_create_worker(struct io_wqe *wqe, struct io_wqe_acct *acct)
+static bool io_wq_create_worker(struct io_wq *wq, struct io_wq_acct *acct)
{
/*
* Most likely an attempt to queue unbounded work on an io_wq that
@@ -298,21 +289,21 @@ static bool io_wqe_create_worker(struct io_wqe *wqe, struct io_wqe_acct *acct)
if (unlikely(!acct->max_workers))
pr_warn_once("io-wq is not configured for unbound workers");
- raw_spin_lock(&wqe->lock);
+ raw_spin_lock(&wq->lock);
if (acct->nr_workers >= acct->max_workers) {
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
return true;
}
acct->nr_workers++;
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
atomic_inc(&acct->nr_running);
- atomic_inc(&wqe->wq->worker_refs);
- return create_io_worker(wqe->wq, wqe, acct->index);
+ atomic_inc(&wq->worker_refs);
+ return create_io_worker(wq, acct->index);
}
-static void io_wqe_inc_running(struct io_worker *worker)
+static void io_wq_inc_running(struct io_worker *worker)
{
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+ struct io_wq_acct *acct = io_wq_get_acct(worker);
atomic_inc(&acct->nr_running);
}
@@ -321,22 +312,22 @@ static void create_worker_cb(struct callback_head *cb)
{
struct io_worker *worker;
struct io_wq *wq;
- struct io_wqe *wqe;
- struct io_wqe_acct *acct;
+
+ struct io_wq_acct *acct;
bool do_create = false;
worker = container_of(cb, struct io_worker, create_work);
- wqe = worker->wqe;
- wq = wqe->wq;
- acct = &wqe->acct[worker->create_index];
- raw_spin_lock(&wqe->lock);
+ wq = worker->wq;
+ acct = &wq->acct[worker->create_index];
+ raw_spin_lock(&wq->lock);
+
if (acct->nr_workers < acct->max_workers) {
acct->nr_workers++;
do_create = true;
}
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
if (do_create) {
- create_io_worker(wq, wqe, worker->create_index);
+ create_io_worker(wq, worker->create_index);
} else {
atomic_dec(&acct->nr_running);
io_worker_ref_put(wq);
@@ -346,11 +337,10 @@ static void create_worker_cb(struct callback_head *cb)
}
static bool io_queue_worker_create(struct io_worker *worker,
- struct io_wqe_acct *acct,
+ struct io_wq_acct *acct,
task_work_func_t func)
{
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
+ struct io_wq *wq = worker->wq;
/* raced with exit, just ignore create call */
if (test_bit(IO_WQ_BIT_EXIT, &wq->state))
@@ -392,10 +382,10 @@ static bool io_queue_worker_create(struct io_worker *worker,
return false;
}
-static void io_wqe_dec_running(struct io_worker *worker)
+static void io_wq_dec_running(struct io_worker *worker)
{
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
- struct io_wqe *wqe = worker->wqe;
+ struct io_wq_acct *acct = io_wq_get_acct(worker);
+ struct io_wq *wq = worker->wq;
if (!(worker->flags & IO_WORKER_F_UP))
return;
@@ -406,7 +396,7 @@ static void io_wqe_dec_running(struct io_worker *worker)
return;
atomic_inc(&acct->nr_running);
- atomic_inc(&wqe->wq->worker_refs);
+ atomic_inc(&wq->worker_refs);
io_queue_worker_create(worker, acct, create_worker_cb);
}
@@ -414,29 +404,25 @@ static void io_wqe_dec_running(struct io_worker *worker)
* Worker will start processing some work. Move it to the busy list, if
* it's currently on the freelist
*/
-static void __io_worker_busy(struct io_wqe *wqe, struct io_worker *worker)
+static void __io_worker_busy(struct io_wq *wq, struct io_worker *worker)
{
if (worker->flags & IO_WORKER_F_FREE) {
worker->flags &= ~IO_WORKER_F_FREE;
- raw_spin_lock(&wqe->lock);
+ raw_spin_lock(&wq->lock);
hlist_nulls_del_init_rcu(&worker->nulls_node);
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
}
}
/*
- * No work, worker going to sleep. Move to freelist, and unuse mm if we
- * have one attached. Dropping the mm may potentially sleep, so we drop
- * the lock in that case and return success. Since the caller has to
- * retry the loop in that case (we changed task state), we don't regrab
- * the lock if we return success.
+ * No work, worker going to sleep. Move to freelist.
*/
-static void __io_worker_idle(struct io_wqe *wqe, struct io_worker *worker)
- __must_hold(wqe->lock)
+static void __io_worker_idle(struct io_wq *wq, struct io_worker *worker)
+ __must_hold(wq->lock)
{
if (!(worker->flags & IO_WORKER_F_FREE)) {
worker->flags |= IO_WORKER_F_FREE;
- hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
+ hlist_nulls_add_head_rcu(&worker->nulls_node, &wq->free_list);
}
}
@@ -445,17 +431,16 @@ static inline unsigned int io_get_work_hash(struct io_wq_work *work)
return work->flags >> IO_WQ_HASH_SHIFT;
}
-static bool io_wait_on_hash(struct io_wqe *wqe, unsigned int hash)
+static bool io_wait_on_hash(struct io_wq *wq, unsigned int hash)
{
- struct io_wq *wq = wqe->wq;
bool ret = false;
spin_lock_irq(&wq->hash->wait.lock);
- if (list_empty(&wqe->wait.entry)) {
- __add_wait_queue(&wq->hash->wait, &wqe->wait);
+ if (list_empty(&wq->wait.entry)) {
+ __add_wait_queue(&wq->hash->wait, &wq->wait);
if (!test_bit(hash, &wq->hash->map)) {
__set_current_state(TASK_RUNNING);
- list_del_init(&wqe->wait.entry);
+ list_del_init(&wq->wait.entry);
ret = true;
}
}
@@ -463,14 +448,14 @@ static bool io_wait_on_hash(struct io_wqe *wqe, unsigned int hash)
return ret;
}
-static struct io_wq_work *io_get_next_work(struct io_wqe_acct *acct,
+static struct io_wq_work *io_get_next_work(struct io_wq_acct *acct,
struct io_worker *worker)
__must_hold(acct->lock)
{
struct io_wq_work_node *node, *prev;
struct io_wq_work *work, *tail;
unsigned int stall_hash = -1U;
- struct io_wqe *wqe = worker->wqe;
+ struct io_wq *wq = worker->wq;
wq_list_for_each(node, prev, &acct->work_list) {
unsigned int hash;
@@ -485,11 +470,11 @@ static struct io_wq_work *io_get_next_work(struct io_wqe_acct *acct,
hash = io_get_work_hash(work);
/* all items with this hash lie in [work, tail] */
- tail = wqe->hash_tail[hash];
+ tail = wq->hash_tail[hash];
/* hashed, can run if not already running */
- if (!test_and_set_bit(hash, &wqe->wq->hash->map)) {
- wqe->hash_tail[hash] = NULL;
+ if (!test_and_set_bit(hash, &wq->hash->map)) {
+ wq->hash_tail[hash] = NULL;
wq_list_cut(&acct->work_list, &tail->list, prev);
return work;
}
@@ -508,12 +493,12 @@ static struct io_wq_work *io_get_next_work(struct io_wqe_acct *acct,
*/
set_bit(IO_ACCT_STALLED_BIT, &acct->flags);
raw_spin_unlock(&acct->lock);
- unstalled = io_wait_on_hash(wqe, stall_hash);
+ unstalled = io_wait_on_hash(wq, stall_hash);
raw_spin_lock(&acct->lock);
if (unstalled) {
clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
- if (wq_has_sleeper(&wqe->wq->hash->wait))
- wake_up(&wqe->wq->hash->wait);
+ if (wq_has_sleeper(&wq->hash->wait))
+ wake_up(&wq->hash->wait);
}
}
@@ -534,13 +519,10 @@ static void io_assign_current_work(struct io_worker *worker,
raw_spin_unlock(&worker->lock);
}
-static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work);
-
static void io_worker_handle_work(struct io_worker *worker)
{
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
+ struct io_wq_acct *acct = io_wq_get_acct(worker);
+ struct io_wq *wq = worker->wq;
bool do_kill = test_bit(IO_WQ_BIT_EXIT, &wq->state);
do {
@@ -557,7 +539,7 @@ static void io_worker_handle_work(struct io_worker *worker)
work = io_get_next_work(acct, worker);
raw_spin_unlock(&acct->lock);
if (work) {
- __io_worker_busy(wqe, worker);
+ __io_worker_busy(wq, worker);
/*
* Make sure cancelation can find this, even before
@@ -595,7 +577,7 @@ static void io_worker_handle_work(struct io_worker *worker)
}
io_assign_current_work(worker, work);
if (linked)
- io_wqe_enqueue(wqe, linked);
+ io_wq_enqueue(wq, linked);
if (hash != -1U && !next_hashed) {
/* serialize hash clear with wake_up() */
@@ -610,12 +592,11 @@ static void io_worker_handle_work(struct io_worker *worker)
} while (1);
}
-static int io_wqe_worker(void *data)
+static int io_wq_worker(void *data)
{
struct io_worker *worker = data;
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
- struct io_wqe *wqe = worker->wqe;
- struct io_wq *wq = wqe->wq;
+ struct io_wq_acct *acct = io_wq_get_acct(worker);
+ struct io_wq *wq = worker->wq;
bool exit_mask = false, last_timeout = false;
char buf[TASK_COMM_LEN];
@@ -631,20 +612,20 @@ static int io_wqe_worker(void *data)
while (io_acct_run_queue(acct))
io_worker_handle_work(worker);
- raw_spin_lock(&wqe->lock);
+ raw_spin_lock(&wq->lock);
/*
* Last sleep timed out. Exit if we're not the last worker,
* or if someone modified our affinity.
*/
if (last_timeout && (exit_mask || acct->nr_workers > 1)) {
acct->nr_workers--;
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
__set_current_state(TASK_RUNNING);
break;
}
last_timeout = false;
- __io_worker_idle(wqe, worker);
- raw_spin_unlock(&wqe->lock);
+ __io_worker_idle(wq, worker);
+ raw_spin_unlock(&wq->lock);
if (io_run_task_work())
continue;
ret = schedule_timeout(WORKER_IDLE_TIMEOUT);
@@ -658,7 +639,7 @@ static int io_wqe_worker(void *data)
if (!ret) {
last_timeout = true;
exit_mask = !cpumask_test_cpu(raw_smp_processor_id(),
- wqe->cpu_mask);
+ wq->cpu_mask);
}
}
@@ -683,7 +664,7 @@ void io_wq_worker_running(struct task_struct *tsk)
if (worker->flags & IO_WORKER_F_RUNNING)
return;
worker->flags |= IO_WORKER_F_RUNNING;
- io_wqe_inc_running(worker);
+ io_wq_inc_running(worker);
}
/*
@@ -702,21 +683,21 @@ void io_wq_worker_sleeping(struct task_struct *tsk)
return;
worker->flags &= ~IO_WORKER_F_RUNNING;
- io_wqe_dec_running(worker);
+ io_wq_dec_running(worker);
}
-static void io_init_new_worker(struct io_wqe *wqe, struct io_worker *worker,
+static void io_init_new_worker(struct io_wq *wq, struct io_worker *worker,
struct task_struct *tsk)
{
tsk->worker_private = worker;
worker->task = tsk;
- set_cpus_allowed_ptr(tsk, wqe->cpu_mask);
+ set_cpus_allowed_ptr(tsk, wq->cpu_mask);
- raw_spin_lock(&wqe->lock);
- hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
- list_add_tail_rcu(&worker->all_list, &wqe->all_list);
+ raw_spin_lock(&wq->lock);
+ hlist_nulls_add_head_rcu(&worker->nulls_node, &wq->free_list);
+ list_add_tail_rcu(&worker->all_list, &wq->all_list);
worker->flags |= IO_WORKER_F_FREE;
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
wake_up_new_task(tsk);
}
@@ -749,21 +730,21 @@ static void create_worker_cont(struct callback_head *cb)
{
struct io_worker *worker;
struct task_struct *tsk;
- struct io_wqe *wqe;
+ struct io_wq *wq;
worker = container_of(cb, struct io_worker, create_work);
clear_bit_unlock(0, &worker->create_state);
- wqe = worker->wqe;
- tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
+ wq = worker->wq;
+ tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE);
if (!IS_ERR(tsk)) {
- io_init_new_worker(wqe, worker, tsk);
+ io_init_new_worker(wq, worker, tsk);
io_worker_release(worker);
return;
} else if (!io_should_retry_thread(PTR_ERR(tsk))) {
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+ struct io_wq_acct *acct = io_wq_get_acct(worker);
atomic_dec(&acct->nr_running);
- raw_spin_lock(&wqe->lock);
+ raw_spin_lock(&wq->lock);
acct->nr_workers--;
if (!acct->nr_workers) {
struct io_cb_cancel_data match = {
@@ -771,13 +752,13 @@ static void create_worker_cont(struct callback_head *cb)
.cancel_all = true,
};
- raw_spin_unlock(&wqe->lock);
- while (io_acct_cancel_pending_work(wqe, acct, &match))
+ raw_spin_unlock(&wq->lock);
+ while (io_acct_cancel_pending_work(wq, acct, &match))
;
} else {
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
}
- io_worker_ref_put(wqe->wq);
+ io_worker_ref_put(wq);
kfree(worker);
return;
}
@@ -790,42 +771,42 @@ static void create_worker_cont(struct callback_head *cb)
static void io_workqueue_create(struct work_struct *work)
{
struct io_worker *worker = container_of(work, struct io_worker, work);
- struct io_wqe_acct *acct = io_wqe_get_acct(worker);
+ struct io_wq_acct *acct = io_wq_get_acct(worker);
if (!io_queue_worker_create(worker, acct, create_worker_cont))
kfree(worker);
}
-static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index)
+static bool create_io_worker(struct io_wq *wq, int index)
{
- struct io_wqe_acct *acct = &wqe->acct[index];
+ struct io_wq_acct *acct = &wq->acct[index];
struct io_worker *worker;
struct task_struct *tsk;
__set_current_state(TASK_RUNNING);
- worker = kzalloc_node(sizeof(*worker), GFP_KERNEL, wqe->node);
+ worker = kzalloc(sizeof(*worker), GFP_KERNEL);
if (!worker) {
fail:
atomic_dec(&acct->nr_running);
- raw_spin_lock(&wqe->lock);
+ raw_spin_lock(&wq->lock);
acct->nr_workers--;
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
io_worker_ref_put(wq);
return false;
}
refcount_set(&worker->ref, 1);
- worker->wqe = wqe;
+ worker->wq = wq;
raw_spin_lock_init(&worker->lock);
init_completion(&worker->ref_done);
if (index == IO_WQ_ACCT_BOUND)
worker->flags |= IO_WORKER_F_BOUND;
- tsk = create_io_thread(io_wqe_worker, worker, wqe->node);
+ tsk = create_io_thread(io_wq_worker, worker, NUMA_NO_NODE);
if (!IS_ERR(tsk)) {
- io_init_new_worker(wqe, worker, tsk);
+ io_init_new_worker(wq, worker, tsk);
} else if (!io_should_retry_thread(PTR_ERR(tsk))) {
kfree(worker);
goto fail;
@@ -841,14 +822,14 @@ static bool create_io_worker(struct io_wq *wq, struct io_wqe *wqe, int index)
* Iterate the passed in list and call the specific function for each
* worker that isn't exiting
*/
-static bool io_wq_for_each_worker(struct io_wqe *wqe,
+static bool io_wq_for_each_worker(struct io_wq *wq,
bool (*func)(struct io_worker *, void *),
void *data)
{
struct io_worker *worker;
bool ret = false;
- list_for_each_entry_rcu(worker, &wqe->all_list, all_list) {
+ list_for_each_entry_rcu(worker, &wq->all_list, all_list) {
if (io_worker_get(worker)) {
/* no task if node is/was offline */
if (worker->task)
@@ -869,10 +850,8 @@ static bool io_wq_worker_wake(struct io_worker *worker, void *data)
return false;
}
-static void io_run_cancel(struct io_wq_work *work, struct io_wqe *wqe)
+static void io_run_cancel(struct io_wq_work *work, struct io_wq *wq)
{
- struct io_wq *wq = wqe->wq;
-
do {
work->flags |= IO_WQ_WORK_CANCEL;
wq->do_work(work);
@@ -880,9 +859,9 @@ static void io_run_cancel(struct io_wq_work *work, struct io_wqe *wqe)
} while (work);
}
-static void io_wqe_insert_work(struct io_wqe *wqe, struct io_wq_work *work)
+static void io_wq_insert_work(struct io_wq *wq, struct io_wq_work *work)
{
- struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
+ struct io_wq_acct *acct = io_work_get_acct(wq, work);
unsigned int hash;
struct io_wq_work *tail;
@@ -893,8 +872,8 @@ static void io_wqe_insert_work(struct io_wqe *wqe, struct io_wq_work *work)
}
hash = io_get_work_hash(work);
- tail = wqe->hash_tail[hash];
- wqe->hash_tail[hash] = work;
+ tail = wq->hash_tail[hash];
+ wq->hash_tail[hash] = work;
if (!tail)
goto append;
@@ -906,9 +885,9 @@ static bool io_wq_work_match_item(struct io_wq_work *work, void *data)
return work == data;
}
-static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
+void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work)
{
- struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
+ struct io_wq_acct *acct = io_work_get_acct(wq, work);
struct io_cb_cancel_data match;
unsigned work_flags = work->flags;
bool do_create;
@@ -917,55 +896,48 @@ static void io_wqe_enqueue(struct io_wqe *wqe, struct io_wq_work *work)
* If io-wq is exiting for this task, or if the request has explicitly
* been marked as one that should not get executed, cancel it here.
*/
- if (test_bit(IO_WQ_BIT_EXIT, &wqe->wq->state) ||
+ if (test_bit(IO_WQ_BIT_EXIT, &wq->state) ||
(work->flags & IO_WQ_WORK_CANCEL)) {
- io_run_cancel(work, wqe);
+ io_run_cancel(work, wq);
return;
}
raw_spin_lock(&acct->lock);
- io_wqe_insert_work(wqe, work);
+ io_wq_insert_work(wq, work);
clear_bit(IO_ACCT_STALLED_BIT, &acct->flags);
raw_spin_unlock(&acct->lock);
- raw_spin_lock(&wqe->lock);
+ raw_spin_lock(&wq->lock);
rcu_read_lock();
- do_create = !io_wqe_activate_free_worker(wqe, acct);
+ do_create = !io_wq_activate_free_worker(wq, acct);
rcu_read_unlock();
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
if (do_create && ((work_flags & IO_WQ_WORK_CONCURRENT) ||
!atomic_read(&acct->nr_running))) {
bool did_create;
- did_create = io_wqe_create_worker(wqe, acct);
+ did_create = io_wq_create_worker(wq, acct);
if (likely(did_create))
return;
- raw_spin_lock(&wqe->lock);
+ raw_spin_lock(&wq->lock);
if (acct->nr_workers) {
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
return;
}
- raw_spin_unlock(&wqe->lock);
+ raw_spin_unlock(&wq->lock);
/* fatal condition, failed to create the first worker */
match.fn = io_wq_work_match_item,
match.data = work,
match.cancel_all = false,
- io_acct_cancel_pending_work(wqe, acct, &match);
+ io_acct_cancel_pending_work(wq, acct, &match);
}
}
-void io_wq_enqueue(struct io_wq *wq, struct io_wq_work *work)
-{
- struct io_wqe *wqe = wq->wqes[numa_node_id()];
-
- io_wqe_enqueue(wqe, work);
-}
-
/*
* Work items that hash to the same value will not be done in parallel.
* Used to limit concurrent writes, generally hashed by inode.
@@ -1008,27 +980,27 @@ static bool io_wq_worker_cancel(struct io_worker *worker, void *data)
return match->nr_running && !match->cancel_all;
}
-static inline void io_wqe_remove_pending(struct io_wqe *wqe,
+static inline void io_wq_remove_pending(struct io_wq *wq,
struct io_wq_work *work,
struct io_wq_work_node *prev)
{
- struct io_wqe_acct *acct = io_work_get_acct(wqe, work);
+ struct io_wq_acct *acct = io_work_get_acct(wq, work);
unsigned int hash = io_get_work_hash(work);
struct io_wq_work *prev_work = NULL;
- if (io_wq_is_hashed(work) && work == wqe->hash_tail[hash]) {
+ if (io_wq_is_hashed(work) && work == wq->hash_tail[hash]) {
if (prev)
prev_work = container_of(prev, struct io_wq_work, list);
if (prev_work && io_get_work_hash(prev_work) == hash)
- wqe->hash_tail[hash] = prev_work;
+ wq->hash_tail[hash] = prev_work;
else
- wqe->hash_tail[hash] = NULL;
+ wq->hash_tail[hash] = NULL;
}
wq_list_del(&acct->work_list, &work->list, prev);
}
-static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
- struct io_wqe_acct *acct,
+static bool io_acct_cancel_pending_work(struct io_wq *wq,
+ struct io_wq_acct *acct,
struct io_cb_cancel_data *match)
{
struct io_wq_work_node *node, *prev;
@@ -1039,9 +1011,9 @@ static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
work = container_of(node, struct io_wq_work, list);
if (!match->fn(work, match->data))
continue;
- io_wqe_remove_pending(wqe, work, prev);
+ io_wq_remove_pending(wq, work, prev);
raw_spin_unlock(&acct->lock);
- io_run_cancel(work, wqe);
+ io_run_cancel(work, wq);
match->nr_pending++;
/* not safe to continue after unlock */
return true;
@@ -1051,15 +1023,15 @@ static bool io_acct_cancel_pending_work(struct io_wqe *wqe,
return false;
}
-static void io_wqe_cancel_pending_work(struct io_wqe *wqe,
- struct io_cb_cancel_data *match)
+static void io_wq_cancel_pending_work(struct io_wq *wq,
+ struct io_cb_cancel_data *match)
{
int i;
retry:
for (i = 0; i < IO_WQ_ACCT_NR; i++) {
- struct io_wqe_acct *acct = io_get_acct(wqe, i == 0);
+ struct io_wq_acct *acct = io_get_acct(wq, i == 0);
- if (io_acct_cancel_pending_work(wqe, acct, match)) {
+ if (io_acct_cancel_pending_work(wq, acct, match)) {
if (match->cancel_all)
goto retry;
break;
@@ -1067,11 +1039,11 @@ static void io_wqe_cancel_pending_work(struct io_wqe *wqe,
}
}
-static void io_wqe_cancel_running_work(struct io_wqe *wqe,
+static void io_wq_cancel_running_work(struct io_wq *wq,
struct io_cb_cancel_data *match)
{
rcu_read_lock();
- io_wq_for_each_worker(wqe, io_wq_worker_cancel, match);
+ io_wq_for_each_worker(wq, io_wq_worker_cancel, match);
rcu_read_unlock();
}
@@ -1083,7 +1055,6 @@ enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
.data = data,
.cancel_all = cancel_all,
};
- int node;
/*
* First check pending list, if we're lucky we can just remove it
@@ -1095,22 +1066,18 @@ enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
* as an indication that we attempt to signal cancellation. The
* completion will run normally in this case.
*
- * Do both of these while holding the wqe->lock, to ensure that
+ * Do both of these while holding the wq->lock, to ensure that
* we'll find a work item regardless of state.
*/
- for_each_node(node) {
- struct io_wqe *wqe = wq->wqes[node];
+ io_wq_cancel_pending_work(wq, &match);
+ if (match.nr_pending && !match.cancel_all)
+ return IO_WQ_CANCEL_OK;
- io_wqe_cancel_pending_work(wqe, &match);
- if (match.nr_pending && !match.cancel_all)
- return IO_WQ_CANCEL_OK;
-
- raw_spin_lock(&wqe->lock);
- io_wqe_cancel_running_work(wqe, &match);
- raw_spin_unlock(&wqe->lock);
- if (match.nr_running && !match.cancel_all)
- return IO_WQ_CANCEL_RUNNING;
- }
+ raw_spin_lock(&wq->lock);
+ io_wq_cancel_running_work(wq, &match);
+ raw_spin_unlock(&wq->lock);
+ if (match.nr_running && !match.cancel_all)
+ return IO_WQ_CANCEL_RUNNING;
if (match.nr_running)
return IO_WQ_CANCEL_RUNNING;
@@ -1119,20 +1086,20 @@ enum io_wq_cancel io_wq_cancel_cb(struct io_wq *wq, work_cancel_fn *cancel,
return IO_WQ_CANCEL_NOTFOUND;
}
-static int io_wqe_hash_wake(struct wait_queue_entry *wait, unsigned mode,
+static int io_wq_hash_wake(struct wait_queue_entry *wait, unsigned mode,
int sync, void *key)
{
- struct io_wqe *wqe = container_of(wait, struct io_wqe, wait);
+ struct io_wq *wq = container_of(wait, struct io_wq, wait);
int i;
list_del_init(&wait->entry);
rcu_read_lock();
for (i = 0; i < IO_WQ_ACCT_NR; i++) {
- struct io_wqe_acct *acct = &wqe->acct[i];
+ struct io_wq_acct *acct = &wq->acct[i];
if (test_and_clear_bit(IO_ACCT_STALLED_BIT, &acct->flags))
- io_wqe_activate_free_worker(wqe, acct);
+ io_wq_activate_free_worker(wq, acct);
}
rcu_read_unlock();
return 1;
@@ -1140,7 +1107,7 @@ static int io_wqe_hash_wake(struct wait_queue_entry *wait, unsigned mode,
struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
{
- int ret, node, i;
+ int ret, i;
struct io_wq *wq;
if (WARN_ON_ONCE(!data->free_work || !data->do_work))
@@ -1148,7 +1115,7 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
if (WARN_ON_ONCE(!bounded))
return ERR_PTR(-EINVAL);
- wq = kzalloc(struct_size(wq, wqes, nr_node_ids), GFP_KERNEL);
+ wq = kzalloc(sizeof(struct io_wq), GFP_KERNEL);
if (!wq)
return ERR_PTR(-ENOMEM);
ret = cpuhp_state_add_instance_nocalls(io_wq_online, &wq->cpuhp_node);
@@ -1161,39 +1128,28 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
wq->do_work = data->do_work;
ret = -ENOMEM;
- for_each_node(node) {
- struct io_wqe *wqe;
- int alloc_node = node;
- if (!node_online(alloc_node))
- alloc_node = NUMA_NO_NODE;
- wqe = kzalloc_node(sizeof(struct io_wqe), GFP_KERNEL, alloc_node);
- if (!wqe)
- goto err;
- wq->wqes[node] = wqe;
- if (!alloc_cpumask_var(&wqe->cpu_mask, GFP_KERNEL))
- goto err;
- cpumask_copy(wqe->cpu_mask, cpumask_of_node(node));
- wqe->node = alloc_node;
- wqe->acct[IO_WQ_ACCT_BOUND].max_workers = bounded;
- wqe->acct[IO_WQ_ACCT_UNBOUND].max_workers =
- task_rlimit(current, RLIMIT_NPROC);
- INIT_LIST_HEAD(&wqe->wait.entry);
- wqe->wait.func = io_wqe_hash_wake;
- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
- struct io_wqe_acct *acct = &wqe->acct[i];
+ if (!alloc_cpumask_var(&wq->cpu_mask, GFP_KERNEL))
+ goto err;
+ cpumask_copy(wq->cpu_mask, cpu_possible_mask);
+ wq->acct[IO_WQ_ACCT_BOUND].max_workers = bounded;
+ wq->acct[IO_WQ_ACCT_UNBOUND].max_workers =
+ task_rlimit(current, RLIMIT_NPROC);
+ INIT_LIST_HEAD(&wq->wait.entry);
+ wq->wait.func = io_wq_hash_wake;
+ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+ struct io_wq_acct *acct = &wq->acct[i];
- acct->index = i;
- atomic_set(&acct->nr_running, 0);
- INIT_WQ_LIST(&acct->work_list);
- raw_spin_lock_init(&acct->lock);
- }
- wqe->wq = wq;
- raw_spin_lock_init(&wqe->lock);
- INIT_HLIST_NULLS_HEAD(&wqe->free_list, 0);
- INIT_LIST_HEAD(&wqe->all_list);
+ acct->index = i;
+ atomic_set(&acct->nr_running, 0);
+ INIT_WQ_LIST(&acct->work_list);
+ raw_spin_lock_init(&acct->lock);
}
+ raw_spin_lock_init(&wq->lock);
+ INIT_HLIST_NULLS_HEAD(&wq->free_list, 0);
+ INIT_LIST_HEAD(&wq->all_list);
+
wq->task = get_task_struct(data->task);
atomic_set(&wq->worker_refs, 1);
init_completion(&wq->worker_done);
@@ -1201,12 +1157,8 @@ struct io_wq *io_wq_create(unsigned bounded, struct io_wq_data *data)
err:
io_wq_put_hash(data->hash);
cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
- for_each_node(node) {
- if (!wq->wqes[node])
- continue;
- free_cpumask_var(wq->wqes[node]->cpu_mask);
- kfree(wq->wqes[node]);
- }
+
+ free_cpumask_var(wq->cpu_mask);
err_wq:
kfree(wq);
return ERR_PTR(ret);
@@ -1219,7 +1171,7 @@ static bool io_task_work_match(struct callback_head *cb, void *data)
if (cb->func != create_worker_cb && cb->func != create_worker_cont)
return false;
worker = container_of(cb, struct io_worker, create_work);
- return worker->wqe->wq == data;
+ return worker->wq == data;
}
void io_wq_exit_start(struct io_wq *wq)
@@ -1247,48 +1199,35 @@ static void io_wq_cancel_tw_create(struct io_wq *wq)
static void io_wq_exit_workers(struct io_wq *wq)
{
- int node;
-
if (!wq->task)
return;
io_wq_cancel_tw_create(wq);
rcu_read_lock();
- for_each_node(node) {
- struct io_wqe *wqe = wq->wqes[node];
-
- io_wq_for_each_worker(wqe, io_wq_worker_wake, NULL);
- }
+ io_wq_for_each_worker(wq, io_wq_worker_wake, NULL);
rcu_read_unlock();
io_worker_ref_put(wq);
wait_for_completion(&wq->worker_done);
- for_each_node(node) {
- spin_lock_irq(&wq->hash->wait.lock);
- list_del_init(&wq->wqes[node]->wait.entry);
- spin_unlock_irq(&wq->hash->wait.lock);
- }
+ spin_lock_irq(&wq->hash->wait.lock);
+ list_del_init(&wq->wait.entry);
+ spin_unlock_irq(&wq->hash->wait.lock);
+
put_task_struct(wq->task);
wq->task = NULL;
}
static void io_wq_destroy(struct io_wq *wq)
{
- int node;
+ struct io_cb_cancel_data match = {
+ .fn = io_wq_work_match_all,
+ .cancel_all = true,
+ };
cpuhp_state_remove_instance_nocalls(io_wq_online, &wq->cpuhp_node);
-
- for_each_node(node) {
- struct io_wqe *wqe = wq->wqes[node];
- struct io_cb_cancel_data match = {
- .fn = io_wq_work_match_all,
- .cancel_all = true,
- };
- io_wqe_cancel_pending_work(wqe, &match);
- free_cpumask_var(wqe->cpu_mask);
- kfree(wqe);
- }
+ io_wq_cancel_pending_work(wq, &match);
+ free_cpumask_var(wq->cpu_mask);
io_wq_put_hash(wq->hash);
kfree(wq);
}
@@ -1311,9 +1250,9 @@ static bool io_wq_worker_affinity(struct io_worker *worker, void *data)
struct online_data *od = data;
if (od->online)
- cpumask_set_cpu(od->cpu, worker->wqe->cpu_mask);
+ cpumask_set_cpu(od->cpu, worker->wq->cpu_mask);
else
- cpumask_clear_cpu(od->cpu, worker->wqe->cpu_mask);
+ cpumask_clear_cpu(od->cpu, worker->wq->cpu_mask);
return false;
}
@@ -1323,11 +1262,9 @@ static int __io_wq_cpu_online(struct io_wq *wq, unsigned int cpu, bool online)
.cpu = cpu,
.online = online
};
- int i;
rcu_read_lock();
- for_each_node(i)
- io_wq_for_each_worker(wq->wqes[i], io_wq_worker_affinity, &od);
+ io_wq_for_each_worker(wq, io_wq_worker_affinity, &od);
rcu_read_unlock();
return 0;
}
@@ -1348,18 +1285,13 @@ static int io_wq_cpu_offline(unsigned int cpu, struct hlist_node *node)
int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask)
{
- int i;
-
rcu_read_lock();
- for_each_node(i) {
- struct io_wqe *wqe = wq->wqes[i];
-
- if (mask)
- cpumask_copy(wqe->cpu_mask, mask);
- else
- cpumask_copy(wqe->cpu_mask, cpumask_of_node(i));
- }
+ if (mask)
+ cpumask_copy(wq->cpu_mask, mask);
+ else
+ cpumask_copy(wq->cpu_mask, cpu_possible_mask);
rcu_read_unlock();
+
return 0;
}
@@ -1369,9 +1301,9 @@ int io_wq_cpu_affinity(struct io_wq *wq, cpumask_var_t mask)
*/
int io_wq_max_workers(struct io_wq *wq, int *new_count)
{
+ struct io_wq_acct *acct;
int prev[IO_WQ_ACCT_NR];
- bool first_node = true;
- int i, node;
+ int i;
BUILD_BUG_ON((int) IO_WQ_ACCT_BOUND != (int) IO_WQ_BOUND);
BUILD_BUG_ON((int) IO_WQ_ACCT_UNBOUND != (int) IO_WQ_UNBOUND);
@@ -1386,21 +1318,15 @@ int io_wq_max_workers(struct io_wq *wq, int *new_count)
prev[i] = 0;
rcu_read_lock();
- for_each_node(node) {
- struct io_wqe *wqe = wq->wqes[node];
- struct io_wqe_acct *acct;
- raw_spin_lock(&wqe->lock);
- for (i = 0; i < IO_WQ_ACCT_NR; i++) {
- acct = &wqe->acct[i];
- if (first_node)
- prev[i] = max_t(int, acct->max_workers, prev[i]);
- if (new_count[i])
- acct->max_workers = new_count[i];
- }
- raw_spin_unlock(&wqe->lock);
- first_node = false;
+ raw_spin_lock(&wq->lock);
+ for (i = 0; i < IO_WQ_ACCT_NR; i++) {
+ acct = &wq->acct[i];
+ prev[i] = max_t(int, acct->max_workers, prev[i]);
+ if (new_count[i])
+ acct->max_workers = new_count[i];
}
+ raw_spin_unlock(&wq->lock);
rcu_read_unlock();
for (i = 0; i < IO_WQ_ACCT_NR; i++)
diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c
index 722624b..9bbf582 100644
--- a/io_uring/io_uring.c
+++ b/io_uring/io_uring.c
@@ -72,6 +72,7 @@
#include <linux/io_uring.h>
#include <linux/audit.h>
#include <linux/security.h>
+#include <asm/shmparam.h>
#define CREATE_TRACE_POINTS
#include <trace/events/io_uring.h>
@@ -246,12 +247,12 @@ static __cold void io_fallback_req_func(struct work_struct *work)
fallback_work.work);
struct llist_node *node = llist_del_all(&ctx->fallback_llist);
struct io_kiocb *req, *tmp;
- bool locked = true;
+ struct io_tw_state ts = { .locked = true, };
mutex_lock(&ctx->uring_lock);
llist_for_each_entry_safe(req, tmp, node, io_task_work.node)
- req->io_task_work.func(req, &locked);
- if (WARN_ON_ONCE(!locked))
+ req->io_task_work.func(req, &ts);
+ if (WARN_ON_ONCE(!ts.locked))
return;
io_submit_flush_completions(ctx);
mutex_unlock(&ctx->uring_lock);
@@ -309,8 +310,12 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
INIT_LIST_HEAD(&ctx->sqd_list);
INIT_LIST_HEAD(&ctx->cq_overflow_list);
INIT_LIST_HEAD(&ctx->io_buffers_cache);
- io_alloc_cache_init(&ctx->apoll_cache);
- io_alloc_cache_init(&ctx->netmsg_cache);
+ io_alloc_cache_init(&ctx->rsrc_node_cache, IO_NODE_ALLOC_CACHE_MAX,
+ sizeof(struct io_rsrc_node));
+ io_alloc_cache_init(&ctx->apoll_cache, IO_ALLOC_CACHE_MAX,
+ sizeof(struct async_poll));
+ io_alloc_cache_init(&ctx->netmsg_cache, IO_ALLOC_CACHE_MAX,
+ sizeof(struct io_async_msghdr));
init_completion(&ctx->ref_comp);
xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1);
mutex_init(&ctx->uring_lock);
@@ -324,11 +329,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p)
INIT_LIST_HEAD(&ctx->defer_list);
INIT_LIST_HEAD(&ctx->timeout_list);
INIT_LIST_HEAD(&ctx->ltimeout_list);
- spin_lock_init(&ctx->rsrc_ref_lock);
INIT_LIST_HEAD(&ctx->rsrc_ref_list);
- INIT_DELAYED_WORK(&ctx->rsrc_put_work, io_rsrc_put_work);
- init_task_work(&ctx->rsrc_put_tw, io_rsrc_put_tw);
- init_llist_head(&ctx->rsrc_put_llist);
init_llist_head(&ctx->work_llist);
INIT_LIST_HEAD(&ctx->tctx_list);
ctx->submit_state.free_list.next = NULL;
@@ -425,7 +426,13 @@ static void io_prep_async_work(struct io_kiocb *req)
req->flags |= io_file_get_flags(req->file) << REQ_F_SUPPORT_NOWAIT_BIT;
if (req->flags & REQ_F_ISREG) {
- if (def->hash_reg_file || (ctx->flags & IORING_SETUP_IOPOLL))
+ bool should_hash = def->hash_reg_file;
+
+ /* don't serialize this request if the fs doesn't need it */
+ if (should_hash && (req->file->f_flags & O_DIRECT) &&
+ (req->file->f_mode & FMODE_DIO_PARALLEL_WRITE))
+ should_hash = false;
+ if (should_hash || (ctx->flags & IORING_SETUP_IOPOLL))
io_wq_hash_work(&req->work, file_inode(req->file));
} else if (!req->file || !S_ISBLK(file_inode(req->file)->i_mode)) {
if (def->unbound_nonreg_file)
@@ -450,7 +457,7 @@ static void io_prep_async_link(struct io_kiocb *req)
}
}
-void io_queue_iowq(struct io_kiocb *req, bool *dont_use)
+void io_queue_iowq(struct io_kiocb *req, struct io_tw_state *ts_dont_use)
{
struct io_kiocb *link = io_prep_linked_timeout(req);
struct io_uring_task *tctx = req->task->io_uring;
@@ -620,22 +627,22 @@ static inline void __io_cq_unlock_post(struct io_ring_ctx *ctx)
io_cqring_wake(ctx);
}
-static inline void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx)
+static void __io_cq_unlock_post_flush(struct io_ring_ctx *ctx)
__releases(ctx->completion_lock)
{
io_commit_cqring(ctx);
- __io_cq_unlock(ctx);
- io_commit_cqring_flush(ctx);
- /*
- * As ->task_complete implies that the ring is single tasked, cq_wait
- * may only be waited on by the current in io_cqring_wait(), but since
- * it will re-check the wakeup conditions once we return we can safely
- * skip waking it up.
- */
- if (!(ctx->flags & IORING_SETUP_DEFER_TASKRUN)) {
- smp_mb();
- __io_cqring_wake(ctx);
+ if (ctx->task_complete) {
+ /*
+ * ->task_complete implies that only current might be waiting
+ * for CQEs, and obviously, we currently don't. No one is
+ * waiting, wakeups are futile, skip them.
+ */
+ io_commit_cqring_flush(ctx);
+ } else {
+ __io_cq_unlock(ctx);
+ io_commit_cqring_flush(ctx);
+ io_cqring_wake(ctx);
}
}
@@ -960,9 +967,10 @@ bool io_aux_cqe(struct io_ring_ctx *ctx, bool defer, u64 user_data, s32 res, u32
return true;
}
-static void __io_req_complete_post(struct io_kiocb *req)
+static void __io_req_complete_post(struct io_kiocb *req, unsigned issue_flags)
{
struct io_ring_ctx *ctx = req->ctx;
+ struct io_rsrc_node *rsrc_node = NULL;
io_cq_lock(ctx);
if (!(req->flags & REQ_F_CQE_SKIP))
@@ -983,7 +991,7 @@ static void __io_req_complete_post(struct io_kiocb *req)
}
io_put_kbuf_comp(req);
io_dismantle_req(req);
- io_req_put_rsrc(req);
+ rsrc_node = req->rsrc_node;
/*
* Selected buffer deallocation in io_clean_op() assumes that
* we don't hold ->completion_lock. Clean them here to avoid
@@ -994,6 +1002,12 @@ static void __io_req_complete_post(struct io_kiocb *req)
ctx->locked_free_nr++;
}
io_cq_unlock_post(ctx);
+
+ if (rsrc_node) {
+ io_ring_submit_lock(ctx, issue_flags);
+ io_put_rsrc_node(ctx, rsrc_node);
+ io_ring_submit_unlock(ctx, issue_flags);
+ }
}
void io_req_complete_post(struct io_kiocb *req, unsigned issue_flags)
@@ -1003,12 +1017,12 @@ void io_req_complete_post(struct io_kiocb *req, unsigned issue_flags)
io_req_task_work_add(req);
} else if (!(issue_flags & IO_URING_F_UNLOCKED) ||
!(req->ctx->flags & IORING_SETUP_IOPOLL)) {
- __io_req_complete_post(req);
+ __io_req_complete_post(req, issue_flags);
} else {
struct io_ring_ctx *ctx = req->ctx;
mutex_lock(&ctx->uring_lock);
- __io_req_complete_post(req);
+ __io_req_complete_post(req, issue_flags & ~IO_URING_F_UNLOCKED);
mutex_unlock(&ctx->uring_lock);
}
}
@@ -1106,11 +1120,14 @@ static inline void io_dismantle_req(struct io_kiocb *req)
io_put_file(req->file);
}
-__cold void io_free_req(struct io_kiocb *req)
+static __cold void io_free_req_tw(struct io_kiocb *req, struct io_tw_state *ts)
{
struct io_ring_ctx *ctx = req->ctx;
- io_req_put_rsrc(req);
+ if (req->rsrc_node) {
+ io_tw_lock(ctx, ts);
+ io_put_rsrc_node(ctx, req->rsrc_node);
+ }
io_dismantle_req(req);
io_put_task_remote(req->task, 1);
@@ -1120,6 +1137,12 @@ __cold void io_free_req(struct io_kiocb *req)
spin_unlock(&ctx->completion_lock);
}
+__cold void io_free_req(struct io_kiocb *req)
+{
+ req->io_task_work.func = io_free_req_tw;
+ io_req_task_work_add(req);
+}
+
static void __io_req_find_next_prep(struct io_kiocb *req)
{
struct io_ring_ctx *ctx = req->ctx;
@@ -1146,22 +1169,23 @@ static inline struct io_kiocb *io_req_find_next(struct io_kiocb *req)
return nxt;
}
-static void ctx_flush_and_put(struct io_ring_ctx *ctx, bool *locked)
+static void ctx_flush_and_put(struct io_ring_ctx *ctx, struct io_tw_state *ts)
{
if (!ctx)
return;
if (ctx->flags & IORING_SETUP_TASKRUN_FLAG)
atomic_andnot(IORING_SQ_TASKRUN, &ctx->rings->sq_flags);
- if (*locked) {
+ if (ts->locked) {
io_submit_flush_completions(ctx);
mutex_unlock(&ctx->uring_lock);
- *locked = false;
+ ts->locked = false;
}
percpu_ref_put(&ctx->refs);
}
static unsigned int handle_tw_list(struct llist_node *node,
- struct io_ring_ctx **ctx, bool *locked,
+ struct io_ring_ctx **ctx,
+ struct io_tw_state *ts,
struct llist_node *last)
{
unsigned int count = 0;
@@ -1174,18 +1198,17 @@ static unsigned int handle_tw_list(struct llist_node *node,
prefetch(container_of(next, struct io_kiocb, io_task_work.node));
if (req->ctx != *ctx) {
- ctx_flush_and_put(*ctx, locked);
+ ctx_flush_and_put(*ctx, ts);
*ctx = req->ctx;
/* if not contended, grab and improve batching */
- *locked = mutex_trylock(&(*ctx)->uring_lock);
+ ts->locked = mutex_trylock(&(*ctx)->uring_lock);
percpu_ref_get(&(*ctx)->refs);
- } else if (!*locked)
- *locked = mutex_trylock(&(*ctx)->uring_lock);
- req->io_task_work.func(req, locked);
+ }
+ req->io_task_work.func(req, ts);
node = next;
count++;
if (unlikely(need_resched())) {
- ctx_flush_and_put(*ctx, locked);
+ ctx_flush_and_put(*ctx, ts);
*ctx = NULL;
cond_resched();
}
@@ -1226,7 +1249,7 @@ static inline struct llist_node *io_llist_cmpxchg(struct llist_head *head,
void tctx_task_work(struct callback_head *cb)
{
- bool uring_locked = false;
+ struct io_tw_state ts = {};
struct io_ring_ctx *ctx = NULL;
struct io_uring_task *tctx = container_of(cb, struct io_uring_task,
task_work);
@@ -1243,12 +1266,12 @@ void tctx_task_work(struct callback_head *cb)
do {
loops++;
node = io_llist_xchg(&tctx->task_list, &fake);
- count += handle_tw_list(node, &ctx, &uring_locked, &fake);
+ count += handle_tw_list(node, &ctx, &ts, &fake);
/* skip expensive cmpxchg if there are items in the list */
if (READ_ONCE(tctx->task_list.first) != &fake)
continue;
- if (uring_locked && !wq_list_empty(&ctx->submit_state.compl_reqs)) {
+ if (ts.locked && !wq_list_empty(&ctx->submit_state.compl_reqs)) {
io_submit_flush_completions(ctx);
if (READ_ONCE(tctx->task_list.first) != &fake)
continue;
@@ -1256,7 +1279,7 @@ void tctx_task_work(struct callback_head *cb)
node = io_llist_cmpxchg(&tctx->task_list, &fake, NULL);
} while (node != &fake);
- ctx_flush_and_put(ctx, &uring_locked);
+ ctx_flush_and_put(ctx, &ts);
/* relaxed read is enough as only the task itself sets ->in_cancel */
if (unlikely(atomic_read(&tctx->in_cancel)))
@@ -1279,42 +1302,67 @@ static __cold void io_fallback_tw(struct io_uring_task *tctx)
}
}
-static void io_req_local_work_add(struct io_kiocb *req)
+static void io_req_local_work_add(struct io_kiocb *req, unsigned flags)
{
struct io_ring_ctx *ctx = req->ctx;
+ unsigned nr_wait, nr_tw, nr_tw_prev;
+ struct llist_node *first;
- percpu_ref_get(&ctx->refs);
+ if (req->flags & (REQ_F_LINK | REQ_F_HARDLINK))
+ flags &= ~IOU_F_TWQ_LAZY_WAKE;
- if (!llist_add(&req->io_task_work.node, &ctx->work_llist))
- goto put_ref;
+ first = READ_ONCE(ctx->work_llist.first);
+ do {
+ nr_tw_prev = 0;
+ if (first) {
+ struct io_kiocb *first_req = container_of(first,
+ struct io_kiocb,
+ io_task_work.node);
+ /*
+ * Might be executed at any moment, rely on
+ * SLAB_TYPESAFE_BY_RCU to keep it alive.
+ */
+ nr_tw_prev = READ_ONCE(first_req->nr_tw);
+ }
+ nr_tw = nr_tw_prev + 1;
+ /* Large enough to fail the nr_wait comparison below */
+ if (!(flags & IOU_F_TWQ_LAZY_WAKE))
+ nr_tw = -1U;
- /* needed for the following wake up */
- smp_mb__after_atomic();
+ req->nr_tw = nr_tw;
+ req->io_task_work.node.next = first;
+ } while (!try_cmpxchg(&ctx->work_llist.first, &first,
+ &req->io_task_work.node));
- if (unlikely(atomic_read(&req->task->io_uring->in_cancel))) {
- io_move_task_work_from_local(ctx);
- goto put_ref;
+ if (!first) {
+ if (ctx->flags & IORING_SETUP_TASKRUN_FLAG)
+ atomic_or(IORING_SQ_TASKRUN, &ctx->rings->sq_flags);
+ if (ctx->has_evfd)
+ io_eventfd_signal(ctx);
}
- if (ctx->flags & IORING_SETUP_TASKRUN_FLAG)
- atomic_or(IORING_SQ_TASKRUN, &ctx->rings->sq_flags);
- if (ctx->has_evfd)
- io_eventfd_signal(ctx);
-
- if (READ_ONCE(ctx->cq_waiting))
- wake_up_state(ctx->submitter_task, TASK_INTERRUPTIBLE);
-
-put_ref:
- percpu_ref_put(&ctx->refs);
+ nr_wait = atomic_read(&ctx->cq_wait_nr);
+ /* no one is waiting */
+ if (!nr_wait)
+ return;
+ /* either not enough or the previous add has already woken it up */
+ if (nr_wait > nr_tw || nr_tw_prev >= nr_wait)
+ return;
+ /* pairs with set_current_state() in io_cqring_wait() */
+ smp_mb__after_atomic();
+ wake_up_state(ctx->submitter_task, TASK_INTERRUPTIBLE);
}
-void __io_req_task_work_add(struct io_kiocb *req, bool allow_local)
+void __io_req_task_work_add(struct io_kiocb *req, unsigned flags)
{
struct io_uring_task *tctx = req->task->io_uring;
struct io_ring_ctx *ctx = req->ctx;
- if (allow_local && ctx->flags & IORING_SETUP_DEFER_TASKRUN) {
- io_req_local_work_add(req);
+ if (!(flags & IOU_F_TWQ_FORCE_NORMAL) &&
+ (ctx->flags & IORING_SETUP_DEFER_TASKRUN)) {
+ rcu_read_lock();
+ io_req_local_work_add(req, flags);
+ rcu_read_unlock();
return;
}
@@ -1341,11 +1389,11 @@ static void __cold io_move_task_work_from_local(struct io_ring_ctx *ctx)
io_task_work.node);
node = node->next;
- __io_req_task_work_add(req, false);
+ __io_req_task_work_add(req, IOU_F_TWQ_FORCE_NORMAL);
}
}
-static int __io_run_local_work(struct io_ring_ctx *ctx, bool *locked)
+static int __io_run_local_work(struct io_ring_ctx *ctx, struct io_tw_state *ts)
{
struct llist_node *node;
unsigned int loops = 0;
@@ -1362,7 +1410,7 @@ static int __io_run_local_work(struct io_ring_ctx *ctx, bool *locked)
struct io_kiocb *req = container_of(node, struct io_kiocb,
io_task_work.node);
prefetch(container_of(next, struct io_kiocb, io_task_work.node));
- req->io_task_work.func(req, locked);
+ req->io_task_work.func(req, ts);
ret++;
node = next;
}
@@ -1370,7 +1418,7 @@ static int __io_run_local_work(struct io_ring_ctx *ctx, bool *locked)
if (!llist_empty(&ctx->work_llist))
goto again;
- if (*locked) {
+ if (ts->locked) {
io_submit_flush_completions(ctx);
if (!llist_empty(&ctx->work_llist))
goto again;
@@ -1381,46 +1429,46 @@ static int __io_run_local_work(struct io_ring_ctx *ctx, bool *locked)
static inline int io_run_local_work_locked(struct io_ring_ctx *ctx)
{
- bool locked;
+ struct io_tw_state ts = { .locked = true, };
int ret;
if (llist_empty(&ctx->work_llist))
return 0;
- locked = true;
- ret = __io_run_local_work(ctx, &locked);
+ ret = __io_run_local_work(ctx, &ts);
/* shouldn't happen! */
- if (WARN_ON_ONCE(!locked))
+ if (WARN_ON_ONCE(!ts.locked))
mutex_lock(&ctx->uring_lock);
return ret;
}
static int io_run_local_work(struct io_ring_ctx *ctx)
{
- bool locked = mutex_trylock(&ctx->uring_lock);
+ struct io_tw_state ts = {};
int ret;
- ret = __io_run_local_work(ctx, &locked);
- if (locked)
+ ts.locked = mutex_trylock(&ctx->uring_lock);
+ ret = __io_run_local_work(ctx, &ts);
+ if (ts.locked)
mutex_unlock(&ctx->uring_lock);
return ret;
}
-static void io_req_task_cancel(struct io_kiocb *req, bool *locked)
+static void io_req_task_cancel(struct io_kiocb *req, struct io_tw_state *ts)
{
- io_tw_lock(req->ctx, locked);
+ io_tw_lock(req->ctx, ts);
io_req_defer_failed(req, req->cqe.res);
}
-void io_req_task_submit(struct io_kiocb *req, bool *locked)
+void io_req_task_submit(struct io_kiocb *req, struct io_tw_state *ts)
{
- io_tw_lock(req->ctx, locked);
+ io_tw_lock(req->ctx, ts);
/* req->task == current here, checking PF_EXITING is safe */
if (unlikely(req->task->flags & PF_EXITING))
io_req_defer_failed(req, -EFAULT);
else if (req->flags & REQ_F_FORCE_ASYNC)
- io_queue_iowq(req, locked);
+ io_queue_iowq(req, ts);
else
io_queue_sqe(req);
}
@@ -1646,9 +1694,9 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, long min)
return ret;
}
-void io_req_task_complete(struct io_kiocb *req, bool *locked)
+void io_req_task_complete(struct io_kiocb *req, struct io_tw_state *ts)
{
- if (*locked)
+ if (ts->locked)
io_req_complete_defer(req);
else
io_req_complete_post(req, IO_URING_F_UNLOCKED);
@@ -1927,9 +1975,9 @@ static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags)
return 0;
}
-int io_poll_issue(struct io_kiocb *req, bool *locked)
+int io_poll_issue(struct io_kiocb *req, struct io_tw_state *ts)
{
- io_tw_lock(req->ctx, locked);
+ io_tw_lock(req->ctx, ts);
return io_issue_sqe(req, IO_URING_F_NONBLOCK|IO_URING_F_MULTISHOT|
IO_URING_F_COMPLETE_DEFER);
}
@@ -2298,8 +2346,7 @@ static inline int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *req,
if (unlikely(ret))
return io_submit_fail_init(sqe, req, ret);
- /* don't need @sqe from now on */
- trace_io_uring_submit_sqe(req, true);
+ trace_io_uring_submit_req(req);
/*
* If we already have a head request, queue this one for async
@@ -2428,7 +2475,7 @@ int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr)
if (unlikely(!entries))
return 0;
/* make sure SQ entry isn't read before tail */
- ret = left = min3(nr, ctx->sq_entries, entries);
+ ret = left = min(nr, entries);
io_get_task_refs(left);
io_submit_state_start(&ctx->submit_state, left);
@@ -2600,7 +2647,9 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
unsigned long check_cq;
if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) {
- WRITE_ONCE(ctx->cq_waiting, 1);
+ int nr_wait = (int) iowq.cq_tail - READ_ONCE(ctx->rings->cq.tail);
+
+ atomic_set(&ctx->cq_wait_nr, nr_wait);
set_current_state(TASK_INTERRUPTIBLE);
} else {
prepare_to_wait_exclusive(&ctx->cq_wait, &iowq.wq,
@@ -2609,7 +2658,7 @@ static int io_cqring_wait(struct io_ring_ctx *ctx, int min_events,
ret = io_cqring_wait_schedule(ctx, &iowq);
__set_current_state(TASK_RUNNING);
- WRITE_ONCE(ctx->cq_waiting, 0);
+ atomic_set(&ctx->cq_wait_nr, 0);
if (ret < 0)
break;
@@ -2772,10 +2821,14 @@ static void io_req_caches_free(struct io_ring_ctx *ctx)
mutex_unlock(&ctx->uring_lock);
}
+static void io_rsrc_node_cache_free(struct io_cache_entry *entry)
+{
+ kfree(container_of(entry, struct io_rsrc_node, cache));
+}
+
static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
{
io_sq_thread_finish(ctx);
- io_rsrc_refs_drop(ctx);
/* __io_rsrc_put_work() may need uring_lock to progress, wait w/o it */
io_wait_rsrc_data(ctx->buf_data);
io_wait_rsrc_data(ctx->file_data);
@@ -2798,14 +2851,11 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
/* there are no registered resources left, nobody uses it */
if (ctx->rsrc_node)
- io_rsrc_node_destroy(ctx->rsrc_node);
+ io_rsrc_node_destroy(ctx, ctx->rsrc_node);
if (ctx->rsrc_backup_node)
- io_rsrc_node_destroy(ctx->rsrc_backup_node);
- flush_delayed_work(&ctx->rsrc_put_work);
- flush_delayed_work(&ctx->fallback_work);
+ io_rsrc_node_destroy(ctx, ctx->rsrc_backup_node);
WARN_ON_ONCE(!list_empty(&ctx->rsrc_ref_list));
- WARN_ON_ONCE(!llist_empty(&ctx->rsrc_put_llist));
#if defined(CONFIG_UNIX)
if (ctx->ring_sock) {
@@ -2815,6 +2865,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ctx *ctx)
#endif
WARN_ON_ONCE(!list_empty(&ctx->ltimeout_list));
+ io_alloc_cache_free(&ctx->rsrc_node_cache, io_rsrc_node_cache_free);
if (ctx->mm_account) {
mmdrop(ctx->mm_account);
ctx->mm_account = NULL;
@@ -3031,6 +3082,10 @@ static __cold void io_ring_exit_work(struct work_struct *work)
spin_lock(&ctx->completion_lock);
spin_unlock(&ctx->completion_lock);
+ /* pairs with RCU read section in io_req_local_work_add() */
+ if (ctx->flags & IORING_SETUP_DEFER_TASKRUN)
+ synchronize_rcu();
+
io_ring_ctx_free(ctx);
}
@@ -3146,6 +3201,12 @@ static __cold bool io_uring_try_cancel_requests(struct io_ring_ctx *ctx,
enum io_wq_cancel cret;
bool ret = false;
+ /* set it so io_req_local_work_add() would wake us up */
+ if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) {
+ atomic_set(&ctx->cq_wait_nr, 1);
+ smp_mb();
+ }
+
/* failed during ring init, it couldn't have issued any requests */
if (!ctx->rings)
return false;
@@ -3200,6 +3261,8 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd)
{
struct io_uring_task *tctx = current->io_uring;
struct io_ring_ctx *ctx;
+ struct io_tctx_node *node;
+ unsigned long index;
s64 inflight;
DEFINE_WAIT(wait);
@@ -3221,9 +3284,6 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd)
break;
if (!sqd) {
- struct io_tctx_node *node;
- unsigned long index;
-
xa_for_each(&tctx->xa, index, node) {
/* sqpoll task will cancel all its requests */
if (node->ctx->sq_data)
@@ -3246,7 +3306,13 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd)
prepare_to_wait(&tctx->wait, &wait, TASK_INTERRUPTIBLE);
io_run_task_work();
io_uring_drop_tctx_refs(current);
-
+ xa_for_each(&tctx->xa, index, node) {
+ if (!llist_empty(&node->ctx->work_llist)) {
+ WARN_ON_ONCE(node->ctx->submitter_task &&
+ node->ctx->submitter_task != current);
+ goto end_wait;
+ }
+ }
/*
* If we've seen completions, retry without waiting. This
* avoids a race where a completion comes in before we did
@@ -3254,6 +3320,7 @@ __cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd)
*/
if (inflight == tctx_inflight(tctx, !cancel_all))
schedule();
+end_wait:
finish_wait(&tctx->wait, &wait);
} while (1);
@@ -3282,7 +3349,7 @@ static void *io_uring_validate_mmap_request(struct file *file,
struct page *page;
void *ptr;
- switch (offset) {
+ switch (offset & IORING_OFF_MMAP_MASK) {
case IORING_OFF_SQ_RING:
case IORING_OFF_CQ_RING:
ptr = ctx->rings;
@@ -3290,6 +3357,17 @@ static void *io_uring_validate_mmap_request(struct file *file,
case IORING_OFF_SQES:
ptr = ctx->sq_sqes;
break;
+ case IORING_OFF_PBUF_RING: {
+ unsigned int bgid;
+
+ bgid = (offset & ~IORING_OFF_MMAP_MASK) >> IORING_OFF_PBUF_SHIFT;
+ mutex_lock(&ctx->uring_lock);
+ ptr = io_pbuf_get_address(ctx, bgid);
+ mutex_unlock(&ctx->uring_lock);
+ if (!ptr)
+ return ERR_PTR(-EINVAL);
+ break;
+ }
default:
return ERR_PTR(-EINVAL);
}
@@ -3317,6 +3395,54 @@ static __cold int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
return remap_pfn_range(vma, vma->vm_start, pfn, sz, vma->vm_page_prot);
}
+static unsigned long io_uring_mmu_get_unmapped_area(struct file *filp,
+ unsigned long addr, unsigned long len,
+ unsigned long pgoff, unsigned long flags)
+{
+ const unsigned long mmap_end = arch_get_mmap_end(addr, len, flags);
+ struct vm_unmapped_area_info info;
+ void *ptr;
+
+ /*
+ * Do not allow to map to user-provided address to avoid breaking the
+ * aliasing rules. Userspace is not able to guess the offset address of
+ * kernel kmalloc()ed memory area.
+ */
+ if (addr)
+ return -EINVAL;
+
+ ptr = io_uring_validate_mmap_request(filp, pgoff, len);
+ if (IS_ERR(ptr))
+ return -ENOMEM;
+
+ info.flags = VM_UNMAPPED_AREA_TOPDOWN;
+ info.length = len;
+ info.low_limit = max(PAGE_SIZE, mmap_min_addr);
+ info.high_limit = arch_get_mmap_base(addr, current->mm->mmap_base);
+#ifdef SHM_COLOUR
+ info.align_mask = PAGE_MASK & (SHM_COLOUR - 1UL);
+#else
+ info.align_mask = PAGE_MASK & (SHMLBA - 1UL);
+#endif
+ info.align_offset = (unsigned long) ptr;
+
+ /*
+ * A failed mmap() very likely causes application failure,
+ * so fall back to the bottom-up function here. This scenario
+ * can happen with large stack limits and large mmap()
+ * allocations.
+ */
+ addr = vm_unmapped_area(&info);
+ if (offset_in_page(addr)) {
+ info.flags = 0;
+ info.low_limit = TASK_UNMAPPED_BASE;
+ info.high_limit = mmap_end;
+ addr = vm_unmapped_area(&info);
+ }
+
+ return addr;
+}
+
#else /* !CONFIG_MMU */
static int io_uring_mmap(struct file *file, struct vm_area_struct *vma)
@@ -3529,6 +3655,8 @@ static const struct file_operations io_uring_fops = {
#ifndef CONFIG_MMU
.get_unmapped_area = io_uring_nommu_get_unmapped_area,
.mmap_capabilities = io_uring_nommu_mmap_capabilities,
+#else
+ .get_unmapped_area = io_uring_mmu_get_unmapped_area,
#endif
.poll = io_uring_poll,
#ifdef CONFIG_PROC_FS
@@ -4425,7 +4553,7 @@ static int __init io_uring_init(void)
io_uring_optable_init();
req_cachep = KMEM_CACHE(io_kiocb, SLAB_HWCACHE_ALIGN | SLAB_PANIC |
- SLAB_ACCOUNT);
+ SLAB_ACCOUNT | SLAB_TYPESAFE_BY_RCU);
return 0;
};
__initcall(io_uring_init);
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index 2711865..ef449e4 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -16,6 +16,20 @@
#endif
enum {
+ /* don't use deferred task_work */
+ IOU_F_TWQ_FORCE_NORMAL = 1,
+
+ /*
+ * A hint to not wake right away but delay until there are enough of
+ * tw's queued to match the number of CQEs the task is waiting for.
+ *
+ * Must not be used wirh requests generating more than one CQE.
+ * It's also ignored unless IORING_SETUP_DEFER_TASKRUN is set.
+ */
+ IOU_F_TWQ_LAZY_WAKE = 2,
+};
+
+enum {
IOU_OK = 0,
IOU_ISSUE_SKIP_COMPLETE = -EIOCBQUEUED,
@@ -48,20 +62,20 @@ static inline bool io_req_ffs_set(struct io_kiocb *req)
return req->flags & REQ_F_FIXED_FILE;
}
-void __io_req_task_work_add(struct io_kiocb *req, bool allow_local);
+void __io_req_task_work_add(struct io_kiocb *req, unsigned flags);
bool io_is_uring_fops(struct file *file);
bool io_alloc_async_data(struct io_kiocb *req);
void io_req_task_queue(struct io_kiocb *req);
-void io_queue_iowq(struct io_kiocb *req, bool *dont_use);
-void io_req_task_complete(struct io_kiocb *req, bool *locked);
+void io_queue_iowq(struct io_kiocb *req, struct io_tw_state *ts_dont_use);
+void io_req_task_complete(struct io_kiocb *req, struct io_tw_state *ts);
void io_req_task_queue_fail(struct io_kiocb *req, int ret);
-void io_req_task_submit(struct io_kiocb *req, bool *locked);
+void io_req_task_submit(struct io_kiocb *req, struct io_tw_state *ts);
void tctx_task_work(struct callback_head *cb);
__cold void io_uring_cancel_generic(bool cancel_all, struct io_sq_data *sqd);
int io_uring_alloc_task_context(struct task_struct *task,
struct io_ring_ctx *ctx);
-int io_poll_issue(struct io_kiocb *req, bool *locked);
+int io_poll_issue(struct io_kiocb *req, struct io_tw_state *ts);
int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr);
int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin);
void io_free_batch_list(struct io_ring_ctx *ctx, struct io_wq_work_node *node);
@@ -93,7 +107,7 @@ bool io_match_task_safe(struct io_kiocb *head, struct task_struct *task,
static inline void io_req_task_work_add(struct io_kiocb *req)
{
- __io_req_task_work_add(req, true);
+ __io_req_task_work_add(req, 0);
}
#define io_for_each_link(pos, head) \
@@ -228,8 +242,7 @@ static inline void io_poll_wq_wake(struct io_ring_ctx *ctx)
poll_to_key(EPOLL_URING_WAKE | EPOLLIN));
}
-/* requires smb_mb() prior, see wq_has_sleeper() */
-static inline void __io_cqring_wake(struct io_ring_ctx *ctx)
+static inline void io_cqring_wake(struct io_ring_ctx *ctx)
{
/*
* Trigger waitqueue handler on all waiters on our waitqueue. This
@@ -241,17 +254,11 @@ static inline void __io_cqring_wake(struct io_ring_ctx *ctx)
* waitqueue handlers, we know we have a dependency between eventfd or
* epoll and should terminate multishot poll at that point.
*/
- if (waitqueue_active(&ctx->cq_wait))
+ if (wq_has_sleeper(&ctx->cq_wait))
__wake_up(&ctx->cq_wait, TASK_NORMAL, 0,
poll_to_key(EPOLL_URING_WAKE | EPOLLIN));
}
-static inline void io_cqring_wake(struct io_ring_ctx *ctx)
-{
- smp_mb();
- __io_cqring_wake(ctx);
-}
-
static inline bool io_sqring_full(struct io_ring_ctx *ctx)
{
struct io_rings *r = ctx->rings;
@@ -262,9 +269,11 @@ static inline bool io_sqring_full(struct io_ring_ctx *ctx)
static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)
{
struct io_rings *rings = ctx->rings;
+ unsigned int entries;
/* make sure SQ entry isn't read before tail */
- return smp_load_acquire(&rings->sq.tail) - ctx->cached_sq_head;
+ entries = smp_load_acquire(&rings->sq.tail) - ctx->cached_sq_head;
+ return min(entries, ctx->sq_entries);
}
static inline int io_run_task_work(void)
@@ -299,11 +308,11 @@ static inline bool io_task_work_pending(struct io_ring_ctx *ctx)
return task_work_pending(current) || !wq_list_empty(&ctx->work_llist);
}
-static inline void io_tw_lock(struct io_ring_ctx *ctx, bool *locked)
+static inline void io_tw_lock(struct io_ring_ctx *ctx, struct io_tw_state *ts)
{
- if (!*locked) {
+ if (!ts->locked) {
mutex_lock(&ctx->uring_lock);
- *locked = true;
+ ts->locked = true;
}
}
diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c
index 3002dc8..79c2545 100644
--- a/io_uring/kbuf.c
+++ b/io_uring/kbuf.c
@@ -137,7 +137,8 @@ static void __user *io_ring_buffer_select(struct io_kiocb *req, size_t *len,
return NULL;
head &= bl->mask;
- if (head < IO_BUFFER_LIST_BUF_PER_PAGE) {
+ /* mmaped buffers are always contig */
+ if (bl->is_mmap || head < IO_BUFFER_LIST_BUF_PER_PAGE) {
buf = &br->bufs[head];
} else {
int off = head & (IO_BUFFER_LIST_BUF_PER_PAGE - 1);
@@ -179,7 +180,7 @@ void __user *io_buffer_select(struct io_kiocb *req, size_t *len,
bl = io_buffer_get_list(ctx, req->buf_index);
if (likely(bl)) {
- if (bl->buf_nr_pages)
+ if (bl->is_mapped)
ret = io_ring_buffer_select(req, len, bl, issue_flags);
else
ret = io_provided_buffer_select(req, len, bl);
@@ -214,17 +215,30 @@ static int __io_remove_buffers(struct io_ring_ctx *ctx,
if (!nbufs)
return 0;
- if (bl->buf_nr_pages) {
- int j;
-
+ if (bl->is_mapped) {
i = bl->buf_ring->tail - bl->head;
- for (j = 0; j < bl->buf_nr_pages; j++)
- unpin_user_page(bl->buf_pages[j]);
- kvfree(bl->buf_pages);
- bl->buf_pages = NULL;
- bl->buf_nr_pages = 0;
+ if (bl->is_mmap) {
+ if (bl->buf_ring) {
+ struct page *page;
+
+ page = virt_to_head_page(bl->buf_ring);
+ if (put_page_testzero(page))
+ free_compound_page(page);
+ bl->buf_ring = NULL;
+ }
+ bl->is_mmap = 0;
+ } else if (bl->buf_nr_pages) {
+ int j;
+
+ for (j = 0; j < bl->buf_nr_pages; j++)
+ unpin_user_page(bl->buf_pages[j]);
+ kvfree(bl->buf_pages);
+ bl->buf_pages = NULL;
+ bl->buf_nr_pages = 0;
+ }
/* make sure it's seen as empty */
INIT_LIST_HEAD(&bl->buf_list);
+ bl->is_mapped = 0;
return i;
}
@@ -303,7 +317,7 @@ int io_remove_buffers(struct io_kiocb *req, unsigned int issue_flags)
if (bl) {
ret = -EINVAL;
/* can't use provide/remove buffers command on mapped buffers */
- if (!bl->buf_nr_pages)
+ if (!bl->is_mapped)
ret = __io_remove_buffers(ctx, bl, p->nbufs);
}
io_ring_submit_unlock(ctx, issue_flags);
@@ -448,7 +462,7 @@ int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
}
}
/* can't add buffers via this command for a mapped buffer ring */
- if (bl->buf_nr_pages) {
+ if (bl->is_mapped) {
ret = -EINVAL;
goto err;
}
@@ -463,23 +477,87 @@ int io_provide_buffers(struct io_kiocb *req, unsigned int issue_flags)
return IOU_OK;
}
-int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
+static int io_pin_pbuf_ring(struct io_uring_buf_reg *reg,
+ struct io_buffer_list *bl)
{
struct io_uring_buf_ring *br;
- struct io_uring_buf_reg reg;
- struct io_buffer_list *bl, *free_bl = NULL;
struct page **pages;
int nr_pages;
+ pages = io_pin_pages(reg->ring_addr,
+ flex_array_size(br, bufs, reg->ring_entries),
+ &nr_pages);
+ if (IS_ERR(pages))
+ return PTR_ERR(pages);
+
+ br = page_address(pages[0]);
+#ifdef SHM_COLOUR
+ /*
+ * On platforms that have specific aliasing requirements, SHM_COLOUR
+ * is set and we must guarantee that the kernel and user side align
+ * nicely. We cannot do that if IOU_PBUF_RING_MMAP isn't set and
+ * the application mmap's the provided ring buffer. Fail the request
+ * if we, by chance, don't end up with aligned addresses. The app
+ * should use IOU_PBUF_RING_MMAP instead, and liburing will handle
+ * this transparently.
+ */
+ if ((reg->ring_addr | (unsigned long) br) & (SHM_COLOUR - 1)) {
+ int i;
+
+ for (i = 0; i < nr_pages; i++)
+ unpin_user_page(pages[i]);
+ return -EINVAL;
+ }
+#endif
+ bl->buf_pages = pages;
+ bl->buf_nr_pages = nr_pages;
+ bl->buf_ring = br;
+ bl->is_mapped = 1;
+ bl->is_mmap = 0;
+ return 0;
+}
+
+static int io_alloc_pbuf_ring(struct io_uring_buf_reg *reg,
+ struct io_buffer_list *bl)
+{
+ gfp_t gfp = GFP_KERNEL_ACCOUNT | __GFP_ZERO | __GFP_NOWARN | __GFP_COMP;
+ size_t ring_size;
+ void *ptr;
+
+ ring_size = reg->ring_entries * sizeof(struct io_uring_buf_ring);
+ ptr = (void *) __get_free_pages(gfp, get_order(ring_size));
+ if (!ptr)
+ return -ENOMEM;
+
+ bl->buf_ring = ptr;
+ bl->is_mapped = 1;
+ bl->is_mmap = 1;
+ return 0;
+}
+
+int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
+{
+ struct io_uring_buf_reg reg;
+ struct io_buffer_list *bl, *free_bl = NULL;
+ int ret;
+
if (copy_from_user(®, arg, sizeof(reg)))
return -EFAULT;
- if (reg.pad || reg.resv[0] || reg.resv[1] || reg.resv[2])
+ if (reg.resv[0] || reg.resv[1] || reg.resv[2])
return -EINVAL;
- if (!reg.ring_addr)
- return -EFAULT;
- if (reg.ring_addr & ~PAGE_MASK)
+ if (reg.flags & ~IOU_PBUF_RING_MMAP)
return -EINVAL;
+ if (!(reg.flags & IOU_PBUF_RING_MMAP)) {
+ if (!reg.ring_addr)
+ return -EFAULT;
+ if (reg.ring_addr & ~PAGE_MASK)
+ return -EINVAL;
+ } else {
+ if (reg.ring_addr)
+ return -EINVAL;
+ }
+
if (!is_power_of_2(reg.ring_entries))
return -EINVAL;
@@ -496,7 +574,7 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
bl = io_buffer_get_list(ctx, reg.bgid);
if (bl) {
/* if mapped buffer ring OR classic exists, don't allow */
- if (bl->buf_nr_pages || !list_empty(&bl->buf_list))
+ if (bl->is_mapped || !list_empty(&bl->buf_list))
return -EEXIST;
} else {
free_bl = bl = kzalloc(sizeof(*bl), GFP_KERNEL);
@@ -504,22 +582,21 @@ int io_register_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
return -ENOMEM;
}
- pages = io_pin_pages(reg.ring_addr,
- flex_array_size(br, bufs, reg.ring_entries),
- &nr_pages);
- if (IS_ERR(pages)) {
- kfree(free_bl);
- return PTR_ERR(pages);
+ if (!(reg.flags & IOU_PBUF_RING_MMAP))
+ ret = io_pin_pbuf_ring(®, bl);
+ else
+ ret = io_alloc_pbuf_ring(®, bl);
+
+ if (!ret) {
+ bl->nr_entries = reg.ring_entries;
+ bl->mask = reg.ring_entries - 1;
+
+ io_buffer_add_list(ctx, bl, reg.bgid);
+ return 0;
}
- br = page_address(pages[0]);
- bl->buf_pages = pages;
- bl->buf_nr_pages = nr_pages;
- bl->nr_entries = reg.ring_entries;
- bl->buf_ring = br;
- bl->mask = reg.ring_entries - 1;
- io_buffer_add_list(ctx, bl, reg.bgid);
- return 0;
+ kfree(free_bl);
+ return ret;
}
int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
@@ -529,13 +606,15 @@ int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
if (copy_from_user(®, arg, sizeof(reg)))
return -EFAULT;
- if (reg.pad || reg.resv[0] || reg.resv[1] || reg.resv[2])
+ if (reg.resv[0] || reg.resv[1] || reg.resv[2])
+ return -EINVAL;
+ if (reg.flags)
return -EINVAL;
bl = io_buffer_get_list(ctx, reg.bgid);
if (!bl)
return -ENOENT;
- if (!bl->buf_nr_pages)
+ if (!bl->is_mapped)
return -EINVAL;
__io_remove_buffers(ctx, bl, -1U);
@@ -545,3 +624,14 @@ int io_unregister_pbuf_ring(struct io_ring_ctx *ctx, void __user *arg)
}
return 0;
}
+
+void *io_pbuf_get_address(struct io_ring_ctx *ctx, unsigned long bgid)
+{
+ struct io_buffer_list *bl;
+
+ bl = io_buffer_get_list(ctx, bgid);
+ if (!bl || !bl->is_mmap)
+ return NULL;
+
+ return bl->buf_ring;
+}
diff --git a/io_uring/kbuf.h b/io_uring/kbuf.h
index c23e15d..d14345e 100644
--- a/io_uring/kbuf.h
+++ b/io_uring/kbuf.h
@@ -23,6 +23,11 @@ struct io_buffer_list {
__u16 nr_entries;
__u16 head;
__u16 mask;
+
+ /* ring mapped provided buffers */
+ __u8 is_mapped;
+ /* ring mapped provided buffers, but mmap'ed by application */
+ __u8 is_mmap;
};
struct io_buffer {
@@ -50,6 +55,8 @@ unsigned int __io_put_kbuf(struct io_kiocb *req, unsigned issue_flags);
void io_kbuf_recycle_legacy(struct io_kiocb *req, unsigned issue_flags);
+void *io_pbuf_get_address(struct io_ring_ctx *ctx, unsigned long bgid);
+
static inline void io_kbuf_recycle_ring(struct io_kiocb *req)
{
/*
diff --git a/io_uring/net.h b/io_uring/net.h
index 5ffa11b..1910099 100644
--- a/io_uring/net.h
+++ b/io_uring/net.h
@@ -5,8 +5,8 @@
#include "alloc_cache.h"
-#if defined(CONFIG_NET)
struct io_async_msghdr {
+#if defined(CONFIG_NET)
union {
struct iovec fast_iov[UIO_FASTIOV];
struct {
@@ -22,8 +22,11 @@ struct io_async_msghdr {
struct sockaddr __user *uaddr;
struct msghdr msg;
struct sockaddr_storage addr;
+#endif
};
+#if defined(CONFIG_NET)
+
struct io_async_connect {
struct sockaddr_storage address;
};
diff --git a/io_uring/notif.c b/io_uring/notif.c
index 09dfd08..e1846a2 100644
--- a/io_uring/notif.c
+++ b/io_uring/notif.c
@@ -9,7 +9,7 @@
#include "notif.h"
#include "rsrc.h"
-static void io_notif_complete_tw_ext(struct io_kiocb *notif, bool *locked)
+static void io_notif_complete_tw_ext(struct io_kiocb *notif, struct io_tw_state *ts)
{
struct io_notif_data *nd = io_notif_to_data(notif);
struct io_ring_ctx *ctx = notif->ctx;
@@ -21,7 +21,7 @@ static void io_notif_complete_tw_ext(struct io_kiocb *notif, bool *locked)
__io_unaccount_mem(ctx->user, nd->account_pages);
nd->account_pages = 0;
}
- io_req_task_complete(notif, locked);
+ io_req_task_complete(notif, ts);
}
static void io_tx_ubuf_callback(struct sk_buff *skb, struct ubuf_info *uarg,
@@ -31,7 +31,7 @@ static void io_tx_ubuf_callback(struct sk_buff *skb, struct ubuf_info *uarg,
struct io_kiocb *notif = cmd_to_io_kiocb(nd);
if (refcount_dec_and_test(&uarg->refcnt))
- io_req_task_work_add(notif);
+ __io_req_task_work_add(notif, IOU_F_TWQ_LAZY_WAKE);
}
static void io_tx_ubuf_callback_ext(struct sk_buff *skb, struct ubuf_info *uarg,
diff --git a/io_uring/notif.h b/io_uring/notif.h
index c88c800..6dd1b30 100644
--- a/io_uring/notif.h
+++ b/io_uring/notif.h
@@ -33,7 +33,7 @@ static inline void io_notif_flush(struct io_kiocb *notif)
/* drop slot's master ref */
if (refcount_dec_and_test(&nd->uarg.refcnt))
- io_req_task_work_add(notif);
+ __io_req_task_work_add(notif, IOU_F_TWQ_LAZY_WAKE);
}
static inline int io_notif_account_mem(struct io_kiocb *notif, unsigned len)
diff --git a/io_uring/poll.c b/io_uring/poll.c
index 795facb..c90e47d 100644
--- a/io_uring/poll.c
+++ b/io_uring/poll.c
@@ -148,7 +148,7 @@ static void io_poll_req_insert_locked(struct io_kiocb *req)
hlist_add_head(&req->hash_node, &table->hbs[index].list);
}
-static void io_poll_tw_hash_eject(struct io_kiocb *req, bool *locked)
+static void io_poll_tw_hash_eject(struct io_kiocb *req, struct io_tw_state *ts)
{
struct io_ring_ctx *ctx = req->ctx;
@@ -159,7 +159,7 @@ static void io_poll_tw_hash_eject(struct io_kiocb *req, bool *locked)
* already grabbed the mutex for us, but there is a chance it
* failed.
*/
- io_tw_lock(ctx, locked);
+ io_tw_lock(ctx, ts);
hash_del(&req->hash_node);
req->flags &= ~REQ_F_HASH_LOCKED;
} else {
@@ -238,7 +238,7 @@ enum {
* req->cqe.res. IOU_POLL_REMOVE_POLL_USE_RES indicates to remove multishot
* poll and that the result is stored in req->cqe.
*/
-static int io_poll_check_events(struct io_kiocb *req, bool *locked)
+static int io_poll_check_events(struct io_kiocb *req, struct io_tw_state *ts)
{
int v;
@@ -300,13 +300,13 @@ static int io_poll_check_events(struct io_kiocb *req, bool *locked)
__poll_t mask = mangle_poll(req->cqe.res &
req->apoll_events);
- if (!io_aux_cqe(req->ctx, *locked, req->cqe.user_data,
+ if (!io_aux_cqe(req->ctx, ts->locked, req->cqe.user_data,
mask, IORING_CQE_F_MORE, false)) {
io_req_set_res(req, mask, 0);
return IOU_POLL_REMOVE_POLL_USE_RES;
}
} else {
- int ret = io_poll_issue(req, locked);
+ int ret = io_poll_issue(req, ts);
if (ret == IOU_STOP_MULTISHOT)
return IOU_POLL_REMOVE_POLL_USE_RES;
if (ret < 0)
@@ -326,15 +326,15 @@ static int io_poll_check_events(struct io_kiocb *req, bool *locked)
return IOU_POLL_NO_ACTION;
}
-static void io_poll_task_func(struct io_kiocb *req, bool *locked)
+static void io_poll_task_func(struct io_kiocb *req, struct io_tw_state *ts)
{
int ret;
- ret = io_poll_check_events(req, locked);
+ ret = io_poll_check_events(req, ts);
if (ret == IOU_POLL_NO_ACTION)
return;
io_poll_remove_entries(req);
- io_poll_tw_hash_eject(req, locked);
+ io_poll_tw_hash_eject(req, ts);
if (req->opcode == IORING_OP_POLL_ADD) {
if (ret == IOU_POLL_DONE) {
@@ -343,7 +343,7 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked)
poll = io_kiocb_to_cmd(req, struct io_poll);
req->cqe.res = mangle_poll(req->cqe.res & poll->events);
} else if (ret == IOU_POLL_REISSUE) {
- io_req_task_submit(req, locked);
+ io_req_task_submit(req, ts);
return;
} else if (ret != IOU_POLL_REMOVE_POLL_USE_RES) {
req->cqe.res = ret;
@@ -351,14 +351,14 @@ static void io_poll_task_func(struct io_kiocb *req, bool *locked)
}
io_req_set_res(req, req->cqe.res, 0);
- io_req_task_complete(req, locked);
+ io_req_task_complete(req, ts);
} else {
- io_tw_lock(req->ctx, locked);
+ io_tw_lock(req->ctx, ts);
if (ret == IOU_POLL_REMOVE_POLL_USE_RES)
- io_req_task_complete(req, locked);
+ io_req_task_complete(req, ts);
else if (ret == IOU_POLL_DONE || ret == IOU_POLL_REISSUE)
- io_req_task_submit(req, locked);
+ io_req_task_submit(req, ts);
else
io_req_defer_failed(req, ret);
}
@@ -726,6 +726,7 @@ int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
apoll = io_req_alloc_apoll(req, issue_flags);
if (!apoll)
return IO_APOLL_ABORTED;
+ req->flags &= ~(REQ_F_SINGLE_POLL | REQ_F_DOUBLE_POLL);
req->flags |= REQ_F_POLLED;
ipt.pt._qproc = io_async_queue_proc;
@@ -976,7 +977,7 @@ int io_poll_remove(struct io_kiocb *req, unsigned int issue_flags)
struct io_hash_bucket *bucket;
struct io_kiocb *preq;
int ret2, ret = 0;
- bool locked;
+ struct io_tw_state ts = {};
preq = io_poll_find(ctx, true, &cd, &ctx->cancel_table, &bucket);
ret2 = io_poll_disarm(preq);
@@ -1026,8 +1027,8 @@ int io_poll_remove(struct io_kiocb *req, unsigned int issue_flags)
req_set_fail(preq);
io_req_set_res(preq, -ECANCELED, 0);
- locked = !(issue_flags & IO_URING_F_UNLOCKED);
- io_req_task_complete(preq, &locked);
+ ts.locked = !(issue_flags & IO_URING_F_UNLOCKED);
+ io_req_task_complete(preq, &ts);
out:
if (ret < 0) {
req_set_fail(req);
diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c
index 7a43aed..603a783 100644
--- a/io_uring/rsrc.c
+++ b/io_uring/rsrc.c
@@ -27,19 +27,13 @@ static int io_sqe_buffer_register(struct io_ring_ctx *ctx, struct iovec *iov,
struct io_mapped_ubuf **pimu,
struct page **last_hpage);
-#define IO_RSRC_REF_BATCH 100
-
/* only define max */
#define IORING_MAX_FIXED_FILES (1U << 20)
#define IORING_MAX_REG_BUFFERS (1U << 14)
-void io_rsrc_refs_drop(struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
+static inline bool io_put_rsrc_data_ref(struct io_rsrc_data *rsrc_data)
{
- if (ctx->rsrc_cached_refs) {
- io_rsrc_put_node(ctx->rsrc_node, ctx->rsrc_cached_refs);
- ctx->rsrc_cached_refs = 0;
- }
+ return !--rsrc_data->refs;
}
int __io_account_mem(struct user_struct *user, unsigned long nr_pages)
@@ -151,132 +145,84 @@ static void io_buffer_unmap(struct io_ring_ctx *ctx, struct io_mapped_ubuf **slo
*slot = NULL;
}
-void io_rsrc_refs_refill(struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
+static void io_rsrc_put_work_one(struct io_rsrc_data *rsrc_data,
+ struct io_rsrc_put *prsrc)
{
- ctx->rsrc_cached_refs += IO_RSRC_REF_BATCH;
- percpu_ref_get_many(&ctx->rsrc_node->refs, IO_RSRC_REF_BATCH);
+ struct io_ring_ctx *ctx = rsrc_data->ctx;
+
+ if (prsrc->tag)
+ io_post_aux_cqe(ctx, prsrc->tag, 0, 0);
+ rsrc_data->do_put(ctx, prsrc);
}
static void __io_rsrc_put_work(struct io_rsrc_node *ref_node)
{
struct io_rsrc_data *rsrc_data = ref_node->rsrc_data;
- struct io_ring_ctx *ctx = rsrc_data->ctx;
struct io_rsrc_put *prsrc, *tmp;
- list_for_each_entry_safe(prsrc, tmp, &ref_node->rsrc_list, list) {
+ if (ref_node->inline_items)
+ io_rsrc_put_work_one(rsrc_data, &ref_node->item);
+
+ list_for_each_entry_safe(prsrc, tmp, &ref_node->item_list, list) {
list_del(&prsrc->list);
-
- if (prsrc->tag) {
- if (ctx->flags & IORING_SETUP_IOPOLL) {
- mutex_lock(&ctx->uring_lock);
- io_post_aux_cqe(ctx, prsrc->tag, 0, 0);
- mutex_unlock(&ctx->uring_lock);
- } else {
- io_post_aux_cqe(ctx, prsrc->tag, 0, 0);
- }
- }
-
- rsrc_data->do_put(ctx, prsrc);
+ io_rsrc_put_work_one(rsrc_data, prsrc);
kfree(prsrc);
}
- io_rsrc_node_destroy(ref_node);
- if (atomic_dec_and_test(&rsrc_data->refs))
+ io_rsrc_node_destroy(rsrc_data->ctx, ref_node);
+ if (io_put_rsrc_data_ref(rsrc_data))
complete(&rsrc_data->done);
}
-void io_rsrc_put_work(struct work_struct *work)
-{
- struct io_ring_ctx *ctx;
- struct llist_node *node;
-
- ctx = container_of(work, struct io_ring_ctx, rsrc_put_work.work);
- node = llist_del_all(&ctx->rsrc_put_llist);
-
- while (node) {
- struct io_rsrc_node *ref_node;
- struct llist_node *next = node->next;
-
- ref_node = llist_entry(node, struct io_rsrc_node, llist);
- __io_rsrc_put_work(ref_node);
- node = next;
- }
-}
-
-void io_rsrc_put_tw(struct callback_head *cb)
-{
- struct io_ring_ctx *ctx = container_of(cb, struct io_ring_ctx,
- rsrc_put_tw);
-
- io_rsrc_put_work(&ctx->rsrc_put_work.work);
-}
-
void io_wait_rsrc_data(struct io_rsrc_data *data)
{
- if (data && !atomic_dec_and_test(&data->refs))
+ if (data && !io_put_rsrc_data_ref(data))
wait_for_completion(&data->done);
}
-void io_rsrc_node_destroy(struct io_rsrc_node *ref_node)
+void io_rsrc_node_destroy(struct io_ring_ctx *ctx, struct io_rsrc_node *node)
{
- percpu_ref_exit(&ref_node->refs);
- kfree(ref_node);
+ if (!io_alloc_cache_put(&ctx->rsrc_node_cache, &node->cache))
+ kfree(node);
}
-static __cold void io_rsrc_node_ref_zero(struct percpu_ref *ref)
+void io_rsrc_node_ref_zero(struct io_rsrc_node *node)
+ __must_hold(&node->rsrc_data->ctx->uring_lock)
{
- struct io_rsrc_node *node = container_of(ref, struct io_rsrc_node, refs);
struct io_ring_ctx *ctx = node->rsrc_data->ctx;
- unsigned long flags;
- bool first_add = false;
- unsigned long delay = HZ;
- spin_lock_irqsave(&ctx->rsrc_ref_lock, flags);
node->done = true;
-
- /* if we are mid-quiesce then do not delay */
- if (node->rsrc_data->quiesce)
- delay = 0;
-
while (!list_empty(&ctx->rsrc_ref_list)) {
node = list_first_entry(&ctx->rsrc_ref_list,
struct io_rsrc_node, node);
/* recycle ref nodes in order */
if (!node->done)
break;
+
list_del(&node->node);
- first_add |= llist_add(&node->llist, &ctx->rsrc_put_llist);
+ __io_rsrc_put_work(node);
}
- spin_unlock_irqrestore(&ctx->rsrc_ref_lock, flags);
-
- if (!first_add)
- return;
-
- if (ctx->submitter_task) {
- if (!task_work_add(ctx->submitter_task, &ctx->rsrc_put_tw,
- ctx->notify_method))
- return;
- }
- mod_delayed_work(system_wq, &ctx->rsrc_put_work, delay);
}
-static struct io_rsrc_node *io_rsrc_node_alloc(void)
+static struct io_rsrc_node *io_rsrc_node_alloc(struct io_ring_ctx *ctx)
{
struct io_rsrc_node *ref_node;
+ struct io_cache_entry *entry;
- ref_node = kzalloc(sizeof(*ref_node), GFP_KERNEL);
- if (!ref_node)
- return NULL;
-
- if (percpu_ref_init(&ref_node->refs, io_rsrc_node_ref_zero,
- 0, GFP_KERNEL)) {
- kfree(ref_node);
- return NULL;
+ entry = io_alloc_cache_get(&ctx->rsrc_node_cache);
+ if (entry) {
+ ref_node = container_of(entry, struct io_rsrc_node, cache);
+ } else {
+ ref_node = kzalloc(sizeof(*ref_node), GFP_KERNEL);
+ if (!ref_node)
+ return NULL;
}
+
+ ref_node->refs = 1;
INIT_LIST_HEAD(&ref_node->node);
- INIT_LIST_HEAD(&ref_node->rsrc_list);
+ INIT_LIST_HEAD(&ref_node->item_list);
ref_node->done = false;
+ ref_node->inline_items = 0;
return ref_node;
}
@@ -287,18 +233,15 @@ void io_rsrc_node_switch(struct io_ring_ctx *ctx,
WARN_ON_ONCE(!ctx->rsrc_backup_node);
WARN_ON_ONCE(data_to_kill && !ctx->rsrc_node);
- io_rsrc_refs_drop(ctx);
-
if (data_to_kill) {
struct io_rsrc_node *rsrc_node = ctx->rsrc_node;
rsrc_node->rsrc_data = data_to_kill;
- spin_lock_irq(&ctx->rsrc_ref_lock);
list_add_tail(&rsrc_node->node, &ctx->rsrc_ref_list);
- spin_unlock_irq(&ctx->rsrc_ref_lock);
- atomic_inc(&data_to_kill->refs);
- percpu_ref_kill(&rsrc_node->refs);
+ data_to_kill->refs++;
+ /* put master ref */
+ io_put_rsrc_node(ctx, rsrc_node);
ctx->rsrc_node = NULL;
}
@@ -312,7 +255,7 @@ int io_rsrc_node_switch_start(struct io_ring_ctx *ctx)
{
if (ctx->rsrc_backup_node)
return 0;
- ctx->rsrc_backup_node = io_rsrc_node_alloc();
+ ctx->rsrc_backup_node = io_rsrc_node_alloc(ctx);
return ctx->rsrc_backup_node ? 0 : -ENOMEM;
}
@@ -329,8 +272,8 @@ __cold static int io_rsrc_ref_quiesce(struct io_rsrc_data *data,
return ret;
io_rsrc_node_switch(ctx, data);
- /* kill initial ref, already quiesced if zero */
- if (atomic_dec_and_test(&data->refs))
+ /* kill initial ref */
+ if (io_put_rsrc_data_ref(data))
return 0;
data->quiesce = true;
@@ -338,19 +281,19 @@ __cold static int io_rsrc_ref_quiesce(struct io_rsrc_data *data,
do {
ret = io_run_task_work_sig(ctx);
if (ret < 0) {
- atomic_inc(&data->refs);
- /* wait for all works potentially completing data->done */
- flush_delayed_work(&ctx->rsrc_put_work);
- reinit_completion(&data->done);
mutex_lock(&ctx->uring_lock);
+ if (!data->refs) {
+ ret = 0;
+ } else {
+ /* restore the master reference */
+ data->refs++;
+ }
break;
}
-
- flush_delayed_work(&ctx->rsrc_put_work);
ret = wait_for_completion_interruptible(&data->done);
if (!ret) {
mutex_lock(&ctx->uring_lock);
- if (atomic_read(&data->refs) <= 0)
+ if (!data->refs)
break;
/*
* it has been revived by another thread while
@@ -425,6 +368,7 @@ __cold static int io_rsrc_data_alloc(struct io_ring_ctx *ctx,
data->nr = nr;
data->ctx = ctx;
data->do_put = do_put;
+ data->refs = 1;
if (utags) {
ret = -EFAULT;
for (i = 0; i < nr; i++) {
@@ -435,8 +379,6 @@ __cold static int io_rsrc_data_alloc(struct io_ring_ctx *ctx,
goto fail;
}
}
-
- atomic_set(&data->refs, 1);
init_completion(&data->done);
*pdata = data;
return 0;
@@ -758,15 +700,23 @@ int io_queue_rsrc_removal(struct io_rsrc_data *data, unsigned idx,
{
u64 *tag_slot = io_get_tag_slot(data, idx);
struct io_rsrc_put *prsrc;
+ bool inline_item = true;
- prsrc = kzalloc(sizeof(*prsrc), GFP_KERNEL);
- if (!prsrc)
- return -ENOMEM;
+ if (!node->inline_items) {
+ prsrc = &node->item;
+ node->inline_items++;
+ } else {
+ prsrc = kzalloc(sizeof(*prsrc), GFP_KERNEL);
+ if (!prsrc)
+ return -ENOMEM;
+ inline_item = false;
+ }
prsrc->tag = *tag_slot;
*tag_slot = 0;
prsrc->rsrc = rsrc;
- list_add(&prsrc->list, &node->rsrc_list);
+ if (!inline_item)
+ list_add(&prsrc->list, &node->item_list);
return 0;
}
diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h
index 2b87436..8729f2f 100644
--- a/io_uring/rsrc.h
+++ b/io_uring/rsrc.h
@@ -4,6 +4,10 @@
#include <net/af_unix.h>
+#include "alloc_cache.h"
+
+#define IO_NODE_ALLOC_CACHE_MAX 32
+
#define IO_RSRC_TAG_TABLE_SHIFT (PAGE_SHIFT - 3)
#define IO_RSRC_TAG_TABLE_MAX (1U << IO_RSRC_TAG_TABLE_SHIFT)
#define IO_RSRC_TAG_TABLE_MASK (IO_RSRC_TAG_TABLE_MAX - 1)
@@ -31,18 +35,29 @@ struct io_rsrc_data {
u64 **tags;
unsigned int nr;
rsrc_put_fn *do_put;
- atomic_t refs;
struct completion done;
+ int refs;
bool quiesce;
};
struct io_rsrc_node {
- struct percpu_ref refs;
+ union {
+ struct io_cache_entry cache;
+ struct io_rsrc_data *rsrc_data;
+ };
struct list_head node;
- struct list_head rsrc_list;
- struct io_rsrc_data *rsrc_data;
struct llist_node llist;
+ int refs;
bool done;
+
+ /*
+ * Keeps a list of struct io_rsrc_put to be completed. Each entry
+ * represents one rsrc (e.g. file or buffer), but all of them should've
+ * came from the same table and so are of the same type.
+ */
+ struct list_head item_list;
+ struct io_rsrc_put item;
+ int inline_items;
};
struct io_mapped_ubuf {
@@ -54,11 +69,10 @@ struct io_mapped_ubuf {
};
void io_rsrc_put_tw(struct callback_head *cb);
+void io_rsrc_node_ref_zero(struct io_rsrc_node *node);
void io_rsrc_put_work(struct work_struct *work);
-void io_rsrc_refs_refill(struct io_ring_ctx *ctx);
void io_wait_rsrc_data(struct io_rsrc_data *data);
-void io_rsrc_node_destroy(struct io_rsrc_node *ref_node);
-void io_rsrc_refs_drop(struct io_ring_ctx *ctx);
+void io_rsrc_node_destroy(struct io_ring_ctx *ctx, struct io_rsrc_node *ref_node);
int io_rsrc_node_switch_start(struct io_ring_ctx *ctx);
int io_queue_rsrc_removal(struct io_rsrc_data *data, unsigned idx,
struct io_rsrc_node *node, void *rsrc);
@@ -107,36 +121,24 @@ int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg,
int io_register_rsrc(struct io_ring_ctx *ctx, void __user *arg,
unsigned int size, unsigned int type);
-static inline void io_rsrc_put_node(struct io_rsrc_node *node, int nr)
+static inline void io_put_rsrc_node(struct io_ring_ctx *ctx, struct io_rsrc_node *node)
{
- percpu_ref_put_many(&node->refs, nr);
-}
+ lockdep_assert_held(&ctx->uring_lock);
-static inline void io_req_put_rsrc(struct io_kiocb *req)
-{
- if (req->rsrc_node)
- io_rsrc_put_node(req->rsrc_node, 1);
+ if (node && !--node->refs)
+ io_rsrc_node_ref_zero(node);
}
static inline void io_req_put_rsrc_locked(struct io_kiocb *req,
struct io_ring_ctx *ctx)
- __must_hold(&ctx->uring_lock)
{
- struct io_rsrc_node *node = req->rsrc_node;
-
- if (node) {
- if (node == ctx->rsrc_node)
- ctx->rsrc_cached_refs++;
- else
- io_rsrc_put_node(node, 1);
- }
+ io_put_rsrc_node(ctx, req->rsrc_node);
}
-static inline void io_charge_rsrc_node(struct io_ring_ctx *ctx)
+static inline void io_charge_rsrc_node(struct io_ring_ctx *ctx,
+ struct io_rsrc_node *node)
{
- ctx->rsrc_cached_refs--;
- if (unlikely(ctx->rsrc_cached_refs < 0))
- io_rsrc_refs_refill(ctx);
+ node->refs++;
}
static inline void io_req_set_rsrc_node(struct io_kiocb *req,
@@ -144,15 +146,13 @@ static inline void io_req_set_rsrc_node(struct io_kiocb *req,
unsigned int issue_flags)
{
if (!req->rsrc_node) {
+ io_ring_submit_lock(ctx, issue_flags);
+
+ lockdep_assert_held(&ctx->uring_lock);
+
req->rsrc_node = ctx->rsrc_node;
-
- if (!(issue_flags & IO_URING_F_UNLOCKED)) {
- lockdep_assert_held(&ctx->uring_lock);
-
- io_charge_rsrc_node(ctx);
- } else {
- percpu_ref_get(&req->rsrc_node->refs);
- }
+ io_charge_rsrc_node(ctx, ctx->rsrc_node);
+ io_ring_submit_unlock(ctx, issue_flags);
}
}
diff --git a/io_uring/rw.c b/io_uring/rw.c
index f33ba6f..3f118ed 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -283,16 +283,16 @@ static inline int io_fixup_rw_res(struct io_kiocb *req, long res)
return res;
}
-static void io_req_rw_complete(struct io_kiocb *req, bool *locked)
+static void io_req_rw_complete(struct io_kiocb *req, struct io_tw_state *ts)
{
io_req_io_end(req);
if (req->flags & (REQ_F_BUFFER_SELECTED|REQ_F_BUFFER_RING)) {
- unsigned issue_flags = *locked ? 0 : IO_URING_F_UNLOCKED;
+ unsigned issue_flags = ts->locked ? 0 : IO_URING_F_UNLOCKED;
req->cqe.flags |= io_put_kbuf(req, issue_flags);
}
- io_req_task_complete(req, locked);
+ io_req_task_complete(req, ts);
}
static void io_complete_rw(struct kiocb *kiocb, long res)
@@ -304,7 +304,7 @@ static void io_complete_rw(struct kiocb *kiocb, long res)
return;
io_req_set_res(req, io_fixup_rw_res(req, res), 0);
req->io_task_work.func = io_req_rw_complete;
- io_req_task_work_add(req);
+ __io_req_task_work_add(req, IOU_F_TWQ_LAZY_WAKE);
}
static void io_complete_rw_iopoll(struct kiocb *kiocb, long res)
@@ -1001,7 +1001,7 @@ void io_rw_fail(struct io_kiocb *req)
int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
{
struct io_wq_work_node *pos, *start, *prev;
- unsigned int poll_flags = BLK_POLL_NOSLEEP;
+ unsigned int poll_flags = 0;
DEFINE_IO_COMP_BATCH(iob);
int nr_events = 0;
diff --git a/io_uring/timeout.c b/io_uring/timeout.c
index 826a51b..5c6c6f7 100644
--- a/io_uring/timeout.c
+++ b/io_uring/timeout.c
@@ -101,9 +101,9 @@ __cold void io_flush_timeouts(struct io_ring_ctx *ctx)
spin_unlock_irq(&ctx->timeout_lock);
}
-static void io_req_tw_fail_links(struct io_kiocb *link, bool *locked)
+static void io_req_tw_fail_links(struct io_kiocb *link, struct io_tw_state *ts)
{
- io_tw_lock(link->ctx, locked);
+ io_tw_lock(link->ctx, ts);
while (link) {
struct io_kiocb *nxt = link->link;
long res = -ECANCELED;
@@ -112,7 +112,7 @@ static void io_req_tw_fail_links(struct io_kiocb *link, bool *locked)
res = link->cqe.res;
link->link = NULL;
io_req_set_res(link, res, 0);
- io_req_task_complete(link, locked);
+ io_req_task_complete(link, ts);
link = nxt;
}
}
@@ -265,9 +265,9 @@ int io_timeout_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd)
return 0;
}
-static void io_req_task_link_timeout(struct io_kiocb *req, bool *locked)
+static void io_req_task_link_timeout(struct io_kiocb *req, struct io_tw_state *ts)
{
- unsigned issue_flags = *locked ? 0 : IO_URING_F_UNLOCKED;
+ unsigned issue_flags = ts->locked ? 0 : IO_URING_F_UNLOCKED;
struct io_timeout *timeout = io_kiocb_to_cmd(req, struct io_timeout);
struct io_kiocb *prev = timeout->prev;
int ret = -ENOENT;
@@ -282,11 +282,11 @@ static void io_req_task_link_timeout(struct io_kiocb *req, bool *locked)
ret = io_try_cancel(req->task->io_uring, &cd, issue_flags);
}
io_req_set_res(req, ret ?: -ETIME, 0);
- io_req_task_complete(req, locked);
+ io_req_task_complete(req, ts);
io_put_req(prev);
} else {
io_req_set_res(req, -ETIME, 0);
- io_req_task_complete(req, locked);
+ io_req_task_complete(req, ts);
}
}
diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c
index 9a1dee5..f7a96bc7 100644
--- a/io_uring/uring_cmd.c
+++ b/io_uring/uring_cmd.c
@@ -12,10 +12,10 @@
#include "rsrc.h"
#include "uring_cmd.h"
-static void io_uring_cmd_work(struct io_kiocb *req, bool *locked)
+static void io_uring_cmd_work(struct io_kiocb *req, struct io_tw_state *ts)
{
struct io_uring_cmd *ioucmd = io_kiocb_to_cmd(req, struct io_uring_cmd);
- unsigned issue_flags = *locked ? 0 : IO_URING_F_UNLOCKED;
+ unsigned issue_flags = ts->locked ? 0 : IO_URING_F_UNLOCKED;
ioucmd->task_work_cb(ioucmd, issue_flags);
}
@@ -73,6 +73,7 @@ int io_uring_cmd_prep_async(struct io_kiocb *req)
cmd_size = uring_cmd_pdu_size(req->ctx->flags & IORING_SETUP_SQE128);
memcpy(req->async_data, ioucmd->cmd, cmd_size);
+ ioucmd->cmd = req->async_data;
return 0;
}
@@ -129,9 +130,6 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags)
WRITE_ONCE(ioucmd->cookie, NULL);
}
- if (req_has_async_data(req))
- ioucmd->cmd = req->async_data;
-
ret = file->f_op->uring_cmd(ioucmd, issue_flags);
if (ret == -EAGAIN) {
if (!req_has_async_data(req)) {
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c
index 03e3251..5b919ef 100644
--- a/kernel/dma/swiotlb.c
+++ b/kernel/dma/swiotlb.c
@@ -625,8 +625,8 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index,
unsigned int iotlb_align_mask =
dma_get_min_align_mask(dev) & ~(IO_TLB_SIZE - 1);
unsigned int nslots = nr_slots(alloc_size), stride;
- unsigned int index, wrap, count = 0, i;
unsigned int offset = swiotlb_align_offset(dev, orig_addr);
+ unsigned int index, slots_checked, count = 0, i;
unsigned long flags;
unsigned int slot_base;
unsigned int slot_index;
@@ -635,29 +635,34 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index,
BUG_ON(area_index >= mem->nareas);
/*
+ * For allocations of PAGE_SIZE or larger only look for page aligned
+ * allocations.
+ */
+ if (alloc_size >= PAGE_SIZE)
+ iotlb_align_mask &= PAGE_MASK;
+ iotlb_align_mask &= alloc_align_mask;
+
+ /*
* For mappings with an alignment requirement don't bother looping to
- * unaligned slots once we found an aligned one. For allocations of
- * PAGE_SIZE or larger only look for page aligned allocations.
+ * unaligned slots once we found an aligned one.
*/
stride = (iotlb_align_mask >> IO_TLB_SHIFT) + 1;
- if (alloc_size >= PAGE_SIZE)
- stride = max(stride, stride << (PAGE_SHIFT - IO_TLB_SHIFT));
- stride = max(stride, (alloc_align_mask >> IO_TLB_SHIFT) + 1);
spin_lock_irqsave(&area->lock, flags);
if (unlikely(nslots > mem->area_nslabs - area->used))
goto not_found;
slot_base = area_index * mem->area_nslabs;
- index = wrap = wrap_area_index(mem, ALIGN(area->index, stride));
+ index = area->index;
- do {
+ for (slots_checked = 0; slots_checked < mem->area_nslabs; ) {
slot_index = slot_base + index;
if (orig_addr &&
(slot_addr(tbl_dma_addr, slot_index) &
iotlb_align_mask) != (orig_addr & iotlb_align_mask)) {
index = wrap_area_index(mem, index + 1);
+ slots_checked++;
continue;
}
@@ -673,7 +678,8 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index,
goto found;
}
index = wrap_area_index(mem, index + stride);
- } while (index != wrap);
+ slots_checked += stride;
+ }
not_found:
spin_unlock_irqrestore(&area->lock, flags);
@@ -693,10 +699,7 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index,
/*
* Update the indices to avoid searching in the next round.
*/
- if (index + nslots < mem->area_nslabs)
- area->index = index + nslots;
- else
- area->index = 0;
+ area->index = wrap_area_index(mem, index + nslots);
area->used += nslots;
spin_unlock_irqrestore(&area->lock, flags);
return slot_index;
diff --git a/lib/iov_iter.c b/lib/iov_iter.c
index 86a066a..f3477c3 100644
--- a/lib/iov_iter.c
+++ b/lib/iov_iter.c
@@ -14,8 +14,6 @@
#include <linux/scatterlist.h>
#include <linux/instrumented.h>
-#define PIPE_PARANOIA /* for now */
-
/* covers ubuf and kbuf alike */
#define iterate_buf(i, n, base, len, off, __p, STEP) { \
size_t __maybe_unused off = 0; \
@@ -186,150 +184,6 @@ static int copyin(void *to, const void __user *from, size_t n)
return res;
}
-#ifdef PIPE_PARANOIA
-static bool sanity(const struct iov_iter *i)
-{
- struct pipe_inode_info *pipe = i->pipe;
- unsigned int p_head = pipe->head;
- unsigned int p_tail = pipe->tail;
- unsigned int p_occupancy = pipe_occupancy(p_head, p_tail);
- unsigned int i_head = i->head;
- unsigned int idx;
-
- if (i->last_offset) {
- struct pipe_buffer *p;
- if (unlikely(p_occupancy == 0))
- goto Bad; // pipe must be non-empty
- if (unlikely(i_head != p_head - 1))
- goto Bad; // must be at the last buffer...
-
- p = pipe_buf(pipe, i_head);
- if (unlikely(p->offset + p->len != abs(i->last_offset)))
- goto Bad; // ... at the end of segment
- } else {
- if (i_head != p_head)
- goto Bad; // must be right after the last buffer
- }
- return true;
-Bad:
- printk(KERN_ERR "idx = %d, offset = %d\n", i_head, i->last_offset);
- printk(KERN_ERR "head = %d, tail = %d, buffers = %d\n",
- p_head, p_tail, pipe->ring_size);
- for (idx = 0; idx < pipe->ring_size; idx++)
- printk(KERN_ERR "[%p %p %d %d]\n",
- pipe->bufs[idx].ops,
- pipe->bufs[idx].page,
- pipe->bufs[idx].offset,
- pipe->bufs[idx].len);
- WARN_ON(1);
- return false;
-}
-#else
-#define sanity(i) true
-#endif
-
-static struct page *push_anon(struct pipe_inode_info *pipe, unsigned size)
-{
- struct page *page = alloc_page(GFP_USER);
- if (page) {
- struct pipe_buffer *buf = pipe_buf(pipe, pipe->head++);
- *buf = (struct pipe_buffer) {
- .ops = &default_pipe_buf_ops,
- .page = page,
- .offset = 0,
- .len = size
- };
- }
- return page;
-}
-
-static void push_page(struct pipe_inode_info *pipe, struct page *page,
- unsigned int offset, unsigned int size)
-{
- struct pipe_buffer *buf = pipe_buf(pipe, pipe->head++);
- *buf = (struct pipe_buffer) {
- .ops = &page_cache_pipe_buf_ops,
- .page = page,
- .offset = offset,
- .len = size
- };
- get_page(page);
-}
-
-static inline int last_offset(const struct pipe_buffer *buf)
-{
- if (buf->ops == &default_pipe_buf_ops)
- return buf->len; // buf->offset is 0 for those
- else
- return -(buf->offset + buf->len);
-}
-
-static struct page *append_pipe(struct iov_iter *i, size_t size,
- unsigned int *off)
-{
- struct pipe_inode_info *pipe = i->pipe;
- int offset = i->last_offset;
- struct pipe_buffer *buf;
- struct page *page;
-
- if (offset > 0 && offset < PAGE_SIZE) {
- // some space in the last buffer; add to it
- buf = pipe_buf(pipe, pipe->head - 1);
- size = min_t(size_t, size, PAGE_SIZE - offset);
- buf->len += size;
- i->last_offset += size;
- i->count -= size;
- *off = offset;
- return buf->page;
- }
- // OK, we need a new buffer
- *off = 0;
- size = min_t(size_t, size, PAGE_SIZE);
- if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
- return NULL;
- page = push_anon(pipe, size);
- if (!page)
- return NULL;
- i->head = pipe->head - 1;
- i->last_offset = size;
- i->count -= size;
- return page;
-}
-
-static size_t copy_page_to_iter_pipe(struct page *page, size_t offset, size_t bytes,
- struct iov_iter *i)
-{
- struct pipe_inode_info *pipe = i->pipe;
- unsigned int head = pipe->head;
-
- if (unlikely(bytes > i->count))
- bytes = i->count;
-
- if (unlikely(!bytes))
- return 0;
-
- if (!sanity(i))
- return 0;
-
- if (offset && i->last_offset == -offset) { // could we merge it?
- struct pipe_buffer *buf = pipe_buf(pipe, head - 1);
- if (buf->page == page) {
- buf->len += bytes;
- i->last_offset -= bytes;
- i->count -= bytes;
- return bytes;
- }
- }
- if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
- return 0;
-
- push_page(pipe, page, offset, bytes);
- i->last_offset = -(offset + bytes);
- i->head = head;
- i->count -= bytes;
- return bytes;
-}
-
/*
* fault_in_iov_iter_readable - fault in iov iterator for reading
* @i: iterator
@@ -433,46 +287,6 @@ void iov_iter_init(struct iov_iter *i, unsigned int direction,
}
EXPORT_SYMBOL(iov_iter_init);
-// returns the offset in partial buffer (if any)
-static inline unsigned int pipe_npages(const struct iov_iter *i, int *npages)
-{
- struct pipe_inode_info *pipe = i->pipe;
- int used = pipe->head - pipe->tail;
- int off = i->last_offset;
-
- *npages = max((int)pipe->max_usage - used, 0);
-
- if (off > 0 && off < PAGE_SIZE) { // anon and not full
- (*npages)++;
- return off;
- }
- return 0;
-}
-
-static size_t copy_pipe_to_iter(const void *addr, size_t bytes,
- struct iov_iter *i)
-{
- unsigned int off, chunk;
-
- if (unlikely(bytes > i->count))
- bytes = i->count;
- if (unlikely(!bytes))
- return 0;
-
- if (!sanity(i))
- return 0;
-
- for (size_t n = bytes; n; n -= chunk) {
- struct page *page = append_pipe(i, n, &off);
- chunk = min_t(size_t, n, PAGE_SIZE - off);
- if (!page)
- return bytes - n;
- memcpy_to_page(page, off, addr, chunk);
- addr += chunk;
- }
- return bytes;
-}
-
static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
__wsum sum, size_t off)
{
@@ -480,44 +294,10 @@ static __wsum csum_and_memcpy(void *to, const void *from, size_t len,
return csum_block_add(sum, next, off);
}
-static size_t csum_and_copy_to_pipe_iter(const void *addr, size_t bytes,
- struct iov_iter *i, __wsum *sump)
-{
- __wsum sum = *sump;
- size_t off = 0;
- unsigned int chunk, r;
-
- if (unlikely(bytes > i->count))
- bytes = i->count;
- if (unlikely(!bytes))
- return 0;
-
- if (!sanity(i))
- return 0;
-
- while (bytes) {
- struct page *page = append_pipe(i, bytes, &r);
- char *p;
-
- if (!page)
- break;
- chunk = min_t(size_t, bytes, PAGE_SIZE - r);
- p = kmap_local_page(page);
- sum = csum_and_memcpy(p + r, addr + off, chunk, sum, off);
- kunmap_local(p);
- off += chunk;
- bytes -= chunk;
- }
- *sump = sum;
- return off;
-}
-
size_t _copy_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
{
if (WARN_ON_ONCE(i->data_source))
return 0;
- if (unlikely(iov_iter_is_pipe(i)))
- return copy_pipe_to_iter(addr, bytes, i);
if (user_backed_iter(i))
might_fault();
iterate_and_advance(i, bytes, base, len, off,
@@ -539,42 +319,6 @@ static int copyout_mc(void __user *to, const void *from, size_t n)
return n;
}
-static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
- struct iov_iter *i)
-{
- size_t xfer = 0;
- unsigned int off, chunk;
-
- if (unlikely(bytes > i->count))
- bytes = i->count;
- if (unlikely(!bytes))
- return 0;
-
- if (!sanity(i))
- return 0;
-
- while (bytes) {
- struct page *page = append_pipe(i, bytes, &off);
- unsigned long rem;
- char *p;
-
- if (!page)
- break;
- chunk = min_t(size_t, bytes, PAGE_SIZE - off);
- p = kmap_local_page(page);
- rem = copy_mc_to_kernel(p + off, addr + xfer, chunk);
- chunk -= rem;
- kunmap_local(p);
- xfer += chunk;
- bytes -= chunk;
- if (rem) {
- iov_iter_revert(i, rem);
- break;
- }
- }
- return xfer;
-}
-
/**
* _copy_mc_to_iter - copy to iter with source memory error exception handling
* @addr: source kernel address
@@ -594,9 +338,8 @@ static size_t copy_mc_pipe_to_iter(const void *addr, size_t bytes,
* alignment and poison alignment assumptions to avoid re-triggering
* hardware exceptions.
*
- * * ITER_KVEC, ITER_PIPE, and ITER_BVEC can return short copies.
- * Compare to copy_to_iter() where only ITER_IOVEC attempts might return
- * a short copy.
+ * * ITER_KVEC and ITER_BVEC can return short copies. Compare to
+ * copy_to_iter() where only ITER_IOVEC attempts might return a short copy.
*
* Return: number of bytes copied (may be %0)
*/
@@ -604,8 +347,6 @@ size_t _copy_mc_to_iter(const void *addr, size_t bytes, struct iov_iter *i)
{
if (WARN_ON_ONCE(i->data_source))
return 0;
- if (unlikely(iov_iter_is_pipe(i)))
- return copy_mc_pipe_to_iter(addr, bytes, i);
if (user_backed_iter(i))
might_fault();
__iterate_and_advance(i, bytes, base, len, off,
@@ -711,8 +452,6 @@ size_t copy_page_to_iter(struct page *page, size_t offset, size_t bytes,
return 0;
if (WARN_ON_ONCE(i->data_source))
return 0;
- if (unlikely(iov_iter_is_pipe(i)))
- return copy_page_to_iter_pipe(page, offset, bytes, i);
page += offset / PAGE_SIZE; // first subpage
offset %= PAGE_SIZE;
while (1) {
@@ -761,36 +500,8 @@ size_t copy_page_from_iter(struct page *page, size_t offset, size_t bytes,
}
EXPORT_SYMBOL(copy_page_from_iter);
-static size_t pipe_zero(size_t bytes, struct iov_iter *i)
-{
- unsigned int chunk, off;
-
- if (unlikely(bytes > i->count))
- bytes = i->count;
- if (unlikely(!bytes))
- return 0;
-
- if (!sanity(i))
- return 0;
-
- for (size_t n = bytes; n; n -= chunk) {
- struct page *page = append_pipe(i, n, &off);
- char *p;
-
- if (!page)
- return bytes - n;
- chunk = min_t(size_t, n, PAGE_SIZE - off);
- p = kmap_local_page(page);
- memset(p + off, 0, chunk);
- kunmap_local(p);
- }
- return bytes;
-}
-
size_t iov_iter_zero(size_t bytes, struct iov_iter *i)
{
- if (unlikely(iov_iter_is_pipe(i)))
- return pipe_zero(bytes, i);
iterate_and_advance(i, bytes, base, len, count,
clear_user(base, len),
memset(base, 0, len)
@@ -821,32 +532,6 @@ size_t copy_page_from_iter_atomic(struct page *page, unsigned offset, size_t byt
}
EXPORT_SYMBOL(copy_page_from_iter_atomic);
-static void pipe_advance(struct iov_iter *i, size_t size)
-{
- struct pipe_inode_info *pipe = i->pipe;
- int off = i->last_offset;
-
- if (!off && !size) {
- pipe_discard_from(pipe, i->start_head); // discard everything
- return;
- }
- i->count -= size;
- while (1) {
- struct pipe_buffer *buf = pipe_buf(pipe, i->head);
- if (off) /* make it relative to the beginning of buffer */
- size += abs(off) - buf->offset;
- if (size <= buf->len) {
- buf->len = size;
- i->last_offset = last_offset(buf);
- break;
- }
- size -= buf->len;
- i->head++;
- off = 0;
- }
- pipe_discard_from(pipe, i->head + 1); // discard everything past this one
-}
-
static void iov_iter_bvec_advance(struct iov_iter *i, size_t size)
{
const struct bio_vec *bvec, *end;
@@ -898,8 +583,6 @@ void iov_iter_advance(struct iov_iter *i, size_t size)
iov_iter_iovec_advance(i, size);
} else if (iov_iter_is_bvec(i)) {
iov_iter_bvec_advance(i, size);
- } else if (iov_iter_is_pipe(i)) {
- pipe_advance(i, size);
} else if (iov_iter_is_discard(i)) {
i->count -= size;
}
@@ -913,26 +596,6 @@ void iov_iter_revert(struct iov_iter *i, size_t unroll)
if (WARN_ON(unroll > MAX_RW_COUNT))
return;
i->count += unroll;
- if (unlikely(iov_iter_is_pipe(i))) {
- struct pipe_inode_info *pipe = i->pipe;
- unsigned int head = pipe->head;
-
- while (head > i->start_head) {
- struct pipe_buffer *b = pipe_buf(pipe, --head);
- if (unroll < b->len) {
- b->len -= unroll;
- i->last_offset = last_offset(b);
- i->head = head;
- return;
- }
- unroll -= b->len;
- pipe_buf_release(pipe, b);
- pipe->head--;
- }
- i->last_offset = 0;
- i->head = head;
- return;
- }
if (unlikely(iov_iter_is_discard(i)))
return;
if (unroll <= i->iov_offset) {
@@ -1020,24 +683,6 @@ void iov_iter_bvec(struct iov_iter *i, unsigned int direction,
}
EXPORT_SYMBOL(iov_iter_bvec);
-void iov_iter_pipe(struct iov_iter *i, unsigned int direction,
- struct pipe_inode_info *pipe,
- size_t count)
-{
- BUG_ON(direction != READ);
- WARN_ON(pipe_full(pipe->head, pipe->tail, pipe->ring_size));
- *i = (struct iov_iter){
- .iter_type = ITER_PIPE,
- .data_source = false,
- .pipe = pipe,
- .head = pipe->head,
- .start_head = pipe->head,
- .last_offset = 0,
- .count = count
- };
-}
-EXPORT_SYMBOL(iov_iter_pipe);
-
/**
* iov_iter_xarray - Initialise an I/O iterator to use the pages in an xarray
* @i: The iterator to initialise.
@@ -1163,19 +808,6 @@ bool iov_iter_is_aligned(const struct iov_iter *i, unsigned addr_mask,
if (iov_iter_is_bvec(i))
return iov_iter_aligned_bvec(i, addr_mask, len_mask);
- if (iov_iter_is_pipe(i)) {
- size_t size = i->count;
-
- if (size & len_mask)
- return false;
- if (size && i->last_offset > 0) {
- if (i->last_offset & addr_mask)
- return false;
- }
-
- return true;
- }
-
if (iov_iter_is_xarray(i)) {
if (i->count & len_mask)
return false;
@@ -1246,14 +878,6 @@ unsigned long iov_iter_alignment(const struct iov_iter *i)
if (iov_iter_is_bvec(i))
return iov_iter_alignment_bvec(i);
- if (iov_iter_is_pipe(i)) {
- size_t size = i->count;
-
- if (size && i->last_offset > 0)
- return size | i->last_offset;
- return size;
- }
-
if (iov_iter_is_xarray(i))
return (i->xarray_start + i->iov_offset) | i->count;
@@ -1306,36 +930,6 @@ static int want_pages_array(struct page ***res, size_t size,
return count;
}
-static ssize_t pipe_get_pages(struct iov_iter *i,
- struct page ***pages, size_t maxsize, unsigned maxpages,
- size_t *start)
-{
- unsigned int npages, count, off, chunk;
- struct page **p;
- size_t left;
-
- if (!sanity(i))
- return -EFAULT;
-
- *start = off = pipe_npages(i, &npages);
- if (!npages)
- return -EFAULT;
- count = want_pages_array(pages, maxsize, off, min(npages, maxpages));
- if (!count)
- return -ENOMEM;
- p = *pages;
- for (npages = 0, left = maxsize ; npages < count; npages++, left -= chunk) {
- struct page *page = append_pipe(i, left, &off);
- if (!page)
- break;
- chunk = min_t(size_t, left, PAGE_SIZE - off);
- get_page(*p++ = page);
- }
- if (!npages)
- return -EFAULT;
- return maxsize - left;
-}
-
static ssize_t iter_xarray_populate_pages(struct page **pages, struct xarray *xa,
pgoff_t index, unsigned int nr_pages)
{
@@ -1429,8 +1023,7 @@ static struct page *first_bvec_segment(const struct iov_iter *i,
static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i,
struct page ***pages, size_t maxsize,
- unsigned int maxpages, size_t *start,
- iov_iter_extraction_t extraction_flags)
+ unsigned int maxpages, size_t *start)
{
unsigned int n, gup_flags = 0;
@@ -1440,8 +1033,6 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i,
return 0;
if (maxsize > MAX_RW_COUNT)
maxsize = MAX_RW_COUNT;
- if (extraction_flags & ITER_ALLOW_P2PDMA)
- gup_flags |= FOLL_PCI_P2PDMA;
if (likely(user_backed_iter(i))) {
unsigned long addr;
@@ -1486,57 +1077,37 @@ static ssize_t __iov_iter_get_pages_alloc(struct iov_iter *i,
}
return maxsize;
}
- if (iov_iter_is_pipe(i))
- return pipe_get_pages(i, pages, maxsize, maxpages, start);
if (iov_iter_is_xarray(i))
return iter_xarray_get_pages(i, pages, maxsize, maxpages, start);
return -EFAULT;
}
-ssize_t iov_iter_get_pages(struct iov_iter *i,
- struct page **pages, size_t maxsize, unsigned maxpages,
- size_t *start, iov_iter_extraction_t extraction_flags)
+ssize_t iov_iter_get_pages2(struct iov_iter *i, struct page **pages,
+ size_t maxsize, unsigned maxpages, size_t *start)
{
if (!maxpages)
return 0;
BUG_ON(!pages);
- return __iov_iter_get_pages_alloc(i, &pages, maxsize, maxpages,
- start, extraction_flags);
-}
-EXPORT_SYMBOL_GPL(iov_iter_get_pages);
-
-ssize_t iov_iter_get_pages2(struct iov_iter *i, struct page **pages,
- size_t maxsize, unsigned maxpages, size_t *start)
-{
- return iov_iter_get_pages(i, pages, maxsize, maxpages, start, 0);
+ return __iov_iter_get_pages_alloc(i, &pages, maxsize, maxpages, start);
}
EXPORT_SYMBOL(iov_iter_get_pages2);
-ssize_t iov_iter_get_pages_alloc(struct iov_iter *i,
- struct page ***pages, size_t maxsize,
- size_t *start, iov_iter_extraction_t extraction_flags)
+ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i,
+ struct page ***pages, size_t maxsize, size_t *start)
{
ssize_t len;
*pages = NULL;
- len = __iov_iter_get_pages_alloc(i, pages, maxsize, ~0U, start,
- extraction_flags);
+ len = __iov_iter_get_pages_alloc(i, pages, maxsize, ~0U, start);
if (len <= 0) {
kvfree(*pages);
*pages = NULL;
}
return len;
}
-EXPORT_SYMBOL_GPL(iov_iter_get_pages_alloc);
-
-ssize_t iov_iter_get_pages_alloc2(struct iov_iter *i,
- struct page ***pages, size_t maxsize, size_t *start)
-{
- return iov_iter_get_pages_alloc(i, pages, maxsize, start, 0);
-}
-EXPORT_SYMBOL(iov_iter_get_pages_alloc2);
+EXPORT_SYMBOL_GPL(iov_iter_get_pages_alloc2);
size_t csum_and_copy_from_iter(void *addr, size_t bytes, __wsum *csum,
struct iov_iter *i)
@@ -1577,9 +1148,7 @@ size_t csum_and_copy_to_iter(const void *addr, size_t bytes, void *_csstate,
}
sum = csum_shift(csstate->csum, csstate->off);
- if (unlikely(iov_iter_is_pipe(i)))
- bytes = csum_and_copy_to_pipe_iter(addr, bytes, i, &sum);
- else iterate_and_advance(i, bytes, base, len, off, ({
+ iterate_and_advance(i, bytes, base, len, off, ({
next = csum_and_copy_to_user(addr + off, base, len);
sum = csum_block_add(sum, next, off);
next ? 0 : len;
@@ -1664,15 +1233,6 @@ int iov_iter_npages(const struct iov_iter *i, int maxpages)
return iov_npages(i, maxpages);
if (iov_iter_is_bvec(i))
return bvec_npages(i, maxpages);
- if (iov_iter_is_pipe(i)) {
- int npages;
-
- if (!sanity(i))
- return 0;
-
- pipe_npages(i, &npages);
- return min(npages, maxpages);
- }
if (iov_iter_is_xarray(i)) {
unsigned offset = (i->xarray_start + i->iov_offset) % PAGE_SIZE;
int npages = DIV_ROUND_UP(offset + i->count, PAGE_SIZE);
@@ -1685,10 +1245,6 @@ EXPORT_SYMBOL(iov_iter_npages);
const void *dup_iter(struct iov_iter *new, struct iov_iter *old, gfp_t flags)
{
*new = *old;
- if (unlikely(iov_iter_is_pipe(new))) {
- WARN_ON(1);
- return NULL;
- }
if (iov_iter_is_bvec(new))
return new->bvec = kmemdup(new->bvec,
new->nr_segs * sizeof(struct bio_vec),
diff --git a/mm/filemap.c b/mm/filemap.c
index 2723104..470be06 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -2690,8 +2690,7 @@ ssize_t filemap_read(struct kiocb *iocb, struct iov_iter *iter,
if (unlikely(iocb->ki_pos >= i_size_read(inode)))
break;
- error = filemap_get_pages(iocb, iter->count, &fbatch,
- iov_iter_is_pipe(iter));
+ error = filemap_get_pages(iocb, iter->count, &fbatch, false);
if (error < 0)
break;
@@ -2967,7 +2966,6 @@ ssize_t filemap_splice_read(struct file *in, loff_t *ppos,
return total_spliced ? total_spliced : error;
}
-EXPORT_SYMBOL(filemap_splice_read);
static inline loff_t folio_seek_hole_data(struct xa_state *xas,
struct address_space *mapping, struct folio *folio,
diff --git a/mm/shmem.c b/mm/shmem.c
index 448f393..a0c268d 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -2719,6 +2719,138 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to)
return retval ? retval : error;
}
+static bool zero_pipe_buf_get(struct pipe_inode_info *pipe,
+ struct pipe_buffer *buf)
+{
+ return true;
+}
+
+static void zero_pipe_buf_release(struct pipe_inode_info *pipe,
+ struct pipe_buffer *buf)
+{
+}
+
+static bool zero_pipe_buf_try_steal(struct pipe_inode_info *pipe,
+ struct pipe_buffer *buf)
+{
+ return false;
+}
+
+static const struct pipe_buf_operations zero_pipe_buf_ops = {
+ .release = zero_pipe_buf_release,
+ .try_steal = zero_pipe_buf_try_steal,
+ .get = zero_pipe_buf_get,
+};
+
+static size_t splice_zeropage_into_pipe(struct pipe_inode_info *pipe,
+ loff_t fpos, size_t size)
+{
+ size_t offset = fpos & ~PAGE_MASK;
+
+ size = min_t(size_t, size, PAGE_SIZE - offset);
+
+ if (!pipe_full(pipe->head, pipe->tail, pipe->max_usage)) {
+ struct pipe_buffer *buf = pipe_head_buf(pipe);
+
+ *buf = (struct pipe_buffer) {
+ .ops = &zero_pipe_buf_ops,
+ .page = ZERO_PAGE(0),
+ .offset = offset,
+ .len = size,
+ };
+ pipe->head++;
+ }
+
+ return size;
+}
+
+static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos,
+ struct pipe_inode_info *pipe,
+ size_t len, unsigned int flags)
+{
+ struct inode *inode = file_inode(in);
+ struct address_space *mapping = inode->i_mapping;
+ struct folio *folio = NULL;
+ size_t total_spliced = 0, used, npages, n, part;
+ loff_t isize;
+ int error = 0;
+
+ /* Work out how much data we can actually add into the pipe */
+ used = pipe_occupancy(pipe->head, pipe->tail);
+ npages = max_t(ssize_t, pipe->max_usage - used, 0);
+ len = min_t(size_t, len, npages * PAGE_SIZE);
+
+ do {
+ if (*ppos >= i_size_read(inode))
+ break;
+
+ error = shmem_get_folio(inode, *ppos / PAGE_SIZE, &folio, SGP_READ);
+ if (error) {
+ if (error == -EINVAL)
+ error = 0;
+ break;
+ }
+ if (folio) {
+ folio_unlock(folio);
+
+ if (folio_test_hwpoison(folio)) {
+ error = -EIO;
+ break;
+ }
+ }
+
+ /*
+ * i_size must be checked after we know the pages are Uptodate.
+ *
+ * Checking i_size after the check allows us to calculate
+ * the correct value for "nr", which means the zero-filled
+ * part of the page is not copied back to userspace (unless
+ * another truncate extends the file - this is desired though).
+ */
+ isize = i_size_read(inode);
+ if (unlikely(*ppos >= isize))
+ break;
+ part = min_t(loff_t, isize - *ppos, len);
+
+ if (folio) {
+ /*
+ * If users can be writing to this page using arbitrary
+ * virtual addresses, take care about potential aliasing
+ * before reading the page on the kernel side.
+ */
+ if (mapping_writably_mapped(mapping))
+ flush_dcache_folio(folio);
+ folio_mark_accessed(folio);
+ /*
+ * Ok, we have the page, and it's up-to-date, so we can
+ * now splice it into the pipe.
+ */
+ n = splice_folio_into_pipe(pipe, folio, *ppos, part);
+ folio_put(folio);
+ folio = NULL;
+ } else {
+ n = splice_zeropage_into_pipe(pipe, *ppos, len);
+ }
+
+ if (!n)
+ break;
+ len -= n;
+ total_spliced += n;
+ *ppos += n;
+ in->f_ra.prev_pos = *ppos;
+ if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
+ break;
+
+ cond_resched();
+ } while (len);
+
+ if (folio)
+ folio_put(folio);
+
+ file_accessed(in);
+ return total_spliced ? total_spliced : error;
+}
+
static loff_t shmem_file_llseek(struct file *file, loff_t offset, int whence)
{
struct address_space *mapping = file->f_mapping;
@@ -3938,7 +4070,7 @@ static const struct file_operations shmem_file_operations = {
.read_iter = shmem_file_read_iter,
.write_iter = generic_file_write_iter,
.fsync = noop_fsync,
- .splice_read = generic_file_splice_read,
+ .splice_read = shmem_file_splice_read,
.splice_write = iter_file_splice_write,
.fallocate = shmem_fallocate,
#endif
diff --git a/net/can/bcm.c b/net/can/bcm.c
index 27706f6..a962ec2 100644
--- a/net/can/bcm.c
+++ b/net/can/bcm.c
@@ -941,6 +941,8 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
cf = op->frames + op->cfsiz * i;
err = memcpy_from_msg((u8 *)cf, msg, op->cfsiz);
+ if (err < 0)
+ goto free_op;
if (op->flags & CAN_FD_FRAME) {
if (cf->len > 64)
@@ -950,12 +952,8 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
err = -EINVAL;
}
- if (err < 0) {
- if (op->frames != &op->sframe)
- kfree(op->frames);
- kfree(op);
- return err;
- }
+ if (err < 0)
+ goto free_op;
if (msg_head->flags & TX_CP_CAN_ID) {
/* copy can_id into frame */
@@ -1026,6 +1024,12 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
bcm_tx_start_timer(op);
return msg_head->nframes * op->cfsiz + MHSIZ;
+
+free_op:
+ if (op->frames != &op->sframe)
+ kfree(op->frames);
+ kfree(op);
+ return err;
}
/*
diff --git a/net/can/j1939/transport.c b/net/can/j1939/transport.c
index fce9b9e..fb92c36 100644
--- a/net/can/j1939/transport.c
+++ b/net/can/j1939/transport.c
@@ -1124,8 +1124,6 @@ static void __j1939_session_cancel(struct j1939_session *session,
if (session->sk)
j1939_sk_send_loop_abort(session->sk, session->err);
- else
- j1939_sk_errqueue(session, J1939_ERRQUEUE_RX_ABORT);
}
static void j1939_session_cancel(struct j1939_session *session,
@@ -1140,6 +1138,9 @@ static void j1939_session_cancel(struct j1939_session *session,
}
j1939_session_list_unlock(session->priv);
+
+ if (!session->sk)
+ j1939_sk_errqueue(session, J1939_ERRQUEUE_RX_ABORT);
}
static enum hrtimer_restart j1939_tp_txtimer(struct hrtimer *hrtimer)
@@ -1253,6 +1254,9 @@ static enum hrtimer_restart j1939_tp_rxtimer(struct hrtimer *hrtimer)
__j1939_session_cancel(session, J1939_XTP_ABORT_TIMEOUT);
}
j1939_session_list_unlock(session->priv);
+
+ if (!session->sk)
+ j1939_sk_errqueue(session, J1939_ERRQUEUE_RX_ABORT);
}
j1939_session_put(session);
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index cac1718..165bb2c 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -57,6 +57,12 @@ struct dsa_standalone_event_work {
u16 vid;
};
+struct dsa_host_vlan_rx_filtering_ctx {
+ struct net_device *dev;
+ const unsigned char *addr;
+ enum dsa_standalone_event event;
+};
+
static bool dsa_switch_supports_uc_filtering(struct dsa_switch *ds)
{
return ds->ops->port_fdb_add && ds->ops->port_fdb_del &&
@@ -155,18 +161,37 @@ static int dsa_slave_schedule_standalone_work(struct net_device *dev,
return 0;
}
+static int dsa_slave_host_vlan_rx_filtering(struct net_device *vdev, int vid,
+ void *arg)
+{
+ struct dsa_host_vlan_rx_filtering_ctx *ctx = arg;
+
+ return dsa_slave_schedule_standalone_work(ctx->dev, ctx->event,
+ ctx->addr, vid);
+}
+
static int dsa_slave_sync_uc(struct net_device *dev,
const unsigned char *addr)
{
struct net_device *master = dsa_slave_to_master(dev);
struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct dsa_host_vlan_rx_filtering_ctx ctx = {
+ .dev = dev,
+ .addr = addr,
+ .event = DSA_UC_ADD,
+ };
+ int err;
dev_uc_add(master, addr);
if (!dsa_switch_supports_uc_filtering(dp->ds))
return 0;
- return dsa_slave_schedule_standalone_work(dev, DSA_UC_ADD, addr, 0);
+ err = dsa_slave_schedule_standalone_work(dev, DSA_UC_ADD, addr, 0);
+ if (err)
+ return err;
+
+ return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx);
}
static int dsa_slave_unsync_uc(struct net_device *dev,
@@ -174,13 +199,23 @@ static int dsa_slave_unsync_uc(struct net_device *dev,
{
struct net_device *master = dsa_slave_to_master(dev);
struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct dsa_host_vlan_rx_filtering_ctx ctx = {
+ .dev = dev,
+ .addr = addr,
+ .event = DSA_UC_DEL,
+ };
+ int err;
dev_uc_del(master, addr);
if (!dsa_switch_supports_uc_filtering(dp->ds))
return 0;
- return dsa_slave_schedule_standalone_work(dev, DSA_UC_DEL, addr, 0);
+ err = dsa_slave_schedule_standalone_work(dev, DSA_UC_DEL, addr, 0);
+ if (err)
+ return err;
+
+ return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx);
}
static int dsa_slave_sync_mc(struct net_device *dev,
@@ -188,13 +223,23 @@ static int dsa_slave_sync_mc(struct net_device *dev,
{
struct net_device *master = dsa_slave_to_master(dev);
struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct dsa_host_vlan_rx_filtering_ctx ctx = {
+ .dev = dev,
+ .addr = addr,
+ .event = DSA_MC_ADD,
+ };
+ int err;
dev_mc_add(master, addr);
if (!dsa_switch_supports_mc_filtering(dp->ds))
return 0;
- return dsa_slave_schedule_standalone_work(dev, DSA_MC_ADD, addr, 0);
+ err = dsa_slave_schedule_standalone_work(dev, DSA_MC_ADD, addr, 0);
+ if (err)
+ return err;
+
+ return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx);
}
static int dsa_slave_unsync_mc(struct net_device *dev,
@@ -202,13 +247,23 @@ static int dsa_slave_unsync_mc(struct net_device *dev,
{
struct net_device *master = dsa_slave_to_master(dev);
struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct dsa_host_vlan_rx_filtering_ctx ctx = {
+ .dev = dev,
+ .addr = addr,
+ .event = DSA_MC_DEL,
+ };
+ int err;
dev_mc_del(master, addr);
if (!dsa_switch_supports_mc_filtering(dp->ds))
return 0;
- return dsa_slave_schedule_standalone_work(dev, DSA_MC_DEL, addr, 0);
+ err = dsa_slave_schedule_standalone_work(dev, DSA_MC_DEL, addr, 0);
+ if (err)
+ return err;
+
+ return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx);
}
void dsa_slave_sync_ha(struct net_device *dev)
@@ -1702,6 +1757,8 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
.flags = 0,
};
struct netlink_ext_ack extack = {0};
+ struct dsa_switch *ds = dp->ds;
+ struct netdev_hw_addr *ha;
int ret;
/* User port... */
@@ -1721,6 +1778,30 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
return ret;
}
+ if (!dsa_switch_supports_uc_filtering(ds) &&
+ !dsa_switch_supports_mc_filtering(ds))
+ return 0;
+
+ netif_addr_lock_bh(dev);
+
+ if (dsa_switch_supports_mc_filtering(ds)) {
+ netdev_for_each_synced_mc_addr(ha, dev) {
+ dsa_slave_schedule_standalone_work(dev, DSA_MC_ADD,
+ ha->addr, vid);
+ }
+ }
+
+ if (dsa_switch_supports_uc_filtering(ds)) {
+ netdev_for_each_synced_uc_addr(ha, dev) {
+ dsa_slave_schedule_standalone_work(dev, DSA_UC_ADD,
+ ha->addr, vid);
+ }
+ }
+
+ netif_addr_unlock_bh(dev);
+
+ dsa_flush_workqueue();
+
return 0;
}
@@ -1733,13 +1814,43 @@ static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
/* This API only allows programming tagged, non-PVID VIDs */
.flags = 0,
};
+ struct dsa_switch *ds = dp->ds;
+ struct netdev_hw_addr *ha;
int err;
err = dsa_port_vlan_del(dp, &vlan);
if (err)
return err;
- return dsa_port_host_vlan_del(dp, &vlan);
+ err = dsa_port_host_vlan_del(dp, &vlan);
+ if (err)
+ return err;
+
+ if (!dsa_switch_supports_uc_filtering(ds) &&
+ !dsa_switch_supports_mc_filtering(ds))
+ return 0;
+
+ netif_addr_lock_bh(dev);
+
+ if (dsa_switch_supports_mc_filtering(ds)) {
+ netdev_for_each_synced_mc_addr(ha, dev) {
+ dsa_slave_schedule_standalone_work(dev, DSA_MC_DEL,
+ ha->addr, vid);
+ }
+ }
+
+ if (dsa_switch_supports_uc_filtering(ds)) {
+ netdev_for_each_synced_uc_addr(ha, dev) {
+ dsa_slave_schedule_standalone_work(dev, DSA_UC_DEL,
+ ha->addr, vid);
+ }
+ }
+
+ netif_addr_unlock_bh(dev);
+
+ dsa_flush_workqueue();
+
+ return 0;
}
static int dsa_slave_restore_vlan(struct net_device *vdev, int vid, void *arg)
diff --git a/net/ieee802154/nl802154.c b/net/ieee802154/nl802154.c
index d8f4379..832e3c50 100644
--- a/net/ieee802154/nl802154.c
+++ b/net/ieee802154/nl802154.c
@@ -2488,8 +2488,7 @@ static int nl802154_del_llsec_seclevel(struct sk_buff *skb,
if (wpan_dev->iftype == NL802154_IFTYPE_MONITOR)
return -EOPNOTSUPP;
- if (!info->attrs[NL802154_ATTR_SEC_LEVEL] ||
- llsec_parse_seclevel(info->attrs[NL802154_ATTR_SEC_LEVEL],
+ if (llsec_parse_seclevel(info->attrs[NL802154_ATTR_SEC_LEVEL],
&sl) < 0)
return -EINVAL;
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index adcbedc..6cacd70 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2158,6 +2158,7 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt)
switch (skst) {
case TCP_FIN_WAIT1:
case TCP_FIN_WAIT2:
+ case TCP_LAST_ACK:
break;
case TCP_ESTABLISHED:
case TCP_CLOSE_WAIT:
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 6564192..37934df 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -363,6 +363,13 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
u32 free_space;
spin_lock_bh(&vvs->rx_lock);
+
+ if (WARN_ONCE(skb_queue_empty(&vvs->rx_queue) && vvs->rx_bytes,
+ "rx_queue is empty, but rx_bytes is non-zero\n")) {
+ spin_unlock_bh(&vvs->rx_lock);
+ return err;
+ }
+
while (total < len && !skb_queue_empty(&vvs->rx_queue)) {
skb = skb_peek(&vvs->rx_queue);
@@ -1068,7 +1075,7 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
memcpy(skb_put(last_skb, skb->len), skb->data, skb->len);
free_pkt = true;
last_hdr->flags |= hdr->flags;
- last_hdr->len = cpu_to_le32(last_skb->len);
+ le32_add_cpu(&last_hdr->len, len);
goto out;
}
}
diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
index 671e032..89905c0 100644
--- a/net/vmw_vsock/vsock_loopback.c
+++ b/net/vmw_vsock/vsock_loopback.c
@@ -15,7 +15,6 @@
struct vsock_loopback {
struct workqueue_struct *workqueue;
- spinlock_t pkt_list_lock; /* protects pkt_list */
struct sk_buff_head pkt_queue;
struct work_struct pkt_work;
};
@@ -32,9 +31,7 @@ static int vsock_loopback_send_pkt(struct sk_buff *skb)
struct vsock_loopback *vsock = &the_vsock_loopback;
int len = skb->len;
- spin_lock_bh(&vsock->pkt_list_lock);
skb_queue_tail(&vsock->pkt_queue, skb);
- spin_unlock_bh(&vsock->pkt_list_lock);
queue_work(vsock->workqueue, &vsock->pkt_work);
@@ -113,9 +110,9 @@ static void vsock_loopback_work(struct work_struct *work)
skb_queue_head_init(&pkts);
- spin_lock_bh(&vsock->pkt_list_lock);
+ spin_lock_bh(&vsock->pkt_queue.lock);
skb_queue_splice_init(&vsock->pkt_queue, &pkts);
- spin_unlock_bh(&vsock->pkt_list_lock);
+ spin_unlock_bh(&vsock->pkt_queue.lock);
while ((skb = __skb_dequeue(&pkts))) {
virtio_transport_deliver_tap_pkt(skb);
@@ -132,7 +129,6 @@ static int __init vsock_loopback_init(void)
if (!vsock->workqueue)
return -ENOMEM;
- spin_lock_init(&vsock->pkt_list_lock);
skb_queue_head_init(&vsock->pkt_queue);
INIT_WORK(&vsock->pkt_work, vsock_loopback_work);
@@ -156,9 +152,7 @@ static void __exit vsock_loopback_exit(void)
flush_work(&vsock->pkt_work);
- spin_lock_bh(&vsock->pkt_list_lock);
virtio_vsock_skb_queue_purge(&vsock->pkt_queue);
- spin_unlock_bh(&vsock->pkt_list_lock);
destroy_workqueue(vsock->workqueue);
}
diff --git a/scripts/kconfig/merge_config.sh b/scripts/kconfig/merge_config.sh
index 32620de..902eb42 100755
--- a/scripts/kconfig/merge_config.sh
+++ b/scripts/kconfig/merge_config.sh
@@ -145,7 +145,7 @@
NEW_VAL=$(grep -w $CFG $MERGE_FILE)
BUILTIN_FLAG=false
if [ "$BUILTIN" = "true" ] && [ "${NEW_VAL#CONFIG_*=}" = "m" ] && [ "${PREV_VAL#CONFIG_*=}" = "y" ]; then
- ${WARNOVVERIDE} Previous value: $PREV_VAL
+ ${WARNOVERRIDE} Previous value: $PREV_VAL
${WARNOVERRIDE} New value: $NEW_VAL
${WARNOVERRIDE} -y passed, will not demote y to m
${WARNOVERRIDE}
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index efff807..9466b6a 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -1733,7 +1733,7 @@ static void extract_crcs_for_object(const char *object, struct module *mod)
if (!isdigit(*p))
continue; /* skip this line */
- crc = strtol(p, &p, 0);
+ crc = strtoul(p, &p, 0);
if (*p != '\n')
continue; /* skip this line */
diff --git a/scripts/package/builddeb b/scripts/package/builddeb
index c5ae571..7b23f52c 100755
--- a/scripts/package/builddeb
+++ b/scripts/package/builddeb
@@ -162,6 +162,7 @@
install_kernel_headers () {
pdir=$1
+ version=$2
rm -rf $pdir
@@ -229,7 +230,7 @@
linux-libc-dev)
install_libc_headers debian/linux-libc-dev;;
linux-headers-*)
- install_kernel_headers debian/linux-headers;;
+ install_kernel_headers debian/linux-headers ${package#linux-headers-};;
esac
done
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
index 8b6aeb8..02fd659 100644
--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -2155,6 +2155,8 @@ int pcm_lib_apply_appl_ptr(struct snd_pcm_substream *substream,
ret = substream->ops->ack(substream);
if (ret < 0) {
runtime->control->appl_ptr = old_appl_ptr;
+ if (ret == -EPIPE)
+ __snd_pcm_xrun(substream);
return ret;
}
}
diff --git a/sound/pci/hda/patch_conexant.c b/sound/pci/hda/patch_conexant.c
index 75e1d00..a889ccc 100644
--- a/sound/pci/hda/patch_conexant.c
+++ b/sound/pci/hda/patch_conexant.c
@@ -980,7 +980,10 @@ static const struct snd_pci_quirk cxt5066_fixups[] = {
SND_PCI_QUIRK(0x17aa, 0x3905, "Lenovo G50-30", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x390b, "Lenovo G50-80", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x3975, "Lenovo U300s", CXT_FIXUP_STEREO_DMIC),
- SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_PINCFG_LENOVO_NOTEBOOK),
+ /* NOTE: we'd need to extend the quirk for 17aa:3977 as the same
+ * PCI SSID is used on multiple Lenovo models
+ */
+ SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo G50-70", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK(0x17aa, 0x397b, "Lenovo S205", CXT_FIXUP_STEREO_DMIC),
SND_PCI_QUIRK_VENDOR(0x17aa, "Thinkpad", CXT_FIXUP_THINKPAD_ACPI),
@@ -1003,6 +1006,7 @@ static const struct hda_model_fixup cxt5066_fixup_models[] = {
{ .id = CXT_FIXUP_MUTE_LED_GPIO, .name = "mute-led-gpio" },
{ .id = CXT_FIXUP_HP_ZBOOK_MUTE_LED, .name = "hp-zbook-mute-led" },
{ .id = CXT_FIXUP_HP_MIC_NO_PRESENCE, .name = "hp-mic-fix" },
+ { .id = CXT_PINCFG_LENOVO_NOTEBOOK, .name = "lenovo-20149" },
{}
};
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index f09a1d7..a2706fd 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -2631,6 +2631,7 @@ static const struct snd_pci_quirk alc882_fixup_tbl[] = {
SND_PCI_QUIRK(0x1558, 0x65e5, "Clevo PC50D[PRS](?:-D|-G)?", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
SND_PCI_QUIRK(0x1558, 0x65f1, "Clevo PC50HS", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
SND_PCI_QUIRK(0x1558, 0x65f5, "Clevo PD50PN[NRT]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
+ SND_PCI_QUIRK(0x1558, 0x66a2, "Clevo PE60RNE", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
SND_PCI_QUIRK(0x1558, 0x67d1, "Clevo PB71[ER][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
SND_PCI_QUIRK(0x1558, 0x67e1, "Clevo PB71[DE][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
SND_PCI_QUIRK(0x1558, 0x67e5, "Clevo PC70D[PRS](?:-D|-G)?", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
@@ -2651,6 +2652,7 @@ static const struct snd_pci_quirk alc882_fixup_tbl[] = {
SND_PCI_QUIRK(0x1558, 0x96e1, "Clevo P960[ER][CDFN]-K", ALC1220_FIXUP_CLEVO_P950),
SND_PCI_QUIRK(0x1558, 0x97e1, "Clevo P970[ER][CDFN]", ALC1220_FIXUP_CLEVO_P950),
SND_PCI_QUIRK(0x1558, 0x97e2, "Clevo P970RC-M", ALC1220_FIXUP_CLEVO_P950),
+ SND_PCI_QUIRK(0x1558, 0xd502, "Clevo PD50SNE", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
SND_PCI_QUIRK_VENDOR(0x1558, "Clevo laptop", ALC882_FIXUP_EAPD),
SND_PCI_QUIRK(0x161f, 0x2054, "Medion laptop", ALC883_FIXUP_EAPD),
SND_PCI_QUIRK(0x17aa, 0x3a0d, "Lenovo Y530", ALC882_FIXUP_LENOVO_Y530),
@@ -9260,7 +9262,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1028, 0x0a62, "Dell Precision 5560", ALC289_FIXUP_DUAL_SPK),
SND_PCI_QUIRK(0x1028, 0x0a9d, "Dell Latitude 5430", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1028, 0x0a9e, "Dell Latitude 5430", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE),
- SND_PCI_QUIRK(0x1028, 0x0ac9, "Dell Precision 3260", ALC295_FIXUP_CHROME_BOOK),
+ SND_PCI_QUIRK(0x1028, 0x0ac9, "Dell Precision 3260", ALC283_FIXUP_CHROME_BOOK),
SND_PCI_QUIRK(0x1028, 0x0b19, "Dell XPS 15 9520", ALC289_FIXUP_DUAL_SPK),
SND_PCI_QUIRK(0x1028, 0x0b1a, "Dell Precision 5570", ALC289_FIXUP_DUAL_SPK),
SND_PCI_QUIRK(0x1028, 0x0b37, "Dell Inspiron 16 Plus 7620 2-in-1", ALC295_FIXUP_DELL_INSPIRON_TOP_SPEAKERS),
@@ -9575,6 +9577,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1558, 0x5101, "Clevo S510WU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x5157, "Clevo W517GU1", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x51a1, "Clevo NS50MU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0x5630, "Clevo NP50RNJS", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x70a1, "Clevo NB70T[HJK]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x70b3, "Clevo NK70SB", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0x70f2, "Clevo NH79EPY", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
@@ -9609,6 +9612,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1558, 0x971d, "Clevo N970T[CDF]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0xa500, "Clevo NL5[03]RU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0xa600, "Clevo NL50NU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
+ SND_PCI_QUIRK(0x1558, 0xa671, "Clevo NP70SN[CDE]", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0xb018, "Clevo NP50D[BE]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0xb019, "Clevo NH77D[BE]Q", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1558, 0xb022, "Clevo NH77D[DC][QW]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
@@ -9709,6 +9713,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x17aa, 0x511e, "Thinkpad", ALC298_FIXUP_TPT470_DOCK),
SND_PCI_QUIRK(0x17aa, 0x511f, "Thinkpad", ALC298_FIXUP_TPT470_DOCK),
SND_PCI_QUIRK(0x17aa, 0x9e54, "LENOVO NB", ALC269_FIXUP_LENOVO_EAPD),
+ SND_PCI_QUIRK(0x17aa, 0x9e56, "Lenovo ZhaoYang CF4620Z", ALC286_FIXUP_SONY_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1849, 0x1233, "ASRock NUC Box 1100", ALC233_FIXUP_NO_AUDIO_JACK),
SND_PCI_QUIRK(0x1849, 0xa233, "Positivo Master C6300", ALC269_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x19e5, 0x3204, "Huawei MACH-WX9", ALC256_FIXUP_HUAWEI_MACH_WX9_PINS),
diff --git a/sound/pci/ymfpci/ymfpci.c b/sound/pci/ymfpci/ymfpci.c
index 1e198e4..82d4e0f 100644
--- a/sound/pci/ymfpci/ymfpci.c
+++ b/sound/pci/ymfpci/ymfpci.c
@@ -170,7 +170,7 @@ static int snd_card_ymfpci_probe(struct pci_dev *pci,
return -ENOENT;
}
- err = snd_card_new(&pci->dev, index[dev], id[dev], THIS_MODULE,
+ err = snd_devm_card_new(&pci->dev, index[dev], id[dev], THIS_MODULE,
sizeof(*chip), &card);
if (err < 0)
return err;
diff --git a/sound/pci/ymfpci/ymfpci_main.c b/sound/pci/ymfpci/ymfpci_main.c
index c80114c..b492c32 100644
--- a/sound/pci/ymfpci/ymfpci_main.c
+++ b/sound/pci/ymfpci/ymfpci_main.c
@@ -2165,7 +2165,7 @@ static int snd_ymfpci_memalloc(struct snd_ymfpci *chip)
chip->work_base = ptr;
chip->work_base_addr = ptr_addr;
- snd_BUG_ON(ptr + chip->work_size !=
+ snd_BUG_ON(ptr + PAGE_ALIGN(chip->work_size) !=
chip->work_ptr->area + chip->work_ptr->bytes);
snd_ymfpci_writel(chip, YDSXGR_PLAYCTRLBASE, chip->bank_base_playback_addr);
diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c
index 419302e..647fa05 100644
--- a/sound/usb/endpoint.c
+++ b/sound/usb/endpoint.c
@@ -455,8 +455,8 @@ static void push_back_to_ready_list(struct snd_usb_endpoint *ep,
* This function is used both for implicit feedback endpoints and in low-
* latency playback mode.
*/
-void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
- bool in_stream_lock)
+int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
+ bool in_stream_lock)
{
bool implicit_fb = snd_usb_endpoint_implicit_feedback_sink(ep);
@@ -480,7 +480,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
spin_unlock_irqrestore(&ep->lock, flags);
if (ctx == NULL)
- return;
+ break;
/* copy over the length information */
if (implicit_fb) {
@@ -495,11 +495,14 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
break;
if (err < 0) {
/* push back to ready list again for -EAGAIN */
- if (err == -EAGAIN)
+ if (err == -EAGAIN) {
push_back_to_ready_list(ep, ctx);
- else
+ break;
+ }
+
+ if (!in_stream_lock)
notify_xrun(ep);
- return;
+ return -EPIPE;
}
err = usb_submit_urb(ctx->urb, GFP_ATOMIC);
@@ -507,13 +510,16 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
usb_audio_err(ep->chip,
"Unable to submit urb #%d: %d at %s\n",
ctx->index, err, __func__);
- notify_xrun(ep);
- return;
+ if (!in_stream_lock)
+ notify_xrun(ep);
+ return -EPIPE;
}
set_bit(ctx->index, &ep->active_mask);
atomic_inc(&ep->submitted_urbs);
}
+
+ return 0;
}
/*
diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h
index 924f435..c09f68c 100644
--- a/sound/usb/endpoint.h
+++ b/sound/usb/endpoint.h
@@ -52,7 +52,7 @@ int snd_usb_endpoint_implicit_feedback_sink(struct snd_usb_endpoint *ep);
int snd_usb_endpoint_next_packet_size(struct snd_usb_endpoint *ep,
struct snd_urb_ctx *ctx, int idx,
unsigned int avail);
-void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
- bool in_stream_lock);
+int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep,
+ bool in_stream_lock);
#endif /* __USBAUDIO_ENDPOINT_H */
diff --git a/sound/usb/format.c b/sound/usb/format.c
index 405dc0b..4b1c5ba 100644
--- a/sound/usb/format.c
+++ b/sound/usb/format.c
@@ -39,8 +39,12 @@ static u64 parse_audio_format_i_type(struct snd_usb_audio *chip,
case UAC_VERSION_1:
default: {
struct uac_format_type_i_discrete_descriptor *fmt = _fmt;
- if (format >= 64)
- return 0; /* invalid format */
+ if (format >= 64) {
+ usb_audio_info(chip,
+ "%u:%d: invalid format type 0x%llx is detected, processed as PCM\n",
+ fp->iface, fp->altsetting, format);
+ format = UAC_FORMAT_TYPE_I_PCM;
+ }
sample_width = fmt->bBitResolution;
sample_bytes = fmt->bSubframeSize;
format = 1ULL << format;
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c
index d959da7..eec5232 100644
--- a/sound/usb/pcm.c
+++ b/sound/usb/pcm.c
@@ -1639,7 +1639,7 @@ static int snd_usb_pcm_playback_ack(struct snd_pcm_substream *substream)
* outputs here
*/
if (!ep->active_mask)
- snd_usb_queue_pending_output_urbs(ep, true);
+ return snd_usb_queue_pending_output_urbs(ep, true);
return 0;
}
diff --git a/tools/testing/selftests/sigaltstack/current_stack_pointer.h b/tools/testing/selftests/sigaltstack/current_stack_pointer.h
new file mode 100644
index 0000000..ea9bdf3
--- /dev/null
+++ b/tools/testing/selftests/sigaltstack/current_stack_pointer.h
@@ -0,0 +1,23 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#if __alpha__
+register unsigned long sp asm("$30");
+#elif __arm__ || __aarch64__ || __csky__ || __m68k__ || __mips__ || __riscv
+register unsigned long sp asm("sp");
+#elif __i386__
+register unsigned long sp asm("esp");
+#elif __loongarch64
+register unsigned long sp asm("$sp");
+#elif __ppc__
+register unsigned long sp asm("r1");
+#elif __s390x__
+register unsigned long sp asm("%15");
+#elif __sh__
+register unsigned long sp asm("r15");
+#elif __x86_64__
+register unsigned long sp asm("rsp");
+#elif __XTENSA__
+register unsigned long sp asm("a1");
+#else
+#error "implement current_stack_pointer equivalent"
+#endif
diff --git a/tools/testing/selftests/sigaltstack/sas.c b/tools/testing/selftests/sigaltstack/sas.c
index c53b070..98d37cb 100644
--- a/tools/testing/selftests/sigaltstack/sas.c
+++ b/tools/testing/selftests/sigaltstack/sas.c
@@ -20,6 +20,7 @@
#include <sys/auxv.h>
#include "../kselftest.h"
+#include "current_stack_pointer.h"
#ifndef SS_AUTODISARM
#define SS_AUTODISARM (1U << 31)
@@ -46,12 +47,6 @@ void my_usr1(int sig, siginfo_t *si, void *u)
stack_t stk;
struct stk_data *p;
-#if __s390x__
- register unsigned long sp asm("%15");
-#else
- register unsigned long sp asm("sp");
-#endif
-
if (sp < (unsigned long)sstack ||
sp >= (unsigned long)sstack + stack_size) {
ksft_exit_fail_msg("SP is not on sigaltstack\n");
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 3de10db..12b97c9 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -968,6 +968,91 @@ static void test_seqpacket_inv_buf_server(const struct test_opts *opts)
test_inv_buf_server(opts, false);
}
+#define HELLO_STR "HELLO"
+#define WORLD_STR "WORLD"
+
+static void test_stream_virtio_skb_merge_client(const struct test_opts *opts)
+{
+ ssize_t res;
+ int fd;
+
+ fd = vsock_stream_connect(opts->peer_cid, 1234);
+ if (fd < 0) {
+ perror("connect");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Send first skbuff. */
+ res = send(fd, HELLO_STR, strlen(HELLO_STR), 0);
+ if (res != strlen(HELLO_STR)) {
+ fprintf(stderr, "unexpected send(2) result %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("SEND0");
+ /* Peer reads part of first skbuff. */
+ control_expectln("REPLY0");
+
+ /* Send second skbuff, it will be appended to the first. */
+ res = send(fd, WORLD_STR, strlen(WORLD_STR), 0);
+ if (res != strlen(WORLD_STR)) {
+ fprintf(stderr, "unexpected send(2) result %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("SEND1");
+ /* Peer reads merged skbuff packet. */
+ control_expectln("REPLY1");
+
+ close(fd);
+}
+
+static void test_stream_virtio_skb_merge_server(const struct test_opts *opts)
+{
+ unsigned char buf[64];
+ ssize_t res;
+ int fd;
+
+ fd = vsock_stream_accept(VMADDR_CID_ANY, 1234, NULL);
+ if (fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ control_expectln("SEND0");
+
+ /* Read skbuff partially. */
+ res = recv(fd, buf, 2, 0);
+ if (res != 2) {
+ fprintf(stderr, "expected recv(2) returns 2 bytes, got %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("REPLY0");
+ control_expectln("SEND1");
+
+ res = recv(fd, buf + 2, sizeof(buf) - 2, 0);
+ if (res != 8) {
+ fprintf(stderr, "expected recv(2) returns 8 bytes, got %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ res = recv(fd, buf, sizeof(buf) - 8 - 2, MSG_DONTWAIT);
+ if (res != -1) {
+ fprintf(stderr, "expected recv(2) failure, got %zi\n", res);
+ exit(EXIT_FAILURE);
+ }
+
+ if (memcmp(buf, HELLO_STR WORLD_STR, strlen(HELLO_STR WORLD_STR))) {
+ fprintf(stderr, "pattern mismatch\n");
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("REPLY1");
+
+ close(fd);
+}
+
static struct test_case test_cases[] = {
{
.name = "SOCK_STREAM connection reset",
@@ -1038,6 +1123,11 @@ static struct test_case test_cases[] = {
.run_client = test_seqpacket_inv_buf_client,
.run_server = test_seqpacket_inv_buf_server,
},
+ {
+ .name = "SOCK_STREAM virtio skb merge",
+ .run_client = test_stream_virtio_skb_merge_client,
+ .run_server = test_stream_virtio_skb_merge_server,
+ },
{},
};