Merge tag 'media/v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media
Pull more media updates from Mauro Carvalho Chehab:
- a set of atomisp patches. They remove several abstraction layers, and
fixes clang and gcc warnings (that were hidden via some macros that
were disabling 4 or 5 types of warnings there). There are also some
important fixes and sensor auto-detection on newer BIOSes via ACPI
_DCM tables.
- some fixes
* tag 'media/v5.8-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media: (95 commits)
media: rkvdec: Fix H264 scaling list order
media: v4l2-ctrls: Unset correct HEVC loop filter flag
media: videobuf2-dma-contig: fix bad kfree in vb2_dma_contig_clear_max_seg_size
media: v4l2-subdev.rst: correct information about v4l2 events
media: s5p-mfc: Properly handle dma_parms for the allocated devices
media: medium: cec: Make MEDIA_CEC_SUPPORT default to n if !MEDIA_SUPPORT
media: cedrus: Implement runtime PM
media: cedrus: Program output format during each run
media: atomisp: improve ACPI/DMI detection logs
media: Revert "media: atomisp: add Asus Transform T101HA ACPI vars"
media: Revert "media: atomisp: Add some ACPI detection info"
media: atomisp: improve sensor detection code to use _DSM table
media: atomisp: get rid of an iomem abstraction layer
media: atomisp: get rid of a string_support.h abstraction layer
media: atomisp: use strscpy() instead of less secure variants
media: atomisp: set DFS to MAX if sensor doesn't report fps
media: atomisp: use different dfs failed messages
media: atomisp: change the detection of ISP2401 at runtime
media: atomisp: use macros from intel-family.h
media: atomisp: don't set hpll_freq twice with different values
...
diff --git a/.clang-format b/.clang-format
index 6ec5558..a0a9608 100644
--- a/.clang-format
+++ b/.clang-format
@@ -80,6 +80,7 @@
- 'ax25_uid_for_each'
- '__bio_for_each_bvec'
- 'bio_for_each_bvec'
+ - 'bio_for_each_bvec_all'
- 'bio_for_each_integrity_vec'
- '__bio_for_each_segment'
- 'bio_for_each_segment'
@@ -142,10 +143,13 @@
- 'for_each_card_auxs'
- 'for_each_card_auxs_safe'
- 'for_each_card_components'
+ - 'for_each_card_dapms'
- 'for_each_card_pre_auxs'
- 'for_each_card_prelinks'
- 'for_each_card_rtds'
- 'for_each_card_rtds_safe'
+ - 'for_each_card_widgets'
+ - 'for_each_card_widgets_safe'
- 'for_each_cgroup_storage_type'
- 'for_each_child_of_node'
- 'for_each_clear_bit'
@@ -160,6 +164,7 @@
- 'for_each_cpu_and'
- 'for_each_cpu_not'
- 'for_each_cpu_wrap'
+ - 'for_each_dapm_widgets'
- 'for_each_dev_addr'
- 'for_each_dev_scope'
- 'for_each_displayid_db'
@@ -170,7 +175,6 @@
- 'for_each_dpcm_fe'
- 'for_each_drhd_unit'
- 'for_each_dss_dev'
- - 'for_each_efi_handle'
- 'for_each_efi_memory_desc'
- 'for_each_efi_memory_desc_in_map'
- 'for_each_element'
@@ -191,6 +195,7 @@
- 'for_each_ip_tunnel_rcu'
- 'for_each_irq_nr'
- 'for_each_link_codecs'
+ - 'for_each_link_cpus'
- 'for_each_link_platforms'
- 'for_each_lru'
- 'for_each_matching_node'
@@ -250,6 +255,7 @@
- 'for_each_pci_bridge'
- 'for_each_pci_dev'
- 'for_each_pci_msi_entry'
+ - 'for_each_pcm_streams'
- 'for_each_populated_zone'
- 'for_each_possible_cpu'
- 'for_each_present_cpu'
@@ -260,9 +266,12 @@
- 'for_each_property_of_node'
- 'for_each_registered_fb'
- 'for_each_reserved_mem_region'
- - 'for_each_rtd_codec_dai'
- - 'for_each_rtd_codec_dai_rollback'
+ - 'for_each_rtd_codec_dais'
+ - 'for_each_rtd_codec_dais_rollback'
- 'for_each_rtd_components'
+ - 'for_each_rtd_cpu_dais'
+ - 'for_each_rtd_cpu_dais_rollback'
+ - 'for_each_rtd_dais'
- 'for_each_set_bit'
- 'for_each_set_bit_from'
- 'for_each_set_clump8'
@@ -334,6 +343,7 @@
- 'klp_for_each_object'
- 'klp_for_each_object_safe'
- 'klp_for_each_object_static'
+ - 'kunit_suite_for_each_test_case'
- 'kvm_for_each_memslot'
- 'kvm_for_each_vcpu'
- 'list_for_each'
@@ -387,6 +397,7 @@
- 'of_property_for_each_string'
- 'of_property_for_each_u32'
- 'pci_bus_for_each_resource'
+ - 'pcm_for_each_format'
- 'ping_portaddr_for_each_entry'
- 'plist_for_each'
- 'plist_for_each_continue'
@@ -482,7 +493,7 @@
MacroBlockBegin: ''
MacroBlockEnd: ''
MaxEmptyLinesToKeep: 1
-NamespaceIndentation: Inner
+NamespaceIndentation: None
#ObjCBinPackProtocolList: Auto # Unknown to clang-format-5.0
ObjCBlockIndentWidth: 8
ObjCSpaceAfterProperty: true
diff --git a/.gitignore b/.gitignore
index 2258e90..87b9dd8 100644
--- a/.gitignore
+++ b/.gitignore
@@ -56,6 +56,7 @@
/linux
/vmlinux
/vmlinux.32
+/vmlinux.symvers
/vmlinux-gdb.py
/vmlinuz
/System.map
diff --git a/.mailmap b/.mailmap
index db3754a4..c69d9c7 100644
--- a/.mailmap
+++ b/.mailmap
@@ -152,6 +152,7 @@
Kuninori Morimoto <kuninori.morimoto.gx@renesas.com>
Leon Romanovsky <leon@kernel.org> <leon@leon.nu>
Leon Romanovsky <leon@kernel.org> <leonro@mellanox.com>
+Leonardo Bras <leobras.c@gmail.com> <leonardo@linux.ibm.com>
Leonid I Ananiev <leonid.i.ananiev@intel.com>
Linas Vepstas <linas@austin.ibm.com>
Linus Lüssing <linus.luessing@c0d3.blue> <linus.luessing@web.de>
@@ -234,7 +235,9 @@
Ralf Wildenhues <Ralf.Wildenhues@gmx.de>
Randy Dunlap <rdunlap@infradead.org> <rdunlap@xenotime.net>
Rémi Denis-Courmont <rdenis@simphalempin.com>
-Ricardo Ribalda Delgado <ricardo.ribalda@gmail.com>
+Ricardo Ribalda <ribalda@kernel.org> <ricardo.ribalda@gmail.com>
+Ricardo Ribalda <ribalda@kernel.org> <ricardo@ribalda.com>
+Ricardo Ribalda <ribalda@kernel.org> Ricardo Ribalda Delgado <ribalda@kernel.org>
Ross Zwisler <zwisler@kernel.org> <ross.zwisler@linux.intel.com>
Rudolf Marek <R.Marek@sh.cvut.cz>
Rui Saraiva <rmps@joel.ist.utl.pt>
@@ -288,6 +291,8 @@
Vladimir Davydov <vdavydov.dev@gmail.com> <vdavydov@parallels.com>
Takashi YOSHII <takashi.yoshii.zj@renesas.com>
Will Deacon <will@kernel.org> <will.deacon@arm.com>
+Wolfram Sang <wsa@kernel.org> <wsa@the-dreams.de>
+Wolfram Sang <wsa@kernel.org> <w.sang@pengutronix.de>
Yakir Yang <kuankuan.y@gmail.com> <ykk@rock-chips.com>
Yusuke Goda <goda.yusuke@renesas.com>
Gustavo Padovan <gustavo@las.ic.unicamp.br>
diff --git a/CREDITS b/CREDITS
index 032b599..0787b5872 100644
--- a/CREDITS
+++ b/CREDITS
@@ -3104,14 +3104,16 @@
D: Generic Z8530 driver, AX.25 DAMA slave implementation
D: Several AX.25 hacks
-N: Ricardo Ribalda Delgado
-E: ricardo.ribalda@gmail.com
+N: Ricardo Ribalda
+E: ribalda@kernel.org
W: http://ribalda.com
D: PLX USB338x driver
D: PCA9634 driver
D: Option GTM671WFS
D: Fintek F81216A
D: AD5761 iio driver
+D: TI DAC7612 driver
+D: Sony IMX214 driver
D: Various kernel hacks
S: Qtechnology A/S
S: Valby Langgade 142
diff --git a/Documentation/ABI/obsolete/sysfs-cpuidle b/Documentation/ABI/obsolete/sysfs-cpuidle
new file mode 100644
index 0000000..e398fb5
--- /dev/null
+++ b/Documentation/ABI/obsolete/sysfs-cpuidle
@@ -0,0 +1,9 @@
+What: /sys/devices/system/cpu/cpuidle/current_governor_ro
+Date: April, 2020
+Contact: linux-pm@vger.kernel.org
+Description:
+ current_governor_ro shows current using cpuidle governor, but read only.
+ with the update that cpuidle governor can be changed at runtime in default,
+ both current_governor and current_governor_ro co-exist under
+ /sys/devices/system/cpu/cpuidle/ file, it's duplicate so make
+ current_governor_ro obselete.
diff --git a/Documentation/ABI/obsolete/sysfs-driver-intel_pmc_bxt b/Documentation/ABI/obsolete/sysfs-driver-intel_pmc_bxt
new file mode 100644
index 0000000..39d5659
--- /dev/null
+++ b/Documentation/ABI/obsolete/sysfs-driver-intel_pmc_bxt
@@ -0,0 +1,22 @@
+These files allow sending arbitrary IPC commands to the PMC/SCU which
+may be dangerous. These will be removed eventually and should not be
+used in any new applications.
+
+What: /sys/bus/platform/devices/INT34D2:00/simplecmd
+Date: Jun 2015
+KernelVersion: 4.1
+Contact: Mika Westerberg <mika.westerberg@linux.intel.com>
+Description: This interface allows userspace to send an arbitrary
+ IPC command to the PMC/SCU.
+
+ Format: %d %d where first number is command and
+ second number is subcommand.
+
+What: /sys/bus/platform/devices/INT34D2:00/northpeak
+Date: Jun 2015
+KernelVersion: 4.1
+Contact: Mika Westerberg <mika.westerberg@linux.intel.com>
+Description: This interface allows userspace to enable and disable
+ Northpeak through the PMC/SCU.
+
+ Format: %u.
diff --git a/Documentation/ABI/stable/sysfs-devices-node b/Documentation/ABI/stable/sysfs-devices-node
index df8413c..484fc04 100644
--- a/Documentation/ABI/stable/sysfs-devices-node
+++ b/Documentation/ABI/stable/sysfs-devices-node
@@ -54,7 +54,7 @@
Contact: Linux Memory Management list <linux-mm@kvack.org>
Description:
Provides information about the node's distribution and memory
- utilization. Similar to /proc/meminfo, see Documentation/filesystems/proc.txt
+ utilization. Similar to /proc/meminfo, see Documentation/filesystems/proc.rst
What: /sys/devices/system/node/nodeX/numastat
Date: October 2002
diff --git a/Documentation/ABI/stable/sysfs-driver-dma-idxd b/Documentation/ABI/stable/sysfs-driver-dma-idxd
index f4be46cc..b5bebf6 100644
--- a/Documentation/ABI/stable/sysfs-driver-dma-idxd
+++ b/Documentation/ABI/stable/sysfs-driver-dma-idxd
@@ -1,3 +1,9 @@
+What: sys/bus/dsa/devices/dsa<m>/version
+Date: Apr 15, 2020
+KernelVersion: 5.8.0
+Contact: dmaengine@vger.kernel.org
+Description: The hardware version number.
+
What: sys/bus/dsa/devices/dsa<m>/cdev_major
Date: Oct 25, 2019
KernelVersion: 5.6.0
diff --git a/Documentation/ABI/stable/sysfs-driver-firmware-zynqmp b/Documentation/ABI/stable/sysfs-driver-firmware-zynqmp
new file mode 100644
index 0000000..00fa04c
--- /dev/null
+++ b/Documentation/ABI/stable/sysfs-driver-firmware-zynqmp
@@ -0,0 +1,103 @@
+What: /sys/devices/platform/firmware\:zynqmp-firmware/ggs*
+Date: March 2020
+KernelVersion: 5.6
+Contact: "Jolly Shah" <jollys@xilinx.com>
+Description:
+ Read/Write PMU global general storage register value,
+ GLOBAL_GEN_STORAGE{0:3}.
+ Global general storage register that can be used
+ by system to pass information between masters.
+
+ The register is reset during system or power-on
+ resets. Three registers are used by the FSBL and
+ other Xilinx software products: GLOBAL_GEN_STORAGE{4:6}.
+
+ Usage:
+ # cat /sys/devices/platform/firmware\:zynqmp-firmware/ggs0
+ # echo <value> > /sys/devices/platform/firmware\:zynqmp-firmware/ggs0
+
+ Example:
+ # cat /sys/devices/platform/firmware\:zynqmp-firmware/ggs0
+ # echo 0x1234ABCD > /sys/devices/platform/firmware\:zynqmp-firmware/ggs0
+
+Users: Xilinx
+
+What: /sys/devices/platform/firmware\:zynqmp-firmware/pggs*
+Date: March 2020
+KernelVersion: 5.6
+Contact: "Jolly Shah" <jollys@xilinx.com>
+Description:
+ Read/Write PMU persistent global general storage register
+ value, PERS_GLOB_GEN_STORAGE{0:3}.
+ Persistent global general storage register that
+ can be used by system to pass information between
+ masters.
+
+ This register is only reset by the power-on reset
+ and maintains its value through a system reset.
+ Four registers are used by the FSBL and other Xilinx
+ software products: PERS_GLOB_GEN_STORAGE{4:7}.
+ Register is reset only by a POR reset.
+
+ Usage:
+ # cat /sys/devices/platform/firmware\:zynqmp-firmware/pggs0
+ # echo <value> > /sys/devices/platform/firmware\:zynqmp-firmware/pggs0
+
+ Example:
+ # cat /sys/devices/platform/firmware\:zynqmp-firmware/pggs0
+ # echo 0x1234ABCD > /sys/devices/platform/firmware\:zynqmp-firmware/pggs0
+
+Users: Xilinx
+
+What: /sys/devices/platform/firmware\:zynqmp-firmware/shutdown_scope
+Date: March 2020
+KernelVersion: 5.6
+Contact: "Jolly Shah" <jollys@xilinx.com>
+Description:
+ This sysfs interface allows to set the shutdown scope for the
+ next shutdown request. When the next shutdown is performed, the
+ platform specific portion of PSCI-system_off can use the chosen
+ shutdown scope.
+
+ Following are available shutdown scopes(subtypes):
+
+ subsystem: Only the APU along with all of its peripherals
+ not used by other processing units will be
+ shut down. This may result in the FPD power
+ domain being shut down provided that no other
+ processing unit uses FPD peripherals or DRAM.
+ ps_only: The complete PS will be shut down, including the
+ RPU, PMU, etc. Only the PL domain (FPGA)
+ remains untouched.
+ system: The complete system/device is shut down.
+
+ Usage:
+ # cat /sys/devices/platform/firmware\:zynqmp-firmware/shutdown_scope
+ # echo <scope> > /sys/devices/platform/firmware\:zynqmp-firmware/shutdown_scope
+
+ Example:
+ # cat /sys/devices/platform/firmware\:zynqmp-firmware/shutdown_scope
+ # echo "subsystem" > /sys/devices/platform/firmware\:zynqmp-firmware/shutdown_scope
+
+Users: Xilinx
+
+What: /sys/devices/platform/firmware\:zynqmp-firmware/health_status
+Date: March 2020
+KernelVersion: 5.6
+Contact: "Jolly Shah" <jollys@xilinx.com>
+Description:
+ This sysfs interface allows to set the health status. If PMUFW
+ is compiled with CHECK_HEALTHY_BOOT, it will check the healthy
+ bit on FPD WDT expiration. If healthy bit is set by a user
+ application running in Linux, PMUFW will do APU only restart. If
+ healthy bit is not set during FPD WDT expiration, PMUFW will do
+ system restart.
+
+ Usage:
+ Set healthy bit
+ # echo 1 > /sys/devices/platform/firmware\:zynqmp-firmware/health_status
+
+ Unset healthy bit
+ # echo 0 > /sys/devices/platform/firmware\:zynqmp-firmware/health_status
+
+Users: Xilinx
diff --git a/Documentation/ABI/testing/debugfs-driver-habanalabs b/Documentation/ABI/testing/debugfs-driver-habanalabs
index a73601c..f6d9c2a 100644
--- a/Documentation/ABI/testing/debugfs-driver-habanalabs
+++ b/Documentation/ABI/testing/debugfs-driver-habanalabs
@@ -8,6 +8,16 @@
only when the IOMMU is disabled.
The acceptable value is a string that starts with "0x"
+What: /sys/kernel/debug/habanalabs/hl<n>/clk_gate
+Date: May 2020
+KernelVersion: 5.8
+Contact: oded.gabbay@gmail.com
+Description: Allow the root user to disable/enable in runtime the clock
+ gating mechanism in Gaudi. Due to how Gaudi is built, the
+ clock gating needs to be disabled in order to access the
+ registers of the TPC and MME engines. This is sometimes needed
+ during debug by the user and hence the user needs this option
+
What: /sys/kernel/debug/habanalabs/hl<n>/command_buffers
Date: Jan 2019
KernelVersion: 5.1
@@ -150,3 +160,10 @@
Contact: oded.gabbay@gmail.com
Description: Displays a list with information about all the active virtual
address mappings per ASID
+
+What: /sys/kernel/debug/habanalabs/hl<n>/stop_on_err
+Date: Mar 2020
+KernelVersion: 5.6
+Contact: oded.gabbay@gmail.com
+Description: Sets the stop-on_error option for the device engines. Value of
+ "0" is for disable, otherwise enable.
diff --git a/Documentation/ABI/testing/debugfs-hisi-hpre b/Documentation/ABI/testing/debugfs-hisi-hpre
index ec4a79e..b4be5f1 100644
--- a/Documentation/ABI/testing/debugfs-hisi-hpre
+++ b/Documentation/ABI/testing/debugfs-hisi-hpre
@@ -33,7 +33,7 @@
Description: Dump debug registers from the HPRE.
Only available for PF.
-What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/qm_regs
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/regs
Date: Sep 2019
Contact: linux-crypto@vger.kernel.org
Description: Dump debug registers from the QM.
@@ -44,14 +44,97 @@
Date: Sep 2019
Contact: linux-crypto@vger.kernel.org
Description: One QM may contain multiple queues. Select specific queue to
- show its debug registers in above qm_regs.
+ show its debug registers in above regs.
Only available for PF.
What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/clear_enable
Date: Sep 2019
Contact: linux-crypto@vger.kernel.org
-Description: QM debug registers(qm_regs) read clear control. 1 means enable
+Description: QM debug registers(regs) read clear control. 1 means enable
register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers.
Only available for PF.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/err_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of invalid interrupts for
+ QM task completion.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/aeq_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of QM async event queue interrupts.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/abnormal_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of interrupts for QM abnormal event.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/create_qp_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of queue allocation errors.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/mb_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of failed QM mailbox commands.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/status
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the status of the QM.
+ Four states: initiated, started, stopped and closed.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/send_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of sent requests.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/recv_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of received requests.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/send_busy_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of requests sent
+ with returning busy.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/send_fail_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of completed but error requests.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/invalid_req_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of invalid requests being received.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/overtime_thrhld
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Set the threshold time for counting the request which is
+ processed longer than the threshold.
+ 0: disable(default), 1: 1 microsecond.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/over_thrhld_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of time out requests.
+ Available for both PF and VF, and take no other effect on HPRE.
diff --git a/Documentation/ABI/testing/debugfs-hisi-sec b/Documentation/ABI/testing/debugfs-hisi-sec
index 06adb89..85feb44 100644
--- a/Documentation/ABI/testing/debugfs-hisi-sec
+++ b/Documentation/ABI/testing/debugfs-hisi-sec
@@ -1,10 +1,4 @@
-What: /sys/kernel/debug/hisi_sec/<bdf>/sec_dfx
-Date: Oct 2019
-Contact: linux-crypto@vger.kernel.org
-Description: Dump the debug registers of SEC cores.
- Only available for PF.
-
-What: /sys/kernel/debug/hisi_sec/<bdf>/clear_enable
+What: /sys/kernel/debug/hisi_sec2/<bdf>/clear_enable
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading
@@ -12,7 +6,7 @@
0: disable, 1: enable.
Only available for PF, and take no other effect on SEC.
-What: /sys/kernel/debug/hisi_sec/<bdf>/current_qm
+What: /sys/kernel/debug/hisi_sec2/<bdf>/current_qm
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: One SEC controller has one PF and multiple VFs, each function
@@ -20,24 +14,100 @@
qm refers to.
Only available for PF.
-What: /sys/kernel/debug/hisi_sec/<bdf>/qm/qm_regs
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/qm_regs
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
Available for PF and VF in host. VF in guest currently only
has one debug register.
-What: /sys/kernel/debug/hisi_sec/<bdf>/qm/current_q
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/current_q
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: One QM of SEC may contain multiple queues. Select specific
- queue to show its debug registers in above 'qm_regs'.
+ queue to show its debug registers in above 'regs'.
Only available for PF.
-What: /sys/kernel/debug/hisi_sec/<bdf>/qm/clear_enable
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/clear_enable
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading
the SEC's QM debug registers.
0: disable, 1: enable.
Only available for PF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/err_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of invalid interrupts for
+ QM task completion.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/aeq_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of QM async event queue interrupts.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/abnormal_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of interrupts for QM abnormal event.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/create_qp_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of queue allocation errors.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/mb_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of failed QM mailbox commands.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/status
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the status of the QM.
+ Four states: initiated, started, stopped and closed.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of sent requests.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/recv_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of received requests.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_busy_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of requests sent with returning busy.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/err_bd_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of BD type error requests
+ to be received.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/invalid_req_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of invalid requests being received.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/done_flag_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of completed but marked error requests
+ to be received.
+ Available for both PF and VF, and take no other effect on SEC.
diff --git a/Documentation/ABI/testing/debugfs-hisi-zip b/Documentation/ABI/testing/debugfs-hisi-zip
index a7c63e6..3034a2b 100644
--- a/Documentation/ABI/testing/debugfs-hisi-zip
+++ b/Documentation/ABI/testing/debugfs-hisi-zip
@@ -26,7 +26,7 @@
has a QM. Select the QM which below qm refers to.
Only available for PF.
-What: /sys/kernel/debug/hisi_zip/<bdf>/qm/qm_regs
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
@@ -37,14 +37,78 @@
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: One QM may contain multiple queues. Select specific queue to
- show its debug registers in above qm_regs.
+ show its debug registers in above regs.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/clear_enable
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
-Description: QM debug registers(qm_regs) read clear control. 1 means enable
+Description: QM debug registers(regs) read clear control. 1 means enable
register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers.
Only available for PF.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/err_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of invalid interrupts for
+ QM task completion.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/aeq_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of QM async event queue interrupts.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/abnormal_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of interrupts for QM abnormal event.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/create_qp_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of queue allocation errors.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/mb_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of failed QM mailbox commands.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/status
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the status of the QM.
+ Four states: initiated, started, stopped and closed.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of sent requests.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/recv_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of received requests.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_busy_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of requests received
+ with returning busy.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/err_bd_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of BD type error requests
+ to be received.
+ Available for both PF and VF, and take no other effect on ZIP.
diff --git a/Documentation/ABI/testing/dev-kmsg b/Documentation/ABI/testing/dev-kmsg
index f307506e..1e6c28b 100644
--- a/Documentation/ABI/testing/dev-kmsg
+++ b/Documentation/ABI/testing/dev-kmsg
@@ -56,6 +56,11 @@
seek after the last record available at the time
the last SYSLOG_ACTION_CLEAR was issued.
+ Due to the record nature of this interface with a "read all"
+ behavior and the specific positions each seek operation sets,
+ SEEK_CUR is not supported, returning -ESPIPE (invalid seek) to
+ errno whenever requested.
+
The output format consists of a prefix carrying the syslog
prefix including priority and facility, the 64 bit message
sequence number and the monotonic timestamp in microseconds,
diff --git a/Documentation/ABI/testing/procfs-smaps_rollup b/Documentation/ABI/testing/procfs-smaps_rollup
index 274df44..0469781 100644
--- a/Documentation/ABI/testing/procfs-smaps_rollup
+++ b/Documentation/ABI/testing/procfs-smaps_rollup
@@ -11,7 +11,7 @@
Additionally, the fields Pss_Anon, Pss_File and Pss_Shmem
are not present in /proc/pid/smaps. These fields represent
the sum of the Pss field of each type (anon, file, shmem).
- For more details, see Documentation/filesystems/proc.txt
+ For more details, see Documentation/filesystems/proc.rst
and the procfs man page.
Typical output looks like this:
diff --git a/Documentation/ABI/testing/sysfs-block-rnbd b/Documentation/ABI/testing/sysfs-block-rnbd
new file mode 100644
index 0000000..8f070b4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-block-rnbd
@@ -0,0 +1,46 @@
+What: /sys/block/rnbd<N>/rnbd/unmap_device
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: To unmap a volume, "normal" or "force" has to be written to:
+ /sys/block/rnbd<N>/rnbd/unmap_device
+
+ When "normal" is used, the operation will fail with EBUSY if any process
+ is using the device. When "force" is used, the device is also unmapped
+ when device is in use. All I/Os that are in progress will fail.
+
+ Example:
+
+ # echo "normal" > /sys/block/rnbd0/rnbd/unmap_device
+
+What: /sys/block/rnbd<N>/rnbd/state
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: The file contains the current state of the block device. The state file
+ returns "open" when the device is successfully mapped from the server
+ and accepting I/O requests. When the connection to the server gets
+ disconnected in case of an error (e.g. link failure), the state file
+ returns "closed" and all I/O requests submitted to it will fail with -EIO.
+
+What: /sys/block/rnbd<N>/rnbd/session
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RNBD uses RTRS session to transport the data between client and
+ server. The entry "session" contains the name of the session, that
+ was used to establish the RTRS session. It's the same name that
+ was passed as server parameter to the map_device entry.
+
+What: /sys/block/rnbd<N>/rnbd/mapping_path
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Contains the path that was passed as "device_path" to the map_device
+ operation.
+
+What: /sys/block/rnbd<N>/rnbd/access_mode
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Contains the device access mode: ro, rw or migration.
diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-dfl_fme b/Documentation/ABI/testing/sysfs-bus-event_source-devices-dfl_fme
new file mode 100644
index 0000000..c9278a3
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-dfl_fme
@@ -0,0 +1,104 @@
+What: /sys/bus/event_source/devices/dfl_fmeX/format
+Date: April 2020
+KernelVersion: 5.8
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Attribute group to describe the magic bits
+ that go into perf_event_attr.config for a particular pmu.
+ (See ABI/testing/sysfs-bus-event_source-devices-format).
+
+ Each attribute under this group defines a bit range of the
+ perf_event_attr.config. All supported attributes are listed
+ below.
+
+ event = "config:0-11" - event ID
+ evtype = "config:12-15" - event type
+ portid = "config:16-23" - event source
+
+ For example,
+
+ fab_mmio_read = "event=0x06,evtype=0x02,portid=0xff"
+
+ It shows this fab_mmio_read is a fabric type (0x02) event with
+ 0x06 local event id for overall monitoring (portid=0xff).
+
+What: /sys/bus/event_source/devices/dfl_fmeX/cpumask
+Date: April 2020
+KernelVersion: 5.8
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. This file always returns cpu which the PMU is bound
+ for access to all fme pmu performance monitoring events.
+
+What: /sys/bus/event_source/devices/dfl_fmeX/events
+Date: April 2020
+KernelVersion: 5.8
+Contact: Wu Hao <hao.wu@intel.com>
+Description: Read-only. Attribute group to describe performance monitoring
+ events specific to fme. Each attribute in this group describes
+ a single performance monitoring event supported by this fme pmu.
+ The name of the file is the name of the event.
+ (See ABI/testing/sysfs-bus-event_source-devices-events).
+
+ All supported performance monitoring events are listed below.
+
+ Basic events (evtype=0x00)
+
+ clock = "event=0x00,evtype=0x00,portid=0xff"
+
+ Cache events (evtype=0x01)
+
+ cache_read_hit = "event=0x00,evtype=0x01,portid=0xff"
+ cache_read_miss = "event=0x01,evtype=0x01,portid=0xff"
+ cache_write_hit = "event=0x02,evtype=0x01,portid=0xff"
+ cache_write_miss = "event=0x03,evtype=0x01,portid=0xff"
+ cache_hold_request = "event=0x05,evtype=0x01,portid=0xff"
+ cache_data_write_port_contention =
+ "event=0x06,evtype=0x01,portid=0xff"
+ cache_tag_write_port_contention =
+ "event=0x07,evtype=0x01,portid=0xff"
+ cache_tx_req_stall = "event=0x08,evtype=0x01,portid=0xff"
+ cache_rx_req_stall = "event=0x09,evtype=0x01,portid=0xff"
+ cache_eviction = "event=0x0a,evtype=0x01,portid=0xff"
+
+ Fabric events (evtype=0x02)
+
+ fab_pcie0_read = "event=0x00,evtype=0x02,portid=0xff"
+ fab_pcie0_write = "event=0x01,evtype=0x02,portid=0xff"
+ fab_pcie1_read = "event=0x02,evtype=0x02,portid=0xff"
+ fab_pcie1_write = "event=0x03,evtype=0x02,portid=0xff"
+ fab_upi_read = "event=0x04,evtype=0x02,portid=0xff"
+ fab_upi_write = "event=0x05,evtype=0x02,portid=0xff"
+ fab_mmio_read = "event=0x06,evtype=0x02,portid=0xff"
+ fab_mmio_write = "event=0x07,evtype=0x02,portid=0xff"
+ fab_port_pcie0_read = "event=0x00,evtype=0x02,portid=?"
+ fab_port_pcie0_write = "event=0x01,evtype=0x02,portid=?"
+ fab_port_pcie1_read = "event=0x02,evtype=0x02,portid=?"
+ fab_port_pcie1_write = "event=0x03,evtype=0x02,portid=?"
+ fab_port_upi_read = "event=0x04,evtype=0x02,portid=?"
+ fab_port_upi_write = "event=0x05,evtype=0x02,portid=?"
+ fab_port_mmio_read = "event=0x06,evtype=0x02,portid=?"
+ fab_port_mmio_write = "event=0x07,evtype=0x02,portid=?"
+
+ VTD events (evtype=0x03)
+
+ vtd_port_read_transaction = "event=0x00,evtype=0x03,portid=?"
+ vtd_port_write_transaction = "event=0x01,evtype=0x03,portid=?"
+ vtd_port_devtlb_read_hit = "event=0x02,evtype=0x03,portid=?"
+ vtd_port_devtlb_write_hit = "event=0x03,evtype=0x03,portid=?"
+ vtd_port_devtlb_4k_fill = "event=0x04,evtype=0x03,portid=?"
+ vtd_port_devtlb_2m_fill = "event=0x05,evtype=0x03,portid=?"
+ vtd_port_devtlb_1g_fill = "event=0x06,evtype=0x03,portid=?"
+
+ VTD SIP events (evtype=0x04)
+
+ vtd_sip_iotlb_4k_hit = "event=0x00,evtype=0x04,portid=0xff"
+ vtd_sip_iotlb_2m_hit = "event=0x01,evtype=0x04,portid=0xff"
+ vtd_sip_iotlb_1g_hit = "event=0x02,evtype=0x04,portid=0xff"
+ vtd_sip_slpwc_l3_hit = "event=0x03,evtype=0x04,portid=0xff"
+ vtd_sip_slpwc_l4_hit = "event=0x04,evtype=0x04,portid=0xff"
+ vtd_sip_rcc_hit = "event=0x05,evtype=0x04,portid=0xff"
+ vtd_sip_iotlb_4k_miss = "event=0x06,evtype=0x04,portid=0xff"
+ vtd_sip_iotlb_2m_miss = "event=0x07,evtype=0x04,portid=0xff"
+ vtd_sip_iotlb_1g_miss = "event=0x08,evtype=0x04,portid=0xff"
+ vtd_sip_slpwc_l3_miss = "event=0x09,evtype=0x04,portid=0xff"
+ vtd_sip_slpwc_l4_miss = "event=0x0a,evtype=0x04,portid=0xff"
+ vtd_sip_rcc_miss = "event=0x0b,evtype=0x04,portid=0xff"
diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
index ec27c6c..e8698af 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
@@ -22,6 +22,27 @@
Exposes the "version" field of the 24x7 catalog. This is also
extractable from the provided binary "catalog" sysfs entry.
+What: /sys/devices/hv_24x7/interface/sockets
+Date: May 2020
+Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
+Description: read only
+ This sysfs interface exposes the number of sockets present in the
+ system.
+
+What: /sys/devices/hv_24x7/interface/chipspersocket
+Date: May 2020
+Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
+Description: read only
+ This sysfs interface exposes the number of chips per socket
+ present in the system.
+
+What: /sys/devices/hv_24x7/interface/coresperchip
+Date: May 2020
+Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
+Description: read only
+ This sysfs interface exposes the number of cores per chip
+ present in the system.
+
What: /sys/bus/event_source/devices/hv_24x7/event_descs/<event-name>
Date: February 2014
Contact: Linux on PowerPC Developer List <linuxppc-dev@lists.ozlabs.org>
diff --git a/Documentation/ABI/testing/sysfs-bus-iio-proximity b/Documentation/ABI/testing/sysfs-bus-iio-proximity
new file mode 100644
index 0000000..2172f3b
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-iio-proximity
@@ -0,0 +1,10 @@
+What: /sys/bus/iio/devices/iio:deviceX/in_proximity_nearlevel
+Date: March 2020
+KernelVersion: 5.7
+Contact: linux-iio@vger.kernel.org
+Description:
+ Near level for proximity sensors. This is a single integer
+ value that tells user space when an object should be
+ considered close to the device. If the value read from the
+ sensor is above or equal to the value in this file an object
+ should typically be considered near.
diff --git a/Documentation/ABI/testing/sysfs-bus-iio-sx9310 b/Documentation/ABI/testing/sysfs-bus-iio-sx9310
new file mode 100644
index 0000000..3ac7759
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-iio-sx9310
@@ -0,0 +1,10 @@
+What: /sys/bus/iio/devices/iio:deviceX/in_proximity3_comb_raw
+Date: February 2019
+KernelVersion: 5.6
+Contact: Daniel Campello <campello@chromium.org>
+Description:
+ Proximity measurement indicating that some object is
+ near the combined sensor. The combined sensor presents
+ proximity measurements constructed by hardware by
+ combining measurements taken from a given set of
+ physical sensors.
diff --git a/Documentation/ABI/testing/sysfs-bus-most b/Documentation/ABI/testing/sysfs-bus-most
index 6b1d06e..ec0a603 100644
--- a/Documentation/ABI/testing/sysfs-bus-most
+++ b/Documentation/ABI/testing/sysfs-bus-most
@@ -1,14 +1,14 @@
-What: /sys/bus/most/devices/.../description
+What: /sys/bus/most/devices/<dev>/description
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- Provides information about the interface type and the physical
- location of the device. Hardware attached via USB, for instance,
+ Provides information about the physical location of the
+ device. Hardware attached via USB, for instance,
might return <1-1.1:1.0>
Users:
-What: /sys/bus/most/devices/.../interface
+What: /sys/bus/most/devices/<dev>/interface
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -16,7 +16,7 @@
Indicates the type of peripheral interface the device uses.
Users:
-What: /sys/bus/most/devices/.../dci
+What: /sys/bus/most/devices/<dev>/dci
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -26,7 +26,7 @@
write the controller's DCI registers.
Users:
-What: /sys/bus/most/devices/.../dci/arb_address
+What: /sys/bus/most/devices/<dev>/dci/arb_address
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -35,7 +35,7 @@
application wants to read from or write to.
Users:
-What: /sys/bus/most/devices/.../dci/arb_value
+What: /sys/bus/most/devices/<dev>/dci/arb_value
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -44,7 +44,7 @@
is stored in arb_address.
Users:
-What: /sys/bus/most/devices/.../dci/mep_eui48_hi
+What: /sys/bus/most/devices/<dev>/dci/mep_eui48_hi
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -52,7 +52,7 @@
This is used to check and configure the MAC address.
Users:
-What: /sys/bus/most/devices/.../dci/mep_eui48_lo
+What: /sys/bus/most/devices/<dev>/dci/mep_eui48_lo
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -60,7 +60,7 @@
This is used to check and configure the MAC address.
Users:
-What: /sys/bus/most/devices/.../dci/mep_eui48_mi
+What: /sys/bus/most/devices/<dev>/dci/mep_eui48_mi
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -68,7 +68,7 @@
This is used to check and configure the MAC address.
Users:
-What: /sys/bus/most/devices/.../dci/mep_filter
+What: /sys/bus/most/devices/<dev>/dci/mep_filter
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -76,7 +76,7 @@
This is used to check and configure the MEP filter address.
Users:
-What: /sys/bus/most/devices/.../dci/mep_hash0
+What: /sys/bus/most/devices/<dev>/dci/mep_hash0
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -84,7 +84,7 @@
This is used to check and configure the MEP hash table.
Users:
-What: /sys/bus/most/devices/.../dci/mep_hash1
+What: /sys/bus/most/devices/<dev>/dci/mep_hash1
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -92,7 +92,7 @@
This is used to check and configure the MEP hash table.
Users:
-What: /sys/bus/most/devices/.../dci/mep_hash2
+What: /sys/bus/most/devices/<dev>/dci/mep_hash2
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -100,7 +100,7 @@
This is used to check and configure the MEP hash table.
Users:
-What: /sys/bus/most/devices/.../dci/mep_hash3
+What: /sys/bus/most/devices/<dev>/dci/mep_hash3
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -108,7 +108,7 @@
This is used to check and configure the MEP hash table.
Users:
-What: /sys/bus/most/devices/.../dci/ni_state
+What: /sys/bus/most/devices/<dev>/dci/ni_state
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -116,7 +116,7 @@
Indicates the current network interface state.
Users:
-What: /sys/bus/most/devices/.../dci/node_address
+What: /sys/bus/most/devices/<dev>/dci/node_address
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -124,7 +124,7 @@
Indicates the current node address.
Users:
-What: /sys/bus/most/devices/.../dci/node_position
+What: /sys/bus/most/devices/<dev>/dci/node_position
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -132,7 +132,7 @@
Indicates the current node position.
Users:
-What: /sys/bus/most/devices/.../dci/packet_bandwidth
+What: /sys/bus/most/devices/<dev>/dci/packet_bandwidth
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -140,7 +140,7 @@
Indicates the configured packet bandwidth.
Users:
-What: /sys/bus/most/devices/.../dci/sync_ep
+What: /sys/bus/most/devices/<dev>/dci/sync_ep
Date: June 2016
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -149,7 +149,7 @@
endpoint.
Users:
-What: /sys/bus/most/devices/.../<channel>/
+What: /sys/bus/most/devices/<dev>/<channel>/
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
@@ -160,91 +160,92 @@
configure it.
Users:
-What: /sys/bus/most/devices/.../<channel>/available_datatypes
+What: /sys/bus/most/devices/<dev>/<channel>/available_datatypes
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- Indicates the data types the current channel can transport.
+ Indicates the data types the channel can transport.
Users:
-What: /sys/bus/most/devices/.../<channel>/available_directions
+What: /sys/bus/most/devices/<dev>/<channel>/available_directions
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- Indicates the directions the current channel is capable of.
+ Indicates the directions the channel is capable of.
Users:
-What: /sys/bus/most/devices/.../<channel>/number_of_packet_buffers
+What: /sys/bus/most/devices/<dev>/<channel>/number_of_packet_buffers
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- Indicates the number of packet buffers the current channel can
+ Indicates the number of packet buffers the channel can
handle.
Users:
-What: /sys/bus/most/devices/.../<channel>/number_of_stream_buffers
+What: /sys/bus/most/devices/<dev>/<channel>/number_of_stream_buffers
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- Indicates the number of streaming buffers the current channel can
+ Indicates the number of streaming buffers the channel can
handle.
Users:
-What: /sys/bus/most/devices/.../<channel>/size_of_packet_buffer
+What: /sys/bus/most/devices/<dev>/<channel>/size_of_packet_buffer
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- Indicates the size of a packet buffer the current channel can
+ Indicates the size of a packet buffer the channel can
handle.
Users:
-What: /sys/bus/most/devices/.../<channel>/size_of_stream_buffer
+What: /sys/bus/most/devices/<dev>/<channel>/size_of_stream_buffer
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- Indicates the size of a streaming buffer the current channel can
+ Indicates the size of a streaming buffer the channel can
handle.
Users:
-What: /sys/bus/most/devices/.../<channel>/set_number_of_buffers
+What: /sys/bus/most/devices/<dev>/<channel>/set_number_of_buffers
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- This is to configure the number of buffers of the current channel.
+ This is to read back the configured number of buffers of
+ the channel.
Users:
-What: /sys/bus/most/devices/.../<channel>/set_buffer_size
+What: /sys/bus/most/devices/<dev>/<channel>/set_buffer_size
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- This is to configure the size of a buffer of the current channel.
+ This is to read back the configured buffer size of the channel.
Users:
-What: /sys/bus/most/devices/.../<channel>/set_direction
+What: /sys/bus/most/devices/<dev>/<channel>/set_direction
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- This is to configure the direction of the current channel.
+ This is to read back the configured direction of the channel.
The following strings will be accepted:
- 'dir_tx',
- 'dir_rx'
+ 'tx',
+ 'rx'
Users:
-What: /sys/bus/most/devices/.../<channel>/set_datatype
+What: /sys/bus/most/devices/<dev>/<channel>/set_datatype
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- This is to configure the data type of the current channel.
+ This is to read back the configured data type of the channel.
The following strings will be accepted:
'control',
'async',
@@ -252,30 +253,31 @@
'isoc_avp'
Users:
-What: /sys/bus/most/devices/.../<channel>/set_subbuffer_size
+What: /sys/bus/most/devices/<dev>/<channel>/set_subbuffer_size
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- This is to configure the subbuffer size of the current channel.
+ This is to read back the configured subbuffer size of
+ the channel.
Users:
-What: /sys/bus/most/devices/.../<channel>/set_packets_per_xact
+What: /sys/bus/most/devices/<dev>/<channel>/set_packets_per_xact
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- This is to configure the number of packets per transaction of
- the current channel. This is only needed network interface
- controller is attached via USB.
+ This is to read back the configured number of packets per
+ transaction of the channel. This is only applicable when
+ connected via USB.
Users:
-What: /sys/bus/most/devices/.../<channel>/channel_starving
+What: /sys/bus/most/devices/<dev>/<channel>/channel_starving
Date: March 2017
KernelVersion: 4.15
Contact: Christian Gromm <christian.gromm@microchip.com>
Description:
- Indicates whether current channel ran out of buffers.
+ Indicates whether channel ran out of buffers.
Users:
What: /sys/bus/most/drivers/most_core/components
diff --git a/Documentation/ABI/testing/sysfs-bus-soundwire-master b/Documentation/ABI/testing/sysfs-bus-soundwire-master
new file mode 100644
index 0000000..46ef038
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-soundwire-master
@@ -0,0 +1,23 @@
+What: /sys/bus/soundwire/devices/sdw-master-N/revision
+ /sys/bus/soundwire/devices/sdw-master-N/clk_stop_modes
+ /sys/bus/soundwire/devices/sdw-master-N/clk_freq
+ /sys/bus/soundwire/devices/sdw-master-N/clk_gears
+ /sys/bus/soundwire/devices/sdw-master-N/default_col
+ /sys/bus/soundwire/devices/sdw-master-N/default_frame_rate
+ /sys/bus/soundwire/devices/sdw-master-N/default_row
+ /sys/bus/soundwire/devices/sdw-master-N/dynamic_shape
+ /sys/bus/soundwire/devices/sdw-master-N/err_threshold
+ /sys/bus/soundwire/devices/sdw-master-N/max_clk_freq
+
+Date: April 2020
+
+Contact: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
+ Bard Liao <yung-chuan.liao@linux.intel.com>
+ Vinod Koul <vkoul@kernel.org>
+
+Description: SoundWire Master-N DisCo properties.
+ These properties are defined by MIPI DisCo Specification
+ for SoundWire. They define various properties of the Master
+ and are used by the bus to configure the Master. clk_stop_modes
+ is a bitmask for simplifications and combines the
+ clock-stop-mode0 and clock-stop-mode1 properties.
diff --git a/Documentation/ABI/testing/sysfs-bus-soundwire-slave b/Documentation/ABI/testing/sysfs-bus-soundwire-slave
new file mode 100644
index 0000000..db4c951
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-soundwire-slave
@@ -0,0 +1,91 @@
+What: /sys/bus/soundwire/devices/sdw:.../dev-properties/mipi_revision
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/wake_capable
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/test_mode_capable
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/clk_stop_mode1
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/simple_clk_stop_capable
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/clk_stop_timeout
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/ch_prep_timeout
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/reset_behave
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/high_PHY_capable
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/paging_support
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/bank_delay_support
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/p15_behave
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/master_count
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/source_ports
+ /sys/bus/soundwire/devices/sdw:.../dev-properties/sink_ports
+
+Date: May 2020
+
+Contact: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
+ Bard Liao <yung-chuan.liao@linux.intel.com>
+ Vinod Koul <vkoul@kernel.org>
+
+Description: SoundWire Slave DisCo properties.
+ These properties are defined by MIPI DisCo Specification
+ for SoundWire. They define various properties of the
+ SoundWire Slave and are used by the bus to configure
+ the Slave
+
+
+What: /sys/bus/soundwire/devices/sdw:.../dp0/max_word
+ /sys/bus/soundwire/devices/sdw:.../dp0/min_word
+ /sys/bus/soundwire/devices/sdw:.../dp0/words
+ /sys/bus/soundwire/devices/sdw:.../dp0/BRA_flow_controlled
+ /sys/bus/soundwire/devices/sdw:.../dp0/simple_ch_prep_sm
+ /sys/bus/soundwire/devices/sdw:.../dp0/imp_def_interrupts
+
+Date: May 2020
+
+Contact: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
+ Bard Liao <yung-chuan.liao@linux.intel.com>
+ Vinod Koul <vkoul@kernel.org>
+
+Description: SoundWire Slave Data Port-0 DisCo properties.
+ These properties are defined by MIPI DisCo Specification
+ for the SoundWire. They define various properties of the
+ Data port 0 are used by the bus to configure the Data Port 0.
+
+
+What: /sys/bus/soundwire/devices/sdw:.../dpN_src/max_word
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/min_word
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/words
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/type
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/max_grouping
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/simple_ch_prep_sm
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/ch_prep_timeout
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/imp_def_interrupts
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/min_ch
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/max_ch
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/channels
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/ch_combinations
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/max_async_buffer
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/block_pack_mode
+ /sys/bus/soundwire/devices/sdw:.../dpN_src/port_encoding
+
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/max_word
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/min_word
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/words
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/type
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/max_grouping
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/simple_ch_prep_sm
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/ch_prep_timeout
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/imp_def_interrupts
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/min_ch
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/max_ch
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/channels
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/ch_combinations
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/max_async_buffer
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/block_pack_mode
+ /sys/bus/soundwire/devices/sdw:.../dpN_sink/port_encoding
+
+Date: May 2020
+
+Contact: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
+ Bard Liao <yung-chuan.liao@linux.intel.com>
+ Vinod Koul <vkoul@kernel.org>
+
+Description: SoundWire Slave Data Source/Sink Port-N DisCo properties.
+ These properties are defined by MIPI DisCo Specification
+ for SoundWire. They define various properties of the
+ Source/Sink Data port N and are used by the bus to configure
+ the Data Port N.
diff --git a/Documentation/ABI/testing/sysfs-class-net b/Documentation/ABI/testing/sysfs-class-net
index 664a8f6..3b40457 100644
--- a/Documentation/ABI/testing/sysfs-class-net
+++ b/Documentation/ABI/testing/sysfs-class-net
@@ -124,6 +124,19 @@
authentication is performed (e.g: 802.1x). 'link_mode' attribute
will also reflect the dormant state.
+What: /sys/class/net/<iface>/testing
+Date: April 2002
+KernelVersion: 5.8
+Contact: netdev@vger.kernel.org
+Description:
+ Indicates whether the interface is under test. Possible
+ values are:
+ 0: interface is not being tested
+ 1: interface is being tested
+
+ When an interface is under test, it cannot be expected
+ to pass packets as normal.
+
What: /sys/clas/net/<iface>/duplex
Date: October 2009
KernelVersion: 2.6.33
diff --git a/Documentation/ABI/testing/sysfs-class-power b/Documentation/ABI/testing/sysfs-class-power
index bf3b48f..216d61a 100644
--- a/Documentation/ABI/testing/sysfs-class-power
+++ b/Documentation/ABI/testing/sysfs-class-power
@@ -74,6 +74,21 @@
Access: Read, Write
Valid values: 0 - 100 (percent)
+What: /sys/class/power_supply/<supply_name>/capacity_error_margin
+Date: April 2019
+Contact: linux-pm@vger.kernel.org
+Description:
+ Battery capacity measurement becomes unreliable without
+ recalibration. This values provides the maximum error
+ margin expected to exist by the fuel gauge in percent.
+ Values close to 0% will be returned after (re-)calibration
+ has happened. Over time the error margin will increase.
+ 100% means, that the capacity related values are basically
+ completely useless.
+
+ Access: Read
+ Valid values: 0 - 100 (percent)
+
What: /sys/class/power_supply/<supply_name>/capacity_level
Date: June 2009
Contact: linux-pm@vger.kernel.org
@@ -190,7 +205,7 @@
Valid values: "Unknown", "Good", "Overheat", "Dead",
"Over voltage", "Unspecified failure", "Cold",
"Watchdog timer expire", "Safety timer expire",
- "Over current"
+ "Over current", "Calibration required"
What: /sys/class/power_supply/<supply_name>/precharge_current
Date: June 2017
@@ -665,3 +680,31 @@
Valid values:
- 1: enabled
- 0: disabled
+
+What: /sys/class/power_supply/<supply_name>/manufacture_year
+Date: January 2020
+Contact: linux-pm@vger.kernel.org
+Description:
+ Reports the year (following Gregorian calendar) when the device has been
+ manufactured.
+
+ Access: Read
+ Valid values: Reported as integer
+
+What: /sys/class/power_supply/<supply_name>/manufacture_month
+Date: January 2020
+Contact: linux-pm@vger.kernel.org
+Description:
+ Reports the month when the device has been manufactured.
+
+ Access: Read
+ Valid values: 1-12
+
+What: /sys/class/power_supply/<supply_name>/manufacture_day
+Date: January 2020
+Contact: linux-pm@vger.kernel.org
+Description:
+ Reports the day of month when the device has been manufactured.
+
+ Access: Read
+ Valid values: 1-31
diff --git a/Documentation/ABI/testing/sysfs-class-power-mp2629 b/Documentation/ABI/testing/sysfs-class-power-mp2629
new file mode 100644
index 0000000..327a07e
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-power-mp2629
@@ -0,0 +1,8 @@
+What: /sys/class/power_supply/mp2629_battery/batt_impedance_compen
+Date: April 2020
+KernelVersion: 5.7
+Description:
+ Represents a battery impedance compensation to accelerate charging.
+
+ Access: Read, Write
+ Valid values: Represented in milli-ohms. Valid range is [0, 140].
diff --git a/Documentation/ABI/testing/sysfs-class-rnbd-client b/Documentation/ABI/testing/sysfs-class-rnbd-client
new file mode 100644
index 0000000..c084f20
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-rnbd-client
@@ -0,0 +1,111 @@
+What: /sys/class/rnbd-client
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Provide information about RNBD-client.
+ All sysfs files that are not read-only provide the usage information on read:
+
+ Example:
+ # cat /sys/class/rnbd-client/ctl/map_device
+
+ > Usage: echo "sessname=<name of the rtrs session> path=<[srcaddr,]dstaddr>
+ > [path=<[srcaddr,]dstaddr>] device_path=<full path on remote side>
+ > [access_mode=<ro|rw|migration>] > map_device
+ >
+ > addr ::= [ ip:<ipv4> | ip:<ipv6> | gid:<gid> ]
+
+What: /sys/class/rnbd-client/ctl/map_device
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Expected format is the following:
+
+ sessname=<name of the rtrs session>
+ path=<[srcaddr,]dstaddr> [path=<[srcaddr,]dstaddr> ...]
+ device_path=<full path on remote side>
+ [access_mode=<ro|rw|migration>]
+
+ Where:
+
+ sessname: accepts a string not bigger than 256 chars, which identifies
+ a given session on the client and on the server.
+ I.e. "clt_hostname-srv_hostname" could be a natural choice.
+
+ path: describes a connection between the client and the server by
+ specifying destination and, when required, the source address.
+ The addresses are to be provided in the following format:
+
+ ip:<IPv6>
+ ip:<IPv4>
+ gid:<GID>
+
+ for example:
+
+ path=ip:10.0.0.66
+ The single addr is treated as the destination.
+ The connection will be established to this server from any client IP address.
+
+ path=ip:10.0.0.66,ip:10.0.1.66
+ First addr is the source address and the second is the destination.
+
+ If multiple "path=" options are specified multiple connection
+ will be established and data will be sent according to
+ the selected multipath policy (see RTRS mp_policy sysfs entry description).
+
+ device_path: Path to the block device on the server side. Path is specified
+ relative to the directory on server side configured in the
+ 'dev_search_path' module parameter of the rnbd_server.
+ The rnbd_server prepends the <device_path> received from client
+ with <dev_search_path> and tries to open the
+ <dev_search_path>/<device_path> block device. On success,
+ a /dev/rnbd<N> device file, a /sys/block/rnbd_client/rnbd<N>/
+ directory and an entry in /sys/class/rnbd-client/ctl/devices
+ will be created.
+
+ If 'dev_search_path' contains '%SESSNAME%', then each session can
+ have different devices namespace, e.g. server was configured with
+ the following parameter "dev_search_path=/run/rnbd-devs/%SESSNAME%",
+ client has this string "sessname=blya device_path=sda", then server
+ will try to open: /run/rnbd-devs/blya/sda.
+
+ access_mode: the access_mode parameter specifies if the device is to be
+ mapped as "ro" read-only or "rw" read-write. The server allows
+ a device to be exported in rw mode only once. The "migration"
+ access mode has to be specified if a second mapping in read-write
+ mode is desired.
+
+ By default "rw" is used.
+
+ Exit Codes:
+
+ If the device is already mapped it will fail with EEXIST. If the input
+ has an invalid format it will return EINVAL. If the device path cannot
+ be found on the server, it will fail with ENOENT.
+
+ Finding device file after mapping
+ ---------------------------------
+
+ After mapping, the device file can be found by:
+ o The symlink /sys/class/rnbd-client/ctl/devices/<device_id>
+ points to /sys/block/<dev-name>. The last part of the symlink destination
+ is the same as the device name. By extracting the last part of the
+ path the path to the device /dev/<dev-name> can be build.
+
+ o /dev/block/$(cat /sys/class/rnbd-client/ctl/devices/<device_id>/dev)
+
+ How to find the <device_id> of the device is described on the next
+ section.
+
+What: /sys/class/rnbd-client/ctl/devices/
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: For each device mapped on the client a new symbolic link is created as
+ /sys/class/rnbd-client/ctl/devices/<device_id>, which points
+ to the block device created by rnbd (/sys/block/rnbd<N>/).
+ The <device_id> of each device is created as follows:
+
+ - If the 'device_path' provided during mapping contains slashes ("/"),
+ they are replaced by exclamation mark ("!") and used as as the
+ <device_id>. Otherwise, the <device_id> will be the same as the
+ "device_path" provided.
diff --git a/Documentation/ABI/testing/sysfs-class-rnbd-server b/Documentation/ABI/testing/sysfs-class-rnbd-server
new file mode 100644
index 0000000..ba60a90
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-rnbd-server
@@ -0,0 +1,50 @@
+What: /sys/class/rnbd-server
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: provide information about RNBD-server.
+
+What: /sys/class/rnbd-server/ctl/
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: When a client maps a device, a directory entry with the name of the
+ block device is created under /sys/class/rnbd-server/ctl/devices/.
+
+What: /sys/class/rnbd-server/ctl/devices/<device_name>/block_dev
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Is a symlink to the sysfs entry of the exported device.
+
+ Example:
+ block_dev -> ../../../../class/block/ram0
+
+What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: For each client a particular device is exported to, following directory will be
+ created:
+
+ /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/
+
+ When the device is unmapped by that client, the directory will be removed.
+
+What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/read_only
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Contains '1' if device is mapped read-only, otherwise '0'.
+
+What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/mapping_path
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Contains the relative device path provided by the user during mapping.
+
+What: /sys/class/rnbd-server/ctl/devices/<device_name>/sessions/<session-name>/access_mode
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Contains the device access mode: ro, rw or migration.
diff --git a/Documentation/ABI/testing/sysfs-class-rtrs-client b/Documentation/ABI/testing/sysfs-class-rtrs-client
new file mode 100644
index 0000000..e7e718d
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-rtrs-client
@@ -0,0 +1,131 @@
+What: /sys/class/rtrs-client
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: When a user of RTRS API creates a new session, a directory entry with
+ the name of that session is created under /sys/class/rtrs-client/<session-name>/
+
+What: /sys/class/rtrs-client/<session-name>/add_path
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RW, adds a new path (connection) to an existing session. Expected format is the
+ following:
+
+ <[source addr,]destination addr>
+ *addr ::= [ ip:<ipv4|ipv6> | gid:<gid> ]
+
+What: /sys/class/rtrs-client/<session-name>/max_reconnect_attempts
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Maximum number reconnect attempts the client should make before giving up
+ after connection breaks unexpectedly.
+
+What: /sys/class/rtrs-client/<session-name>/mp_policy
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Multipath policy specifies which path should be selected on each IO:
+
+ round-robin (0):
+ select path in per CPU round-robin manner.
+
+ min-inflight (1):
+ select path with minimum inflights.
+
+What: /sys/class/rtrs-client/<session-name>/paths/
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Each path belonging to a given session is listed here by its source and
+ destination address. When a new path is added to a session by writing to
+ the "add_path" entry, a directory <src@dst> is created.
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/state
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RO, Contains "connected" if the session is connected to the peer and fully
+ functional. Otherwise the file contains "disconnected"
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/reconnect
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Write "1" to the file in order to reconnect the path.
+ Operation is blocking and returns 0 if reconnect was successful.
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/disconnect
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Write "1" to the file in order to disconnect the path.
+ Operation blocks until RTRS path is disconnected.
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/remove_path
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Write "1" to the file in order to disconnected and remove the path
+ from the session. Operation blocks until the path is disconnected
+ and removed from the session.
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_name
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RO, Contains the the name of HCA the connection established on.
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/hca_port
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RO, Contains the port number of active port traffic is going through.
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/src_addr
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RO, Contains the source address of the path
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/dst_addr
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RO, Contains the destination address of the path
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/reset_all
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RW, Read will return usage help, write 0 will clear all the statistics.
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/cpu_migration
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RTRS expects that each HCA IRQ is pinned to a separate CPU. If it's
+ not the case, the processing of an I/O response could be processed on a
+ different CPU than where it was originally submitted. This file shows
+ how many interrupts where generated on a non expected CPU.
+ "from:" is the CPU on which the IRQ was expected, but not generated.
+ "to:" is the CPU on which the IRQ was generated, but not expected.
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/reconnects
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Contains 2 unsigned int values, the first one records number of successful
+ reconnects in the path lifetime, the second one records number of failed
+ reconnects in the path lifetime.
+
+What: /sys/class/rtrs-client/<session-name>/paths/<src@dst>/stats/rdma
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Contains statistics regarding rdma operations and inflight operations.
+ The output consists of 6 values:
+
+ <read-count> <read-total-size> <write-count> <write-total-size> \
+ <inflights> <failovered>
diff --git a/Documentation/ABI/testing/sysfs-class-rtrs-server b/Documentation/ABI/testing/sysfs-class-rtrs-server
new file mode 100644
index 0000000..3b6d5b0
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-class-rtrs-server
@@ -0,0 +1,53 @@
+What: /sys/class/rtrs-server
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: When a user of RTRS API creates a new session on a client side, a
+ directory entry with the name of that session is created in here.
+
+What: /sys/class/rtrs-server/<session-name>/paths/
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: When new path is created by writing to "add_path" entry on client side,
+ a directory entry named as <source address>@<destination address> is created
+ on server.
+
+What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/disconnect
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: When "1" is written to the file, the RTRS session is being disconnected.
+ Operations is non-blocking and returns control immediately to the caller.
+
+What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_name
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RO, Contains the the name of HCA the connection established on.
+
+What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/hca_port
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RO, Contains the port number of active port traffic is going through.
+
+What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/src_addr
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RO, Contains the source address of the path
+
+What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/dst_addr
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: RO, Contains the destination address of the path
+
+What: /sys/class/rtrs-server/<session-name>/paths/<src@dst>/stats/rdma
+Date: Feb 2020
+KernelVersion: 5.7
+Contact: Jack Wang <jinpu.wang@cloud.ionos.com> Danil Kipnis <danil.kipnis@cloud.ionos.com>
+Description: Contains statistics regarding rdma operations and inflight operations.
+ The output consists of 5 values:
+ <read-count> <read-total-size> <write-count> <write-total-size> <inflights>
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 2e0e3b4..b555df8 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -106,10 +106,10 @@
See Documentation/admin-guide/cputopology.rst for more information.
-What: /sys/devices/system/cpu/cpuidle/current_driver
- /sys/devices/system/cpu/cpuidle/current_governer_ro
- /sys/devices/system/cpu/cpuidle/available_governors
+What: /sys/devices/system/cpu/cpuidle/available_governors
+ /sys/devices/system/cpu/cpuidle/current_driver
/sys/devices/system/cpu/cpuidle/current_governor
+ /sys/devices/system/cpu/cpuidle/current_governer_ro
Date: September 2007
Contact: Linux kernel mailing list <linux-kernel@vger.kernel.org>
Description: Discover cpuidle policy and mechanism
@@ -119,24 +119,18 @@
consumption during idle.
Idle policy (governor) is differentiated from idle mechanism
- (driver)
-
- current_driver: (RO) displays current idle mechanism
-
- current_governor_ro: (RO) displays current idle policy
-
- With the cpuidle_sysfs_switch boot option enabled (meant for
- developer testing), the following three attributes are visible
- instead:
-
- current_driver: same as described above
+ (driver).
available_governors: (RO) displays a space separated list of
- available governors
+ available governors.
+
+ current_driver: (RO) displays current idle mechanism.
current_governor: (RW) displays current idle policy. Users can
switch the governor at runtime by writing to this file.
+ current_governor_ro: (RO) displays current idle policy.
+
See Documentation/admin-guide/pm/cpuidle.rst and
Documentation/driver-api/pm/cpuidle.rst for more information.
@@ -492,6 +486,7 @@
/sys/devices/system/cpu/vulnerabilities/spec_store_bypass
/sys/devices/system/cpu/vulnerabilities/l1tf
/sys/devices/system/cpu/vulnerabilities/mds
+ /sys/devices/system/cpu/vulnerabilities/srbds
/sys/devices/system/cpu/vulnerabilities/tsx_async_abort
/sys/devices/system/cpu/vulnerabilities/itlb_multihit
Date: January 2018
@@ -580,3 +575,42 @@
If 1, it means the system is using the Protected Execution
Facility in POWER9 and newer processors. i.e., it is a Secure
Virtual Machine.
+
+What: /sys/devices/system/cpu/cpuX/purr
+Date: Apr 2005
+Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description: PURR ticks for this CPU since the system boot.
+
+ The Processor Utilization Resources Register (PURR) is
+ a 64-bit counter which provides an estimate of the
+ resources used by the CPU thread. The contents of this
+ register increases monotonically. This sysfs interface
+ exposes the number of PURR ticks for cpuX.
+
+What: /sys/devices/system/cpu/cpuX/spurr
+Date: Dec 2006
+Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description: SPURR ticks for this CPU since the system boot.
+
+ The Scaled Processor Utilization Resources Register
+ (SPURR) is a 64-bit counter that provides a frequency
+ invariant estimate of the resources used by the CPU
+ thread. The contents of this register increases
+ monotonically. This sysfs interface exposes the number
+ of SPURR ticks for cpuX.
+
+What: /sys/devices/system/cpu/cpuX/idle_purr
+Date: Apr 2020
+Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description: PURR ticks for cpuX when it was idle.
+
+ This sysfs interface exposes the number of PURR ticks
+ for cpuX when it was idle.
+
+What: /sys/devices/system/cpu/cpuX/idle_spurr
+Date: Apr 2020
+Contact: Linux for PowerPC mailing list <linuxppc-dev@ozlabs.org>
+Description: SPURR ticks for cpuX when it was idle.
+
+ This sysfs interface exposes the number of SPURR ticks
+ for cpuX when it was idle.
diff --git a/Documentation/ABI/testing/sysfs-driver-habanalabs b/Documentation/ABI/testing/sysfs-driver-habanalabs
index 782df74..1a14bf9 100644
--- a/Documentation/ABI/testing/sysfs-driver-habanalabs
+++ b/Documentation/ABI/testing/sysfs-driver-habanalabs
@@ -10,6 +10,23 @@
Contact: oded.gabbay@gmail.com
Description: Version of the application running on the device's CPU
+What: /sys/class/habanalabs/hl<n>/clk_max_freq_mhz
+Date: Jun 2019
+KernelVersion: not yet upstreamed
+Contact: oded.gabbay@gmail.com
+Description: Allows the user to set the maximum clock frequency, in MHz.
+ The device clock might be set to lower value than the maximum.
+ The user should read the clk_cur_freq_mhz to see the actual
+ frequency value of the device clock. This property is valid
+ only for the Gaudi ASIC family
+
+What: /sys/class/habanalabs/hl<n>/clk_cur_freq_mhz
+Date: Jun 2019
+KernelVersion: not yet upstreamed
+Contact: oded.gabbay@gmail.com
+Description: Displays the current frequency, in MHz, of the device clock.
+ This property is valid only for the Gaudi ASIC family
+
What: /sys/class/habanalabs/hl<n>/cpld_ver
Date: Jan 2019
KernelVersion: 5.1
diff --git a/Documentation/ABI/testing/sysfs-driver-w1_therm b/Documentation/ABI/testing/sysfs-driver-w1_therm
new file mode 100644
index 0000000..076659d
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-driver-w1_therm
@@ -0,0 +1,116 @@
+What: /sys/bus/w1/devices/.../alarms
+Date: May 2020
+Contact: Akira Shimahara <akira215corp@gmail.com>
+Description:
+ (RW) read or write TH and TL (Temperature High an Low) alarms.
+ Values shall be space separated and in the device range
+ (typical -55 degC to 125 degC), if not values will be trimmed
+ to device min/max capabilities. Values are integer as they are
+ stored in a 8bit register in the device. Lowest value is
+ automatically put to TL. Once set, alarms could be search at
+ master level, refer to Documentation/w1/w1_generic.rst for
+ detailed information
+Users: any user space application which wants to communicate with
+ w1_term device
+
+
+What: /sys/bus/w1/devices/.../eeprom
+Date: May 2020
+Contact: Akira Shimahara <akira215corp@gmail.com>
+Description:
+ (WO) writing that file will either trigger a save of the
+ device data to its embedded EEPROM, either restore data
+ embedded in device EEPROM. Be aware that devices support
+ limited EEPROM writing cycles (typical 50k)
+ * 'save': save device RAM to EEPROM
+ * 'restore': restore EEPROM data in device RAM
+Users: any user space application which wants to communicate with
+ w1_term device
+
+
+What: /sys/bus/w1/devices/.../ext_power
+Date: May 2020
+Contact: Akira Shimahara <akira215corp@gmail.com>
+Description:
+ (RO) return the power status by asking the device
+ * '0': device parasite powered
+ * '1': device externally powered
+ * '-xx': xx is kernel error when reading power status
+Users: any user space application which wants to communicate with
+ w1_term device
+
+
+What: /sys/bus/w1/devices/.../resolution
+Date: May 2020
+Contact: Akira Shimahara <akira215corp@gmail.com>
+Description:
+ (RW) get or set the device resolution (on supported devices,
+ if not, this entry is not present). Note that the resolution
+ will be changed only in device RAM, so it will be cleared when
+ power is lost. Trigger a 'save' to EEPROM command to keep
+ values after power-on. Read or write are :
+ * '9..12': device resolution in bit
+ or resolution to set in bit
+ * '-xx': xx is kernel error when reading the resolution
+ * Anything else: do nothing
+Users: any user space application which wants to communicate with
+ w1_term device
+
+
+What: /sys/bus/w1/devices/.../temperature
+Date: May 2020
+Contact: Akira Shimahara <akira215corp@gmail.com>
+Description:
+ (RO) return the temperature in 1/1000 degC.
+ * If a bulk read has been triggered, it will directly
+ return the temperature computed when the bulk read
+ occurred, if available. If not yet available, nothing
+ is returned (a debug kernel message is sent), you
+ should retry later on.
+ * If no bulk read has been triggered, it will trigger
+ a conversion and send the result. Note that the
+ conversion duration depend on the resolution (if
+ device support this feature). It takes 94ms in 9bits
+ resolution, 750ms for 12bits.
+Users: any user space application which wants to communicate with
+ w1_term device
+
+
+What: /sys/bus/w1/devices/.../w1_slave
+Date: May 2020
+Contact: Akira Shimahara <akira215corp@gmail.com>
+Description:
+ (RW) return the temperature in 1/1000 degC.
+ *read*: return 2 lines with the hexa output data sent on the
+ bus, return the CRC check and temperature in 1/1000 degC
+ *write* :
+ * '0' : save the 2 or 3 bytes to the device EEPROM
+ (i.e. TH, TL and config register)
+ * '9..12' : set the device resolution in RAM
+ (if supported)
+ * Anything else: do nothing
+ refer to Documentation/w1/slaves/w1_therm.rst for detailed
+ information.
+Users: any user space application which wants to communicate with
+ w1_term device
+
+
+What: /sys/bus/w1/devices/w1_bus_masterXX/therm_bulk_read
+Date: May 2020
+Contact: Akira Shimahara <akira215corp@gmail.com>
+Description:
+ (RW) trigger a bulk read conversion. read the status
+ *read*:
+ * '-1': conversion in progress on at least 1 sensor
+ * '1' : conversion complete but at least one sensor
+ value has not been read yet
+ * '0' : no bulk operation. Reading temperature will
+ trigger a conversion on each device
+ *write*: 'trigger': trigger a bulk read on all supporting
+ devices on the bus
+ Note that if a bulk read is sent but one sensor is not read
+ immediately, the next access to temperature on this device
+ will return the temperature measured at the time of issue
+ of the bulk read command (not the current temperature).
+Users: any user space application which wants to communicate with
+ w1_term device
diff --git a/Documentation/ABI/testing/sysfs-fs-f2fs b/Documentation/ABI/testing/sysfs-fs-f2fs
index bd8a0d1..4bb93a0 100644
--- a/Documentation/ABI/testing/sysfs-fs-f2fs
+++ b/Documentation/ABI/testing/sysfs-fs-f2fs
@@ -323,3 +323,27 @@
Date: February 2020
Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
Description: Show the mounted time in secs of this partition.
+
+What: /sys/fs/f2fs/<disk>/data_io_flag
+Date: April 2020
+Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
+Description: Give a way to attach REQ_META|FUA to data writes
+ given temperature-based bits. Now the bits indicate:
+ * REQ_META | REQ_FUA |
+ * 5 | 4 | 3 | 2 | 1 | 0 |
+ * Cold | Warm | Hot | Cold | Warm | Hot |
+
+What: /sys/fs/f2fs/<disk>/node_io_flag
+Date: June 2020
+Contact: "Jaegeuk Kim" <jaegeuk@kernel.org>
+Description: Give a way to attach REQ_META|FUA to node writes
+ given temperature-based bits. Now the bits indicate:
+ * REQ_META | REQ_FUA |
+ * 5 | 4 | 3 | 2 | 1 | 0 |
+ * Cold | Warm | Hot | Cold | Warm | Hot |
+
+What: /sys/fs/f2fs/<disk>/iostat_period_ms
+Date: April 2020
+Contact: "Daeho Jeong" <daehojeong@google.com>
+Description: Give a way to change iostat_period time. 3secs by default.
+ The new iostat trace gives stats gap given the period.
diff --git a/Documentation/ABI/testing/sysfs-platform-dptf b/Documentation/ABI/testing/sysfs-platform-dptf
index 325dc06..eeed81c 100644
--- a/Documentation/ABI/testing/sysfs-platform-dptf
+++ b/Documentation/ABI/testing/sysfs-platform-dptf
@@ -27,10 +27,12 @@
Contact: linux-acpi@vger.kernel.org
Description:
(RO) Display the platform power source
- 0x00 = DC
- 0x01 = AC
- 0x02 = USB
- 0x03 = Wireless Charger
+ bits[3:0] Current power source
+ 0x00 = DC
+ 0x01 = AC
+ 0x02 = USB
+ 0x03 = Wireless Charger
+ bits[7:4] Power source sequence number
What: /sys/bus/platform/devices/INT3407:00/dptf_power/battery_steady_power
Date: Jul, 2016
@@ -38,3 +40,55 @@
Contact: linux-acpi@vger.kernel.org
Description:
(RO) The maximum sustained power for battery in milliwatts.
+
+What: /sys/bus/platform/devices/INT3407:00/dptf_power/rest_of_platform_power_mw
+Date: June, 2020
+KernelVersion: v5.8
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (RO) Shows the rest (outside of SoC) of worst-case platform power.
+
+What: /sys/bus/platform/devices/INT3407:00/dptf_power/prochot_confirm
+Date: June, 2020
+KernelVersion: v5.8
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (WO) Confirm embedded controller about a prochot notification.
+
+What: /sys/bus/platform/devices/INT3532:00/dptf_battery/max_platform_power_mw
+Date: June, 2020
+KernelVersion: v5.8
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (RO) The maximum platform power that can be supported by the battery in milli watts.
+
+What: /sys/bus/platform/devices/INT3532:00/dptf_battery/max_steady_state_power_mw
+Date: June, 2020
+KernelVersion: v5.8
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (RO) The maximum sustained power for battery in milli watts.
+
+What: /sys/bus/platform/devices/INT3532:00/dptf_battery/high_freq_impedance_mohm
+Date: June, 2020
+KernelVersion: v5.8
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (RO) The high frequency impedance value that can be obtained from battery
+ fuel gauge in milli Ohms.
+
+What: /sys/bus/platform/devices/INT3532:00/dptf_battery/no_load_voltage_mv
+Date: June, 2020
+KernelVersion: v5.8
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (RO) The no-load voltage that can be obtained from battery fuel gauge in
+ milli volts.
+
+What: /sys/bus/platform/devices/INT3532:00/dptf_battery/current_discharge_capbility_ma
+Date: June, 2020
+KernelVersion: v5.8
+Contact: linux-acpi@vger.kernel.org
+Description:
+ (RO) The battery discharge current capability obtained from battery fuel gauge in
+ milli Amps.
diff --git a/Documentation/ABI/testing/sysfs-platform-intel-wmi-sbl-fw-update b/Documentation/ABI/testing/sysfs-platform-intel-wmi-sbl-fw-update
new file mode 100644
index 0000000..5aa6189
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-platform-intel-wmi-sbl-fw-update
@@ -0,0 +1,12 @@
+What: /sys/bus/wmi/devices/44FADEB1-B204-40F2-8581-394BBDC1B651/firmware_update_request
+Date: April 2020
+KernelVersion: 5.7
+Contact: "Jithu Joseph" <jithu.joseph@intel.com>
+Description:
+ Allow user space entities to trigger update of Slim
+ Bootloader (SBL). This attribute normally has a value
+ of 0 and userspace can signal SBL to update firmware,
+ on next reboot, by writing a value of 1.
+ There are two available states:
+ * 0 -> Skip firmware update while rebooting
+ * 1 -> Attempt firmware update on next reboot
diff --git a/Documentation/COPYING-logo b/Documentation/COPYING-logo
index 296f0f7..b21c7cf 100644
--- a/Documentation/COPYING-logo
+++ b/Documentation/COPYING-logo
@@ -9,5 +9,5 @@
you want to use it for: for the full range of logos take a look at
Larry's web-page:
- http://www.isc.tamu.edu/~lewing/linux/
+ https://www.isc.tamu.edu/~lewing/linux/
diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt
deleted file mode 100644
index 507775c..0000000
--- a/Documentation/IRQ-domain.txt
+++ /dev/null
@@ -1,269 +0,0 @@
-===============================================
-The irq_domain interrupt number mapping library
-===============================================
-
-The current design of the Linux kernel uses a single large number
-space where each separate IRQ source is assigned a different number.
-This is simple when there is only one interrupt controller, but in
-systems with multiple interrupt controllers the kernel must ensure
-that each one gets assigned non-overlapping allocations of Linux
-IRQ numbers.
-
-The number of interrupt controllers registered as unique irqchips
-show a rising tendency: for example subdrivers of different kinds
-such as GPIO controllers avoid reimplementing identical callback
-mechanisms as the IRQ core system by modelling their interrupt
-handlers as irqchips, i.e. in effect cascading interrupt controllers.
-
-Here the interrupt number loose all kind of correspondence to
-hardware interrupt numbers: whereas in the past, IRQ numbers could
-be chosen so they matched the hardware IRQ line into the root
-interrupt controller (i.e. the component actually fireing the
-interrupt line to the CPU) nowadays this number is just a number.
-
-For this reason we need a mechanism to separate controller-local
-interrupt numbers, called hardware irq's, from Linux IRQ numbers.
-
-The irq_alloc_desc*() and irq_free_desc*() APIs provide allocation of
-irq numbers, but they don't provide any support for reverse mapping of
-the controller-local IRQ (hwirq) number into the Linux IRQ number
-space.
-
-The irq_domain library adds mapping between hwirq and IRQ numbers on
-top of the irq_alloc_desc*() API. An irq_domain to manage mapping is
-preferred over interrupt controller drivers open coding their own
-reverse mapping scheme.
-
-irq_domain also implements translation from an abstract irq_fwspec
-structure to hwirq numbers (Device Tree and ACPI GSI so far), and can
-be easily extended to support other IRQ topology data sources.
-
-irq_domain usage
-================
-
-An interrupt controller driver creates and registers an irq_domain by
-calling one of the irq_domain_add_*() functions (each mapping method
-has a different allocator function, more on that later). The function
-will return a pointer to the irq_domain on success. The caller must
-provide the allocator function with an irq_domain_ops structure.
-
-In most cases, the irq_domain will begin empty without any mappings
-between hwirq and IRQ numbers. Mappings are added to the irq_domain
-by calling irq_create_mapping() which accepts the irq_domain and a
-hwirq number as arguments. If a mapping for the hwirq doesn't already
-exist then it will allocate a new Linux irq_desc, associate it with
-the hwirq, and call the .map() callback so the driver can perform any
-required hardware setup.
-
-When an interrupt is received, irq_find_mapping() function should
-be used to find the Linux IRQ number from the hwirq number.
-
-The irq_create_mapping() function must be called *atleast once*
-before any call to irq_find_mapping(), lest the descriptor will not
-be allocated.
-
-If the driver has the Linux IRQ number or the irq_data pointer, and
-needs to know the associated hwirq number (such as in the irq_chip
-callbacks) then it can be directly obtained from irq_data->hwirq.
-
-Types of irq_domain mappings
-============================
-
-There are several mechanisms available for reverse mapping from hwirq
-to Linux irq, and each mechanism uses a different allocation function.
-Which reverse map type should be used depends on the use case. Each
-of the reverse map types are described below:
-
-Linear
-------
-
-::
-
- irq_domain_add_linear()
- irq_domain_create_linear()
-
-The linear reverse map maintains a fixed size table indexed by the
-hwirq number. When a hwirq is mapped, an irq_desc is allocated for
-the hwirq, and the IRQ number is stored in the table.
-
-The Linear map is a good choice when the maximum number of hwirqs is
-fixed and a relatively small number (~ < 256). The advantages of this
-map are fixed time lookup for IRQ numbers, and irq_descs are only
-allocated for in-use IRQs. The disadvantage is that the table must be
-as large as the largest possible hwirq number.
-
-irq_domain_add_linear() and irq_domain_create_linear() are functionally
-equivalent, except for the first argument is different - the former
-accepts an Open Firmware specific 'struct device_node', while the latter
-accepts a more general abstraction 'struct fwnode_handle'.
-
-The majority of drivers should use the linear map.
-
-Tree
-----
-
-::
-
- irq_domain_add_tree()
- irq_domain_create_tree()
-
-The irq_domain maintains a radix tree map from hwirq numbers to Linux
-IRQs. When an hwirq is mapped, an irq_desc is allocated and the
-hwirq is used as the lookup key for the radix tree.
-
-The tree map is a good choice if the hwirq number can be very large
-since it doesn't need to allocate a table as large as the largest
-hwirq number. The disadvantage is that hwirq to IRQ number lookup is
-dependent on how many entries are in the table.
-
-irq_domain_add_tree() and irq_domain_create_tree() are functionally
-equivalent, except for the first argument is different - the former
-accepts an Open Firmware specific 'struct device_node', while the latter
-accepts a more general abstraction 'struct fwnode_handle'.
-
-Very few drivers should need this mapping.
-
-No Map
-------
-
-::
-
- irq_domain_add_nomap()
-
-The No Map mapping is to be used when the hwirq number is
-programmable in the hardware. In this case it is best to program the
-Linux IRQ number into the hardware itself so that no mapping is
-required. Calling irq_create_direct_mapping() will allocate a Linux
-IRQ number and call the .map() callback so that driver can program the
-Linux IRQ number into the hardware.
-
-Most drivers cannot use this mapping.
-
-Legacy
-------
-
-::
-
- irq_domain_add_simple()
- irq_domain_add_legacy()
- irq_domain_add_legacy_isa()
-
-The Legacy mapping is a special case for drivers that already have a
-range of irq_descs allocated for the hwirqs. It is used when the
-driver cannot be immediately converted to use the linear mapping. For
-example, many embedded system board support files use a set of #defines
-for IRQ numbers that are passed to struct device registrations. In that
-case the Linux IRQ numbers cannot be dynamically assigned and the legacy
-mapping should be used.
-
-The legacy map assumes a contiguous range of IRQ numbers has already
-been allocated for the controller and that the IRQ number can be
-calculated by adding a fixed offset to the hwirq number, and
-visa-versa. The disadvantage is that it requires the interrupt
-controller to manage IRQ allocations and it requires an irq_desc to be
-allocated for every hwirq, even if it is unused.
-
-The legacy map should only be used if fixed IRQ mappings must be
-supported. For example, ISA controllers would use the legacy map for
-mapping Linux IRQs 0-15 so that existing ISA drivers get the correct IRQ
-numbers.
-
-Most users of legacy mappings should use irq_domain_add_simple() which
-will use a legacy domain only if an IRQ range is supplied by the
-system and will otherwise use a linear domain mapping. The semantics
-of this call are such that if an IRQ range is specified then
-descriptors will be allocated on-the-fly for it, and if no range is
-specified it will fall through to irq_domain_add_linear() which means
-*no* irq descriptors will be allocated.
-
-A typical use case for simple domains is where an irqchip provider
-is supporting both dynamic and static IRQ assignments.
-
-In order to avoid ending up in a situation where a linear domain is
-used and no descriptor gets allocated it is very important to make sure
-that the driver using the simple domain call irq_create_mapping()
-before any irq_find_mapping() since the latter will actually work
-for the static IRQ assignment case.
-
-Hierarchy IRQ domain
---------------------
-
-On some architectures, there may be multiple interrupt controllers
-involved in delivering an interrupt from the device to the target CPU.
-Let's look at a typical interrupt delivering path on x86 platforms::
-
- Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
-
-There are three interrupt controllers involved:
-
-1) IOAPIC controller
-2) Interrupt remapping controller
-3) Local APIC controller
-
-To support such a hardware topology and make software architecture match
-hardware architecture, an irq_domain data structure is built for each
-interrupt controller and those irq_domains are organized into hierarchy.
-When building irq_domain hierarchy, the irq_domain near to the device is
-child and the irq_domain near to CPU is parent. So a hierarchy structure
-as below will be built for the example above::
-
- CPU Vector irq_domain (root irq_domain to manage CPU vectors)
- ^
- |
- Interrupt Remapping irq_domain (manage irq_remapping entries)
- ^
- |
- IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
-
-There are four major interfaces to use hierarchy irq_domain:
-
-1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
- controller related resources to deliver these interrupts.
-2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controller
- related resources associated with these interrupts.
-3) irq_domain_activate_irq(): activate interrupt controller hardware to
- deliver the interrupt.
-4) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
- to stop delivering the interrupt.
-
-Following changes are needed to support hierarchy irq_domain:
-
-1) a new field 'parent' is added to struct irq_domain; it's used to
- maintain irq_domain hierarchy information.
-2) a new field 'parent_data' is added to struct irq_data; it's used to
- build hierarchy irq_data to match hierarchy irq_domains. The irq_data
- is used to store irq_domain pointer and hardware irq number.
-3) new callbacks are added to struct irq_domain_ops to support hierarchy
- irq_domain operations.
-
-With support of hierarchy irq_domain and hierarchy irq_data ready, an
-irq_domain structure is built for each interrupt controller, and an
-irq_data structure is allocated for each irq_domain associated with an
-IRQ. Now we could go one step further to support stacked(hierarchy)
-irq_chip. That is, an irq_chip is associated with each irq_data along
-the hierarchy. A child irq_chip may implement a required action by
-itself or by cooperating with its parent irq_chip.
-
-With stacked irq_chip, interrupt controller driver only needs to deal
-with the hardware managed by itself and may ask for services from its
-parent irq_chip when needed. So we could achieve a much cleaner
-software architecture.
-
-For an interrupt controller driver to support hierarchy irq_domain, it
-needs to:
-
-1) Implement irq_domain_ops.alloc and irq_domain_ops.free
-2) Optionally implement irq_domain_ops.activate and
- irq_domain_ops.deactivate.
-3) Optionally implement an irq_chip to manage the interrupt controller
- hardware.
-4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
- they are unused with hierarchy irq_domain.
-
-Hierarchy irq_domain is in no way x86 specific, and is heavily used to
-support other architectures, such as ARM, ARM64 etc.
-
-=== Debugging ===
-
-Most of the internals of the IRQ subsystem are exposed in debugfs by
-turning CONFIG_GENERIC_IRQ_DEBUGFS on.
diff --git a/Documentation/Makefile b/Documentation/Makefile
index 7e12d10..6b12dd8 100644
--- a/Documentation/Makefile
+++ b/Documentation/Makefile
@@ -98,7 +98,11 @@
pdfdocs: latexdocs
@$(srctree)/scripts/sphinx-pre-install --version-check
- $(foreach var,$(SPHINXDIRS), $(MAKE) PDFLATEX="$(PDFLATEX)" LATEXOPTS="$(LATEXOPTS)" -C $(BUILDDIR)/$(var)/latex || exit;)
+ $(foreach var,$(SPHINXDIRS), \
+ $(MAKE) PDFLATEX="$(PDFLATEX)" LATEXOPTS="$(LATEXOPTS)" -C $(BUILDDIR)/$(var)/latex || exit; \
+ mkdir -p $(BUILDDIR)/$(var)/pdf; \
+ mv $(subst .tex,.pdf,$(wildcard $(BUILDDIR)/$(var)/latex/*.tex)) $(BUILDDIR)/$(var)/pdf/; \
+ )
endif # HAVE_PDFLATEX
diff --git a/Documentation/PCI/boot-interrupts.rst b/Documentation/PCI/boot-interrupts.rst
index d078ef3..2ec7012 100644
--- a/Documentation/PCI/boot-interrupts.rst
+++ b/Documentation/PCI/boot-interrupts.rst
@@ -32,12 +32,13 @@
Spurious Interrupts. The IRQ will be disabled by the Linux kernel after it
reaches a specific count with the error "nobody cared". This disabled IRQ
now prevents valid usage by an existing interrupt which may happen to share
-the IRQ line.
+the IRQ line::
irq 19: nobody cared (try booting with the "irqpoll" option)
CPU: 0 PID: 2988 Comm: irq/34-nipalk Tainted: 4.14.87-rt49-02410-g4a640ec-dirty #1
Hardware name: National Instruments NI PXIe-8880/NI PXIe-8880, BIOS 2.1.5f1 01/09/2020
Call Trace:
+
<IRQ>
? dump_stack+0x46/0x5e
? __report_bad_irq+0x2e/0xb0
@@ -85,15 +86,18 @@
The mitigations take the form of PCI quirks. The preference has been to
first identify and make use of a means to disable the routing to the PCH.
In such a case a quirk to disable boot interrupt generation can be
-added.[1]
+added. [1]_
- Intel® 6300ESB I/O Controller Hub
+Intel® 6300ESB I/O Controller Hub
Alternate Base Address Register:
BIE: Boot Interrupt Enable
- 0 = Boot interrupt is enabled.
- 1 = Boot interrupt is disabled.
- Intel® Sandy Bridge through Sky Lake based Xeon servers:
+ == ===========================
+ 0 Boot interrupt is enabled.
+ 1 Boot interrupt is disabled.
+ == ===========================
+
+Intel® Sandy Bridge through Sky Lake based Xeon servers:
Coherent Interface Protocol Interrupt Control
dis_intx_route2pch/dis_intx_route2ich/dis_intx_route2dmi2:
When this bit is set. Local INTx messages received from the
@@ -109,12 +113,12 @@
disabled, the Linux kernel will reroute the valid interrupt to its legacy
interrupt. This redirection of the handler will prevent the occurrence of
the spurious interrupt detection which would ordinarily disable the IRQ
-line due to excessive unhandled counts.[2]
+line due to excessive unhandled counts. [2]_
The config option X86_REROUTE_FOR_BROKEN_BOOT_IRQS exists to enable (or
disable) the redirection of the interrupt handler to the PCH interrupt
line. The option can be overridden by either pci=ioapicreroute or
-pci=noioapicreroute.[3]
+pci=noioapicreroute. [3]_
More Documentation
@@ -127,19 +131,19 @@
Example of disabling of the boot interrupt
------------------------------------------
-Intel® 6300ESB I/O Controller Hub (Document # 300641-004US)
+ - Intel® 6300ESB I/O Controller Hub (Document # 300641-004US)
5.7.3 Boot Interrupt
https://www.intel.com/content/dam/doc/datasheet/6300esb-io-controller-hub-datasheet.pdf
-Intel® Xeon® Processor E5-1600/2400/2600/4600 v3 Product Families
-Datasheet - Volume 2: Registers (Document # 330784-003)
+ - Intel® Xeon® Processor E5-1600/2400/2600/4600 v3 Product Families
+ Datasheet - Volume 2: Registers (Document # 330784-003)
6.6.41 cipintrc Coherent Interface Protocol Interrupt Control
https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/xeon-e5-v3-datasheet-vol-2.pdf
Example of handler rerouting
----------------------------
-Intel® 6700PXH 64-bit PCI Hub (Document # 302628)
+ - Intel® 6700PXH 64-bit PCI Hub (Document # 302628)
2.15.2 PCI Express Legacy INTx Support and Boot Interrupt
https://www.intel.com/content/dam/doc/datasheet/6700pxh-64-bit-pci-hub-datasheet.pdf
@@ -150,6 +154,6 @@
Sean V Kelley
sean.v.kelley@linux.intel.com
-[1] https://lore.kernel.org/r/12131949181903-git-send-email-sassmann@suse.de/
-[2] https://lore.kernel.org/r/12131949182094-git-send-email-sassmann@suse.de/
-[3] https://lore.kernel.org/r/487C8EA7.6020205@suse.de/
+.. [1] https://lore.kernel.org/r/12131949181903-git-send-email-sassmann@suse.de/
+.. [2] https://lore.kernel.org/r/12131949182094-git-send-email-sassmann@suse.de/
+.. [3] https://lore.kernel.org/r/487C8EA7.6020205@suse.de/
diff --git a/Documentation/PCI/endpoint/pci-endpoint.rst b/Documentation/PCI/endpoint/pci-endpoint.rst
index 0e2311b..7536be44 100644
--- a/Documentation/PCI/endpoint/pci-endpoint.rst
+++ b/Documentation/PCI/endpoint/pci-endpoint.rst
@@ -78,8 +78,8 @@
Cleanup the pci_epc_mem structure allocated during pci_epc_mem_init().
-APIs for the PCI Endpoint Function Driver
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+EPC APIs for the PCI Endpoint Function Driver
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This section lists the APIs that the PCI Endpoint core provides to be used
by the PCI endpoint function driver.
@@ -117,8 +117,8 @@
The PCI endpoint function driver should use pci_epc_mem_free_addr() to
free the memory space allocated using pci_epc_mem_alloc_addr().
-Other APIs
-~~~~~~~~~~
+Other EPC APIs
+~~~~~~~~~~~~~~
There are other APIs provided by the EPC library. These are used for binding
the EPF device with EPC device. pci-ep-cfs.c can be used as reference for
@@ -160,8 +160,8 @@
The EPF library provides APIs to be used by the function driver and the EPC
library to provide endpoint mode functionality.
-APIs for the PCI Endpoint Function Driver
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+EPF APIs for the PCI Endpoint Function Driver
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This section lists the APIs that the PCI Endpoint core provides to be used
by the PCI endpoint function driver.
@@ -204,8 +204,8 @@
The PCI endpoint controller library invokes pci_epf_linkup() when the
EPC device has established the connection to the host.
-Other APIs
-~~~~~~~~~~
+Other EPF APIs
+~~~~~~~~~~~~~~
There are other APIs provided by the EPF library. These are used to notify
the function driver when the EPF device is bound to the EPC device.
diff --git a/Documentation/RCU/Design/Requirements/Requirements.rst b/Documentation/RCU/Design/Requirements/Requirements.rst
index fd5e2cb..75b8ca0 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.rst
+++ b/Documentation/RCU/Design/Requirements/Requirements.rst
@@ -1943,56 +1943,27 @@
Scheduler and RCU
~~~~~~~~~~~~~~~~~
-RCU depends on the scheduler, and the scheduler uses RCU to protect some
-of its data structures. The preemptible-RCU ``rcu_read_unlock()``
-implementation must therefore be written carefully to avoid deadlocks
-involving the scheduler's runqueue and priority-inheritance locks. In
-particular, ``rcu_read_unlock()`` must tolerate an interrupt where the
-interrupt handler invokes both ``rcu_read_lock()`` and
-``rcu_read_unlock()``. This possibility requires ``rcu_read_unlock()``
-to use negative nesting levels to avoid destructive recursion via
-interrupt handler's use of RCU.
-
-This scheduler-RCU requirement came as a `complete
-surprise <https://lwn.net/Articles/453002/>`__.
-
-As noted above, RCU makes use of kthreads, and it is necessary to avoid
-excessive CPU-time accumulation by these kthreads. This requirement was
-no surprise, but RCU's violation of it when running context-switch-heavy
-workloads when built with ``CONFIG_NO_HZ_FULL=y`` `did come as a
-surprise
+RCU makes use of kthreads, and it is necessary to avoid excessive CPU-time
+accumulation by these kthreads. This requirement was no surprise, but
+RCU's violation of it when running context-switch-heavy workloads when
+built with ``CONFIG_NO_HZ_FULL=y`` `did come as a surprise
[PDF] <http://www.rdrop.com/users/paulmck/scalability/paper/BareMetal.2015.01.15b.pdf>`__.
RCU has made good progress towards meeting this requirement, even for
context-switch-heavy ``CONFIG_NO_HZ_FULL=y`` workloads, but there is
room for further improvement.
-It is forbidden to hold any of scheduler's runqueue or
-priority-inheritance spinlocks across an ``rcu_read_unlock()`` unless
-interrupts have been disabled across the entire RCU read-side critical
-section, that is, up to and including the matching ``rcu_read_lock()``.
-Violating this restriction can result in deadlocks involving these
-scheduler spinlocks. There was hope that this restriction might be
-lifted when interrupt-disabled calls to ``rcu_read_unlock()`` started
-deferring the reporting of the resulting RCU-preempt quiescent state
-until the end of the corresponding interrupts-disabled region.
-Unfortunately, timely reporting of the corresponding quiescent state to
-expedited grace periods requires a call to ``raise_softirq()``, which
-can acquire these scheduler spinlocks. In addition, real-time systems
-using RCU priority boosting need this restriction to remain in effect
-because deferred quiescent-state reporting would also defer deboosting,
-which in turn would degrade real-time latencies.
+There is no longer any prohibition against holding any of
+scheduler's runqueue or priority-inheritance spinlocks across an
+``rcu_read_unlock()``, even if interrupts and preemption were enabled
+somewhere within the corresponding RCU read-side critical section.
+Therefore, it is now perfectly legal to execute ``rcu_read_lock()``
+with preemption enabled, acquire one of the scheduler locks, and hold
+that lock across the matching ``rcu_read_unlock()``.
-In theory, if a given RCU read-side critical section could be guaranteed
-to be less than one second in duration, holding a scheduler spinlock
-across that critical section's ``rcu_read_unlock()`` would require only
-that preemption be disabled across the entire RCU read-side critical
-section, not interrupts. Unfortunately, given the possibility of vCPU
-preemption, long-running interrupts, and so on, it is not possible in
-practice to guarantee that a given RCU read-side critical section will
-complete in less than one second. Therefore, as noted above, if
-scheduler spinlocks are held across a given call to
-``rcu_read_unlock()``, interrupts must be disabled across the entire RCU
-read-side critical section.
+Similarly, the RCU flavor consolidation has removed the need for negative
+nesting. The fact that interrupt-disabled regions of code act as RCU
+read-side critical sections implicitly avoids earlier issues that used
+to result in destructive recursion via interrupt handler's use of RCU.
Tracing and RCU
~~~~~~~~~~~~~~~
diff --git a/Documentation/admin-guide/LSM/tomoyo.rst b/Documentation/admin-guide/LSM/tomoyo.rst
index e2d6b6e..4bc9c2b 100644
--- a/Documentation/admin-guide/LSM/tomoyo.rst
+++ b/Documentation/admin-guide/LSM/tomoyo.rst
@@ -27,29 +27,29 @@
=======================
User <-> Kernel interface documentation is available at
-http://tomoyo.osdn.jp/2.5/policy-specification/index.html .
+https://tomoyo.osdn.jp/2.5/policy-specification/index.html .
Materials we prepared for seminars and symposiums are available at
-http://osdn.jp/projects/tomoyo/docs/?category_id=532&language_id=1 .
+https://osdn.jp/projects/tomoyo/docs/?category_id=532&language_id=1 .
Below lists are chosen from three aspects.
What is TOMOYO?
TOMOYO Linux Overview
- http://osdn.jp/projects/tomoyo/docs/lca2009-takeda.pdf
+ https://osdn.jp/projects/tomoyo/docs/lca2009-takeda.pdf
TOMOYO Linux: pragmatic and manageable security for Linux
- http://osdn.jp/projects/tomoyo/docs/freedomhectaipei-tomoyo.pdf
+ https://osdn.jp/projects/tomoyo/docs/freedomhectaipei-tomoyo.pdf
TOMOYO Linux: A Practical Method to Understand and Protect Your Own Linux Box
- http://osdn.jp/projects/tomoyo/docs/PacSec2007-en-no-demo.pdf
+ https://osdn.jp/projects/tomoyo/docs/PacSec2007-en-no-demo.pdf
What can TOMOYO do?
Deep inside TOMOYO Linux
- http://osdn.jp/projects/tomoyo/docs/lca2009-kumaneko.pdf
+ https://osdn.jp/projects/tomoyo/docs/lca2009-kumaneko.pdf
The role of "pathname based access control" in security.
- http://osdn.jp/projects/tomoyo/docs/lfj2008-bof.pdf
+ https://osdn.jp/projects/tomoyo/docs/lfj2008-bof.pdf
History of TOMOYO?
Realities of Mainlining
- http://osdn.jp/projects/tomoyo/docs/lfj2008.pdf
+ https://osdn.jp/projects/tomoyo/docs/lfj2008.pdf
What is future plan?
====================
diff --git a/Documentation/admin-guide/README.rst b/Documentation/admin-guide/README.rst
index cc6151f..5fb5269 100644
--- a/Documentation/admin-guide/README.rst
+++ b/Documentation/admin-guide/README.rst
@@ -209,15 +209,22 @@
store the lsmod of that machine into a file
and pass it in as a LSMOD parameter.
+ Also, you can preserve modules in certain folders
+ or kconfig files by specifying their paths in
+ parameter LMC_KEEP.
+
target$ lsmod > /tmp/mylsmod
target$ scp /tmp/mylsmod host:/tmp
- host$ make LSMOD=/tmp/mylsmod localmodconfig
+ host$ make LSMOD=/tmp/mylsmod \
+ LMC_KEEP="drivers/usb:drivers/gpu:fs" \
+ localmodconfig
The above also works when cross compiling.
"make localyesconfig" Similar to localmodconfig, except it will convert
- all module options to built in (=y) options.
+ all module options to built in (=y) options. You can
+ also preserve modules by LMC_KEEP.
"make kvmconfig" Enable additional options for kvm guest kernel support.
diff --git a/Documentation/admin-guide/acpi/initrd_table_override.rst b/Documentation/admin-guide/acpi/initrd_table_override.rst
index cbd7682..bb24fa6 100644
--- a/Documentation/admin-guide/acpi/initrd_table_override.rst
+++ b/Documentation/admin-guide/acpi/initrd_table_override.rst
@@ -102,7 +102,7 @@
=================================
iasl and acpixtract are part of Intel's ACPICA project:
-http://acpica.org/
+https://acpica.org/
and should be packaged by distributions (for example in the acpica package
on SUSE).
diff --git a/Documentation/admin-guide/acpi/ssdt-overlays.rst b/Documentation/admin-guide/acpi/ssdt-overlays.rst
index da37455..5d7e259 100644
--- a/Documentation/admin-guide/acpi/ssdt-overlays.rst
+++ b/Documentation/admin-guide/acpi/ssdt-overlays.rst
@@ -63,7 +63,7 @@
ASL Input: minnomax.asl - 30 lines, 614 bytes, 7 keywords
AML Output: minnowmax.aml - 165 bytes, 6 named objects, 1 executable opcodes
-[1] http://wiki.minnowboard.org/MinnowBoard_MAX#Low_Speed_Expansion_Connector_.28Top.29
+[1] https://www.elinux.org/Minnowboard:MinnowMax#Low_Speed_Expansion_.28Top.29
The resulting AML code can then be loaded by the kernel using one of the methods
below.
diff --git a/Documentation/admin-guide/bcache.rst b/Documentation/admin-guide/bcache.rst
index c0ce64d..1eccf95 100644
--- a/Documentation/admin-guide/bcache.rst
+++ b/Documentation/admin-guide/bcache.rst
@@ -7,9 +7,9 @@
Wiki and git repositories are at:
- - http://bcache.evilpiepirate.org
+ - https://bcache.evilpiepirate.org
- http://evilpiepirate.org/git/linux-bcache.git
- - http://evilpiepirate.org/git/bcache-tools.git
+ - https://evilpiepirate.org/git/bcache-tools.git
It's designed around the performance characteristics of SSDs - it only allocates
in erase block sized buckets, and it uses a hybrid btree/log to track cached
diff --git a/Documentation/admin-guide/bug-hunting.rst b/Documentation/admin-guide/bug-hunting.rst
index 44b8a4e..f7c80f4 100644
--- a/Documentation/admin-guide/bug-hunting.rst
+++ b/Documentation/admin-guide/bug-hunting.rst
@@ -49,15 +49,19 @@
Despite being an **Oops** or some other sort of stack trace, the offended
line is usually required to identify and handle the bug. Along this chapter,
-we'll refer to "Oops" for all kinds of stack traces that need to be analized.
+we'll refer to "Oops" for all kinds of stack traces that need to be analyzed.
-.. note::
+If the kernel is compiled with ``CONFIG_DEBUG_INFO``, you can enhance the
+quality of the stack trace by using file:`scripts/decode_stacktrace.sh`.
- ``ksymoops`` is useless on 2.6 or upper. Please use the Oops in its original
- format (from ``dmesg``, etc). Ignore any references in this or other docs to
- "decoding the Oops" or "running it through ksymoops".
- If you post an Oops from 2.6+ that has been run through ``ksymoops``,
- people will just tell you to repost it.
+Modules linked in
+-----------------
+
+Modules that are tainted or are being loaded or unloaded are marked with
+"(...)", where the taint flags are described in
+file:`Documentation/admin-guide/tainted-kernels.rst`, "being loaded" is
+annotated with "+", and "being unloaded" is annotated with "-".
+
Where is the Oops message is located?
-------------------------------------
@@ -71,7 +75,7 @@
Sometimes ``klogd`` dies, in which case you can run ``dmesg > file`` to
read the data from the kernel buffers and save it. Or you can
``cat /proc/kmsg > file``, however you have to break in to stop the transfer,
-``kmsg`` is a "never ending file".
+since ``kmsg`` is a "never ending file".
If the machine has crashed so badly that you cannot enter commands or
the disk is not available then you have three options:
@@ -81,9 +85,9 @@
planned for a crash. Alternatively, you can take a picture of
the screen with a digital camera - not nice, but better than
nothing. If the messages scroll off the top of the console, you
- may find that booting with a higher resolution (eg, ``vga=791``)
+ may find that booting with a higher resolution (e.g., ``vga=791``)
will allow you to read more of the text. (Caveat: This needs ``vesafb``,
- so won't help for 'early' oopses)
+ so won't help for 'early' oopses.)
(2) Boot with a serial console (see
:ref:`Documentation/admin-guide/serial-console.rst <serial_console>`),
@@ -104,7 +108,7 @@
gdb
^^^
-The GNU debug (``gdb``) is the best way to figure out the exact file and line
+The GNU debugger (``gdb``) is the best way to figure out the exact file and line
number of the OOPS from the ``vmlinux`` file.
The usage of gdb works best on a kernel compiled with ``CONFIG_DEBUG_INFO``.
@@ -165,7 +169,7 @@
[<ffffffff8802770b>] :jbd:journal_stop+0x1be/0x1ee
...
-this shows the problem likely in the :jbd: module. You can load that module
+this shows the problem likely is in the :jbd: module. You can load that module
in gdb and list the relevant code::
$ gdb fs/jbd/jbd.ko
@@ -199,8 +203,9 @@
You need to be at the top level of the kernel tree for this to pick up
your C files.
-If you don't have access to the code you can also debug on some crash dumps
-e.g. crash dump output as shown by Dave Miller::
+If you don't have access to the source code you can still debug some crash
+dumps using the following method (example crash dump output as shown by
+Dave Miller)::
EIP is at +0x14/0x4c0
...
@@ -230,6 +235,9 @@
mov 0x8(%ebp), %ebx ! %ebx = skb->sk
mov 0x13c(%ebx), %eax ! %eax = inet_sk(sk)->opt
+file:`scripts/decodecode` can be used to automate most of this, depending
+on what CPU architecture is being debugged.
+
Reporting the bug
-----------------
@@ -241,7 +249,7 @@
the ``get_maintainer.pl`` script.
For example, if you find a bug at the gspca's sonixj.c file, you can get
-their maintainers with::
+its maintainers with::
$ ./scripts/get_maintainer.pl -f drivers/media/usb/gspca/sonixj.c
Hans Verkuil <hverkuil@xs4all.nl> (odd fixer:GSPCA USB WEBCAM DRIVER,commit_signer:1/1=100%)
@@ -253,16 +261,17 @@
Please notice that it will point to:
-- The last developers that touched on the source code. On the above example,
- Tejun and Bhaktipriya (in this specific case, none really envolved on the
- development of this file);
+- The last developers that touched the source code (if this is done inside
+ a git tree). On the above example, Tejun and Bhaktipriya (in this
+ specific case, none really envolved on the development of this file);
- The driver maintainer (Hans Verkuil);
- The subsystem maintainer (Mauro Carvalho Chehab);
- The driver and/or subsystem mailing list (linux-media@vger.kernel.org);
- the Linux Kernel mailing list (linux-kernel@vger.kernel.org).
Usually, the fastest way to have your bug fixed is to report it to mailing
-list used for the development of the code (linux-media ML) copying the driver maintainer (Hans).
+list used for the development of the code (linux-media ML) copying the
+driver maintainer (Hans).
If you are totally stumped as to whom to send the report, and
``get_maintainer.pl`` didn't provide you anything useful, send it to
@@ -303,9 +312,9 @@
and forwarded to the kernel developers.
Two types of address resolution are performed by ``klogd``. The first is
-static translation and the second is dynamic translation. Static
-translation uses the System.map file in much the same manner that
-ksymoops does. In order to do static translation the ``klogd`` daemon
+static translation and the second is dynamic translation.
+Static translation uses the System.map file.
+In order to do static translation the ``klogd`` daemon
must be able to find a system map file at daemon initialization time.
See the klogd man page for information on how ``klogd`` searches for map
files.
diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 0ae4f56..12757e6 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -199,11 +199,11 @@
unaccounted when it's removed from radix-tree. Even if RSS pages are fully
unmapped (by kswapd), they may exist as SwapCache in the system until they
are really freed. Such SwapCaches are also accounted.
-A swapped-in page is not accounted until it's mapped.
+A swapped-in page is accounted after adding into swapcache.
Note: The kernel does swapin-readahead and reads multiple swaps at once.
-This means swapped-in pages may contain pages for other tasks than a task
-causing page fault. So, we avoid accounting at swap-in I/O.
+Since page's memcg recorded into swap whatever memsw enabled, the page will
+be accounted after swapin.
At page migration, accounting information is kept.
@@ -222,18 +222,13 @@
But see section 8.2: when moving a task to another cgroup, its pages may
be recharged to the new cgroup, if move_charge_at_immigrate has been chosen.
-Exception: If CONFIG_MEMCG_SWAP is not used.
-When you do swapoff and make swapped-out pages of shmem(tmpfs) to
-be backed into memory in force, charges for pages are accounted against the
-caller of swapoff rather than the users of shmem.
-
-2.4 Swap Extension (CONFIG_MEMCG_SWAP)
+2.4 Swap Extension
--------------------------------------
-Swap Extension allows you to record charge for swap. A swapped-in page is
-charged back to original page allocator if possible.
+Swap usage is always recorded for each of cgroup. Swap Extension allows you to
+read and limit it.
-When swap is accounted, following files are added.
+When CONFIG_SWAP is enabled, following files are added.
- memory.memsw.usage_in_bytes.
- memory.memsw.limit_in_bytes.
diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index bcc80269..ce3e05e 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -714,9 +714,7 @@
- Settings for a single feature should be contained in a single file.
- The root cgroup should be exempt from resource control and thus
- shouldn't have resource control interface files. Also,
- informational files on the root cgroup which end up showing global
- information available elsewhere shouldn't exist.
+ shouldn't have resource control interface files.
- The default time unit is microseconds. If a different unit is ever
used, an explicit unit suffix must be present.
@@ -985,7 +983,7 @@
All time durations are in microseconds.
cpu.stat
- A read-only flat-keyed file which exists on non-root cgroups.
+ A read-only flat-keyed file.
This file exists whether the controller is enabled or not.
It always reports the following three stats:
@@ -1172,6 +1170,13 @@
Under certain circumstances, the usage may go over the limit
temporarily.
+ In default configuration regular 0-order allocations always
+ succeed unless OOM killer chooses current task as a victim.
+
+ Some kinds of allocations don't invoke the OOM killer.
+ Caller could retry them differently, return into userspace
+ as -ENOMEM or silently ignore in cases like disk readahead.
+
This is the ultimate protection mechanism. As long as the
high limit is used and monitored properly, this limit's
utility is limited to providing the final safety net.
@@ -1228,17 +1233,9 @@
The number of time the cgroup's memory usage was
reached the limit and allocation was about to fail.
- Depending on context result could be invocation of OOM
- killer and retrying allocation or failing allocation.
-
- Failed allocation in its turn could be returned into
- userspace as -ENOMEM or silently ignored in cases like
- disk readahead. For now OOM in memory cgroup kills
- tasks iff shortage has happened inside page fault.
-
This event is not raised if the OOM killer is not
considered as an option, e.g. for failed high-order
- allocations.
+ allocations or if caller asked to not retry attempts.
oom_kill
The number of processes belonging to this cgroup
@@ -1329,6 +1326,10 @@
workingset_activate
Number of refaulted pages that were immediately activated
+ workingset_restore
+ Number of restored pages which have been detected as an active
+ workingset before they got reclaimed.
+
workingset_nodereclaim
Number of times a shadow node has been reclaimed
@@ -1370,6 +1371,22 @@
The total amount of swap currently being used by the cgroup
and its descendants.
+ memory.swap.high
+ A read-write single value file which exists on non-root
+ cgroups. The default is "max".
+
+ Swap usage throttle limit. If a cgroup's swap usage exceeds
+ this limit, all its further allocations will be throttled to
+ allow userspace to implement custom out-of-memory procedures.
+
+ This limit marks a point of no return for the cgroup. It is NOT
+ designed to manage the amount of swapping a workload does
+ during regular operation. Compare to memory.swap.max, which
+ prohibits swapping past a set amount, but lets the cgroup
+ continue unimpeded as long as other memory can be reclaimed.
+
+ Healthy workloads are not expected to reach this limit.
+
memory.swap.max
A read-write single value file which exists on non-root
cgroups. The default is "max".
@@ -1383,6 +1400,10 @@
otherwise, a value change in this file generates a file
modified event.
+ high
+ The number of times the cgroup's swap usage was over
+ the high threshold.
+
max
The number of times the cgroup's swap usage was about
to go over the max boundary and swap allocation
diff --git a/Documentation/admin-guide/cpu-load.rst b/Documentation/admin-guide/cpu-load.rst
index 2d01ce4..ebdecf8 100644
--- a/Documentation/admin-guide/cpu-load.rst
+++ b/Documentation/admin-guide/cpu-load.rst
@@ -105,7 +105,7 @@
----------
- http://lkml.org/lkml/2007/2/12/6
-- Documentation/filesystems/proc.txt (1.8)
+- Documentation/filesystems/proc.rst (1.8)
Thanks
diff --git a/Documentation/admin-guide/device-mapper/dm-ebs.rst b/Documentation/admin-guide/device-mapper/dm-ebs.rst
new file mode 100644
index 0000000..534fa38
--- /dev/null
+++ b/Documentation/admin-guide/device-mapper/dm-ebs.rst
@@ -0,0 +1,51 @@
+======
+dm-ebs
+======
+
+
+This target is similar to the linear target except that it emulates
+a smaller logical block size on a device with a larger logical block
+size. Its main purpose is to provide emulation of 512 byte sectors on
+devices that do not provide this emulation (i.e. 4K native disks).
+
+Supported emulated logical block sizes 512, 1024, 2048 and 4096.
+
+Underlying block size can be set to > 4K to test buffering larger units.
+
+
+Table parameters
+----------------
+ <dev path> <offset> <emulated sectors> [<underlying sectors>]
+
+Mandatory parameters:
+
+ <dev path>:
+ Full pathname to the underlying block-device,
+ or a "major:minor" device-number.
+ <offset>:
+ Starting sector within the device;
+ has to be a multiple of <emulated sectors>.
+ <emulated sectors>:
+ Number of sectors defining the logical block size to be emulated;
+ 1, 2, 4, 8 sectors of 512 bytes supported.
+
+Optional parameter:
+
+ <underyling sectors>:
+ Number of sectors defining the logical block size of <dev path>.
+ 2^N supported, e.g. 8 = emulate 8 sectors of 512 bytes = 4KiB.
+ If not provided, the logical block size of <dev path> will be used.
+
+
+Examples:
+
+Emulate 1 sector = 512 bytes logical block size on /dev/sda starting at
+offset 1024 sectors with underlying devices block size automatically set:
+
+ebs /dev/sda 1024 1
+
+Emulate 2 sector = 1KiB logical block size on /dev/sda starting at
+offset 128 sectors, enforce 2KiB underlying device block size.
+This presumes 2KiB logical blocksize on /dev/sda or less to work:
+
+ebs /dev/sda 128 2 4
diff --git a/Documentation/admin-guide/device-mapper/dm-integrity.rst b/Documentation/admin-guide/device-mapper/dm-integrity.rst
index c00f9f1..9edd455 100644
--- a/Documentation/admin-guide/device-mapper/dm-integrity.rst
+++ b/Documentation/admin-guide/device-mapper/dm-integrity.rst
@@ -182,12 +182,23 @@
space-efficient. If this option is not present, large padding is
used - that is for compatibility with older kernels.
+allow_discards
+ Allow block discard requests (a.k.a. TRIM) for the integrity device.
+ Discards are only allowed to devices using internal hash.
-The journal mode (D/J), buffer_sectors, journal_watermark, commit_time can
-be changed when reloading the target (load an inactive table and swap the
-tables with suspend and resume). The other arguments should not be changed
-when reloading the target because the layout of disk data depend on them
-and the reloaded target would be non-functional.
+The journal mode (D/J), buffer_sectors, journal_watermark, commit_time and
+allow_discards can be changed when reloading the target (load an inactive
+table and swap the tables with suspend and resume). The other arguments
+should not be changed when reloading the target because the layout of disk
+data depend on them and the reloaded target would be non-functional.
+
+
+Status line:
+
+1. the number of integrity mismatches
+2. provided data sectors - that is the number of sectors that the user
+ could use
+3. the current recalculating position (or '-' if we didn't recalculate)
The layout of the formatted block device:
diff --git a/Documentation/admin-guide/device-mapper/dm-zoned.rst b/Documentation/admin-guide/device-mapper/dm-zoned.rst
index 07f56eb..553752e 100644
--- a/Documentation/admin-guide/device-mapper/dm-zoned.rst
+++ b/Documentation/admin-guide/device-mapper/dm-zoned.rst
@@ -37,9 +37,13 @@
dm-zoned implements an on-disk buffering scheme to handle non-sequential
write accesses to the sequential zones of a zoned block device.
Conventional zones are used for caching as well as for storing internal
-metadata.
+metadata. It can also use a regular block device together with the zoned
+block device; in that case the regular block device will be split logically
+in zones with the same size as the zoned block device. These zones will be
+placed in front of the zones from the zoned block device and will be handled
+just like conventional zones.
-The zones of the device are separated into 2 types:
+The zones of the device(s) are separated into 2 types:
1) Metadata zones: these are conventional zones used to store metadata.
Metadata zones are not reported as useable capacity to the user.
@@ -127,6 +131,13 @@
discard requests. Read requests can be processed concurrently while
metadata flush is being executed.
+If a regular device is used in conjunction with the zoned block device,
+a third set of metadata (without the zone bitmaps) is written to the
+start of the zoned block device. This metadata has a generation counter of
+'0' and will never be updated during normal operation; it just serves for
+identification purposes. The first and second copy of the metadata
+are located at the start of the regular block device.
+
Usage
=====
@@ -138,9 +149,46 @@
dmzadm --format /dev/sdxx
-For a formatted device, the target can be created normally with the
-dmsetup utility. The only parameter that dm-zoned requires is the
-underlying zoned block device name. Ex::
- echo "0 `blockdev --getsize ${dev}` zoned ${dev}" | \
- dmsetup create dmz-`basename ${dev}`
+If two drives are to be used, both devices must be specified, with the
+regular block device as the first device.
+
+Ex::
+
+ dmzadm --format /dev/sdxx /dev/sdyy
+
+
+Fomatted device(s) can be started with the dmzadm utility, too.:
+
+Ex::
+
+ dmzadm --start /dev/sdxx /dev/sdyy
+
+
+Information about the internal layout and current usage of the zones can
+be obtained with the 'status' callback from dmsetup:
+
+Ex::
+
+ dmsetup status /dev/dm-X
+
+will return a line
+
+ 0 <size> zoned <nr_zones> zones <nr_unmap_rnd>/<nr_rnd> random <nr_unmap_seq>/<nr_seq> sequential
+
+where <nr_zones> is the total number of zones, <nr_unmap_rnd> is the number
+of unmapped (ie free) random zones, <nr_rnd> the total number of zones,
+<nr_unmap_seq> the number of unmapped sequential zones, and <nr_seq> the
+total number of sequential zones.
+
+Normally the reclaim process will be started once there are less than 50
+percent free random zones. In order to start the reclaim process manually
+even before reaching this threshold the 'dmsetup message' function can be
+used:
+
+Ex::
+
+ dmsetup message /dev/dm-X 0 reclaim
+
+will start the reclaim process and random zones will be moved to sequential
+zones.
diff --git a/Documentation/admin-guide/devices.rst b/Documentation/admin-guide/devices.rst
index d41671a..035275f 100644
--- a/Documentation/admin-guide/devices.rst
+++ b/Documentation/admin-guide/devices.rst
@@ -17,7 +17,7 @@
to involve for character and block devices.
This document is included by reference into the Filesystem Hierarchy
-Standard (FHS). The FHS is available from http://www.pathname.com/fhs/.
+Standard (FHS). The FHS is available from https://www.pathname.com/fhs/.
Allocations marked (68k/Amiga) apply to Linux/68k on the Amiga
platform only. Allocations marked (68k/Atari) apply to Linux/68k on
diff --git a/Documentation/admin-guide/dynamic-debug-howto.rst b/Documentation/admin-guide/dynamic-debug-howto.rst
index 0dc2eb8..1012bd9 100644
--- a/Documentation/admin-guide/dynamic-debug-howto.rst
+++ b/Documentation/admin-guide/dynamic-debug-howto.rst
@@ -13,6 +13,11 @@
``print_hex_dump_debug()``/``print_hex_dump_bytes()`` calls can be dynamically
enabled per-callsite.
+If you do not want to enable dynamic debug globally (i.e. in some embedded
+system), you may set ``CONFIG_DYNAMIC_DEBUG_CORE`` as basic support of dynamic
+debug and add ``ccflags := -DDYNAMIC_DEBUG_MODULE`` into the Makefile of any
+modules which you'd like to dynamically debug later.
+
If ``CONFIG_DYNAMIC_DEBUG`` is not set, ``print_hex_dump_debug()`` is just
shortcut for ``print_hex_dump(KERN_DEBUG)``.
diff --git a/Documentation/admin-guide/gpio/gpio-aggregator.rst b/Documentation/admin-guide/gpio/gpio-aggregator.rst
new file mode 100644
index 0000000..5cd1e72
--- /dev/null
+++ b/Documentation/admin-guide/gpio/gpio-aggregator.rst
@@ -0,0 +1,111 @@
+.. SPDX-License-Identifier: GPL-2.0-only
+
+GPIO Aggregator
+===============
+
+The GPIO Aggregator provides a mechanism to aggregate GPIOs, and expose them as
+a new gpio_chip. This supports the following use cases.
+
+
+Aggregating GPIOs using Sysfs
+-----------------------------
+
+GPIO controllers are exported to userspace using /dev/gpiochip* character
+devices. Access control to these devices is provided by standard UNIX file
+system permissions, on an all-or-nothing basis: either a GPIO controller is
+accessible for a user, or it is not.
+
+The GPIO Aggregator provides access control for a set of one or more GPIOs, by
+aggregating them into a new gpio_chip, which can be assigned to a group or user
+using standard UNIX file ownership and permissions. Furthermore, this
+simplifies and hardens exporting GPIOs to a virtual machine, as the VM can just
+grab the full GPIO controller, and no longer needs to care about which GPIOs to
+grab and which not, reducing the attack surface.
+
+Aggregated GPIO controllers are instantiated and destroyed by writing to
+write-only attribute files in sysfs.
+
+ /sys/bus/platform/drivers/gpio-aggregator/
+
+ "new_device" ...
+ Userspace may ask the kernel to instantiate an aggregated GPIO
+ controller by writing a string describing the GPIOs to
+ aggregate to the "new_device" file, using the format
+
+ .. code-block:: none
+
+ [<gpioA>] [<gpiochipB> <offsets>] ...
+
+ Where:
+
+ "<gpioA>" ...
+ is a GPIO line name,
+
+ "<gpiochipB>" ...
+ is a GPIO chip label, and
+
+ "<offsets>" ...
+ is a comma-separated list of GPIO offsets and/or
+ GPIO offset ranges denoted by dashes.
+
+ Example: Instantiate a new GPIO aggregator by aggregating GPIO
+ line 19 of "e6052000.gpio" and GPIO lines 20-21 of
+ "e6050000.gpio" into a new gpio_chip:
+
+ .. code-block:: sh
+
+ $ echo 'e6052000.gpio 19 e6050000.gpio 20-21' > new_device
+
+ "delete_device" ...
+ Userspace may ask the kernel to destroy an aggregated GPIO
+ controller after use by writing its device name to the
+ "delete_device" file.
+
+ Example: Destroy the previously-created aggregated GPIO
+ controller, assumed to be "gpio-aggregator.0":
+
+ .. code-block:: sh
+
+ $ echo gpio-aggregator.0 > delete_device
+
+
+Generic GPIO Driver
+-------------------
+
+The GPIO Aggregator can also be used as a generic driver for a simple
+GPIO-operated device described in DT, without a dedicated in-kernel driver.
+This is useful in industrial control, and is not unlike e.g. spidev, which
+allows the user to communicate with an SPI device from userspace.
+
+Binding a device to the GPIO Aggregator is performed either by modifying the
+gpio-aggregator driver, or by writing to the "driver_override" file in Sysfs.
+
+Example: If "door" is a GPIO-operated device described in DT, using its own
+compatible value::
+
+ door {
+ compatible = "myvendor,mydoor";
+
+ gpios = <&gpio2 19 GPIO_ACTIVE_HIGH>,
+ <&gpio2 20 GPIO_ACTIVE_LOW>;
+ gpio-line-names = "open", "lock";
+ };
+
+it can be bound to the GPIO Aggregator by either:
+
+1. Adding its compatible value to ``gpio_aggregator_dt_ids[]``,
+2. Binding manually using "driver_override":
+
+.. code-block:: sh
+
+ $ echo gpio-aggregator > /sys/bus/platform/devices/door/driver_override
+ $ echo door > /sys/bus/platform/drivers/gpio-aggregator/bind
+
+After that, a new gpiochip "door" has been created:
+
+.. code-block:: sh
+
+ $ gpioinfo door
+ gpiochip12 - 2 lines:
+ line 0: "open" unused input active-high
+ line 1: "lock" unused input active-high
diff --git a/Documentation/admin-guide/gpio/index.rst b/Documentation/admin-guide/gpio/index.rst
index a244ba4..ef28386 100644
--- a/Documentation/admin-guide/gpio/index.rst
+++ b/Documentation/admin-guide/gpio/index.rst
@@ -7,6 +7,7 @@
.. toctree::
:maxdepth: 1
+ gpio-aggregator
sysfs
.. only:: subproject and html
diff --git a/Documentation/admin-guide/hw-vuln/index.rst b/Documentation/admin-guide/hw-vuln/index.rst
index 0795e3c..ca4dbdd 100644
--- a/Documentation/admin-guide/hw-vuln/index.rst
+++ b/Documentation/admin-guide/hw-vuln/index.rst
@@ -14,3 +14,4 @@
mds
tsx_async_abort
multihit.rst
+ special-register-buffer-data-sampling.rst
diff --git a/Documentation/admin-guide/hw-vuln/l1tf.rst b/Documentation/admin-guide/hw-vuln/l1tf.rst
index f83212f..3eeeb48 100644
--- a/Documentation/admin-guide/hw-vuln/l1tf.rst
+++ b/Documentation/admin-guide/hw-vuln/l1tf.rst
@@ -268,7 +268,7 @@
/proc/irq/$NR/smp_affinity[_list] files. Limited documentation is
available at:
- https://www.kernel.org/doc/Documentation/IRQ-affinity.txt
+ https://www.kernel.org/doc/Documentation/core-api/irq/irq-affinity.rst
.. _smt_control:
diff --git a/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
new file mode 100644
index 0000000..47b1b3a
--- /dev/null
+++ b/Documentation/admin-guide/hw-vuln/special-register-buffer-data-sampling.rst
@@ -0,0 +1,149 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+SRBDS - Special Register Buffer Data Sampling
+=============================================
+
+SRBDS is a hardware vulnerability that allows MDS :doc:`mds` techniques to
+infer values returned from special register accesses. Special register
+accesses are accesses to off core registers. According to Intel's evaluation,
+the special register reads that have a security expectation of privacy are
+RDRAND, RDSEED and SGX EGETKEY.
+
+When RDRAND, RDSEED and EGETKEY instructions are used, the data is moved
+to the core through the special register mechanism that is susceptible
+to MDS attacks.
+
+Affected processors
+--------------------
+Core models (desktop, mobile, Xeon-E3) that implement RDRAND and/or RDSEED may
+be affected.
+
+A processor is affected by SRBDS if its Family_Model and stepping is
+in the following list, with the exception of the listed processors
+exporting MDS_NO while Intel TSX is available yet not enabled. The
+latter class of processors are only affected when Intel TSX is enabled
+by software using TSX_CTRL_MSR otherwise they are not affected.
+
+ ============= ============ ========
+ common name Family_Model Stepping
+ ============= ============ ========
+ IvyBridge 06_3AH All
+
+ Haswell 06_3CH All
+ Haswell_L 06_45H All
+ Haswell_G 06_46H All
+
+ Broadwell_G 06_47H All
+ Broadwell 06_3DH All
+
+ Skylake_L 06_4EH All
+ Skylake 06_5EH All
+
+ Kabylake_L 06_8EH <= 0xC
+ Kabylake 06_9EH <= 0xD
+ ============= ============ ========
+
+Related CVEs
+------------
+
+The following CVE entry is related to this SRBDS issue:
+
+ ============== ===== =====================================
+ CVE-2020-0543 SRBDS Special Register Buffer Data Sampling
+ ============== ===== =====================================
+
+Attack scenarios
+----------------
+An unprivileged user can extract values returned from RDRAND and RDSEED
+executed on another core or sibling thread using MDS techniques.
+
+
+Mitigation mechanism
+-------------------
+Intel will release microcode updates that modify the RDRAND, RDSEED, and
+EGETKEY instructions to overwrite secret special register data in the shared
+staging buffer before the secret data can be accessed by another logical
+processor.
+
+During execution of the RDRAND, RDSEED, or EGETKEY instructions, off-core
+accesses from other logical processors will be delayed until the special
+register read is complete and the secret data in the shared staging buffer is
+overwritten.
+
+This has three effects on performance:
+
+#. RDRAND, RDSEED, or EGETKEY instructions have higher latency.
+
+#. Executing RDRAND at the same time on multiple logical processors will be
+ serialized, resulting in an overall reduction in the maximum RDRAND
+ bandwidth.
+
+#. Executing RDRAND, RDSEED or EGETKEY will delay memory accesses from other
+ logical processors that miss their core caches, with an impact similar to
+ legacy locked cache-line-split accesses.
+
+The microcode updates provide an opt-out mechanism (RNGDS_MITG_DIS) to disable
+the mitigation for RDRAND and RDSEED instructions executed outside of Intel
+Software Guard Extensions (Intel SGX) enclaves. On logical processors that
+disable the mitigation using this opt-out mechanism, RDRAND and RDSEED do not
+take longer to execute and do not impact performance of sibling logical
+processors memory accesses. The opt-out mechanism does not affect Intel SGX
+enclaves (including execution of RDRAND or RDSEED inside an enclave, as well
+as EGETKEY execution).
+
+IA32_MCU_OPT_CTRL MSR Definition
+--------------------------------
+Along with the mitigation for this issue, Intel added a new thread-scope
+IA32_MCU_OPT_CTRL MSR, (address 0x123). The presence of this MSR and
+RNGDS_MITG_DIS (bit 0) is enumerated by CPUID.(EAX=07H,ECX=0).EDX[SRBDS_CTRL =
+9]==1. This MSR is introduced through the microcode update.
+
+Setting IA32_MCU_OPT_CTRL[0] (RNGDS_MITG_DIS) to 1 for a logical processor
+disables the mitigation for RDRAND and RDSEED executed outside of an Intel SGX
+enclave on that logical processor. Opting out of the mitigation for a
+particular logical processor does not affect the RDRAND and RDSEED mitigations
+for other logical processors.
+
+Note that inside of an Intel SGX enclave, the mitigation is applied regardless
+of the value of RNGDS_MITG_DS.
+
+Mitigation control on the kernel command line
+---------------------------------------------
+The kernel command line allows control over the SRBDS mitigation at boot time
+with the option "srbds=". The option for this is:
+
+ ============= =============================================================
+ off This option disables SRBDS mitigation for RDRAND and RDSEED on
+ affected platforms.
+ ============= =============================================================
+
+SRBDS System Information
+-----------------------
+The Linux kernel provides vulnerability status information through sysfs. For
+SRBDS this can be accessed by the following sysfs file:
+/sys/devices/system/cpu/vulnerabilities/srbds
+
+The possible values contained in this file are:
+
+ ============================== =============================================
+ Not affected Processor not vulnerable
+ Vulnerable Processor vulnerable and mitigation disabled
+ Vulnerable: No microcode Processor vulnerable and microcode is missing
+ mitigation
+ Mitigation: Microcode Processor is vulnerable and mitigation is in
+ effect.
+ Mitigation: TSX disabled Processor is only vulnerable when TSX is
+ enabled while this system was booted with TSX
+ disabled.
+ Unknown: Dependent on
+ hypervisor status Running on virtual guest processor that is
+ affected but with no way to know if host
+ processor is mitigated or vulnerable.
+ ============================== =============================================
+
+SRBDS Default mitigation
+------------------------
+This new microcode serializes processor access during execution of RDRAND,
+RDSEED ensures that the shared buffer is overwritten before it is released for
+reuse. Use the "srbds=off" kernel command line to disable the mitigation for
+RDRAND and RDSEED.
diff --git a/Documentation/admin-guide/init.rst b/Documentation/admin-guide/init.rst
index e89d97f..41f06a0 100644
--- a/Documentation/admin-guide/init.rst
+++ b/Documentation/admin-guide/init.rst
@@ -1,52 +1,48 @@
-Explaining the dreaded "No init found." boot hang message
+Explaining the "No working init found." boot hang message
=========================================================
+:Authors: Andreas Mohr <andi at lisas period de>
+ Cristian Souza <cristianmsbr at gmail period com>
-OK, so you've got this pretty unintuitive message (currently located
-in init/main.c) and are wondering what the H*** went wrong.
-Some high-level reasons for failure (listed roughly in order of execution)
-to load the init binary are:
+This document provides some high-level reasons for failure
+(listed roughly in order of execution) to load the init binary.
-A) Unable to mount root FS
-B) init binary doesn't exist on rootfs
-C) broken console device
-D) binary exists but dependencies not available
-E) binary cannot be loaded
+1) **Unable to mount root FS**: Set "debug" kernel parameter (in bootloader
+ config file or CONFIG_CMDLINE) to get more detailed kernel messages.
-Detailed explanations:
+2) **init binary doesn't exist on rootfs**: Make sure you have the correct
+ root FS type (and ``root=`` kernel parameter points to the correct
+ partition), required drivers such as storage hardware (such as SCSI or
+ USB!) and filesystem (ext3, jffs2, etc.) are builtin (alternatively as
+ modules, to be pre-loaded by an initrd).
-A) Set "debug" kernel parameter (in bootloader config file or CONFIG_CMDLINE)
- to get more detailed kernel messages.
-B) make sure you have the correct root FS type
- (and ``root=`` kernel parameter points to the correct partition),
- required drivers such as storage hardware (such as SCSI or USB!)
- and filesystem (ext3, jffs2 etc.) are builtin (alternatively as modules,
- to be pre-loaded by an initrd)
-C) Possibly a conflict in ``console= setup`` --> initial console unavailable.
- E.g. some serial consoles are unreliable due to serial IRQ issues (e.g.
- missing interrupt-based configuration).
+3) **Broken console device**: Possibly a conflict in ``console= setup``
+ --> initial console unavailable. E.g. some serial consoles are unreliable
+ due to serial IRQ issues (e.g. missing interrupt-based configuration).
Try using a different ``console= device`` or e.g. ``netconsole=``.
-D) e.g. required library dependencies of the init binary such as
- ``/lib/ld-linux.so.2`` missing or broken. Use
- ``readelf -d <INIT>|grep NEEDED`` to find out which libraries are required.
-E) make sure the binary's architecture matches your hardware.
- E.g. i386 vs. x86_64 mismatch, or trying to load x86 on ARM hardware.
- In case you tried loading a non-binary file here (shell script?),
- you should make sure that the script specifies an interpreter in its shebang
- header line (``#!/...``) that is fully working (including its library
- dependencies). And before tackling scripts, better first test a simple
- non-script binary such as ``/bin/sh`` and confirm its successful execution.
- To find out more, add code ``to init/main.c`` to display kernel_execve()s
- return values.
+
+4) **Binary exists but dependencies not available**: E.g. required library
+ dependencies of the init binary such as ``/lib/ld-linux.so.2`` missing or
+ broken. Use ``readelf -d <INIT>|grep NEEDED`` to find out which libraries
+ are required.
+
+5) **Binary cannot be loaded**: Make sure the binary's architecture matches
+ your hardware. E.g. i386 vs. x86_64 mismatch, or trying to load x86 on ARM
+ hardware. In case you tried loading a non-binary file here (shell script?),
+ you should make sure that the script specifies an interpreter in its
+ shebang header line (``#!/...``) that is fully working (including its
+ library dependencies). And before tackling scripts, better first test a
+ simple non-script binary such as ``/bin/sh`` and confirm its successful
+ execution. To find out more, add code ``to init/main.c`` to display
+ kernel_execve()s return values.
Please extend this explanation whenever you find new failure causes
(after all loading the init binary is a CRITICAL and hard transition step
-which needs to be made as painless as possible), then submit patch to LKML.
+which needs to be made as painless as possible), then submit a patch to LKML.
Further TODOs:
- Implement the various ``run_init_process()`` invocations via a struct array
which can then store the ``kernel_execve()`` result value and on failure
log it all by iterating over **all** results (very important usability fix).
-- try to make the implementation itself more helpful in general,
- e.g. by providing additional error messages at affected places.
+- Try to make the implementation itself more helpful in general, e.g. by
+ providing additional error messages at affected places.
-Andreas Mohr <andi at lisas period de>
diff --git a/Documentation/admin-guide/initrd.rst b/Documentation/admin-guide/initrd.rst
index a03daba..67bbad8 100644
--- a/Documentation/admin-guide/initrd.rst
+++ b/Documentation/admin-guide/initrd.rst
@@ -376,7 +376,7 @@
---------
.. [#f1] Almesberger, Werner; "Booting Linux: The History and the Future"
- http://www.almesberger.net/cv/papers/ols2k-9.ps.gz
+ https://www.almesberger.net/cv/papers/ols2k-9.ps.gz
.. [#f2] newlib package (experimental), with initrd example
https://www.sourceware.org/newlib/
.. [#f3] util-linux: Miscellaneous utilities for Linux
diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index ac7e131..2da65fe 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -521,6 +521,14 @@
to specify this during runtime, /proc/sys/kernel/panic_on_warn can be set to 1
to achieve the same behaviour.
+Trigger Kdump on add_taint()
+============================
+
+The kernel parameter panic_on_taint facilitates a conditional call to panic()
+from within add_taint() whenever the value set in this bitmask matches with the
+bit flag being set by add_taint().
+This will cause a kdump to occur at the add_taint()->panic() call.
+
Contact
=======
diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index 007a6b8..e4ee8b2 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -393,6 +393,12 @@
The kernel randomization offset. Used to compute the page offset. If
KASLR is disabled, this value is zero.
+KERNELPACMASK
+-------------
+
+The mask to extract the Pointer Authentication Code from a kernel virtual
+address.
+
arm
===
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 3ed0b42..fb95fad 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -356,7 +356,7 @@
shot down by NMI
autoconf= [IPV6]
- See Documentation/networking/ipv6.txt.
+ See Documentation/networking/ipv6.rst.
show_lapic= [APIC,X86] Advanced Programmable Interrupt Controller
Limit apic dumping. The parameter defines the maximal
@@ -638,7 +638,7 @@
See Documentation/admin-guide/serial-console.rst for more
information. See
- Documentation/networking/netconsole.txt for an
+ Documentation/networking/netconsole.rst for an
alternative.
uart[8250],io,<addr>[,options]
@@ -831,15 +831,18 @@
decnet.addr= [HW,NET]
Format: <area>[,<node>]
- See also Documentation/networking/decnet.txt.
+ See also Documentation/networking/decnet.rst.
default_hugepagesz=
- [same as hugepagesz=] The size of the default
- HugeTLB page size. This is the size represented by
- the legacy /proc/ hugepages APIs, used for SHM, and
- default size when mounting hugetlbfs filesystems.
- Defaults to the default architecture's huge page size
- if not specified.
+ [HW] The size of the default HugeTLB page. This is
+ the size represented by the legacy /proc/ hugepages
+ APIs. In addition, this is the default hugetlb size
+ used for shmget(), mmap() and mounting hugetlbfs
+ filesystems. If not specified, defaults to the
+ architecture's default huge page size. Huge page
+ sizes are architecture dependent. See also
+ Documentation/admin-guide/mm/hugetlbpage.rst.
+ Format: size[KMG]
deferred_probe_timeout=
[KNL] Debugging option to set a timeout in seconds for
@@ -871,8 +874,13 @@
can be useful when debugging issues that require an SLB
miss to occur.
+ stress_slb [PPC]
+ Limits the number of kernel SLB entries, and flushes
+ them frequently to increase the rate of SLB faults
+ on kernel addresses.
+
disable= [IPV6]
- See Documentation/networking/ipv6.txt.
+ See Documentation/networking/ipv6.rst.
hardened_usercopy=
[KNL] Under CONFIG_HARDENED_USERCOPY, whether
@@ -912,7 +920,7 @@
to workaround buggy firmware.
disable_ipv6= [IPV6]
- See Documentation/networking/ipv6.txt.
+ See Documentation/networking/ipv6.rst.
disable_mtrr_cleanup [X86]
The kernel tries to adjust MTRR layout from continuous
@@ -1190,6 +1198,11 @@
This is designed to be used in conjunction with
the boot argument: earlyprintk=vga
+ This parameter works in place of the kgdboc parameter
+ but can only be used if the backing tty is available
+ very early in the boot process. For early debugging
+ via a serial port see kgdboc_earlycon instead.
+
edd= [EDD]
Format: {"off" | "on" | "skip[mbr]"}
@@ -1432,7 +1445,7 @@
hardlockup_all_cpu_backtrace=
[KNL] Should the hard-lockup detector generate
backtraces on all cpus.
- Format: <integer>
+ Format: 0 | 1
hashdist= [KNL,NUMA] Large hashes allocated during boot
are distributed across NUMA nodes. Defaults on
@@ -1479,19 +1492,30 @@
hugepages using the cma allocator. If enabled, the
boot-time allocation of gigantic hugepages is skipped.
- hugepages= [HW,X86-32,IA-64] HugeTLB pages to allocate at boot.
- hugepagesz= [HW,IA-64,PPC,X86-64] The size of the HugeTLB pages.
- On x86-64 and powerpc, this option can be specified
- multiple times interleaved with hugepages= to reserve
- huge pages of different sizes. Valid pages sizes on
- x86-64 are 2M (when the CPU supports "pse") and 1G
- (when the CPU supports the "pdpe1gb" cpuinfo flag).
+ hugepages= [HW] Number of HugeTLB pages to allocate at boot.
+ If this follows hugepagesz (below), it specifies
+ the number of pages of hugepagesz to be allocated.
+ If this is the first HugeTLB parameter on the command
+ line, it specifies the number of pages to allocate for
+ the default huge page size. See also
+ Documentation/admin-guide/mm/hugetlbpage.rst.
+ Format: <integer>
+
+ hugepagesz=
+ [HW] The size of the HugeTLB pages. This is used in
+ conjunction with hugepages (above) to allocate huge
+ pages of a specific size at boot. The pair
+ hugepagesz=X hugepages=Y can be specified once for
+ each supported huge page size. Huge page sizes are
+ architecture dependent. See also
+ Documentation/admin-guide/mm/hugetlbpage.rst.
+ Format: size[KMG]
hung_task_panic=
[KNL] Should the hung task detector generate panics.
- Format: <integer>
+ Format: 0 | 1
- A nonzero value instructs the kernel to panic when a
+ A value of 1 instructs the kernel to panic when a
hung task is detected. The default value is controlled
by the CONFIG_BOOTPARAM_HUNG_TASK_PANIC build-time
option. The value selected by this boot parameter can
@@ -1748,6 +1772,13 @@
initrd= [BOOT] Specify the location of the initial ramdisk
+ initrdmem= [KNL] Specify a physical address and size from which to
+ load the initrd. If an initrd is compiled in or
+ specified in the bootparams, it takes priority over this
+ setting.
+ Format: ss[KMG],nn[KMG]
+ Default is 0, 0
+
init_on_alloc= [MM] Fill newly allocated pages and heap objects with
zeroes.
Format: 0 | 1
@@ -2105,6 +2136,21 @@
kms, kbd format: kms,kbd
kms, kbd and serial format: kms,kbd,<ser_dev>[,baud]
+ kgdboc_earlycon= [KGDB,HW]
+ If the boot console provides the ability to read
+ characters and can work in polling mode, you can use
+ this parameter to tell kgdb to use it as a backend
+ until the normal console is registered. Intended to
+ be used together with the kgdboc parameter which
+ specifies the normal console to transition to.
+
+ The name of the early console should be specified
+ as the value of this parameter. Note that the name of
+ the early console might be different than the tty
+ name passed to kgdboc. It's OK to leave the value
+ blank and the first boot console that implements
+ read() will be picked.
+
kgdbwait [KGDB] Stop kernel execution and enter the
kernel debugger at the earliest opportunity.
@@ -3329,7 +3375,7 @@
See Documentation/admin-guide/sysctl/vm.rst for details.
ohci1394_dma=early [HW] enable debugging via the ohci1394 driver.
- See Documentation/debugging-via-ohci1394.txt for more
+ See Documentation/core-api/debugging-via-ohci1394.rst for more
info.
olpc_ec_timeout= [OLPC] ms delay when issuing EC commands
@@ -3401,6 +3447,19 @@
bit 4: print ftrace buffer
bit 5: print all printk messages in buffer
+ panic_on_taint= Bitmask for conditionally calling panic() in add_taint()
+ Format: <hex>[,nousertaint]
+ Hexadecimal bitmask representing the set of TAINT flags
+ that will cause the kernel to panic when add_taint() is
+ called with any of the flags in this set.
+ The optional switch "nousertaint" can be utilized to
+ prevent userspace forced crashes by writing to sysctl
+ /proc/sys/kernel/tainted any flagset matching with the
+ bitmask set on panic_on_taint.
+ See Documentation/admin-guide/tainted-kernels.rst for
+ extra details on the taint flags that users can pick
+ to compose the bitmask to assign to panic_on_taint.
+
panic_on_warn panic() instead of WARN(). Useful to cause kdump
on a WARN().
@@ -3669,6 +3728,8 @@
may put more devices in an IOMMU group.
force_floating [S390] Force usage of floating interrupts.
nomio [S390] Do not use MIO instructions.
+ norid [S390] ignore the RID field and force use of
+ one PCI domain per PCI function
pcie_aspm= [PCIE] Forcibly enable or disable PCIe Active State Power
Management.
@@ -4210,12 +4271,24 @@
Duration of CPU stall (s) to test RCU CPU stall
warnings, zero to disable.
+ rcutorture.stall_cpu_block= [KNL]
+ Sleep while stalling if set. This will result
+ in warnings from preemptible RCU in addition
+ to any other stall-related activity.
+
rcutorture.stall_cpu_holdoff= [KNL]
Time to wait (s) after boot before inducing stall.
rcutorture.stall_cpu_irqsoff= [KNL]
Disable interrupts while stalling if set.
+ rcutorture.stall_gp_kthread= [KNL]
+ Duration (s) of forced sleep within RCU
+ grace-period kthread to test RCU CPU stall
+ warnings, zero to disable. If both stall_cpu
+ and stall_gp_kthread are specified, the
+ kthread is starved first, then the CPU.
+
rcutorture.stat_interval= [KNL]
Time (s) between statistics printk()s.
@@ -4286,6 +4359,13 @@
only normal grace-period primitives. No effect
on CONFIG_TINY_RCU kernels.
+ rcupdate.rcu_task_ipi_delay= [KNL]
+ Set time in jiffies during which RCU tasks will
+ avoid sending IPIs, starting with the beginning
+ of a given grace period. Setting a large
+ number avoids disturbing real-time workloads,
+ but lengthens grace periods.
+
rcupdate.rcu_task_stall_timeout= [KNL]
Set timeout in jiffies for RCU task stall warning
messages. Disable with a value less than or equal
@@ -4587,9 +4667,9 @@
softlockup_panic=
[KNL] Should the soft-lockup detector generate panics.
- Format: <integer>
+ Format: 0 | 1
- A nonzero value instructs the soft-lockup detector
+ A value of 1 instructs the soft-lockup detector
to panic the machine when a soft-lockup occurs. It is
also controlled by the kernel.softlockup_panic sysctl
and CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC, which is the
@@ -4598,7 +4678,7 @@
softlockup_all_cpu_backtrace=
[KNL] Should the soft-lockup detector generate
backtraces on all cpus.
- Format: <integer>
+ Format: 0 | 1
sonypi.*= [HW] Sony Programmable I/O Control Device driver
See Documentation/admin-guide/laptops/sonypi.rst
@@ -4757,6 +4837,26 @@
the kernel will oops in either "warn" or "fatal"
mode.
+ srbds= [X86,INTEL]
+ Control the Special Register Buffer Data Sampling
+ (SRBDS) mitigation.
+
+ Certain CPUs are vulnerable to an MDS-like
+ exploit which can leak bits from the random
+ number generator.
+
+ By default, this issue is mitigated by
+ microcode. However, the microcode fix can cause
+ the RDRAND and RDSEED instructions to become
+ much slower. Among other effects, this will
+ result in reduced throughput from /dev/urandom.
+
+ The microcode mitigation can be disabled with
+ the following option:
+
+ off: Disable mitigation and remove
+ performance impact to RDRAND and RDSEED
+
srcutree.counter_wrap_check [KNL]
Specifies how frequently to check for
grace-period sequence counter wrap for the
@@ -4891,6 +4991,15 @@
switches= [HW,M68k]
+ sysctl.*= [KNL]
+ Set a sysctl parameter, right before loading the init
+ process, as if the value was written to the respective
+ /proc/sys/... file. Both '.' and '/' are recognized as
+ separators. Unrecognized parameters and invalid values
+ are reported in the kernel log. Sysctls registered
+ later by a loaded module cannot be set this way.
+ Example: sysctl.vm.swappiness=40
+
sysfs.deprecated=0|1 [KNL]
Enable/disable old style sysfs layout for old udev
on older distributions. When this option is enabled
@@ -4910,7 +5019,7 @@
Set the number of tcp_metrics_hash slots.
Default value is 8192 or 16384 depending on total
ram pages. This is used to specify the TCP metrics
- cache size. See Documentation/networking/ip-sysctl.txt
+ cache size. See Documentation/networking/ip-sysctl.rst
"tcp_no_metrics_save" section for more details.
tdfx= [HW,DRM]
@@ -5067,6 +5176,12 @@
interruptions from clocksource watchdog are not
acceptable).
+ tsc_early_khz= [X86] Skip early TSC calibration and use the given
+ value instead. Useful when the early TSC frequency discovery
+ procedure is not reliable, such as on overclocked systems
+ with CPUID.16h support and partial CPUID.15h support.
+ Format: <unsigned int>
+
tsx= [X86] Control Transactional Synchronization
Extensions (TSX) feature in Intel processors that
support TSX control.
@@ -5187,8 +5302,7 @@
usbcore.old_scheme_first=
[USB] Start with the old device initialization
- scheme, applies only to low and full-speed devices
- (default 0 = off).
+ scheme (default 0 = off).
usbcore.usbfs_memory_mb=
[USB] Memory limit (in MB) for buffers allocated by
diff --git a/Documentation/admin-guide/kernel-per-CPU-kthreads.rst b/Documentation/admin-guide/kernel-per-CPU-kthreads.rst
index 21818ac..dc36aeb 100644
--- a/Documentation/admin-guide/kernel-per-CPU-kthreads.rst
+++ b/Documentation/admin-guide/kernel-per-CPU-kthreads.rst
@@ -10,7 +10,7 @@
References
==========
-- Documentation/IRQ-affinity.txt: Binding interrupts to sets of CPUs.
+- Documentation/core-api/irq/irq-affinity.rst: Binding interrupts to sets of CPUs.
- Documentation/admin-guide/cgroup-v1: Using cgroups to bind tasks to sets of CPUs.
diff --git a/Documentation/admin-guide/md.rst b/Documentation/admin-guide/md.rst
index 3c51084..d973d46 100644
--- a/Documentation/admin-guide/md.rst
+++ b/Documentation/admin-guide/md.rst
@@ -5,7 +5,7 @@
---------------------------------
Tools that manage md devices can be found at
- http://www.kernel.org/pub/linux/utils/raid/
+ https://www.kernel.org/pub/linux/utils/raid/
You can boot with your md device with the following kernel command
diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst
index 1cc0bc7..5026e58 100644
--- a/Documentation/admin-guide/mm/hugetlbpage.rst
+++ b/Documentation/admin-guide/mm/hugetlbpage.rst
@@ -100,6 +100,41 @@
be specified in bytes with optional scale suffix [kKmMgG]. The default huge
page size may be selected with the "default_hugepagesz=<size>" boot parameter.
+Hugetlb boot command line parameter semantics
+hugepagesz - Specify a huge page size. Used in conjunction with hugepages
+ parameter to preallocate a number of huge pages of the specified
+ size. Hence, hugepagesz and hugepages are typically specified in
+ pairs such as:
+ hugepagesz=2M hugepages=512
+ hugepagesz can only be specified once on the command line for a
+ specific huge page size. Valid huge page sizes are architecture
+ dependent.
+hugepages - Specify the number of huge pages to preallocate. This typically
+ follows a valid hugepagesz or default_hugepagesz parameter. However,
+ if hugepages is the first or only hugetlb command line parameter it
+ implicitly specifies the number of huge pages of default size to
+ allocate. If the number of huge pages of default size is implicitly
+ specified, it can not be overwritten by a hugepagesz,hugepages
+ parameter pair for the default size.
+ For example, on an architecture with 2M default huge page size:
+ hugepages=256 hugepagesz=2M hugepages=512
+ will result in 256 2M huge pages being allocated and a warning message
+ indicating that the hugepages=512 parameter is ignored. If a hugepages
+ parameter is preceded by an invalid hugepagesz parameter, it will
+ be ignored.
+default_hugepagesz - Specify the default huge page size. This parameter can
+ only be specified once on the command line. default_hugepagesz can
+ optionally be followed by the hugepages parameter to preallocate a
+ specific number of huge pages of default size. The number of default
+ sized huge pages to preallocate can also be implicitly specified as
+ mentioned in the hugepages section above. Therefore, on an
+ architecture with 2M default huge page size:
+ hugepages=256
+ default_hugepagesz=2M hugepages=256
+ hugepages=256 default_hugepagesz=2M
+ will all result in 256 2M huge pages being allocated. Valid default
+ huge page size is architecture dependent.
+
When multiple huge page sizes are supported, ``/proc/sys/vm/nr_hugepages``
indicates the current number of pre-allocated huge pages of the default size.
Thus, one can use the following command to dynamically allocate/deallocate
diff --git a/Documentation/admin-guide/mm/numa_memory_policy.rst b/Documentation/admin-guide/mm/numa_memory_policy.rst
index 8463f55..067a90a 100644
--- a/Documentation/admin-guide/mm/numa_memory_policy.rst
+++ b/Documentation/admin-guide/mm/numa_memory_policy.rst
@@ -364,19 +364,19 @@
2) for querying the policy, we do not need to take an extra reference on the
target task's task policy nor vma policies because we always acquire the
- task's mm's mmap_sem for read during the query. The set_mempolicy() and
- mbind() APIs [see below] always acquire the mmap_sem for write when
+ task's mm's mmap_lock for read during the query. The set_mempolicy() and
+ mbind() APIs [see below] always acquire the mmap_lock for write when
installing or replacing task or vma policies. Thus, there is no possibility
of a task or thread freeing a policy while another task or thread is
querying it.
3) Page allocation usage of task or vma policy occurs in the fault path where
- we hold them mmap_sem for read. Again, because replacing the task or vma
- policy requires that the mmap_sem be held for write, the policy can't be
+ we hold them mmap_lock for read. Again, because replacing the task or vma
+ policy requires that the mmap_lock be held for write, the policy can't be
freed out from under us while we're using it for page allocation.
4) Shared policies require special consideration. One task can replace a
- shared memory policy while another task, with a distinct mmap_sem, is
+ shared memory policy while another task, with a distinct mmap_lock, is
querying or allocating a page based on the policy. To resolve this
potential race, the shared policy infrastructure adds an extra reference
to the shared policy during lookup while holding a spin lock on the shared
diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
index 2f31de8..6a233e4 100644
--- a/Documentation/admin-guide/mm/transhuge.rst
+++ b/Documentation/admin-guide/mm/transhuge.rst
@@ -220,6 +220,13 @@
collapsed, resulting fewer pages being collapsed into
THPs, and lower memory access performance.
+``max_ptes_shared`` specifies how many pages can be shared across multiple
+processes. Exceeding the number would block the collapse::
+
+ /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_shared
+
+A higher value may increase memory footprint for some workloads.
+
Boot parameter
==============
diff --git a/Documentation/admin-guide/mm/userfaultfd.rst b/Documentation/admin-guide/mm/userfaultfd.rst
index c30176e..1dc2d5f 100644
--- a/Documentation/admin-guide/mm/userfaultfd.rst
+++ b/Documentation/admin-guide/mm/userfaultfd.rst
@@ -12,107 +12,107 @@
memory page faults, something otherwise only the kernel code could do.
For example userfaults allows a proper and more optimal implementation
-of the PROT_NONE+SIGSEGV trick.
+of the ``PROT_NONE+SIGSEGV`` trick.
Design
======
-Userfaults are delivered and resolved through the userfaultfd syscall.
+Userfaults are delivered and resolved through the ``userfaultfd`` syscall.
-The userfaultfd (aside from registering and unregistering virtual
+The ``userfaultfd`` (aside from registering and unregistering virtual
memory ranges) provides two primary functionalities:
-1) read/POLLIN protocol to notify a userland thread of the faults
+1) ``read/POLLIN`` protocol to notify a userland thread of the faults
happening
-2) various UFFDIO_* ioctls that can manage the virtual memory regions
- registered in the userfaultfd that allows userland to efficiently
+2) various ``UFFDIO_*`` ioctls that can manage the virtual memory regions
+ registered in the ``userfaultfd`` that allows userland to efficiently
resolve the userfaults it receives via 1) or to manage the virtual
memory in the background
The real advantage of userfaults if compared to regular virtual memory
management of mremap/mprotect is that the userfaults in all their
operations never involve heavyweight structures like vmas (in fact the
-userfaultfd runtime load never takes the mmap_sem for writing).
+``userfaultfd`` runtime load never takes the mmap_lock for writing).
Vmas are not suitable for page- (or hugepage) granular fault tracking
when dealing with virtual address spaces that could span
Terabytes. Too many vmas would be needed for that.
-The userfaultfd once opened by invoking the syscall, can also be
+The ``userfaultfd`` once opened by invoking the syscall, can also be
passed using unix domain sockets to a manager process, so the same
manager process could handle the userfaults of a multitude of
different processes without them being aware about what is going on
-(well of course unless they later try to use the userfaultfd
+(well of course unless they later try to use the ``userfaultfd``
themselves on the same region the manager is already tracking, which
-is a corner case that would currently return -EBUSY).
+is a corner case that would currently return ``-EBUSY``).
API
===
-When first opened the userfaultfd must be enabled invoking the
-UFFDIO_API ioctl specifying a uffdio_api.api value set to UFFD_API (or
-a later API version) which will specify the read/POLLIN protocol
-userland intends to speak on the UFFD and the uffdio_api.features
-userland requires. The UFFDIO_API ioctl if successful (i.e. if the
-requested uffdio_api.api is spoken also by the running kernel and the
+When first opened the ``userfaultfd`` must be enabled invoking the
+``UFFDIO_API`` ioctl specifying a ``uffdio_api.api`` value set to ``UFFD_API`` (or
+a later API version) which will specify the ``read/POLLIN`` protocol
+userland intends to speak on the ``UFFD`` and the ``uffdio_api.features``
+userland requires. The ``UFFDIO_API`` ioctl if successful (i.e. if the
+requested ``uffdio_api.api`` is spoken also by the running kernel and the
requested features are going to be enabled) will return into
-uffdio_api.features and uffdio_api.ioctls two 64bit bitmasks of
+``uffdio_api.features`` and ``uffdio_api.ioctls`` two 64bit bitmasks of
respectively all the available features of the read(2) protocol and
the generic ioctl available.
-The uffdio_api.features bitmask returned by the UFFDIO_API ioctl
-defines what memory types are supported by the userfaultfd and what
+The ``uffdio_api.features`` bitmask returned by the ``UFFDIO_API`` ioctl
+defines what memory types are supported by the ``userfaultfd`` and what
events, except page fault notifications, may be generated.
-If the kernel supports registering userfaultfd ranges on hugetlbfs
-virtual memory areas, UFFD_FEATURE_MISSING_HUGETLBFS will be set in
-uffdio_api.features. Similarly, UFFD_FEATURE_MISSING_SHMEM will be
-set if the kernel supports registering userfaultfd ranges on shared
-memory (covering all shmem APIs, i.e. tmpfs, IPCSHM, /dev/zero
-MAP_SHARED, memfd_create, etc).
+If the kernel supports registering ``userfaultfd`` ranges on hugetlbfs
+virtual memory areas, ``UFFD_FEATURE_MISSING_HUGETLBFS`` will be set in
+``uffdio_api.features``. Similarly, ``UFFD_FEATURE_MISSING_SHMEM`` will be
+set if the kernel supports registering ``userfaultfd`` ranges on shared
+memory (covering all shmem APIs, i.e. tmpfs, ``IPCSHM``, ``/dev/zero``,
+``MAP_SHARED``, ``memfd_create``, etc).
-The userland application that wants to use userfaultfd with hugetlbfs
+The userland application that wants to use ``userfaultfd`` with hugetlbfs
or shared memory need to set the corresponding flag in
-uffdio_api.features to enable those features.
+``uffdio_api.features`` to enable those features.
If the userland desires to receive notifications for events other than
-page faults, it has to verify that uffdio_api.features has appropriate
-UFFD_FEATURE_EVENT_* bits set. These events are described in more
-detail below in "Non-cooperative userfaultfd" section.
+page faults, it has to verify that ``uffdio_api.features`` has appropriate
+``UFFD_FEATURE_EVENT_*`` bits set. These events are described in more
+detail below in `Non-cooperative userfaultfd`_ section.
-Once the userfaultfd has been enabled the UFFDIO_REGISTER ioctl should
-be invoked (if present in the returned uffdio_api.ioctls bitmask) to
-register a memory range in the userfaultfd by setting the
-uffdio_register structure accordingly. The uffdio_register.mode
+Once the ``userfaultfd`` has been enabled the ``UFFDIO_REGISTER`` ioctl should
+be invoked (if present in the returned ``uffdio_api.ioctls`` bitmask) to
+register a memory range in the ``userfaultfd`` by setting the
+uffdio_register structure accordingly. The ``uffdio_register.mode``
bitmask will specify to the kernel which kind of faults to track for
-the range (UFFDIO_REGISTER_MODE_MISSING would track missing
-pages). The UFFDIO_REGISTER ioctl will return the
-uffdio_register.ioctls bitmask of ioctls that are suitable to resolve
+the range (``UFFDIO_REGISTER_MODE_MISSING`` would track missing
+pages). The ``UFFDIO_REGISTER`` ioctl will return the
+``uffdio_register.ioctls`` bitmask of ioctls that are suitable to resolve
userfaults on the range registered. Not all ioctls will necessarily be
supported for all memory types depending on the underlying virtual
memory backend (anonymous memory vs tmpfs vs real filebacked
mappings).
-Userland can use the uffdio_register.ioctls to manage the virtual
+Userland can use the ``uffdio_register.ioctls`` to manage the virtual
address space in the background (to add or potentially also remove
-memory from the userfaultfd registered range). This means a userfault
+memory from the ``userfaultfd`` registered range). This means a userfault
could be triggering just before userland maps in the background the
user-faulted page.
-The primary ioctl to resolve userfaults is UFFDIO_COPY. That
+The primary ioctl to resolve userfaults is ``UFFDIO_COPY``. That
atomically copies a page into the userfault registered range and wakes
-up the blocked userfaults (unless uffdio_copy.mode &
-UFFDIO_COPY_MODE_DONTWAKE is set). Other ioctl works similarly to
-UFFDIO_COPY. They're atomic as in guaranteeing that nothing can see an
-half copied page since it'll keep userfaulting until the copy has
-finished.
+up the blocked userfaults
+(unless ``uffdio_copy.mode & UFFDIO_COPY_MODE_DONTWAKE`` is set).
+Other ioctl works similarly to ``UFFDIO_COPY``. They're atomic as in
+guaranteeing that nothing can see an half copied page since it'll
+keep userfaulting until the copy has finished.
Notes:
-- If you requested UFFDIO_REGISTER_MODE_MISSING when registering then
+- If you requested ``UFFDIO_REGISTER_MODE_MISSING`` when registering then
you must provide some kind of page in your thread after reading from
- the uffd. You must provide either UFFDIO_COPY or UFFDIO_ZEROPAGE.
+ the uffd. You must provide either ``UFFDIO_COPY`` or ``UFFDIO_ZEROPAGE``.
The normal behavior of the OS automatically providing a zero page on
an annonymous mmaping is not in place.
@@ -122,13 +122,13 @@
- You get the address of the access that triggered the missing page
event out of a struct uffd_msg that you read in the thread from the
- uffd. You can supply as many pages as you want with UFFDIO_COPY or
- UFFDIO_ZEROPAGE. Keep in mind that unless you used DONTWAKE then
+ uffd. You can supply as many pages as you want with ``UFFDIO_COPY`` or
+ ``UFFDIO_ZEROPAGE``. Keep in mind that unless you used DONTWAKE then
the first of any of those IOCTLs wakes up the faulting thread.
-- Be sure to test for all errors including (pollfd[0].revents &
- POLLERR). This can happen, e.g. when ranges supplied were
- incorrect.
+- Be sure to test for all errors including
+ (``pollfd[0].revents & POLLERR``). This can happen, e.g. when ranges
+ supplied were incorrect.
Write Protect Notifications
---------------------------
@@ -136,41 +136,42 @@
This is equivalent to (but faster than) using mprotect and a SIGSEGV
signal handler.
-Firstly you need to register a range with UFFDIO_REGISTER_MODE_WP.
-Instead of using mprotect(2) you use ioctl(uffd, UFFDIO_WRITEPROTECT,
-struct *uffdio_writeprotect) while mode = UFFDIO_WRITEPROTECT_MODE_WP
+Firstly you need to register a range with ``UFFDIO_REGISTER_MODE_WP``.
+Instead of using mprotect(2) you use
+``ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect)``
+while ``mode = UFFDIO_WRITEPROTECT_MODE_WP``
in the struct passed in. The range does not default to and does not
have to be identical to the range you registered with. You can write
protect as many ranges as you like (inside the registered range).
Then, in the thread reading from uffd the struct will have
-msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP set. Now you send
-ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect) again
-while pagefault.mode does not have UFFDIO_WRITEPROTECT_MODE_WP set.
-This wakes up the thread which will continue to run with writes. This
+``msg.arg.pagefault.flags & UFFD_PAGEFAULT_FLAG_WP`` set. Now you send
+``ioctl(uffd, UFFDIO_WRITEPROTECT, struct *uffdio_writeprotect)``
+again while ``pagefault.mode`` does not have ``UFFDIO_WRITEPROTECT_MODE_WP``
+set. This wakes up the thread which will continue to run with writes. This
allows you to do the bookkeeping about the write in the uffd reading
thread before the ioctl.
-If you registered with both UFFDIO_REGISTER_MODE_MISSING and
-UFFDIO_REGISTER_MODE_WP then you need to think about the sequence in
+If you registered with both ``UFFDIO_REGISTER_MODE_MISSING`` and
+``UFFDIO_REGISTER_MODE_WP`` then you need to think about the sequence in
which you supply a page and undo write protect. Note that there is a
difference between writes into a WP area and into a !WP area. The
-former will have UFFD_PAGEFAULT_FLAG_WP set, the latter
-UFFD_PAGEFAULT_FLAG_WRITE. The latter did not fail on protection but
-you still need to supply a page when UFFDIO_REGISTER_MODE_MISSING was
+former will have ``UFFD_PAGEFAULT_FLAG_WP`` set, the latter
+``UFFD_PAGEFAULT_FLAG_WRITE``. The latter did not fail on protection but
+you still need to supply a page when ``UFFDIO_REGISTER_MODE_MISSING`` was
used.
QEMU/KVM
========
-QEMU/KVM is using the userfaultfd syscall to implement postcopy live
+QEMU/KVM is using the ``userfaultfd`` syscall to implement postcopy live
migration. Postcopy live migration is one form of memory
externalization consisting of a virtual machine running with part or
all of its memory residing on a different node in the cloud. The
-userfaultfd abstraction is generic enough that not a single line of
+``userfaultfd`` abstraction is generic enough that not a single line of
KVM kernel code had to be modified in order to add postcopy live
migration to QEMU.
-Guest async page faults, FOLL_NOWAIT and all other GUP features work
+Guest async page faults, ``FOLL_NOWAIT`` and all other ``GUP*`` features work
just fine in combination with userfaults. Userfaults trigger async
page faults in the guest scheduler so those guest processes that
aren't waiting for userfaults (i.e. network bound) can keep running in
@@ -183,19 +184,19 @@
The implementation of postcopy live migration currently uses one
single bidirectional socket but in the future two different sockets
will be used (to reduce the latency of the userfaults to the minimum
-possible without having to decrease /proc/sys/net/ipv4/tcp_wmem).
+possible without having to decrease ``/proc/sys/net/ipv4/tcp_wmem``).
The QEMU in the source node writes all pages that it knows are missing
in the destination node, into the socket, and the migration thread of
-the QEMU running in the destination node runs UFFDIO_COPY|ZEROPAGE
-ioctls on the userfaultfd in order to map the received pages into the
-guest (UFFDIO_ZEROCOPY is used if the source page was a zero page).
+the QEMU running in the destination node runs ``UFFDIO_COPY|ZEROPAGE``
+ioctls on the ``userfaultfd`` in order to map the received pages into the
+guest (``UFFDIO_ZEROCOPY`` is used if the source page was a zero page).
A different postcopy thread in the destination node listens with
-poll() to the userfaultfd in parallel. When a POLLIN event is
+poll() to the ``userfaultfd`` in parallel. When a ``POLLIN`` event is
generated after a userfault triggers, the postcopy thread read() from
-the userfaultfd and receives the fault address (or -EAGAIN in case the
-userfault was already resolved and waken by a UFFDIO_COPY|ZEROPAGE run
+the ``userfaultfd`` and receives the fault address (or ``-EAGAIN`` in case the
+userfault was already resolved and waken by a ``UFFDIO_COPY|ZEROPAGE`` run
by the parallel QEMU migration thread).
After the QEMU postcopy thread (running in the destination node) gets
@@ -206,7 +207,7 @@
(just the time to flush the tcp_wmem queue through the network) the
migration thread in the QEMU running in the destination node will
receive the page that triggered the userfault and it'll map it as
-usual with the UFFDIO_COPY|ZEROPAGE (without actually knowing if it
+usual with the ``UFFDIO_COPY|ZEROPAGE`` (without actually knowing if it
was spontaneously sent by the source or if it was an urgent page
requested through a userfault).
@@ -219,74 +220,74 @@
over it when receiving incoming userfaults. After sending each page of
course the bitmap is updated accordingly. It's also useful to avoid
sending the same page twice (in case the userfault is read by the
-postcopy thread just before UFFDIO_COPY|ZEROPAGE runs in the migration
+postcopy thread just before ``UFFDIO_COPY|ZEROPAGE`` runs in the migration
thread).
Non-cooperative userfaultfd
===========================
-When the userfaultfd is monitored by an external manager, the manager
+When the ``userfaultfd`` is monitored by an external manager, the manager
must be able to track changes in the process virtual memory
layout. Userfaultfd can notify the manager about such changes using
the same read(2) protocol as for the page fault notifications. The
manager has to explicitly enable these events by setting appropriate
-bits in uffdio_api.features passed to UFFDIO_API ioctl:
+bits in ``uffdio_api.features`` passed to ``UFFDIO_API`` ioctl:
-UFFD_FEATURE_EVENT_FORK
- enable userfaultfd hooks for fork(). When this feature is
- enabled, the userfaultfd context of the parent process is
+``UFFD_FEATURE_EVENT_FORK``
+ enable ``userfaultfd`` hooks for fork(). When this feature is
+ enabled, the ``userfaultfd`` context of the parent process is
duplicated into the newly created process. The manager
- receives UFFD_EVENT_FORK with file descriptor of the new
- userfaultfd context in the uffd_msg.fork.
+ receives ``UFFD_EVENT_FORK`` with file descriptor of the new
+ ``userfaultfd`` context in the ``uffd_msg.fork``.
-UFFD_FEATURE_EVENT_REMAP
+``UFFD_FEATURE_EVENT_REMAP``
enable notifications about mremap() calls. When the
non-cooperative process moves a virtual memory area to a
different location, the manager will receive
- UFFD_EVENT_REMAP. The uffd_msg.remap will contain the old and
+ ``UFFD_EVENT_REMAP``. The ``uffd_msg.remap`` will contain the old and
new addresses of the area and its original length.
-UFFD_FEATURE_EVENT_REMOVE
+``UFFD_FEATURE_EVENT_REMOVE``
enable notifications about madvise(MADV_REMOVE) and
- madvise(MADV_DONTNEED) calls. The event UFFD_EVENT_REMOVE will
- be generated upon these calls to madvise. The uffd_msg.remove
+ madvise(MADV_DONTNEED) calls. The event ``UFFD_EVENT_REMOVE`` will
+ be generated upon these calls to madvise(). The ``uffd_msg.remove``
will contain start and end addresses of the removed area.
-UFFD_FEATURE_EVENT_UNMAP
+``UFFD_FEATURE_EVENT_UNMAP``
enable notifications about memory unmapping. The manager will
- get UFFD_EVENT_UNMAP with uffd_msg.remove containing start and
+ get ``UFFD_EVENT_UNMAP`` with ``uffd_msg.remove`` containing start and
end addresses of the unmapped area.
-Although the UFFD_FEATURE_EVENT_REMOVE and UFFD_FEATURE_EVENT_UNMAP
+Although the ``UFFD_FEATURE_EVENT_REMOVE`` and ``UFFD_FEATURE_EVENT_UNMAP``
are pretty similar, they quite differ in the action expected from the
-userfaultfd manager. In the former case, the virtual memory is
+``userfaultfd`` manager. In the former case, the virtual memory is
removed, but the area is not, the area remains monitored by the
-userfaultfd, and if a page fault occurs in that area it will be
+``userfaultfd``, and if a page fault occurs in that area it will be
delivered to the manager. The proper resolution for such page fault is
to zeromap the faulting address. However, in the latter case, when an
area is unmapped, either explicitly (with munmap() system call), or
implicitly (e.g. during mremap()), the area is removed and in turn the
-userfaultfd context for such area disappears too and the manager will
+``userfaultfd`` context for such area disappears too and the manager will
not get further userland page faults from the removed area. Still, the
notification is required in order to prevent manager from using
-UFFDIO_COPY on the unmapped area.
+``UFFDIO_COPY`` on the unmapped area.
Unlike userland page faults which have to be synchronous and require
explicit or implicit wakeup, all the events are delivered
asynchronously and the non-cooperative process resumes execution as
-soon as manager executes read(). The userfaultfd manager should
-carefully synchronize calls to UFFDIO_COPY with the events
-processing. To aid the synchronization, the UFFDIO_COPY ioctl will
-return -ENOSPC when the monitored process exits at the time of
-UFFDIO_COPY, and -ENOENT, when the non-cooperative process has changed
-its virtual memory layout simultaneously with outstanding UFFDIO_COPY
+soon as manager executes read(). The ``userfaultfd`` manager should
+carefully synchronize calls to ``UFFDIO_COPY`` with the events
+processing. To aid the synchronization, the ``UFFDIO_COPY`` ioctl will
+return ``-ENOSPC`` when the monitored process exits at the time of
+``UFFDIO_COPY``, and ``-ENOENT``, when the non-cooperative process has changed
+its virtual memory layout simultaneously with outstanding ``UFFDIO_COPY``
operation.
The current asynchronous model of the event delivery is optimal for
-single threaded non-cooperative userfaultfd manager implementations. A
+single threaded non-cooperative ``userfaultfd`` manager implementations. A
synchronous event delivery model can be added later as a new
-userfaultfd feature to facilitate multithreading enhancements of the
-non cooperative manager, for example to allow UFFDIO_COPY ioctls to
+``userfaultfd`` feature to facilitate multithreading enhancements of the
+non cooperative manager, for example to allow ``UFFDIO_COPY`` ioctls to
run in parallel to the event reception. Single threaded
implementations should continue to use the current async event
delivery model instead.
diff --git a/Documentation/admin-guide/mono.rst b/Documentation/admin-guide/mono.rst
index 59e6d59..c6dab56 100644
--- a/Documentation/admin-guide/mono.rst
+++ b/Documentation/admin-guide/mono.rst
@@ -12,11 +12,11 @@
a binary package, a source tarball or by installing from Git. Binary
packages for several distributions can be found at:
- http://www.mono-project.com/download/
+ https://www.mono-project.com/download/
Instructions for compiling Mono can be found at:
- http://www.mono-project.com/docs/compiling-mono/linux/
+ https://www.mono-project.com/docs/compiling-mono/linux/
Once the Mono CLR support has been installed, just check that
``/usr/bin/mono`` (which could be located elsewhere, for example
diff --git a/Documentation/admin-guide/nfs/nfsroot.rst b/Documentation/admin-guide/nfs/nfsroot.rst
index 82a4fda..c677207 100644
--- a/Documentation/admin-guide/nfs/nfsroot.rst
+++ b/Documentation/admin-guide/nfs/nfsroot.rst
@@ -18,7 +18,7 @@
In order to use a diskless system, such as an X-terminal or printer server for
example, it is necessary for the root filesystem to be present on a non-disk
device. This may be an initramfs (see
-Documentation/filesystems/ramfs-rootfs-initramfs.txt), a ramdisk (see
+Documentation/filesystems/ramfs-rootfs-initramfs.rst), a ramdisk (see
Documentation/admin-guide/initrd.rst) or a filesystem mounted via NFS. The
following text describes on how to use NFS for the root filesystem. For the rest
of this text 'client' means the diskless system, and 'server' means the NFS
diff --git a/Documentation/admin-guide/numastat.rst b/Documentation/admin-guide/numastat.rst
index aaf1667..08ec2c2 100644
--- a/Documentation/admin-guide/numastat.rst
+++ b/Documentation/admin-guide/numastat.rst
@@ -6,6 +6,21 @@
All units are pages. Hugepages have separate counters.
+The numa_hit, numa_miss and numa_foreign counters reflect how well processes
+are able to allocate memory from nodes they prefer. If they succeed, numa_hit
+is incremented on the preferred node, otherwise numa_foreign is incremented on
+the preferred node and numa_miss on the node where allocation succeeded.
+
+Usually preferred node is the one local to the CPU where the process executes,
+but restrictions such as mempolicies can change that, so there are also two
+counters based on CPU local node. local_node is similar to numa_hit and is
+incremented on allocation from a node by CPU on the same node. other_node is
+similar to numa_miss and is incremented on the node where allocation succeeds
+from a CPU from a different node. Note there is no counter analogical to
+numa_foreign.
+
+In more detail:
+
=============== ============================================================
numa_hit A process wanted to allocate memory from this node,
and succeeded.
@@ -14,11 +29,13 @@
but ended up with memory from this node.
numa_foreign A process wanted to allocate on this node,
- but ended up with memory from another one.
+ but ended up with memory from another node.
-local_node A process ran on this node and got memory from it.
+local_node A process ran on this node's CPU,
+ and got memory from this node.
-other_node A process ran on this node and got memory from another node.
+other_node A process ran on a different node's CPU
+ and got memory from this node.
interleave_hit Interleaving wanted to allocate from this node
and succeeded.
@@ -28,3 +45,11 @@
(http://oss.sgi.com/projects/libnuma/). Note that it only works
well right now on machines with a small number of CPUs.
+Note that on systems with memoryless nodes (where a node has CPUs but no
+memory) the numa_hit, numa_miss and numa_foreign statistics can be skewed
+heavily. In the current kernel implementation, if a process prefers a
+memoryless node (i.e. because it is running on one of its local CPU), the
+implementation actually treats one of the nearest nodes with memory as the
+preferred node. As a result, such allocation will not increase the numa_foreign
+counter on the memoryless node, and will skew the numa_hit, numa_miss and
+numa_foreign statistics of the nearest node.
diff --git a/Documentation/admin-guide/perf-security.rst b/Documentation/admin-guide/perf-security.rst
index 72effa7..1307b52 100644
--- a/Documentation/admin-guide/perf-security.rst
+++ b/Documentation/admin-guide/perf-security.rst
@@ -1,6 +1,6 @@
.. _perf_security:
-Perf Events and tool security
+Perf events and tool security
=============================
Overview
@@ -42,11 +42,11 @@
Data that belong to the fourth category can potentially contain
sensitive process data. If PMUs in some monitoring modes capture values
of execution context registers or data from process memory then access
-to such monitoring capabilities requires to be ordered and secured
-properly. So, perf_events/Perf performance monitoring is the subject for
-security access control management [5]_ .
+to such monitoring modes requires to be ordered and secured properly.
+So, perf_events performance monitoring and observability operations are
+the subject for security access control management [5]_ .
-perf_events/Perf access control
+perf_events access control
-------------------------------
To perform security checks, the Linux implementation splits processes
@@ -66,11 +66,25 @@
independently enabled and disabled on per-thread basis for processes and
files of unprivileged users.
-Unprivileged processes with enabled CAP_SYS_ADMIN capability are treated
+Unprivileged processes with enabled CAP_PERFMON capability are treated
as privileged processes with respect to perf_events performance
-monitoring and bypass *scope* permissions checks in the kernel.
+monitoring and observability operations, thus, bypass *scope* permissions
+checks in the kernel. CAP_PERFMON implements the principle of least
+privilege [13]_ (POSIX 1003.1e: 2.2.2.39) for performance monitoring and
+observability operations in the kernel and provides a secure approach to
+perfomance monitoring and observability in the system.
-Unprivileged processes using perf_events system call API is also subject
+For backward compatibility reasons the access to perf_events monitoring and
+observability operations is also open for CAP_SYS_ADMIN privileged
+processes but CAP_SYS_ADMIN usage for secure monitoring and observability
+use cases is discouraged with respect to the CAP_PERFMON capability.
+If system audit records [14]_ for a process using perf_events system call
+API contain denial records of acquiring both CAP_PERFMON and CAP_SYS_ADMIN
+capabilities then providing the process with CAP_PERFMON capability singly
+is recommended as the preferred secure approach to resolve double access
+denial logging related to usage of performance monitoring and observability.
+
+Unprivileged processes using perf_events system call are also subject
for PTRACE_MODE_READ_REALCREDS ptrace access mode check [7]_ , whose
outcome determines whether monitoring is permitted. So unprivileged
processes provided with CAP_SYS_PTRACE capability are effectively
@@ -82,14 +96,14 @@
CAP_SYSLOG capability permits reading kernel space memory addresses from
/proc/kallsyms file.
-perf_events/Perf privileged users
+Privileged Perf users groups
---------------------------------
Mechanisms of capabilities, privileged capability-dumb files [6]_ and
-file system ACLs [10]_ can be used to create a dedicated group of
-perf_events/Perf privileged users who are permitted to execute
-performance monitoring without scope limits. The following steps can be
-taken to create such a group of privileged Perf users.
+file system ACLs [10]_ can be used to create dedicated groups of
+privileged Perf users who are permitted to execute performance monitoring
+and observability without scope limits. The following steps can be
+taken to create such groups of privileged Perf users.
1. Create perf_users group of privileged Perf users, assign perf_users
group to Perf tool executable and limit access to the executable for
@@ -108,30 +122,51 @@
-rwxr-x--- 2 root perf_users 11M Oct 19 15:12 perf
2. Assign the required capabilities to the Perf tool executable file and
- enable members of perf_users group with performance monitoring
+ enable members of perf_users group with monitoring and observability
privileges [6]_ :
::
- # setcap "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
- # setcap -v "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
+ # setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" perf
+ # setcap -v "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" perf
perf: OK
# getcap perf
- perf = cap_sys_ptrace,cap_sys_admin,cap_syslog+ep
+ perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
+
+If the libcap installed doesn't yet support "cap_perfmon", use "38" instead,
+i.e.:
+
+::
+
+ # setcap "38,cap_ipc_lock,cap_sys_ptrace,cap_syslog=ep" perf
+
+Note that you may need to have 'cap_ipc_lock' in the mix for tools such as
+'perf top', alternatively use 'perf top -m N', to reduce the memory that
+it uses for the perf ring buffer, see the memory allocation section below.
+
+Using a libcap without support for CAP_PERFMON will make cap_get_flag(caps, 38,
+CAP_EFFECTIVE, &val) fail, which will lead the default event to be 'cycles:u',
+so as a workaround explicitly ask for the 'cycles' event, i.e.:
+
+::
+
+ # perf top -e cycles
+
+To get kernel and user samples with a perf binary with just CAP_PERFMON.
As a result, members of perf_users group are capable of conducting
-performance monitoring by using functionality of the configured Perf
-tool executable that, when executes, passes perf_events subsystem scope
-checks.
+performance monitoring and observability by using functionality of the
+configured Perf tool executable that, when executes, passes perf_events
+subsystem scope checks.
This specific access control management is only available to superuser
or root running processes with CAP_SETPCAP, CAP_SETFCAP [6]_
capabilities.
-perf_events/Perf unprivileged users
+Unprivileged users
-----------------------------------
-perf_events/Perf *scope* and *access* control for unprivileged processes
+perf_events *scope* and *access* control for unprivileged processes
is governed by perf_event_paranoid [2]_ setting:
-1:
@@ -166,7 +201,7 @@
perf_event_mlock_kb locking limit is imposed but ignored for
unprivileged processes with CAP_IPC_LOCK capability.
-perf_events/Perf resource control
+Resource control
---------------------------------
Open file descriptors
@@ -227,4 +262,5 @@
.. [10] `<http://man7.org/linux/man-pages/man5/acl.5.html>`_
.. [11] `<http://man7.org/linux/man-pages/man2/getrlimit.2.html>`_
.. [12] `<http://man7.org/linux/man-pages/man5/limits.conf.5.html>`_
-
+.. [13] `<https://sites.google.com/site/fullycapable>`_
+.. [14] `<http://man7.org/linux/man-pages/man8/auditd.8.html>`_
diff --git a/Documentation/admin-guide/pm/cpuidle.rst b/Documentation/admin-guide/pm/cpuidle.rst
index 5605cc6..a96a423 100644
--- a/Documentation/admin-guide/pm/cpuidle.rst
+++ b/Documentation/admin-guide/pm/cpuidle.rst
@@ -159,17 +159,15 @@
and that is the primary reason for having more than one governor in the
``CPUIdle`` subsystem.
-There are three ``CPUIdle`` governors available, ``menu``, `TEO <teo-gov_>`_
-and ``ladder``. Which of them is used by default depends on the configuration
-of the kernel and in particular on whether or not the scheduler tick can be
-`stopped by the idle loop <idle-cpus-and-tick_>`_. It is possible to change the
-governor at run time if the ``cpuidle_sysfs_switch`` command line parameter has
-been passed to the kernel, but that is not safe in general, so it should not be
-done on production systems (that may change in the future, though). The name of
-the ``CPUIdle`` governor currently used by the kernel can be read from the
-:file:`current_governor_ro` (or :file:`current_governor` if
-``cpuidle_sysfs_switch`` is present in the kernel command line) file under
-:file:`/sys/devices/system/cpu/cpuidle/` in ``sysfs``.
+There are four ``CPUIdle`` governors available, ``menu``, `TEO <teo-gov_>`_,
+``ladder`` and ``haltpoll``. Which of them is used by default depends on the
+configuration of the kernel and in particular on whether or not the scheduler
+tick can be `stopped by the idle loop <idle-cpus-and-tick_>`_. Available
+governors can be read from the :file:`available_governors`, and the governor
+can be changed at runtime. The name of the ``CPUIdle`` governor currently
+used by the kernel can be read from the :file:`current_governor_ro` or
+:file:`current_governor` file under :file:`/sys/devices/system/cpu/cpuidle/`
+in ``sysfs``.
Which ``CPUIdle`` driver is used, on the other hand, usually depends on the
platform the kernel is running on, but there are platforms with more than one
diff --git a/Documentation/admin-guide/pm/intel-speed-select.rst b/Documentation/admin-guide/pm/intel-speed-select.rst
new file mode 100644
index 0000000..b2ca601
--- /dev/null
+++ b/Documentation/admin-guide/pm/intel-speed-select.rst
@@ -0,0 +1,917 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+============================================================
+Intel(R) Speed Select Technology User Guide
+============================================================
+
+The Intel(R) Speed Select Technology (Intel(R) SST) provides a powerful new
+collection of features that give more granular control over CPU performance.
+With Intel(R) SST, one server can be configured for power and performance for a
+variety of diverse workload requirements.
+
+Refer to the links below for an overview of the technology:
+
+- https://www.intel.com/content/www/us/en/architecture-and-technology/speed-select-technology-article.html
+- https://builders.intel.com/docs/networkbuilders/intel-speed-select-technology-base-frequency-enhancing-performance.pdf
+
+These capabilities are further enhanced in some of the newer generations of
+server platforms where these features can be enumerated and controlled
+dynamically without pre-configuring via BIOS setup options. This dynamic
+configuration is done via mailbox commands to the hardware. One way to enumerate
+and configure these features is by using the Intel Speed Select utility.
+
+This document explains how to use the Intel Speed Select tool to enumerate and
+control Intel(R) SST features. This document gives example commands and explains
+how these commands change the power and performance profile of the system under
+test. Using this tool as an example, customers can replicate the messaging
+implemented in the tool in their production software.
+
+intel-speed-select configuration tool
+======================================
+
+Most Linux distribution packages may include the "intel-speed-select" tool. If not,
+it can be built by downloading the Linux kernel tree from kernel.org. Once
+downloaded, the tool can be built without building the full kernel.
+
+From the kernel tree, run the following commands::
+
+# cd tools/power/x86/intel-speed-select/
+# make
+# make install
+
+Getting Help
+------------
+
+To get help with the tool, execute the command below::
+
+# intel-speed-select --help
+
+The top-level help describes arguments and features. Notice that there is a
+multi-level help structure in the tool. For example, to get help for the feature "perf-profile"::
+
+# intel-speed-select perf-profile --help
+
+To get help on a command, another level of help is provided. For example for the command info "info"::
+
+# intel-speed-select perf-profile info --help
+
+Summary of platform capability
+------------------------------
+To check the current platform and driver capaibilities, execute::
+
+#intel-speed-select --info
+
+For example on a test system::
+
+ # intel-speed-select --info
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ Platform: API version : 1
+ Platform: Driver version : 1
+ Platform: mbox supported : 1
+ Platform: mmio supported : 1
+ Intel(R) SST-PP (feature perf-profile) is supported
+ TDP level change control is unlocked, max level: 4
+ Intel(R) SST-TF (feature turbo-freq) is supported
+ Intel(R) SST-BF (feature base-freq) is not supported
+ Intel(R) SST-CP (feature core-power) is supported
+
+Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP)
+------------------------------------------------------------------------
+
+This feature allows configuration of a server dynamically based on workload
+performance requirements. This helps users during deployment as they do not have
+to choose a specific server configuration statically. This Intel(R) Speed Select
+Technology - Performance Profile (Intel(R) SST-PP) feature introduces a mechanism
+that allows multiple optimized performance profiles per system. Each profile
+defines a set of CPUs that need to be online and rest offline to sustain a
+guaranteed base frequency. Once the user issues a command to use a specific
+performance profile and meet CPU online/offline requirement, the user can expect
+a change in the base frequency dynamically. This feature is called
+"perf-profile" when using the Intel Speed Select tool.
+
+Number or performance levels
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+There can be multiple performance profiles on a system. To get the number of
+profiles, execute the command below::
+
+ # intel-speed-select perf-profile get-config-levels
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ get-config-levels:4
+ package-1
+ die-0
+ cpu-14
+ get-config-levels:4
+
+On this system under test, there are 4 performance profiles in addition to the
+base performance profile (which is performance level 0).
+
+Lock/Unlock status
+~~~~~~~~~~~~~~~~~~
+
+Even if there are multiple performance profiles, it is possible that that they
+are locked. If they are locked, users cannot issue a command to change the
+performance state. It is possible that there is a BIOS setup to unlock or check
+with your system vendor.
+
+To check if the system is locked, execute the following command::
+
+ # intel-speed-select perf-profile get-lock-status
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ get-lock-status:0
+ package-1
+ die-0
+ cpu-14
+ get-lock-status:0
+
+In this case, lock status is 0, which means that the system is unlocked.
+
+Properties of a performance level
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get properties of a specific performance level (For example for the level 0, below), execute the command below::
+
+ # intel-speed-select perf-profile info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-0
+ cpu-count:28
+ enable-cpu-mask:000003ff,f0003fff
+ enable-cpu-list:0,1,2,3,4,5,6,7,8,9,10,11,12,13,28,29,30,31,32,33,34,35,36,37,38,39,40,41
+ thermal-design-power-ratio:26
+ base-frequency(MHz):2600
+ speed-select-turbo-freq:disabled
+ speed-select-base-freq:disabled
+ ...
+ ...
+
+Here -l option is used to specify a performance level.
+
+If the option -l is omitted, then this command will print information about all
+the performance levels. The above command is printing properties of the
+performance level 0.
+
+For this performance profile, the list of CPUs displayed by the
+"enable-cpu-mask/enable-cpu-list" at the max can be "online." When that
+condition is met, then base frequency of 2600 MHz can be maintained. To
+understand more, execute "intel-speed-select perf-profile info" for performance
+level 4::
+
+ # intel-speed-select perf-profile info -l 4
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-4
+ cpu-count:28
+ enable-cpu-mask:000000fa,f0000faf
+ enable-cpu-list:0,1,2,3,5,7,8,9,10,11,28,29,30,31,33,35,36,37,38,39
+ thermal-design-power-ratio:28
+ base-frequency(MHz):2800
+ speed-select-turbo-freq:disabled
+ speed-select-base-freq:unsupported
+ ...
+ ...
+
+There are fewer CPUs in the "enable-cpu-mask/enable-cpu-list". Consequently, if
+the user only keeps these CPUs online and the rest "offline," then the base
+frequency is increased to 2.8 GHz compared to 2.6 GHz at performance level 0.
+
+Get current performance level
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get the current performance level, execute::
+
+ # intel-speed-select perf-profile get-config-current-level
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ get-config-current_level:0
+
+First verify that the base_frequency displayed by the cpufreq sysfs is correct::
+
+ # cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
+ 2600000
+
+This matches the base-frequency (MHz) field value displayed from the
+"perf-profile info" command for performance level 0(cpufreq frequency is in
+KHz).
+
+To check if the average frequency is equal to the base frequency for a 100% busy
+workload, disable turbo::
+
+# echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
+
+Then runs a busy workload on all CPUs, for example::
+
+#stress -c 64
+
+To verify the base frequency, run turbostat::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+
+ Package Core CPU Bzy_MHz
+ - - 2600
+ 0 0 0 2600
+ 0 1 1 2600
+ 0 2 2 2600
+ 0 3 3 2600
+ 0 4 4 2600
+ . . . .
+
+
+Changing performance level
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To the change the performance level to 4, execute::
+
+ # intel-speed-select -d perf-profile set-config-level -l 4 -o
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile
+ set_tdp_level:success
+
+In the command above, "-o" is optional. If it is specified, then it will also
+offline CPUs which are not present in the enable_cpu_mask for this performance
+level.
+
+Now if the base_frequency is checked::
+
+ #cat /sys/devices/system/cpu/cpu0/cpufreq/base_frequency
+ 2800000
+
+Which shows that the base frequency now increased from 2600 MHz at performance
+level 0 to 2800 MHz at performance level 4. As a result, any workload, which can
+use fewer CPUs, can see a boost of 200 MHz compared to performance level 0.
+
+Check presence of other Intel(R) SST features
+---------------------------------------------
+
+Each of the performance profiles also specifies weather there is support of
+other two Intel(R) SST features (Intel(R) Speed Select Technology - Base Frequency
+(Intel(R) SST-BF) and Intel(R) Speed Select Technology - Turbo Frequency (Intel
+SST-TF)).
+
+For example, from the output of "perf-profile info" above, for level 0 and level
+4:
+
+For level 0::
+ speed-select-turbo-freq:disabled
+ speed-select-base-freq:disabled
+
+For level 4::
+ speed-select-turbo-freq:disabled
+ speed-select-base-freq:unsupported
+
+Given these results, the "speed-select-base-freq" (Intel(R) SST-BF) in level 4
+changed from "disabled" to "unsupported" compared to performance level 0.
+
+This means that at performance level 4, the "speed-select-base-freq" feature is
+not supported. However, at performance level 0, this feature is "supported", but
+currently "disabled", meaning the user has not activated this feature. Whereas
+"speed-select-turbo-freq" (Intel(R) SST-TF) is supported at both performance
+levels, but currently not activated by the user.
+
+The Intel(R) SST-BF and the Intel(R) SST-TF features are built on a foundation
+technology called Intel(R) Speed Select Technology - Core Power (Intel(R) SST-CP).
+The platform firmware enables this feature when Intel(R) SST-BF or Intel(R) SST-TF
+is supported on a platform.
+
+Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP)
+---------------------------------------------------------------
+
+Intel(R) Speed Select Technology Core Power (Intel(R) SST-CP) is an interface that
+allows users to define per core priority. This defines a mechanism to distribute
+power among cores when there is a power constrained scenario. This defines a
+class of service (CLOS) configuration.
+
+The user can configure up to 4 class of service configurations. Each CLOS group
+configuration allows definitions of parameters, which affects how the frequency
+can be limited and power is distributed. Each CPU core can be tied to a class of
+service and hence an associated priority. The granularity is at core level not
+at per CPU level.
+
+Enable CLOS based prioritization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To use CLOS based prioritization feature, firmware must be informed to enable
+and use a priority type. There is a default per platform priority type, which
+can be changed with optional command line parameter.
+
+To enable and check the options, execute::
+
+ # intel-speed-select core-power enable --help
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ Enable core-power for a package/die
+ Clos Enable: Specify priority type with [--priority|-p]
+ 0: Proportional, 1: Ordered
+
+There are two types of priority types:
+
+- Ordered
+
+Priority for ordered throttling is defined based on the index of the assigned
+CLOS group. Where CLOS0 gets highest priority (throttled last).
+
+Priority order is:
+CLOS0 > CLOS1 > CLOS2 > CLOS3.
+
+- Proportional
+
+When proportional priority is used, there is an additional parameter called
+frequency_weight, which can be specified per CLOS group. The goal of
+proportional priority is to provide each core with the requested min., then
+distribute all remaining (excess/deficit) budgets in proportion to a defined
+weight. This proportional priority can be configured using "core-power config"
+command.
+
+To enable with the platform default priority type, execute::
+
+ # intel-speed-select core-power enable
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ core-power
+ enable:success
+ package-1
+ die-0
+ cpu-6
+ core-power
+ enable:success
+
+The scope of this enable is per package or die scoped when a package contains
+multiple dies. To check if CLOS is enabled and get priority type, "core-power
+info" command can be used. For example to check the status of core-power feature
+on CPU 0, execute::
+
+ # intel-speed-select -c 0 core-power info
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ core-power
+ support-status:supported
+ enable-status:enabled
+ clos-enable-status:enabled
+ priority-type:proportional
+ package-1
+ die-0
+ cpu-24
+ core-power
+ support-status:supported
+ enable-status:enabled
+ clos-enable-status:enabled
+ priority-type:proportional
+
+Configuring CLOS groups
+~~~~~~~~~~~~~~~~~~~~~~~
+
+Each CLOS group has its own attributes including min, max, freq_weight and
+desired. These parameters can be configured with "core-power config" command.
+Defaults will be used if user skips setting a parameter except clos id, which is
+mandatory. To check core-power config options, execute::
+
+ # intel-speed-select core-power config --help
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ Set core-power configuration for one of the four clos ids
+ Specify targeted clos id with [--clos|-c]
+ Specify clos Proportional Priority [--weight|-w]
+ Specify clos min in MHz with [--min|-n]
+ Specify clos max in MHz with [--max|-m]
+
+For example::
+
+ # intel-speed-select core-power config -c 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ clos epp is not specified, default: 0
+ clos frequency weight is not specified, default: 0
+ clos min is not specified, default: 0 MHz
+ clos max is not specified, default: 25500 MHz
+ clos desired is not specified, default: 0
+ package-0
+ die-0
+ cpu-0
+ core-power
+ config:success
+ package-1
+ die-0
+ cpu-6
+ core-power
+ config:success
+
+The user has the option to change defaults. For example, the user can change the
+"min" and set the base frequency to always get guaranteed base frequency.
+
+Get the current CLOS configuration
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To check the current configuration, "core-power get-config" can be used. For
+example, to get the configuration of CLOS 0::
+
+ # intel-speed-select core-power get-config -c 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ core-power
+ clos:0
+ epp:0
+ clos-proportional-priority:0
+ clos-min:0 MHz
+ clos-max:Max Turbo frequency
+ clos-desired:0 MHz
+ package-1
+ die-0
+ cpu-24
+ core-power
+ clos:0
+ epp:0
+ clos-proportional-priority:0
+ clos-min:0 MHz
+ clos-max:Max Turbo frequency
+ clos-desired:0 MHz
+
+Associating a CPU with a CLOS group
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To associate a CPU to a CLOS group "core-power assoc" command can be used::
+
+ # intel-speed-select core-power assoc --help
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ Associate a clos id to a CPU
+ Specify targeted clos id with [--clos|-c]
+
+
+For example to associate CPU 10 to CLOS group 3, execute::
+
+ # intel-speed-select -c 10 core-power assoc -c 3
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-10
+ core-power
+ assoc:success
+
+Once a CPU is associated, its sibling CPUs are also associated to a CLOS group.
+Once associated, avoid changing Linux "cpufreq" subsystem scaling frequency
+limits.
+
+To check the existing association for a CPU, "core-power get-assoc" command can
+be used. For example, to get association of CPU 10, execute::
+
+ # intel-speed-select -c 10 core-power get-assoc
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-1
+ die-0
+ cpu-10
+ get-assoc
+ clos:3
+
+This shows that CPU 10 is part of a CLOS group 3.
+
+
+Disable CLOS based prioritization
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To disable, execute::
+
+# intel-speed-select core-power disable
+
+Some features like Intel(R) SST-TF can only be enabled when CLOS based prioritization
+is enabled. For this reason, disabling while Intel(R) SST-TF is enabled can cause
+Intel(R) SST-TF to fail. This will cause the "disable" command to display an error
+if Intel(R) SST-TF is already enabled. In turn, to disable, the Intel(R) SST-TF
+feature must be disabled first.
+
+Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF)
+-------------------------------------------------------------------
+
+The Intel(R) Speed Select Technology - Base Frequency (Intel(R) SST-BF) feature lets
+the user control base frequency. If some critical workload threads demand
+constant high guaranteed performance, then this feature can be used to execute
+the thread at higher base frequency on specific sets of CPUs (high priority
+CPUs) at the cost of lower base frequency (low priority CPUs) on other CPUs.
+This feature does not require offline of the low priority CPUs.
+
+The support of Intel(R) SST-BF depends on the Intel(R) Speed Select Technology -
+Performance Profile (Intel(R) SST-PP) performance level configuration. It is
+possible that only certain performance levels support Intel(R) SST-BF. It is also
+possible that only base performance level (level = 0) has support of Intel
+SST-BF. Consequently, first select the desired performance level to enable this
+feature.
+
+In the system under test here, Intel(R) SST-BF is supported at the base
+performance level 0, but currently disabled. For example for the level 0::
+
+ # intel-speed-select -c 0 perf-profile info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-0
+ ...
+
+ speed-select-base-freq:disabled
+ ...
+
+Before enabling Intel(R) SST-BF and measuring its impact on a workload
+performance, execute some workload and measure performance and get a baseline
+performance to compare against.
+
+Here the user wants more guaranteed performance. For this reason, it is likely
+that turbo is disabled. To disable turbo, execute::
+
+#echo 1 > /sys/devices/system/cpu/intel_pstate/no_turbo
+
+Based on the output of the "intel-speed-select perf-profile info -l 0" base
+frequency of guaranteed frequency 2600 MHz.
+
+
+Measure baseline performance for comparison
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To compare, pick a multi-threaded workload where each thread can be scheduled on
+separate CPUs. "Hackbench pipe" test is a good example on how to improve
+performance using Intel(R) SST-BF.
+
+Below, the workload is measuring average scheduler wakeup latency, so a lower
+number means better performance::
+
+ # taskset -c 3,4 perf bench -r 100 sched pipe
+ # Running 'sched/pipe' benchmark:
+ # Executed 1000000 pipe operations between two processes
+ Total time: 6.102 [sec]
+ 6.102445 usecs/op
+ 163868 ops/sec
+
+While running the above test, if we take turbostat output, it will show us that
+2 of the CPUs are busy and reaching max. frequency (which would be the base
+frequency as the turbo is disabled). The turbostat output::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+ Package Core CPU Bzy_MHz
+ 0 0 0 1000
+ 0 1 1 1005
+ 0 2 2 1000
+ 0 3 3 2600
+ 0 4 4 2600
+ 0 5 5 1000
+ 0 6 6 1000
+ 0 7 7 1005
+ 0 8 8 1005
+ 0 9 9 1000
+ 0 10 10 1000
+ 0 11 11 995
+ 0 12 12 1000
+ 0 13 13 1000
+
+From the above turbostat output, both CPU 3 and 4 are very busy and reaching
+full guaranteed frequency of 2600 MHz.
+
+Intel(R) SST-BF Capabilities
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get capabilities of Intel(R) SST-BF for the current performance level 0,
+execute::
+
+ # intel-speed-select base-freq info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ speed-select-base-freq
+ high-priority-base-frequency(MHz):3000
+ high-priority-cpu-mask:00000216,00002160
+ high-priority-cpu-list:5,6,8,13,33,34,36,41
+ low-priority-base-frequency(MHz):2400
+ tjunction-temperature(C):125
+ thermal-design-power(W):205
+
+The above capabilities show that there are some CPUs on this system that can
+offer base frequency of 3000 MHz compared to the standard base frequency at this
+performance levels. Nevertheless, these CPUs are fixed, and they are presented
+via high-priority-cpu-list/high-priority-cpu-mask. But if this Intel(R) SST-BF
+feature is selected, the low priorities CPUs (which are not in
+high-priority-cpu-list) can only offer up to 2400 MHz. As a result, if this
+clipping of low priority CPUs is acceptable, then the user can enable Intel
+SST-BF feature particularly for the above "sched pipe" workload since only two
+CPUs are used, they can be scheduled on high priority CPUs and can get boost of
+400 MHz.
+
+Enable Intel(R) SST-BF
+~~~~~~~~~~~~~~~~~~~~~~
+
+To enable Intel(R) SST-BF feature, execute::
+
+ # intel-speed-select base-freq enable -a
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ base-freq
+ enable:success
+ package-1
+ die-0
+ cpu-14
+ base-freq
+ enable:success
+
+In this case, -a option is optional. This not only enables Intel(R) SST-BF, but it
+also adjusts the priority of cores using Intel(R) Speed Select Technology Core
+Power (Intel(R) SST-CP) features. This option sets the minimum performance of each
+Intel(R) Speed Select Technology - Performance Profile (Intel(R) SST-PP) class to
+maximum performance so that the hardware will give maximum performance possible
+for each CPU.
+
+If -a option is not used, then the following steps are required before enabling
+Intel(R) SST-BF:
+
+- Discover Intel(R) SST-BF and note low and high priority base frequency
+- Note the high prioity CPU list
+- Enable CLOS using core-power feature set
+- Configure CLOS parameters. Use CLOS.min to set to minimum performance
+- Subscribe desired CPUs to CLOS groups
+
+With this configuration, if the same workload is executed by pinning the
+workload to high priority CPUs (CPU 5 and 6 in this case)::
+
+ #taskset -c 5,6 perf bench -r 100 sched pipe
+ # Running 'sched/pipe' benchmark:
+ # Executed 1000000 pipe operations between two processes
+ Total time: 5.627 [sec]
+ 5.627922 usecs/op
+ 177685 ops/sec
+
+This way, by enabling Intel(R) SST-BF, the performance of this benchmark is
+improved (latency reduced) by 7.79%. From the turbostat output, it can be
+observed that the high priority CPUs reached 3000 MHz compared to 2600 MHz.
+The turbostat output::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+ Package Core CPU Bzy_MHz
+ 0 0 0 2151
+ 0 1 1 2166
+ 0 2 2 2175
+ 0 3 3 2175
+ 0 4 4 2175
+ 0 5 5 3000
+ 0 6 6 3000
+ 0 7 7 2180
+ 0 8 8 2662
+ 0 9 9 2176
+ 0 10 10 2175
+ 0 11 11 2176
+ 0 12 12 2176
+ 0 13 13 2661
+
+Disable Intel(R) SST-BF
+~~~~~~~~~~~~~~~~~~~~~~~
+
+To disable the Intel(R) SST-BF feature, execute::
+
+# intel-speed-select base-freq disable -a
+
+
+Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
+--------------------------------------------------------------------
+
+This feature enables the ability to set different "All core turbo ratio limits"
+to cores based on the priority. By using this feature, some cores can be
+configured to get higher turbo frequency by designating them as high priority at
+the cost of lower or no turbo frequency on the low priority cores.
+
+For this reason, this feature is only useful when system is busy utilizing all
+CPUs, but the user wants some configurable option to get high performance on
+some CPUs.
+
+The support of Intel(R) Speed Select Technology - Turbo Frequency (Intel(R) SST-TF)
+depends on the Intel(R) Speed Select Technology - Performance Profile (Intel
+SST-PP) performance level configuration. It is possible that only a certain
+performance level supports Intel(R) SST-TF. It is also possible that only the base
+performance level (level = 0) has the support of Intel(R) SST-TF. Hence, first
+select the desired performance level to enable this feature.
+
+In the system under test here, Intel(R) SST-TF is supported at the base
+performance level 0, but currently disabled::
+
+ # intel-speed-select -c 0 perf-profile info -l 0
+ Intel(R) Speed Select Technology
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-0
+ ...
+ ...
+ speed-select-turbo-freq:disabled
+ ...
+ ...
+
+
+To check if performance can be improved using Intel(R) SST-TF feature, get the turbo
+frequency properties with Intel(R) SST-TF enabled and compare to the base turbo
+capability of this system.
+
+Get Base turbo capability
+~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get the base turbo capability of performance level 0, execute::
+
+ # intel-speed-select perf-profile info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ perf-profile-level-0
+ ...
+ ...
+ turbo-ratio-limits-sse
+ bucket-0
+ core-count:2
+ max-turbo-frequency(MHz):3200
+ bucket-1
+ core-count:4
+ max-turbo-frequency(MHz):3100
+ bucket-2
+ core-count:6
+ max-turbo-frequency(MHz):3100
+ bucket-3
+ core-count:8
+ max-turbo-frequency(MHz):3100
+ bucket-4
+ core-count:10
+ max-turbo-frequency(MHz):3100
+ bucket-5
+ core-count:12
+ max-turbo-frequency(MHz):3100
+ bucket-6
+ core-count:14
+ max-turbo-frequency(MHz):3100
+ bucket-7
+ core-count:16
+ max-turbo-frequency(MHz):3100
+
+Based on the data above, when all the CPUS are busy, the max. frequency of 3100
+MHz can be achieved. If there is some busy workload on cpu 0 - 11 (e.g. stress)
+and on CPU 12 and 13, execute "hackbench pipe" workload::
+
+ # taskset -c 12,13 perf bench -r 100 sched pipe
+ # Running 'sched/pipe' benchmark:
+ # Executed 1000000 pipe operations between two processes
+ Total time: 5.705 [sec]
+ 5.705488 usecs/op
+ 175269 ops/sec
+
+The turbostat output::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+ Package Core CPU Bzy_MHz
+ 0 0 0 3000
+ 0 1 1 3000
+ 0 2 2 3000
+ 0 3 3 3000
+ 0 4 4 3000
+ 0 5 5 3100
+ 0 6 6 3100
+ 0 7 7 3000
+ 0 8 8 3100
+ 0 9 9 3000
+ 0 10 10 3000
+ 0 11 11 3000
+ 0 12 12 3100
+ 0 13 13 3100
+
+Based on turbostat output, the performance is limited by frequency cap of 3100
+MHz. To check if the hackbench performance can be improved for CPU 12 and CPU
+13, first check the capability of the Intel(R) SST-TF feature for this performance
+level.
+
+Get Intel(R) SST-TF Capability
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To get the capability, the "turbo-freq info" command can be used::
+
+ # intel-speed-select turbo-freq info -l 0
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-0
+ speed-select-turbo-freq
+ bucket-0
+ high-priority-cores-count:2
+ high-priority-max-frequency(MHz):3200
+ high-priority-max-avx2-frequency(MHz):3200
+ high-priority-max-avx512-frequency(MHz):3100
+ bucket-1
+ high-priority-cores-count:4
+ high-priority-max-frequency(MHz):3100
+ high-priority-max-avx2-frequency(MHz):3000
+ high-priority-max-avx512-frequency(MHz):2900
+ bucket-2
+ high-priority-cores-count:6
+ high-priority-max-frequency(MHz):3100
+ high-priority-max-avx2-frequency(MHz):3000
+ high-priority-max-avx512-frequency(MHz):2900
+ speed-select-turbo-freq-clip-frequencies
+ low-priority-max-frequency(MHz):2600
+ low-priority-max-avx2-frequency(MHz):2400
+ low-priority-max-avx512-frequency(MHz):2100
+
+Based on the output above, there is an Intel(R) SST-TF bucket for which there are
+two high priority cores. If only two high priority cores are set, then max.
+turbo frequency on those cores can be increased to 3200 MHz. This is 100 MHz
+more than the base turbo capability for all cores.
+
+In turn, for the hackbench workload, two CPUs can be set as high priority and
+rest as low priority. One side effect is that once enabled, the low priority
+cores will be clipped to a lower frequency of 2600 MHz.
+
+Enable Intel(R) SST-TF
+~~~~~~~~~~~~~~~~~~~~~~
+
+To enable Intel(R) SST-TF, execute::
+
+ # intel-speed-select -c 12,13 turbo-freq enable -a
+ Intel(R) Speed Select Technology
+ Executing on CPU model: X
+ package-0
+ die-0
+ cpu-12
+ turbo-freq
+ enable:success
+ package-0
+ die-0
+ cpu-13
+ turbo-freq
+ enable:success
+ package--1
+ die-0
+ cpu-63
+ turbo-freq --auto
+ enable:success
+
+In this case, the option "-a" is optional. If set, it enables Intel(R) SST-TF
+feature and also sets the CPUs to high and and low priority using Intel Speed
+Select Technology Core Power (Intel(R) SST-CP) features. The CPU numbers passed
+with "-c" arguments are marked as high priority, including its siblings.
+
+If -a option is not used, then the following steps are required before enabling
+Intel(R) SST-TF:
+
+- Discover Intel(R) SST-TF and note buckets of high priority cores and maximum frequency
+
+- Enable CLOS using core-power feature set - Configure CLOS parameters
+
+- Subscribe desired CPUs to CLOS groups making sure that high priority cores are set to the maximum frequency
+
+If the same hackbench workload is executed, schedule hackbench threads on high
+priority CPUs::
+
+ #taskset -c 12,13 perf bench -r 100 sched pipe
+ # Running 'sched/pipe' benchmark:
+ # Executed 1000000 pipe operations between two processes
+ Total time: 5.510 [sec]
+ 5.510165 usecs/op
+ 180826 ops/sec
+
+This improved performance by around 3.3% improvement on a busy system. Here the
+turbostat output will show that the CPU 12 and CPU 13 are getting 100 MHz boost.
+The turbostat output::
+
+ #turbostat -c 0-13 --show Package,Core,CPU,Bzy_MHz -i 1
+ Package Core CPU Bzy_MHz
+ ...
+ 0 12 12 3200
+ 0 13 13 3200
diff --git a/Documentation/admin-guide/pm/intel_pstate.rst b/Documentation/admin-guide/pm/intel_pstate.rst
index ad392f3..39d80bc 100644
--- a/Documentation/admin-guide/pm/intel_pstate.rst
+++ b/Documentation/admin-guide/pm/intel_pstate.rst
@@ -62,9 +62,10 @@
Active Mode
-----------
-This is the default operation mode of ``intel_pstate``. If it works in this
-mode, the ``scaling_driver`` policy attribute in ``sysfs`` for all ``CPUFreq``
-policies contains the string "intel_pstate".
+This is the default operation mode of ``intel_pstate`` for processors with
+hardware-managed P-states (HWP) support. If it works in this mode, the
+``scaling_driver`` policy attribute in ``sysfs`` for all ``CPUFreq`` policies
+contains the string "intel_pstate".
In this mode the driver bypasses the scaling governors layer of ``CPUFreq`` and
provides its own scaling algorithms for P-state selection. Those algorithms
@@ -138,12 +139,13 @@
Active Mode Without HWP
~~~~~~~~~~~~~~~~~~~~~~~
-This is the default operation mode for processors that do not support the HWP
-feature. It also is used by default with the ``intel_pstate=no_hwp`` argument
-in the kernel command line. However, in this mode ``intel_pstate`` may refuse
-to work with the given processor if it does not recognize it. [Note that
-``intel_pstate`` will never refuse to work with any processor with the HWP
-feature enabled.]
+This operation mode is optional for processors that do not support the HWP
+feature or when the ``intel_pstate=no_hwp`` argument is passed to the kernel in
+the command line. The active mode is used in those cases if the
+``intel_pstate=active`` argument is passed to the kernel in the command line.
+In this mode ``intel_pstate`` may refuse to work with processors that are not
+recognized by it. [Note that ``intel_pstate`` will never refuse to work with
+any processor with the HWP feature enabled.]
In this mode ``intel_pstate`` registers utilization update callbacks with the
CPU scheduler in order to run a P-state selection algorithm, either
@@ -188,10 +190,14 @@
Passive Mode
------------
-This mode is used if the ``intel_pstate=passive`` argument is passed to the
-kernel in the command line (it implies the ``intel_pstate=no_hwp`` setting too).
-Like in the active mode without HWP support, in this mode ``intel_pstate`` may
-refuse to work with the given processor if it does not recognize it.
+This is the default operation mode of ``intel_pstate`` for processors without
+hardware-managed P-states (HWP) support. It is always used if the
+``intel_pstate=passive`` argument is passed to the kernel in the command line
+regardless of whether or not the given processor supports HWP. [Note that the
+``intel_pstate=no_hwp`` setting implies ``intel_pstate=passive`` if it is used
+without ``intel_pstate=active``.] Like in the active mode without HWP support,
+in this mode ``intel_pstate`` may refuse to work with processors that are not
+recognized by it.
If the driver works in this mode, the ``scaling_driver`` policy attribute in
``sysfs`` for all ``CPUFreq`` policies contains the string "intel_cpufreq".
diff --git a/Documentation/admin-guide/pm/working-state.rst b/Documentation/admin-guide/pm/working-state.rst
index 0a38cdf..f40994c 100644
--- a/Documentation/admin-guide/pm/working-state.rst
+++ b/Documentation/admin-guide/pm/working-state.rst
@@ -13,3 +13,4 @@
intel_pstate
cpufreq_drivers
intel_epb
+ intel-speed-select
diff --git a/Documentation/admin-guide/pstore-blk.rst b/Documentation/admin-guide/pstore-blk.rst
new file mode 100644
index 0000000..296d502
--- /dev/null
+++ b/Documentation/admin-guide/pstore-blk.rst
@@ -0,0 +1,243 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+pstore block oops/panic logger
+==============================
+
+Introduction
+------------
+
+pstore block (pstore/blk) is an oops/panic logger that writes its logs to a
+block device and non-block device before the system crashes. You can get
+these log files by mounting pstore filesystem like::
+
+ mount -t pstore pstore /sys/fs/pstore
+
+
+pstore block concepts
+---------------------
+
+pstore/blk provides efficient configuration method for pstore/blk, which
+divides all configurations into two parts, configurations for user and
+configurations for driver.
+
+Configurations for user determine how pstore/blk works, such as pmsg_size,
+kmsg_size and so on. All of them support both Kconfig and module parameters,
+but module parameters have priority over Kconfig.
+
+Configurations for driver are all about block device and non-block device,
+such as total_size of block device and read/write operations.
+
+Configurations for user
+-----------------------
+
+All of these configurations support both Kconfig and module parameters, but
+module parameters have priority over Kconfig.
+
+Here is an example for module parameters::
+
+ pstore_blk.blkdev=179:7 pstore_blk.kmsg_size=64
+
+The detail of each configurations may be of interest to you.
+
+blkdev
+~~~~~~
+
+The block device to use. Most of the time, it is a partition of block device.
+It's required for pstore/blk. It is also used for MTD device.
+
+It accepts the following variants for block device:
+
+1. <hex_major><hex_minor> device number in hexadecimal represents itself; no
+ leading 0x, for example b302.
+#. /dev/<disk_name> represents the device number of disk
+#. /dev/<disk_name><decimal> represents the device number of partition - device
+ number of disk plus the partition number
+#. /dev/<disk_name>p<decimal> - same as the above; this form is used when disk
+ name of partitioned disk ends with a digit.
+#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF represents the unique id of
+ a partition if the partition table provides it. The UUID may be either an
+ EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP,
+ where SSSSSSSS is a zero-filled hex representation of the 32-bit
+ "NT disk signature", and PP is a zero-filled hex representation of the
+ 1-based partition number.
+#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a
+ partition with a known unique id.
+#. <major>:<minor> major and minor number of the device separated by a colon.
+
+It accepts the following variants for MTD device:
+
+1. <device name> MTD device name. "pstore" is recommended.
+#. <device number> MTD device number.
+
+kmsg_size
+~~~~~~~~~
+
+The chunk size in KB for oops/panic front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care oops/panic log.
+
+There are multiple chunks for oops/panic front-end depending on the remaining
+space except other pstore front-ends.
+
+pstore/blk will log to oops/panic chunks one by one, and always overwrite the
+oldest chunk if there is no more free chunk.
+
+pmsg_size
+~~~~~~~~~
+
+The chunk size in KB for pmsg front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care pmsg log.
+
+Unlike oops/panic front-end, there is only one chunk for pmsg front-end.
+
+Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are
+appended to the chunk. On reboot the contents are available in
+*/sys/fs/pstore/pmsg-pstore-blk-0*.
+
+console_size
+~~~~~~~~~~~~
+
+The chunk size in KB for console front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care console log.
+
+Similar to pmsg front-end, there is only one chunk for console front-end.
+
+All log of console will be appended to the chunk. On reboot the contents are
+available in */sys/fs/pstore/console-pstore-blk-0*.
+
+ftrace_size
+~~~~~~~~~~~
+
+The chunk size in KB for ftrace front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care console log.
+
+Similar to oops front-end, there are multiple chunks for ftrace front-end
+depending on the count of cpu processors. Each chunk size is equal to
+ftrace_size / processors_count.
+
+All log of ftrace will be appended to the chunk. On reboot the contents are
+combined and available in */sys/fs/pstore/ftrace-pstore-blk-0*.
+
+Persistent function tracing might be useful for debugging software or hardware
+related hangs. Here is an example of usage::
+
+ # mount -t pstore pstore /sys/fs/pstore
+ # mount -t debugfs debugfs /sys/kernel/debug/
+ # echo 1 > /sys/kernel/debug/pstore/record_ftrace
+ # reboot -f
+ [...]
+ # mount -t pstore pstore /sys/fs/pstore
+ # tail /sys/fs/pstore/ftrace-pstore-blk-0
+ CPU:0 ts:5914676 c0063828 c0063b94 call_cpuidle <- cpu_startup_entry+0x1b8/0x1e0
+ CPU:0 ts:5914678 c039ecdc c006385c cpuidle_enter_state <- call_cpuidle+0x44/0x48
+ CPU:0 ts:5914680 c039e9a0 c039ecf0 cpuidle_enter_freeze <- cpuidle_enter_state+0x304/0x314
+ CPU:0 ts:5914681 c0063870 c039ea30 sched_idle_set_state <- cpuidle_enter_state+0x44/0x314
+ CPU:1 ts:5916720 c0160f59 c015ee04 kernfs_unmap_bin_file <- __kernfs_remove+0x140/0x204
+ CPU:1 ts:5916721 c05ca625 c015ee0c __mutex_lock_slowpath <- __kernfs_remove+0x148/0x204
+ CPU:1 ts:5916723 c05c813d c05ca630 yield_to <- __mutex_lock_slowpath+0x314/0x358
+ CPU:1 ts:5916724 c05ca2d1 c05ca638 __ww_mutex_lock <- __mutex_lock_slowpath+0x31c/0x358
+
+max_reason
+~~~~~~~~~~
+
+Limiting which kinds of kmsg dumps are stored can be controlled via
+the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
+``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
+``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
+``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
+(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
+``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
+otherwise KMSG_DUMP_MAX.
+
+Configurations for driver
+-------------------------
+
+Only a block device driver cares about these configurations. A block device
+driver uses ``register_pstore_blk`` to register to pstore/blk.
+
+.. kernel-doc:: fs/pstore/blk.c
+ :identifiers: register_pstore_blk
+
+A non-block device driver uses ``register_pstore_device`` with
+``struct pstore_device_info`` to register to pstore/blk.
+
+.. kernel-doc:: fs/pstore/blk.c
+ :identifiers: register_pstore_device
+
+.. kernel-doc:: include/linux/pstore_blk.h
+ :identifiers: pstore_device_info
+
+Compression and header
+----------------------
+
+Block device is large enough for uncompressed oops data. Actually we do not
+recommend data compression because pstore/blk will insert some information into
+the first line of oops/panic data. For example::
+
+ Panic: Total 16 times
+
+It means that it's OOPS|Panic for the 16th time since the first booting.
+Sometimes the number of occurrences of oops|panic since the first booting is
+important to judge whether the system is stable.
+
+The following line is inserted by pstore filesystem. For example::
+
+ Oops#2 Part1
+
+It means that it's OOPS for the 2nd time on the last boot.
+
+Reading the data
+----------------
+
+The dump data can be read from the pstore filesystem. The format for these
+files is ``dmesg-pstore-blk-[N]`` for oops/panic front-end,
+``pmsg-pstore-blk-0`` for pmsg front-end and so on. The timestamp of the
+dump file records the trigger time. To delete a stored record from block
+device, simply unlink the respective pstore file.
+
+Attentions in panic read/write APIs
+-----------------------------------
+
+If on panic, the kernel is not going to run for much longer, the tasks will not
+be scheduled and most kernel resources will be out of service. It
+looks like a single-threaded program running on a single-core computer.
+
+The following points require special attention for panic read/write APIs:
+
+1. Can **NOT** allocate any memory.
+ If you need memory, just allocate while the block driver is initializing
+ rather than waiting until the panic.
+#. Must be polled, **NOT** interrupt driven.
+ No task schedule any more. The block driver should delay to ensure the write
+ succeeds, but NOT sleep.
+#. Can **NOT** take any lock.
+ There is no other task, nor any shared resource; you are safe to break all
+ locks.
+#. Just use CPU to transfer.
+ Do not use DMA to transfer unless you are sure that DMA will not keep lock.
+#. Control registers directly.
+ Please control registers directly rather than use Linux kernel resources.
+ Do I/O map while initializing rather than wait until a panic occurs.
+#. Reset your block device and controller if necessary.
+ If you are not sure of the state of your block device and controller when
+ a panic occurs, you are safe to stop and reset them.
+
+pstore/blk supports psblk_blkdev_info(), which is defined in
+*linux/pstore_blk.h*, to get information of using block device, such as the
+device number, sector count and start sector of the whole disk.
+
+pstore block internals
+----------------------
+
+For developer reference, here are all the important structures and APIs:
+
+.. kernel-doc:: fs/pstore/zone.c
+ :internal:
+
+.. kernel-doc:: include/linux/pstore_zone.h
+ :internal:
+
+.. kernel-doc:: fs/pstore/blk.c
+ :export:
+
+.. kernel-doc:: include/linux/pstore_blk.h
+ :internal:
diff --git a/Documentation/admin-guide/ramoops.rst b/Documentation/admin-guide/ramoops.rst
index 6dbcc54..a60a962 100644
--- a/Documentation/admin-guide/ramoops.rst
+++ b/Documentation/admin-guide/ramoops.rst
@@ -32,11 +32,17 @@
memory are implementation defined, and won't work on many ARMs such as omaps.
The memory area is divided into ``record_size`` chunks (also rounded down to
-power of two) and each oops/panic writes a ``record_size`` chunk of
+power of two) and each kmesg dump writes a ``record_size`` chunk of
information.
-Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
-variable while setting 0 in that variable dumps only the panics.
+Limiting which kinds of kmsg dumps are stored can be controlled via
+the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
+``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
+``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
+``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
+(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
+``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
+otherwise KMSG_DUMP_MAX.
The module uses a counter to record multiple dumps but the counter gets reset
on restart (i.e. new dumps after the restart will overwrite old ones).
@@ -90,7 +96,7 @@
.mem_address = <...>,
.mem_type = <...>,
.record_size = <...>,
- .dump_oops = <...>,
+ .max_reason = <...>,
.ecc = <...>,
};
diff --git a/Documentation/admin-guide/ras.rst b/Documentation/admin-guide/ras.rst
index 0310db6..7b481b2 100644
--- a/Documentation/admin-guide/ras.rst
+++ b/Documentation/admin-guide/ras.rst
@@ -156,11 +156,11 @@
ECC memory
----------
-As mentioned on the previous section, ECC memory has extra bits to be
-used for error correction. So, on 64 bit systems, a memory module
-has 64 bits of *data width*, and 74 bits of *total width*. So, there are
-8 bits extra bits to be used for the error detection and correction
-mechanisms. Those extra bits are called *syndrome*\ [#f1]_\ [#f2]_.
+As mentioned in the previous section, ECC memory has extra bits to be
+used for error correction. In the above example, a memory module has
+64 bits of *data width*, and 72 bits of *total width*. The extra 8
+bits which are used for the error detection and correction mechanisms
+are referred to as the *syndrome*\ [#f1]_\ [#f2]_.
So, when the cpu requests the memory controller to write a word with
*data width*, the memory controller calculates the *syndrome* in real time,
@@ -212,7 +212,7 @@
purposes.
When the subsystem was pushed upstream for the first time, on
- Kernel 2.6.16, for the first time, it was renamed to ``EDAC``.
+ Kernel 2.6.16, it was renamed to ``EDAC``.
Purpose
-------
@@ -351,15 +351,17 @@
+------------+-----------+-----------+
| | ``ch0`` | ``ch1`` |
+============+===========+===========+
- | ``csrow0`` | DIMM_A0 | DIMM_B0 |
- | | rank0 | rank0 |
- +------------+ - | - |
+ | |**DIMM_A0**|**DIMM_B0**|
+ +------------+-----------+-----------+
+ | ``csrow0`` | rank0 | rank0 |
+ +------------+-----------+-----------+
| ``csrow1`` | rank1 | rank1 |
+------------+-----------+-----------+
- | ``csrow2`` | DIMM_A1 | DIMM_B1 |
- | | rank0 | rank0 |
- +------------+ - | - |
- | ``csrow3`` | rank1 | rank1 |
+ | |**DIMM_A1**|**DIMM_B1**|
+ +------------+-----------+-----------+
+ | ``csrow2`` | rank0 | rank0 |
+ +------------+-----------+-----------+
+ | ``csrow3`` | rank1 | rank1 |
+------------+-----------+-----------+
In the above example, there are 4 physical slots on the motherboard
diff --git a/Documentation/admin-guide/reporting-bugs.rst b/Documentation/admin-guide/reporting-bugs.rst
index 49ac8dc..42481ea 100644
--- a/Documentation/admin-guide/reporting-bugs.rst
+++ b/Documentation/admin-guide/reporting-bugs.rst
@@ -75,7 +75,7 @@
If you haven't reported a bug before, please read:
- http://www.chiark.greenend.org.uk/~sgtatham/bugs.html
+ https://www.chiark.greenend.org.uk/~sgtatham/bugs.html
http://www.catb.org/esr/faqs/smart-questions.html
diff --git a/Documentation/admin-guide/serial-console.rst b/Documentation/admin-guide/serial-console.rst
index a8d1e36..58b3283 100644
--- a/Documentation/admin-guide/serial-console.rst
+++ b/Documentation/admin-guide/serial-console.rst
@@ -54,7 +54,7 @@
``/dev/console`` is now character device 5,1.
(You can also use a network device as a console. See
-``Documentation/networking/netconsole.txt`` for information on that.)
+``Documentation/networking/netconsole.rst`` for information on that.)
Here's an example that will use ``/dev/ttyS1`` (COM2) as the console.
Replace the sample values as needed.
diff --git a/Documentation/admin-guide/sysctl/kernel.rst b/Documentation/admin-guide/sysctl/kernel.rst
index 0d427fd..83acf50 100644
--- a/Documentation/admin-guide/sysctl/kernel.rst
+++ b/Documentation/admin-guide/sysctl/kernel.rst
@@ -102,6 +102,30 @@
:doc:`/x86/boot` for additional information.
+bpf_stats_enabled
+=================
+
+Controls whether the kernel should collect statistics on BPF programs
+(total time spent running, number of times run...). Enabling
+statistics causes a slight reduction in performance on each program
+run. The statistics can be seen using ``bpftool``.
+
+= ===================================
+0 Don't collect statistics (default).
+1 Collect statistics.
+= ===================================
+
+
+cad_pid
+=======
+
+This is the pid which will be signalled on reboot (notably, by
+Ctrl-Alt-Delete). Writing a value to this file which doesn't
+correspond to a running process will result in ``-ESRCH``.
+
+See also `ctrl-alt-del`_.
+
+
cap_last_cap
============
@@ -241,6 +265,40 @@
see the ``hostname(1)`` man page.
+firmware_config
+===============
+
+See :doc:`/driver-api/firmware/fallback-mechanisms`.
+
+The entries in this directory allow the firmware loader helper
+fallback to be controlled:
+
+* ``force_sysfs_fallback``, when set to 1, forces the use of the
+ fallback;
+* ``ignore_sysfs_fallback``, when set to 1, ignores any fallback.
+
+
+ftrace_dump_on_oops
+===================
+
+Determines whether ``ftrace_dump()`` should be called on an oops (or
+kernel panic). This will output the contents of the ftrace buffers to
+the console. This is very useful for capturing traces that lead to
+crashes and outputting them to a serial console.
+
+= ===================================================
+0 Disabled (default).
+1 Dump buffers of all CPUs.
+2 Dump the buffer of the CPU that triggered the oops.
+= ===================================================
+
+
+ftrace_enabled, stack_tracer_enabled
+====================================
+
+See :doc:`/trace/ftrace`.
+
+
hardlockup_all_cpu_backtrace
============================
@@ -277,6 +335,20 @@
Default value is "``/sbin/hotplug``".
+hung_task_all_cpu_backtrace:
+================
+
+If this option is set, the kernel will send an NMI to all CPUs to dump
+their backtraces when a hung task is detected. This file shows up if
+CONFIG_DETECT_HUNG_TASK and CONFIG_SMP are enabled.
+
+0: Won't show all CPUs backtraces when a hung task is detected.
+This is the default behavior.
+
+1: Will non-maskably interrupt all CPUs and dump their backtraces when
+a hung task is detected.
+
+
hung_task_panic
===============
@@ -344,6 +416,25 @@
= =========================================================
+ignore-unaligned-usertrap
+=========================
+
+On architectures where unaligned accesses cause traps, and where this
+feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_NO_WARN``;
+currently, ``arc`` and ``ia64``), controls whether all unaligned traps
+are logged.
+
+= =============================================================
+0 Log all unaligned accesses.
+1 Only warn the first time a process traps. This is the default
+ setting.
+= =============================================================
+
+See also `unaligned-trap`_ and `unaligned-dump-stack`_. On ``ia64``,
+this allows system administrators to override the
+``IA64_THREAD_UAC_NOPRINT`` ``prctl`` and avoid logs being flooded.
+
+
kexec_load_disabled
===================
@@ -459,6 +550,15 @@
successful IPC object allocation. If an IPC object allocation syscall
fails, it is undefined if the value remains unmodified or is reset to -1.
+
+ngroups_max
+===========
+
+Maximum number of supplementary groups, _i.e._ the maximum size which
+``setgroups`` will accept. Exports ``NGROUPS_MAX`` from the kernel.
+
+
+
nmi_watchdog
============
@@ -546,6 +646,22 @@
scanned for a given scan.
+oops_all_cpu_backtrace:
+================
+
+If this option is set, the kernel will send an NMI to all CPUs to dump
+their backtraces when an oops event occurs. It should be used as a last
+resort in case a panic cannot be triggered (to protect VMs running, for
+example) or kdump can't be collected. This file shows up if CONFIG_SMP
+is enabled.
+
+0: Won't show all CPUs backtraces when an oops is detected.
+This is the default behavior.
+
+1: Will non-maskably interrupt all CPUs and dump their backtraces when
+an oops event is detected.
+
+
osrelease, ostype & version
===========================
@@ -721,7 +837,13 @@
===================
Controls use of the performance events system by unprivileged
-users (without CAP_SYS_ADMIN). The default value is 2.
+users (without CAP_PERFMON). The default value is 2.
+
+For backward compatibility reasons access to system performance
+monitoring and observability remains open for CAP_SYS_ADMIN
+privileged processes but CAP_SYS_ADMIN usage for secure system
+performance monitoring and observability operations is discouraged
+with respect to CAP_PERFMON use cases.
=== ==================================================================
-1 Allow use of (almost) all events by all users.
@@ -730,13 +852,13 @@
``CAP_IPC_LOCK``.
>=0 Disallow ftrace function tracepoint by users without
- ``CAP_SYS_ADMIN``.
+ ``CAP_PERFMON``.
- Disallow raw tracepoint access by users without ``CAP_SYS_ADMIN``.
+ Disallow raw tracepoint access by users without ``CAP_PERFMON``.
->=1 Disallow CPU event access by users without ``CAP_SYS_ADMIN``.
+>=1 Disallow CPU event access by users without ``CAP_PERFMON``.
->=2 Disallow kernel profiling by users without ``CAP_SYS_ADMIN``.
+>=2 Disallow kernel profiling by users without ``CAP_PERFMON``.
=== ==================================================================
@@ -871,7 +993,7 @@
pty
===
-See Documentation/filesystems/devpts.txt.
+See Documentation/filesystems/devpts.rst.
randomize_va_space
@@ -1147,6 +1269,13 @@
See :doc:`/admin-guide/tainted-kernels` for more information.
+Note:
+ writes to this sysctl interface will fail with ``EINVAL`` if the kernel is
+ booted with the command line option ``panic_on_taint=<bitmask>,nousertaint``
+ and any of the ORed together values being written to ``tainted`` match with
+ the bitmask declared on panic_on_taint.
+ See :doc:`/admin-guide/kernel-parameters` for more details on that particular
+ kernel command line option and its optional ``nousertaint`` switch.
threads-max
===========
@@ -1167,6 +1296,65 @@
``EINVAL`` error occurs.
+traceoff_on_warning
+===================
+
+When set, disables tracing (see :doc:`/trace/ftrace`) when a
+``WARN()`` is hit.
+
+
+tracepoint_printk
+=================
+
+When tracepoints are sent to printk() (enabled by the ``tp_printk``
+boot parameter), this entry provides runtime control::
+
+ echo 0 > /proc/sys/kernel/tracepoint_printk
+
+will stop tracepoints from being sent to printk(), and::
+
+ echo 1 > /proc/sys/kernel/tracepoint_printk
+
+will send them to printk() again.
+
+This only works if the kernel was booted with ``tp_printk`` enabled.
+
+See :doc:`/admin-guide/kernel-parameters` and
+:doc:`/trace/boottime-trace`.
+
+
+.. _unaligned-dump-stack:
+
+unaligned-dump-stack (ia64)
+===========================
+
+When logging unaligned accesses, controls whether the stack is
+dumped.
+
+= ===================================================
+0 Do not dump the stack. This is the default setting.
+1 Dump the stack.
+= ===================================================
+
+See also `ignore-unaligned-usertrap`_.
+
+
+unaligned-trap
+==============
+
+On architectures where unaligned accesses cause traps, and where this
+feature is supported (``CONFIG_SYSCTL_ARCH_UNALIGN_ALLOW``; currently,
+``arc`` and ``parisc``), controls whether unaligned traps are caught
+and emulated (instead of failing).
+
+= ========================================================
+0 Do not emulate unaligned accesses.
+1 Emulate unaligned accesses. This is the default setting.
+= ========================================================
+
+See also `ignore-unaligned-usertrap`_.
+
+
unknown_nmi_panic
=================
@@ -1178,6 +1366,16 @@
example. If a system hangs up, try pressing the NMI switch.
+unprivileged_bpf_disabled
+=========================
+
+Writing 1 to this entry will disable unprivileged calls to ``bpf()``;
+once disabled, calling ``bpf()`` without ``CAP_SYS_ADMIN`` will return
+``-EPERM``.
+
+Once set, this can't be cleared.
+
+
watchdog
========
diff --git a/Documentation/admin-guide/sysctl/net.rst b/Documentation/admin-guide/sysctl/net.rst
index e043c92..42cd04b 100644
--- a/Documentation/admin-guide/sysctl/net.rst
+++ b/Documentation/admin-guide/sysctl/net.rst
@@ -339,7 +339,9 @@
If set to 1, both IPv4 and IPv6 settings are forced to inherit from
current ones in init_net. If set to 2, both IPv4 and IPv6 settings are
-forced to reset to their default values.
+forced to reset to their default values. If set to 3, both IPv4 and IPv6
+settings are forced to inherit from current ones in the netns where this
+new netns has been created.
Default : 0 (for compatibility reasons)
@@ -353,8 +355,8 @@
3. /proc/sys/net/ipv4 - IPV4 settings
-------------------------------------
-Please see: Documentation/networking/ip-sysctl.txt and ipvs-sysctl.txt for
-descriptions of these entries.
+Please see: Documentation/networking/ip-sysctl.rst and
+Documentation/admin-guide/sysctl/net.rst for descriptions of these entries.
4. Appletalk
diff --git a/Documentation/admin-guide/sysctl/vm.rst b/Documentation/admin-guide/sysctl/vm.rst
index 0329a4d..d46d5b7 100644
--- a/Documentation/admin-guide/sysctl/vm.rst
+++ b/Documentation/admin-guide/sysctl/vm.rst
@@ -831,14 +831,27 @@
swappiness
==========
-This control is used to define how aggressive the kernel will swap
-memory pages. Higher values will increase aggressiveness, lower values
-decrease the amount of swap. A value of 0 instructs the kernel not to
-initiate swap until the amount of free and file-backed pages is less
-than the high water mark in a zone.
+This control is used to define the rough relative IO cost of swapping
+and filesystem paging, as a value between 0 and 200. At 100, the VM
+assumes equal IO cost and will thus apply memory pressure to the page
+cache and swap-backed pages equally; lower values signify more
+expensive swap IO, higher values indicates cheaper.
+
+Keep in mind that filesystem IO patterns under memory pressure tend to
+be more efficient than swap's random IO. An optimal value will require
+experimentation and will also be workload-dependent.
The default value is 60.
+For in-memory swap, like zram or zswap, as well as hybrid setups that
+have swap on faster devices than the filesystem, values beyond 100 can
+be considered. For example, if the random IO against the swap device
+is on average 2x faster than IO from the filesystem, swappiness should
+be 133 (x + 2x = 200, 2x = 133.33).
+
+At 0, the kernel will not initiate swap until the amount of free and
+file-backed pages is less than the high watermark in a zone.
+
unprivileged_userfaultfd
========================
diff --git a/Documentation/admin-guide/sysrq.rst b/Documentation/admin-guide/sysrq.rst
index a46209f..e6424d8 100644
--- a/Documentation/admin-guide/sysrq.rst
+++ b/Documentation/admin-guide/sysrq.rst
@@ -231,13 +231,13 @@
handler is called. Your handler must conform to the prototype in 'sysrq.h'.
After the ``sysrq_key_op`` is created, you can call the kernel function
-``register_sysrq_key(int key, struct sysrq_key_op *op_p);`` this will
+``register_sysrq_key(int key, const struct sysrq_key_op *op_p);`` this will
register the operation pointed to by ``op_p`` at table key 'key',
if that slot in the table is blank. At module unload time, you must call
-the function ``unregister_sysrq_key(int key, struct sysrq_key_op *op_p)``, which
-will remove the key op pointed to by 'op_p' from the key 'key', if and only if
-it is currently registered in that slot. This is in case the slot has been
-overwritten since you registered it.
+the function ``unregister_sysrq_key(int key, const struct sysrq_key_op *op_p)``,
+which will remove the key op pointed to by 'op_p' from the key 'key', if and
+only if it is currently registered in that slot. This is in case the slot has
+been overwritten since you registered it.
The Magic SysRQ system works by registering key operations against a key op
lookup table, which is defined in 'drivers/tty/sysrq.c'. This key table has
diff --git a/Documentation/admin-guide/unicode.rst b/Documentation/admin-guide/unicode.rst
index 7425a33..290fe83 100644
--- a/Documentation/admin-guide/unicode.rst
+++ b/Documentation/admin-guide/unicode.rst
@@ -114,7 +114,7 @@
This range is now officially managed by the ConScript Unicode
Registry. The normative reference is at:
- http://www.evertype.com/standards/csur/klingon.html
+ https://www.evertype.com/standards/csur/klingon.html
Klingon has an alphabet of 26 characters, a positional numeric writing
system with 10 digits, and is written left-to-right, top-to-bottom.
@@ -178,7 +178,7 @@
<jcowan@reutershealth.com> and Michael Everson <everson@evertype.com>.
The ConScript Unicode Registry is accessible at:
- http://www.evertype.com/standards/csur/
+ https://www.evertype.com/standards/csur/
The ranges used fall at the low end of the End User Zone and can hence
not be normatively assigned, but it is recommended that people who
diff --git a/Documentation/arm/microchip.rst b/Documentation/arm/microchip.rst
index 05e5f2d..9c01329 100644
--- a/Documentation/arm/microchip.rst
+++ b/Documentation/arm/microchip.rst
@@ -192,7 +192,7 @@
considered as "Unstable". To be completely clear, any at91 binding can change at
any time. So, be sure to use a Device Tree Binary and a Kernel Image generated from
the same source tree.
-Please refer to the Documentation/devicetree/bindings/ABI.txt file for a
+Please refer to the Documentation/devicetree/bindings/ABI.rst file for a
definition of a "Stable" binding/ABI.
This statement will be removed by AT91 MAINTAINERS when appropriate.
diff --git a/Documentation/arm64/amu.rst b/Documentation/arm64/amu.rst
index 5057b11..452ec8b 100644
--- a/Documentation/arm64/amu.rst
+++ b/Documentation/arm64/amu.rst
@@ -23,6 +23,7 @@
Version 1 of the Activity Monitors architecture implements a counter group
of four fixed and architecturally defined 64-bit event counters.
+
- CPU cycle counter: increments at the frequency of the CPU.
- Constant counter: increments at the fixed frequency of the system
clock.
@@ -57,6 +58,7 @@
Firmware (code running at higher exception levels, e.g. arm-tf) support is
needed to:
+
- Enable access for lower exception levels (EL2 and EL1) to the AMU
registers.
- Enable the counters. If not enabled these will read as 0.
@@ -78,6 +80,7 @@
The fixed counters of AMUv1 are accessible though the following system
register definitions:
+
- SYS_AMEVCNTR0_CORE_EL0
- SYS_AMEVCNTR0_CONST_EL0
- SYS_AMEVCNTR0_INST_RET_EL0
@@ -93,6 +96,7 @@
----------------
Currently, access from userspace to the AMU registers is disabled due to:
+
- Security reasons: they might expose information about code executed in
secure mode.
- Purpose: AMU counters are intended for system management use.
@@ -105,6 +109,7 @@
Currently, access from userspace (EL0) and kernelspace (EL1) on the KVM
guest side is disabled due to:
+
- Security reasons: they might expose information about code executed
by other guests or the host.
diff --git a/Documentation/arm64/booting.rst b/Documentation/arm64/booting.rst
index a3f1a47..7552dbc 100644
--- a/Documentation/arm64/booting.rst
+++ b/Documentation/arm64/booting.rst
@@ -173,7 +173,10 @@
- Caches, MMUs
The MMU must be off.
- Instruction cache may be on or off.
+
+ The instruction cache may be on or off, and must not hold any stale
+ entries corresponding to the loaded kernel image.
+
The address range corresponding to the loaded kernel image must be
cleaned to the PoC. In the presence of a system cache or other
coherent masters with caches enabled, this will typically require
@@ -238,6 +241,7 @@
- The DT or ACPI tables must describe a GICv2 interrupt controller.
For CPUs with pointer authentication functionality:
+
- If EL3 is present:
- SCR_EL3.APK (bit 16) must be initialised to 0b1
@@ -249,18 +253,22 @@
- HCR_EL2.API (bit 41) must be initialised to 0b1
For CPUs with Activity Monitors Unit v1 (AMUv1) extension present:
+
- If EL3 is present:
- CPTR_EL3.TAM (bit 30) must be initialised to 0b0
- CPTR_EL2.TAM (bit 30) must be initialised to 0b0
- AMCNTENSET0_EL0 must be initialised to 0b1111
- AMCNTENSET1_EL0 must be initialised to a platform specific value
- having 0b1 set for the corresponding bit for each of the auxiliary
- counters present.
+
+ - CPTR_EL3.TAM (bit 30) must be initialised to 0b0
+ - CPTR_EL2.TAM (bit 30) must be initialised to 0b0
+ - AMCNTENSET0_EL0 must be initialised to 0b1111
+ - AMCNTENSET1_EL0 must be initialised to a platform specific value
+ having 0b1 set for the corresponding bit for each of the auxiliary
+ counters present.
+
- If the kernel is entered at EL1:
- AMCNTENSET0_EL0 must be initialised to 0b1111
- AMCNTENSET1_EL0 must be initialised to a platform specific value
- having 0b1 set for the corresponding bit for each of the auxiliary
- counters present.
+
+ - AMCNTENSET0_EL0 must be initialised to 0b1111
+ - AMCNTENSET1_EL0 must be initialised to a platform specific value
+ having 0b1 set for the corresponding bit for each of the auxiliary
+ counters present.
The requirements described above for CPU mode, caches, MMUs, architected
timers, coherency and system registers apply to all CPUs. All CPUs must
@@ -304,7 +312,8 @@
Documentation/devicetree/bindings/arm/psci.yaml.
- Secondary CPU general-purpose register settings
- x0 = 0 (reserved for future use)
- x1 = 0 (reserved for future use)
- x2 = 0 (reserved for future use)
- x3 = 0 (reserved for future use)
+
+ - x0 = 0 (reserved for future use)
+ - x1 = 0 (reserved for future use)
+ - x2 = 0 (reserved for future use)
+ - x3 = 0 (reserved for future use)
diff --git a/Documentation/arm64/cpu-feature-registers.rst b/Documentation/arm64/cpu-feature-registers.rst
index 41937a8..314fa5b 100644
--- a/Documentation/arm64/cpu-feature-registers.rst
+++ b/Documentation/arm64/cpu-feature-registers.rst
@@ -176,6 +176,8 @@
+------------------------------+---------+---------+
| SSBS | [7-4] | y |
+------------------------------+---------+---------+
+ | BT | [3-0] | y |
+ +------------------------------+---------+---------+
4) MIDR_EL1 - Main ID Register
diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index 7dfb97d..84a9fd2 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -236,6 +236,11 @@
Functionality implied by ID_AA64ISAR0_EL1.RNDR == 0b0001.
+HWCAP2_BTI
+
+ Functionality implied by ID_AA64PFR0_EL1.BT == 0b0001.
+
+
4. Unused AT_HWCAP bits
-----------------------
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst
index 2c08c62..936cf2a 100644
--- a/Documentation/arm64/silicon-errata.rst
+++ b/Documentation/arm64/silicon-errata.rst
@@ -64,6 +64,10 @@
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A53 | #843419 | ARM64_ERRATUM_843419 |
+----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A55 | #1024718 | ARM64_ERRATUM_1024718 |
++----------------+-----------------+-----------------+-----------------------------+
+| ARM | Cortex-A55 | #1530923 | ARM64_ERRATUM_1530923 |
++----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A57 | #832075 | ARM64_ERRATUM_832075 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A57 | #852523 | N/A |
@@ -78,8 +82,6 @@
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A73 | #858921 | ARM64_ERRATUM_858921 |
+----------------+-----------------+-----------------+-----------------------------+
-| ARM | Cortex-A55 | #1024718 | ARM64_ERRATUM_1024718 |
-+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1188873,1418040| ARM64_ERRATUM_1418040 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1165522 | ARM64_ERRATUM_1165522 |
@@ -88,8 +90,6 @@
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Cortex-A76 | #1463225 | ARM64_ERRATUM_1463225 |
+----------------+-----------------+-----------------+-----------------------------+
-| ARM | Cortex-A55 | #1530923 | ARM64_ERRATUM_1530923 |
-+----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-N1 | #1188873,1418040| ARM64_ERRATUM_1418040 |
+----------------+-----------------+-----------------+-----------------------------+
| ARM | Neoverse-N1 | #1349291 | N/A |
diff --git a/Documentation/block/biovecs.rst b/Documentation/block/biovecs.rst
index ad303a2..36771a1 100644
--- a/Documentation/block/biovecs.rst
+++ b/Documentation/block/biovecs.rst
@@ -129,6 +129,7 @@
::
bio_for_each_segment_all()
+ bio_for_each_bvec_all()
bio_first_bvec_all()
bio_first_page_all()
bio_last_bvec_all()
@@ -143,4 +144,5 @@
bio_vec' will contain a multi-page IO vector during the iteration::
bio_for_each_bvec()
+ bio_for_each_bvec_all()
rq_for_each_bvec()
diff --git a/Documentation/block/index.rst b/Documentation/block/index.rst
index 3fa7a52..026addf 100644
--- a/Documentation/block/index.rst
+++ b/Documentation/block/index.rst
@@ -14,6 +14,7 @@
cmdline-partition
data-integrity
deadline-iosched
+ inline-encryption
ioprio
kyber-iosched
null_blk
diff --git a/Documentation/block/inline-encryption.rst b/Documentation/block/inline-encryption.rst
new file mode 100644
index 0000000..354817b
--- /dev/null
+++ b/Documentation/block/inline-encryption.rst
@@ -0,0 +1,263 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+=================
+Inline Encryption
+=================
+
+Background
+==========
+
+Inline encryption hardware sits logically between memory and the disk, and can
+en/decrypt data as it goes in/out of the disk. Inline encryption hardware has a
+fixed number of "keyslots" - slots into which encryption contexts (i.e. the
+encryption key, encryption algorithm, data unit size) can be programmed by the
+kernel at any time. Each request sent to the disk can be tagged with the index
+of a keyslot (and also a data unit number to act as an encryption tweak), and
+the inline encryption hardware will en/decrypt the data in the request with the
+encryption context programmed into that keyslot. This is very different from
+full disk encryption solutions like self encrypting drives/TCG OPAL/ATA
+Security standards, since with inline encryption, any block on disk could be
+encrypted with any encryption context the kernel chooses.
+
+
+Objective
+=========
+
+We want to support inline encryption (IE) in the kernel.
+To allow for testing, we also want a crypto API fallback when actual
+IE hardware is absent. We also want IE to work with layered devices
+like dm and loopback (i.e. we want to be able to use the IE hardware
+of the underlying devices if present, or else fall back to crypto API
+en/decryption).
+
+
+Constraints and notes
+=====================
+
+- IE hardware has a limited number of "keyslots" that can be programmed
+ with an encryption context (key, algorithm, data unit size, etc.) at any time.
+ One can specify a keyslot in a data request made to the device, and the
+ device will en/decrypt the data using the encryption context programmed into
+ that specified keyslot. When possible, we want to make multiple requests with
+ the same encryption context share the same keyslot.
+
+- We need a way for upper layers like filesystems to specify an encryption
+ context to use for en/decrypting a struct bio, and a device driver (like UFS)
+ needs to be able to use that encryption context when it processes the bio.
+
+- We need a way for device drivers to expose their inline encryption
+ capabilities in a unified way to the upper layers.
+
+
+Design
+======
+
+We add a :c:type:`struct bio_crypt_ctx` to :c:type:`struct bio` that can
+represent an encryption context, because we need to be able to pass this
+encryption context from the upper layers (like the fs layer) to the
+device driver to act upon.
+
+While IE hardware works on the notion of keyslots, the FS layer has no
+knowledge of keyslots - it simply wants to specify an encryption context to
+use while en/decrypting a bio.
+
+We introduce a keyslot manager (KSM) that handles the translation from
+encryption contexts specified by the FS to keyslots on the IE hardware.
+This KSM also serves as the way IE hardware can expose its capabilities to
+upper layers. The generic mode of operation is: each device driver that wants
+to support IE will construct a KSM and set it up in its struct request_queue.
+Upper layers that want to use IE on this device can then use this KSM in
+the device's struct request_queue to translate an encryption context into
+a keyslot. The presence of the KSM in the request queue shall be used to mean
+that the device supports IE.
+
+The KSM uses refcounts to track which keyslots are idle (either they have no
+encryption context programmed, or there are no in-flight struct bios
+referencing that keyslot). When a new encryption context needs a keyslot, it
+tries to find a keyslot that has already been programmed with the same
+encryption context, and if there is no such keyslot, it evicts the least
+recently used idle keyslot and programs the new encryption context into that
+one. If no idle keyslots are available, then the caller will sleep until there
+is at least one.
+
+
+blk-mq changes, other block layer changes and blk-crypto-fallback
+=================================================================
+
+We add a pointer to a ``bi_crypt_context`` and ``keyslot`` to
+:c:type:`struct request`. These will be referred to as the ``crypto fields``
+for the request. This ``keyslot`` is the keyslot into which the
+``bi_crypt_context`` has been programmed in the KSM of the ``request_queue``
+that this request is being sent to.
+
+We introduce ``block/blk-crypto-fallback.c``, which allows upper layers to remain
+blissfully unaware of whether or not real inline encryption hardware is present
+underneath. When a bio is submitted with a target ``request_queue`` that doesn't
+support the encryption context specified with the bio, the block layer will
+en/decrypt the bio with the blk-crypto-fallback.
+
+If the bio is a ``WRITE`` bio, a bounce bio is allocated, and the data in the bio
+is encrypted stored in the bounce bio - blk-mq will then proceed to process the
+bounce bio as if it were not encrypted at all (except when blk-integrity is
+concerned). ``blk-crypto-fallback`` sets the bounce bio's ``bi_end_io`` to an
+internal function that cleans up the bounce bio and ends the original bio.
+
+If the bio is a ``READ`` bio, the bio's ``bi_end_io`` (and also ``bi_private``)
+is saved and overwritten by ``blk-crypto-fallback`` to
+``bio_crypto_fallback_decrypt_bio``. The bio's ``bi_crypt_context`` is also
+overwritten with ``NULL``, so that to the rest of the stack, the bio looks
+as if it was a regular bio that never had an encryption context specified.
+``bio_crypto_fallback_decrypt_bio`` will decrypt the bio, restore the original
+``bi_end_io`` (and also ``bi_private``) and end the bio again.
+
+Regardless of whether real inline encryption hardware is used or the
+blk-crypto-fallback is used, the ciphertext written to disk (and hence the
+on-disk format of data) will be the same (assuming the hardware's implementation
+of the algorithm being used adheres to spec and functions correctly).
+
+If a ``request queue``'s inline encryption hardware claimed to support the
+encryption context specified with a bio, then it will not be handled by the
+``blk-crypto-fallback``. We will eventually reach a point in blk-mq when a
+:c:type:`struct request` needs to be allocated for that bio. At that point,
+blk-mq tries to program the encryption context into the ``request_queue``'s
+keyslot_manager, and obtain a keyslot, which it stores in its newly added
+``keyslot`` field. This keyslot is released when the request is completed.
+
+When the first bio is added to a request, ``blk_crypto_rq_bio_prep`` is called,
+which sets the request's ``crypt_ctx`` to a copy of the bio's
+``bi_crypt_context``. bio_crypt_do_front_merge is called whenever a subsequent
+bio is merged to the front of the request, which updates the ``crypt_ctx`` of
+the request so that it matches the newly merged bio's ``bi_crypt_context``. In particular, the request keeps a copy of the ``bi_crypt_context`` of the first
+bio in its bio-list (blk-mq needs to be careful to maintain this invariant
+during bio and request merges).
+
+To make it possible for inline encryption to work with request queue based
+layered devices, when a request is cloned, its ``crypto fields`` are cloned as
+well. When the cloned request is submitted, blk-mq programs the
+``bi_crypt_context`` of the request into the clone's request_queue's keyslot
+manager, and stores the returned keyslot in the clone's ``keyslot``.
+
+
+API presented to users of the block layer
+=========================================
+
+``struct blk_crypto_key`` represents a crypto key (the raw key, size of the
+key, the crypto algorithm to use, the data unit size to use, and the number of
+bytes required to represent data unit numbers that will be specified with the
+``bi_crypt_context``).
+
+``blk_crypto_init_key`` allows upper layers to initialize such a
+``blk_crypto_key``.
+
+``bio_crypt_set_ctx`` should be called on any bio that a user of
+the block layer wants en/decrypted via inline encryption (or the
+blk-crypto-fallback, if hardware support isn't available for the desired
+crypto configuration). This function takes the ``blk_crypto_key`` and the
+data unit number (DUN) to use when en/decrypting the bio.
+
+``blk_crypto_config_supported`` allows upper layers to query whether or not the
+an encryption context passed to request queue can be handled by blk-crypto
+(either by real inline encryption hardware, or by the blk-crypto-fallback).
+This is useful e.g. when blk-crypto-fallback is disabled, and the upper layer
+wants to use an algorithm that may not supported by hardware - this function
+lets the upper layer know ahead of time that the algorithm isn't supported,
+and the upper layer can fallback to something else if appropriate.
+
+``blk_crypto_start_using_key`` - Upper layers must call this function on
+``blk_crypto_key`` and a ``request_queue`` before using the key with any bio
+headed for that ``request_queue``. This function ensures that either the
+hardware supports the key's crypto settings, or the crypto API fallback has
+transforms for the needed mode allocated and ready to go. Note that this
+function may allocate an ``skcipher``, and must not be called from the data
+path, since allocating ``skciphers`` from the data path can deadlock.
+
+``blk_crypto_evict_key`` *must* be called by upper layers before a
+``blk_crypto_key`` is freed. Further, it *must* only be called only once
+there are no more in-flight requests that use that ``blk_crypto_key``.
+``blk_crypto_evict_key`` will ensure that a key is removed from any keyslots in
+inline encryption hardware that the key might have been programmed into (or the blk-crypto-fallback).
+
+API presented to device drivers
+===============================
+
+A :c:type:``struct blk_keyslot_manager`` should be set up by device drivers in
+the ``request_queue`` of the device. The device driver needs to call
+``blk_ksm_init`` on the ``blk_keyslot_manager``, which specifying the number of
+keyslots supported by the hardware.
+
+The device driver also needs to tell the KSM how to actually manipulate the
+IE hardware in the device to do things like programming the crypto key into
+the IE hardware into a particular keyslot. All this is achieved through the
+:c:type:`struct blk_ksm_ll_ops` field in the KSM that the device driver
+must fill up after initing the ``blk_keyslot_manager``.
+
+The KSM also handles runtime power management for the device when applicable
+(e.g. when it wants to program a crypto key into the IE hardware, the device
+must be runtime powered on) - so the device driver must also set the ``dev``
+field in the ksm to point to the `struct device` for the KSM to use for runtime
+power management.
+
+``blk_ksm_reprogram_all_keys`` can be called by device drivers if the device
+needs each and every of its keyslots to be reprogrammed with the key it
+"should have" at the point in time when the function is called. This is useful
+e.g. if a device loses all its keys on runtime power down/up.
+
+``blk_ksm_destroy`` should be called to free up all resources used by a keyslot
+manager upon ``blk_ksm_init``, once the ``blk_keyslot_manager`` is no longer
+needed.
+
+
+Layered Devices
+===============
+
+Request queue based layered devices like dm-rq that wish to support IE need to
+create their own keyslot manager for their request queue, and expose whatever
+functionality they choose. When a layered device wants to pass a clone of that
+request to another ``request_queue``, blk-crypto will initialize and prepare the
+clone as necessary - see ``blk_crypto_insert_cloned_request`` in
+``blk-crypto.c``.
+
+
+Future Optimizations for layered devices
+========================================
+
+Creating a keyslot manager for a layered device uses up memory for each
+keyslot, and in general, a layered device merely passes the request on to a
+"child" device, so the keyslots in the layered device itself are completely
+unused, and don't need any refcounting or keyslot programming. We can instead
+define a new type of KSM; the "passthrough KSM", that layered devices can use
+to advertise an unlimited number of keyslots, and support for any encryption
+algorithms they choose, while not actually using any memory for each keyslot.
+Another use case for the "passthrough KSM" is for IE devices that do not have a
+limited number of keyslots.
+
+
+Interaction between inline encryption and blk integrity
+=======================================================
+
+At the time of this patch, there is no real hardware that supports both these
+features. However, these features do interact with each other, and it's not
+completely trivial to make them both work together properly. In particular,
+when a WRITE bio wants to use inline encryption on a device that supports both
+features, the bio will have an encryption context specified, after which
+its integrity information is calculated (using the plaintext data, since
+the encryption will happen while data is being written), and the data and
+integrity info is sent to the device. Obviously, the integrity info must be
+verified before the data is encrypted. After the data is encrypted, the device
+must not store the integrity info that it received with the plaintext data
+since that might reveal information about the plaintext data. As such, it must
+re-generate the integrity info from the ciphertext data and store that on disk
+instead. Another issue with storing the integrity info of the plaintext data is
+that it changes the on disk format depending on whether hardware inline
+encryption support is present or the kernel crypto API fallback is used (since
+if the fallback is used, the device will receive the integrity info of the
+ciphertext, not that of the plaintext).
+
+Because there isn't any real hardware yet, it seems prudent to assume that
+hardware implementations might not implement both features together correctly,
+and disallow the combination for now. Whenever a device supports integrity, the
+kernel will pretend that the device does not support hardware inline encryption
+(by essentially setting the keyslot manager in the request_queue of the device
+to NULL). When the crypto API fallback is enabled, this means that all bios with
+and encryption context will use the fallback, and IO will complete as usual.
+When the fallback is disabled, a bio with an encryption context will be failed.
diff --git a/Documentation/bpf/bpf_devel_QA.rst b/Documentation/bpf/bpf_devel_QA.rst
index 38c15c6..0b3db91 100644
--- a/Documentation/bpf/bpf_devel_QA.rst
+++ b/Documentation/bpf/bpf_devel_QA.rst
@@ -437,6 +437,21 @@
See the kernels selftest `Documentation/dev-tools/kselftest.rst`_
document for further documentation.
+To maximize the number of tests passing, the .config of the kernel
+under test should match the config file fragment in
+tools/testing/selftests/bpf as closely as possible.
+
+Finally to ensure support for latest BPF Type Format features -
+discussed in `Documentation/bpf/btf.rst`_ - pahole version 1.16
+is required for kernels built with CONFIG_DEBUG_INFO_BTF=y.
+pahole is delivered in the dwarves package or can be built
+from source at
+
+https://github.com/acmel/dwarves
+
+Some distros have pahole version 1.16 packaged already, e.g.
+Fedora, Gentoo.
+
Q: Which BPF kernel selftests version should I run my kernel against?
---------------------------------------------------------------------
A: If you run a kernel ``xyz``, then always run the BPF kernel selftests
diff --git a/Documentation/bpf/index.rst b/Documentation/bpf/index.rst
index f99677f..38b4db8 100644
--- a/Documentation/bpf/index.rst
+++ b/Documentation/bpf/index.rst
@@ -7,7 +7,7 @@
This kernel side documentation is still work in progress. The main
textual documentation is (for historical reasons) described in
-`Documentation/networking/filter.txt`_, which describe both classical
+`Documentation/networking/filter.rst`_, which describe both classical
and extended BPF instruction-set.
The Cilium project also maintains a `BPF and XDP Reference Guide`_
that goes into great technical depth about the BPF Architecture.
@@ -59,7 +59,7 @@
.. Links:
-.. _Documentation/networking/filter.txt: ../networking/filter.txt
+.. _Documentation/networking/filter.rst: ../networking/filter.txt
.. _man-pages: https://www.kernel.org/doc/man-pages/
.. _bpf(2): http://man7.org/linux/man-pages/man2/bpf.2.html
.. _BPF and XDP Reference Guide: http://cilium.readthedocs.io/en/latest/bpf/
diff --git a/Documentation/bpf/ringbuf.rst b/Documentation/bpf/ringbuf.rst
new file mode 100644
index 0000000..75f943f
--- /dev/null
+++ b/Documentation/bpf/ringbuf.rst
@@ -0,0 +1,209 @@
+===============
+BPF ring buffer
+===============
+
+This document describes BPF ring buffer design, API, and implementation details.
+
+.. contents::
+ :local:
+ :depth: 2
+
+Motivation
+----------
+
+There are two distinctive motivators for this work, which are not satisfied by
+existing perf buffer, which prompted creation of a new ring buffer
+implementation.
+
+- more efficient memory utilization by sharing ring buffer across CPUs;
+- preserving ordering of events that happen sequentially in time, even across
+ multiple CPUs (e.g., fork/exec/exit events for a task).
+
+These two problems are independent, but perf buffer fails to satisfy both.
+Both are a result of a choice to have per-CPU perf ring buffer. Both can be
+also solved by having an MPSC implementation of ring buffer. The ordering
+problem could technically be solved for perf buffer with some in-kernel
+counting, but given the first one requires an MPSC buffer, the same solution
+would solve the second problem automatically.
+
+Semantics and APIs
+------------------
+
+Single ring buffer is presented to BPF programs as an instance of BPF map of
+type ``BPF_MAP_TYPE_RINGBUF``. Two other alternatives considered, but
+ultimately rejected.
+
+One way would be to, similar to ``BPF_MAP_TYPE_PERF_EVENT_ARRAY``, make
+``BPF_MAP_TYPE_RINGBUF`` could represent an array of ring buffers, but not
+enforce "same CPU only" rule. This would be more familiar interface compatible
+with existing perf buffer use in BPF, but would fail if application needed more
+advanced logic to lookup ring buffer by arbitrary key.
+``BPF_MAP_TYPE_HASH_OF_MAPS`` addresses this with current approach.
+Additionally, given the performance of BPF ringbuf, many use cases would just
+opt into a simple single ring buffer shared among all CPUs, for which current
+approach would be an overkill.
+
+Another approach could introduce a new concept, alongside BPF map, to represent
+generic "container" object, which doesn't necessarily have key/value interface
+with lookup/update/delete operations. This approach would add a lot of extra
+infrastructure that has to be built for observability and verifier support. It
+would also add another concept that BPF developers would have to familiarize
+themselves with, new syntax in libbpf, etc. But then would really provide no
+additional benefits over the approach of using a map. ``BPF_MAP_TYPE_RINGBUF``
+doesn't support lookup/update/delete operations, but so doesn't few other map
+types (e.g., queue and stack; array doesn't support delete, etc).
+
+The approach chosen has an advantage of re-using existing BPF map
+infrastructure (introspection APIs in kernel, libbpf support, etc), being
+familiar concept (no need to teach users a new type of object in BPF program),
+and utilizing existing tooling (bpftool). For common scenario of using a single
+ring buffer for all CPUs, it's as simple and straightforward, as would be with
+a dedicated "container" object. On the other hand, by being a map, it can be
+combined with ``ARRAY_OF_MAPS`` and ``HASH_OF_MAPS`` map-in-maps to implement
+a wide variety of topologies, from one ring buffer for each CPU (e.g., as
+a replacement for perf buffer use cases), to a complicated application
+hashing/sharding of ring buffers (e.g., having a small pool of ring buffers
+with hashed task's tgid being a look up key to preserve order, but reduce
+contention).
+
+Key and value sizes are enforced to be zero. ``max_entries`` is used to specify
+the size of ring buffer and has to be a power of 2 value.
+
+There are a bunch of similarities between perf buffer
+(``BPF_MAP_TYPE_PERF_EVENT_ARRAY``) and new BPF ring buffer semantics:
+
+- variable-length records;
+- if there is no more space left in ring buffer, reservation fails, no
+ blocking;
+- memory-mappable data area for user-space applications for ease of
+ consumption and high performance;
+- epoll notifications for new incoming data;
+- but still the ability to do busy polling for new data to achieve the
+ lowest latency, if necessary.
+
+BPF ringbuf provides two sets of APIs to BPF programs:
+
+- ``bpf_ringbuf_output()`` allows to *copy* data from one place to a ring
+ buffer, similarly to ``bpf_perf_event_output()``;
+- ``bpf_ringbuf_reserve()``/``bpf_ringbuf_commit()``/``bpf_ringbuf_discard()``
+ APIs split the whole process into two steps. First, a fixed amount of space
+ is reserved. If successful, a pointer to a data inside ring buffer data
+ area is returned, which BPF programs can use similarly to a data inside
+ array/hash maps. Once ready, this piece of memory is either committed or
+ discarded. Discard is similar to commit, but makes consumer ignore the
+ record.
+
+``bpf_ringbuf_output()`` has disadvantage of incurring extra memory copy,
+because record has to be prepared in some other place first. But it allows to
+submit records of the length that's not known to verifier beforehand. It also
+closely matches ``bpf_perf_event_output()``, so will simplify migration
+significantly.
+
+``bpf_ringbuf_reserve()`` avoids the extra copy of memory by providing a memory
+pointer directly to ring buffer memory. In a lot of cases records are larger
+than BPF stack space allows, so many programs have use extra per-CPU array as
+a temporary heap for preparing sample. bpf_ringbuf_reserve() avoid this needs
+completely. But in exchange, it only allows a known constant size of memory to
+be reserved, such that verifier can verify that BPF program can't access memory
+outside its reserved record space. bpf_ringbuf_output(), while slightly slower
+due to extra memory copy, covers some use cases that are not suitable for
+``bpf_ringbuf_reserve()``.
+
+The difference between commit and discard is very small. Discard just marks
+a record as discarded, and such records are supposed to be ignored by consumer
+code. Discard is useful for some advanced use-cases, such as ensuring
+all-or-nothing multi-record submission, or emulating temporary
+``malloc()``/``free()`` within single BPF program invocation.
+
+Each reserved record is tracked by verifier through existing
+reference-tracking logic, similar to socket ref-tracking. It is thus
+impossible to reserve a record, but forget to submit (or discard) it.
+
+``bpf_ringbuf_query()`` helper allows to query various properties of ring
+buffer. Currently 4 are supported:
+
+- ``BPF_RB_AVAIL_DATA`` returns amount of unconsumed data in ring buffer;
+- ``BPF_RB_RING_SIZE`` returns the size of ring buffer;
+- ``BPF_RB_CONS_POS``/``BPF_RB_PROD_POS`` returns current logical possition
+ of consumer/producer, respectively.
+
+Returned values are momentarily snapshots of ring buffer state and could be
+off by the time helper returns, so this should be used only for
+debugging/reporting reasons or for implementing various heuristics, that take
+into account highly-changeable nature of some of those characteristics.
+
+One such heuristic might involve more fine-grained control over poll/epoll
+notifications about new data availability in ring buffer. Together with
+``BPF_RB_NO_WAKEUP``/``BPF_RB_FORCE_WAKEUP`` flags for output/commit/discard
+helpers, it allows BPF program a high degree of control and, e.g., more
+efficient batched notifications. Default self-balancing strategy, though,
+should be adequate for most applications and will work reliable and efficiently
+already.
+
+Design and Implementation
+-------------------------
+
+This reserve/commit schema allows a natural way for multiple producers, either
+on different CPUs or even on the same CPU/in the same BPF program, to reserve
+independent records and work with them without blocking other producers. This
+means that if BPF program was interruped by another BPF program sharing the
+same ring buffer, they will both get a record reserved (provided there is
+enough space left) and can work with it and submit it independently. This
+applies to NMI context as well, except that due to using a spinlock during
+reservation, in NMI context, ``bpf_ringbuf_reserve()`` might fail to get
+a lock, in which case reservation will fail even if ring buffer is not full.
+
+The ring buffer itself internally is implemented as a power-of-2 sized
+circular buffer, with two logical and ever-increasing counters (which might
+wrap around on 32-bit architectures, that's not a problem):
+
+- consumer counter shows up to which logical position consumer consumed the
+ data;
+- producer counter denotes amount of data reserved by all producers.
+
+Each time a record is reserved, producer that "owns" the record will
+successfully advance producer counter. At that point, data is still not yet
+ready to be consumed, though. Each record has 8 byte header, which contains the
+length of reserved record, as well as two extra bits: busy bit to denote that
+record is still being worked on, and discard bit, which might be set at commit
+time if record is discarded. In the latter case, consumer is supposed to skip
+the record and move on to the next one. Record header also encodes record's
+relative offset from the beginning of ring buffer data area (in pages). This
+allows ``bpf_ringbuf_commit()``/``bpf_ringbuf_discard()`` to accept only the
+pointer to the record itself, without requiring also the pointer to ring buffer
+itself. Ring buffer memory location will be restored from record metadata
+header. This significantly simplifies verifier, as well as improving API
+usability.
+
+Producer counter increments are serialized under spinlock, so there is
+a strict ordering between reservations. Commits, on the other hand, are
+completely lockless and independent. All records become available to consumer
+in the order of reservations, but only after all previous records where
+already committed. It is thus possible for slow producers to temporarily hold
+off submitted records, that were reserved later.
+
+Reservation/commit/consumer protocol is verified by litmus tests in
+Documentation/litmus_tests/bpf-rb/_.
+
+One interesting implementation bit, that significantly simplifies (and thus
+speeds up as well) implementation of both producers and consumers is how data
+area is mapped twice contiguously back-to-back in the virtual memory. This
+allows to not take any special measures for samples that have to wrap around
+at the end of the circular buffer data area, because the next page after the
+last data page would be first data page again, and thus the sample will still
+appear completely contiguous in virtual memory. See comment and a simple ASCII
+diagram showing this visually in ``bpf_ringbuf_area_alloc()``.
+
+Another feature that distinguishes BPF ringbuf from perf ring buffer is
+a self-pacing notifications of new data being availability.
+``bpf_ringbuf_commit()`` implementation will send a notification of new record
+being available after commit only if consumer has already caught up right up to
+the record being committed. If not, consumer still has to catch up and thus
+will see new data anyways without needing an extra poll notification.
+Benchmarks (see tools/testing/selftests/bpf/benchs/bench_ringbuf.c_) show that
+this allows to achieve a very high throughput without having to resort to
+tricks like "notify only every Nth sample", which are necessary with perf
+buffer. For extreme cases, when BPF program wants more manual control of
+notifications, commit/discard/output helpers accept ``BPF_RB_NO_WAKEUP`` and
+``BPF_RB_FORCE_WAKEUP`` flags, which give full control over notifications of
+data availability, but require extra caution and diligence in using this API.
diff --git a/Documentation/conf.py b/Documentation/conf.py
index 9ae8e9a..c503188 100644
--- a/Documentation/conf.py
+++ b/Documentation/conf.py
@@ -388,44 +388,6 @@
# author, documentclass [howto, manual, or own class]).
# Sorted in alphabetical order
latex_documents = [
- ('admin-guide/index', 'linux-user.tex', 'Linux Kernel User Documentation',
- 'The kernel development community', 'manual'),
- ('core-api/index', 'core-api.tex', 'The kernel core API manual',
- 'The kernel development community', 'manual'),
- ('crypto/index', 'crypto-api.tex', 'Linux Kernel Crypto API manual',
- 'The kernel development community', 'manual'),
- ('dev-tools/index', 'dev-tools.tex', 'Development tools for the Kernel',
- 'The kernel development community', 'manual'),
- ('doc-guide/index', 'kernel-doc-guide.tex', 'Linux Kernel Documentation Guide',
- 'The kernel development community', 'manual'),
- ('driver-api/index', 'driver-api.tex', 'The kernel driver API manual',
- 'The kernel development community', 'manual'),
- ('filesystems/index', 'filesystems.tex', 'Linux Filesystems API',
- 'The kernel development community', 'manual'),
- ('admin-guide/ext4', 'ext4-admin-guide.tex', 'ext4 Administration Guide',
- 'ext4 Community', 'manual'),
- ('filesystems/ext4/index', 'ext4-data-structures.tex',
- 'ext4 Data Structures and Algorithms', 'ext4 Community', 'manual'),
- ('gpu/index', 'gpu.tex', 'Linux GPU Driver Developer\'s Guide',
- 'The kernel development community', 'manual'),
- ('input/index', 'linux-input.tex', 'The Linux input driver subsystem',
- 'The kernel development community', 'manual'),
- ('kernel-hacking/index', 'kernel-hacking.tex', 'Unreliable Guide To Hacking The Linux Kernel',
- 'The kernel development community', 'manual'),
- ('media/index', 'media.tex', 'Linux Media Subsystem Documentation',
- 'The kernel development community', 'manual'),
- ('networking/index', 'networking.tex', 'Linux Networking Documentation',
- 'The kernel development community', 'manual'),
- ('process/index', 'development-process.tex', 'Linux Kernel Development Documentation',
- 'The kernel development community', 'manual'),
- ('security/index', 'security.tex', 'The kernel security subsystem manual',
- 'The kernel development community', 'manual'),
- ('sh/index', 'sh.tex', 'SuperH architecture implementation manual',
- 'The kernel development community', 'manual'),
- ('sound/index', 'sound.tex', 'Linux Sound Subsystem Documentation',
- 'The kernel development community', 'manual'),
- ('userspace-api/index', 'userspace-api.tex', 'The Linux kernel user-space API guide',
- 'The kernel development community', 'manual'),
]
# Add all other index files from Documentation/ subdirectories
@@ -576,7 +538,7 @@
# Grouping the document tree into PDF files. List of tuples
# (source start file, target name, title, author, options).
#
-# See the Sphinx chapter of http://ralsina.me/static/manual.pdf
+# See the Sphinx chapter of https://ralsina.me/static/manual.pdf
#
# FIXME: Do not add the index file here; the result will be too big. Adding
# multiple PDF files here actually tries to get the cross-referencing right
diff --git a/Documentation/core-api/cachetlb.rst b/Documentation/core-api/cachetlb.rst
index 93cb65d..a1582cc 100644
--- a/Documentation/core-api/cachetlb.rst
+++ b/Documentation/core-api/cachetlb.rst
@@ -213,7 +213,7 @@
there will be no entries in the cache for the kernel address
space for virtual addresses in the range 'start' to 'end-1'.
- The first of these two routines is invoked after map_vm_area()
+ The first of these two routines is invoked after map_kernel_range()
has installed the page table entries. The second is invoked
before unmap_kernel_range() deletes the page table entries.
diff --git a/Documentation/debugging-via-ohci1394.txt b/Documentation/core-api/debugging-via-ohci1394.rst
similarity index 100%
rename from Documentation/debugging-via-ohci1394.txt
rename to Documentation/core-api/debugging-via-ohci1394.rst
diff --git a/Documentation/DMA-API-HOWTO.txt b/Documentation/core-api/dma-api-howto.rst
similarity index 100%
rename from Documentation/DMA-API-HOWTO.txt
rename to Documentation/core-api/dma-api-howto.rst
diff --git a/Documentation/DMA-API.txt b/Documentation/core-api/dma-api.rst
similarity index 100%
rename from Documentation/DMA-API.txt
rename to Documentation/core-api/dma-api.rst
diff --git a/Documentation/DMA-attributes.txt b/Documentation/core-api/dma-attributes.rst
similarity index 100%
rename from Documentation/DMA-attributes.txt
rename to Documentation/core-api/dma-attributes.rst
diff --git a/Documentation/DMA-ISA-LPC.txt b/Documentation/core-api/dma-isa-lpc.rst
similarity index 100%
rename from Documentation/DMA-ISA-LPC.txt
rename to Documentation/core-api/dma-isa-lpc.rst
diff --git a/Documentation/core-api/index.rst b/Documentation/core-api/index.rst
index 0897ad1..15ab861 100644
--- a/Documentation/core-api/index.rst
+++ b/Documentation/core-api/index.rst
@@ -18,6 +18,7 @@
kernel-api
workqueue
+ printk-basics
printk-formats
symbol-namespaces
@@ -30,10 +31,12 @@
:maxdepth: 1
kobject
+ kref
assoc_array
xarray
idr
circular-buffers
+ rbtree
generic-radix-tree
packing
timekeeping
@@ -50,6 +53,7 @@
atomic_ops
refcount-vs-atomic
+ irq/index
local_ops
padata
../RCU/index
@@ -78,6 +82,10 @@
:maxdepth: 1
memory-allocation
+ dma-api
+ dma-api-howto
+ dma-attributes
+ dma-isa-lpc
mm-api
genalloc
pin_user_pages
@@ -92,6 +100,7 @@
debug-objects
tracepoint
+ debugging-via-ohci1394
Everything else
===============
diff --git a/Documentation/IRQ.txt b/Documentation/core-api/irq/concepts.rst
similarity index 100%
rename from Documentation/IRQ.txt
rename to Documentation/core-api/irq/concepts.rst
diff --git a/Documentation/core-api/irq/index.rst b/Documentation/core-api/irq/index.rst
new file mode 100644
index 0000000..0d65d11
--- /dev/null
+++ b/Documentation/core-api/irq/index.rst
@@ -0,0 +1,11 @@
+====
+IRQs
+====
+
+.. toctree::
+ :maxdepth: 1
+
+ concepts
+ irq-affinity
+ irq-domain
+ irqflags-tracing
diff --git a/Documentation/IRQ-affinity.txt b/Documentation/core-api/irq/irq-affinity.rst
similarity index 100%
rename from Documentation/IRQ-affinity.txt
rename to Documentation/core-api/irq/irq-affinity.rst
diff --git a/Documentation/core-api/irq/irq-domain.rst b/Documentation/core-api/irq/irq-domain.rst
new file mode 100644
index 0000000..096db12
--- /dev/null
+++ b/Documentation/core-api/irq/irq-domain.rst
@@ -0,0 +1,270 @@
+===============================================
+The irq_domain interrupt number mapping library
+===============================================
+
+The current design of the Linux kernel uses a single large number
+space where each separate IRQ source is assigned a different number.
+This is simple when there is only one interrupt controller, but in
+systems with multiple interrupt controllers the kernel must ensure
+that each one gets assigned non-overlapping allocations of Linux
+IRQ numbers.
+
+The number of interrupt controllers registered as unique irqchips
+show a rising tendency: for example subdrivers of different kinds
+such as GPIO controllers avoid reimplementing identical callback
+mechanisms as the IRQ core system by modelling their interrupt
+handlers as irqchips, i.e. in effect cascading interrupt controllers.
+
+Here the interrupt number loose all kind of correspondence to
+hardware interrupt numbers: whereas in the past, IRQ numbers could
+be chosen so they matched the hardware IRQ line into the root
+interrupt controller (i.e. the component actually fireing the
+interrupt line to the CPU) nowadays this number is just a number.
+
+For this reason we need a mechanism to separate controller-local
+interrupt numbers, called hardware irq's, from Linux IRQ numbers.
+
+The irq_alloc_desc*() and irq_free_desc*() APIs provide allocation of
+irq numbers, but they don't provide any support for reverse mapping of
+the controller-local IRQ (hwirq) number into the Linux IRQ number
+space.
+
+The irq_domain library adds mapping between hwirq and IRQ numbers on
+top of the irq_alloc_desc*() API. An irq_domain to manage mapping is
+preferred over interrupt controller drivers open coding their own
+reverse mapping scheme.
+
+irq_domain also implements translation from an abstract irq_fwspec
+structure to hwirq numbers (Device Tree and ACPI GSI so far), and can
+be easily extended to support other IRQ topology data sources.
+
+irq_domain usage
+================
+
+An interrupt controller driver creates and registers an irq_domain by
+calling one of the irq_domain_add_*() functions (each mapping method
+has a different allocator function, more on that later). The function
+will return a pointer to the irq_domain on success. The caller must
+provide the allocator function with an irq_domain_ops structure.
+
+In most cases, the irq_domain will begin empty without any mappings
+between hwirq and IRQ numbers. Mappings are added to the irq_domain
+by calling irq_create_mapping() which accepts the irq_domain and a
+hwirq number as arguments. If a mapping for the hwirq doesn't already
+exist then it will allocate a new Linux irq_desc, associate it with
+the hwirq, and call the .map() callback so the driver can perform any
+required hardware setup.
+
+When an interrupt is received, irq_find_mapping() function should
+be used to find the Linux IRQ number from the hwirq number.
+
+The irq_create_mapping() function must be called *atleast once*
+before any call to irq_find_mapping(), lest the descriptor will not
+be allocated.
+
+If the driver has the Linux IRQ number or the irq_data pointer, and
+needs to know the associated hwirq number (such as in the irq_chip
+callbacks) then it can be directly obtained from irq_data->hwirq.
+
+Types of irq_domain mappings
+============================
+
+There are several mechanisms available for reverse mapping from hwirq
+to Linux irq, and each mechanism uses a different allocation function.
+Which reverse map type should be used depends on the use case. Each
+of the reverse map types are described below:
+
+Linear
+------
+
+::
+
+ irq_domain_add_linear()
+ irq_domain_create_linear()
+
+The linear reverse map maintains a fixed size table indexed by the
+hwirq number. When a hwirq is mapped, an irq_desc is allocated for
+the hwirq, and the IRQ number is stored in the table.
+
+The Linear map is a good choice when the maximum number of hwirqs is
+fixed and a relatively small number (~ < 256). The advantages of this
+map are fixed time lookup for IRQ numbers, and irq_descs are only
+allocated for in-use IRQs. The disadvantage is that the table must be
+as large as the largest possible hwirq number.
+
+irq_domain_add_linear() and irq_domain_create_linear() are functionally
+equivalent, except for the first argument is different - the former
+accepts an Open Firmware specific 'struct device_node', while the latter
+accepts a more general abstraction 'struct fwnode_handle'.
+
+The majority of drivers should use the linear map.
+
+Tree
+----
+
+::
+
+ irq_domain_add_tree()
+ irq_domain_create_tree()
+
+The irq_domain maintains a radix tree map from hwirq numbers to Linux
+IRQs. When an hwirq is mapped, an irq_desc is allocated and the
+hwirq is used as the lookup key for the radix tree.
+
+The tree map is a good choice if the hwirq number can be very large
+since it doesn't need to allocate a table as large as the largest
+hwirq number. The disadvantage is that hwirq to IRQ number lookup is
+dependent on how many entries are in the table.
+
+irq_domain_add_tree() and irq_domain_create_tree() are functionally
+equivalent, except for the first argument is different - the former
+accepts an Open Firmware specific 'struct device_node', while the latter
+accepts a more general abstraction 'struct fwnode_handle'.
+
+Very few drivers should need this mapping.
+
+No Map
+------
+
+::
+
+ irq_domain_add_nomap()
+
+The No Map mapping is to be used when the hwirq number is
+programmable in the hardware. In this case it is best to program the
+Linux IRQ number into the hardware itself so that no mapping is
+required. Calling irq_create_direct_mapping() will allocate a Linux
+IRQ number and call the .map() callback so that driver can program the
+Linux IRQ number into the hardware.
+
+Most drivers cannot use this mapping.
+
+Legacy
+------
+
+::
+
+ irq_domain_add_simple()
+ irq_domain_add_legacy()
+ irq_domain_add_legacy_isa()
+
+The Legacy mapping is a special case for drivers that already have a
+range of irq_descs allocated for the hwirqs. It is used when the
+driver cannot be immediately converted to use the linear mapping. For
+example, many embedded system board support files use a set of #defines
+for IRQ numbers that are passed to struct device registrations. In that
+case the Linux IRQ numbers cannot be dynamically assigned and the legacy
+mapping should be used.
+
+The legacy map assumes a contiguous range of IRQ numbers has already
+been allocated for the controller and that the IRQ number can be
+calculated by adding a fixed offset to the hwirq number, and
+visa-versa. The disadvantage is that it requires the interrupt
+controller to manage IRQ allocations and it requires an irq_desc to be
+allocated for every hwirq, even if it is unused.
+
+The legacy map should only be used if fixed IRQ mappings must be
+supported. For example, ISA controllers would use the legacy map for
+mapping Linux IRQs 0-15 so that existing ISA drivers get the correct IRQ
+numbers.
+
+Most users of legacy mappings should use irq_domain_add_simple() which
+will use a legacy domain only if an IRQ range is supplied by the
+system and will otherwise use a linear domain mapping. The semantics
+of this call are such that if an IRQ range is specified then
+descriptors will be allocated on-the-fly for it, and if no range is
+specified it will fall through to irq_domain_add_linear() which means
+*no* irq descriptors will be allocated.
+
+A typical use case for simple domains is where an irqchip provider
+is supporting both dynamic and static IRQ assignments.
+
+In order to avoid ending up in a situation where a linear domain is
+used and no descriptor gets allocated it is very important to make sure
+that the driver using the simple domain call irq_create_mapping()
+before any irq_find_mapping() since the latter will actually work
+for the static IRQ assignment case.
+
+Hierarchy IRQ domain
+--------------------
+
+On some architectures, there may be multiple interrupt controllers
+involved in delivering an interrupt from the device to the target CPU.
+Let's look at a typical interrupt delivering path on x86 platforms::
+
+ Device --> IOAPIC -> Interrupt remapping Controller -> Local APIC -> CPU
+
+There are three interrupt controllers involved:
+
+1) IOAPIC controller
+2) Interrupt remapping controller
+3) Local APIC controller
+
+To support such a hardware topology and make software architecture match
+hardware architecture, an irq_domain data structure is built for each
+interrupt controller and those irq_domains are organized into hierarchy.
+When building irq_domain hierarchy, the irq_domain near to the device is
+child and the irq_domain near to CPU is parent. So a hierarchy structure
+as below will be built for the example above::
+
+ CPU Vector irq_domain (root irq_domain to manage CPU vectors)
+ ^
+ |
+ Interrupt Remapping irq_domain (manage irq_remapping entries)
+ ^
+ |
+ IOAPIC irq_domain (manage IOAPIC delivery entries/pins)
+
+There are four major interfaces to use hierarchy irq_domain:
+
+1) irq_domain_alloc_irqs(): allocate IRQ descriptors and interrupt
+ controller related resources to deliver these interrupts.
+2) irq_domain_free_irqs(): free IRQ descriptors and interrupt controller
+ related resources associated with these interrupts.
+3) irq_domain_activate_irq(): activate interrupt controller hardware to
+ deliver the interrupt.
+4) irq_domain_deactivate_irq(): deactivate interrupt controller hardware
+ to stop delivering the interrupt.
+
+Following changes are needed to support hierarchy irq_domain:
+
+1) a new field 'parent' is added to struct irq_domain; it's used to
+ maintain irq_domain hierarchy information.
+2) a new field 'parent_data' is added to struct irq_data; it's used to
+ build hierarchy irq_data to match hierarchy irq_domains. The irq_data
+ is used to store irq_domain pointer and hardware irq number.
+3) new callbacks are added to struct irq_domain_ops to support hierarchy
+ irq_domain operations.
+
+With support of hierarchy irq_domain and hierarchy irq_data ready, an
+irq_domain structure is built for each interrupt controller, and an
+irq_data structure is allocated for each irq_domain associated with an
+IRQ. Now we could go one step further to support stacked(hierarchy)
+irq_chip. That is, an irq_chip is associated with each irq_data along
+the hierarchy. A child irq_chip may implement a required action by
+itself or by cooperating with its parent irq_chip.
+
+With stacked irq_chip, interrupt controller driver only needs to deal
+with the hardware managed by itself and may ask for services from its
+parent irq_chip when needed. So we could achieve a much cleaner
+software architecture.
+
+For an interrupt controller driver to support hierarchy irq_domain, it
+needs to:
+
+1) Implement irq_domain_ops.alloc and irq_domain_ops.free
+2) Optionally implement irq_domain_ops.activate and
+ irq_domain_ops.deactivate.
+3) Optionally implement an irq_chip to manage the interrupt controller
+ hardware.
+4) No need to implement irq_domain_ops.map and irq_domain_ops.unmap,
+ they are unused with hierarchy irq_domain.
+
+Hierarchy irq_domain is in no way x86 specific, and is heavily used to
+support other architectures, such as ARM, ARM64 etc.
+
+Debugging
+=========
+
+Most of the internals of the IRQ subsystem are exposed in debugfs by
+turning CONFIG_GENERIC_IRQ_DEBUGFS on.
diff --git a/Documentation/irqflags-tracing.txt b/Documentation/core-api/irq/irqflags-tracing.rst
similarity index 100%
rename from Documentation/irqflags-tracing.txt
rename to Documentation/core-api/irq/irqflags-tracing.rst
diff --git a/Documentation/core-api/kobject.rst b/Documentation/core-api/kobject.rst
index 1f62d4d..e93dc8c 100644
--- a/Documentation/core-api/kobject.rst
+++ b/Documentation/core-api/kobject.rst
@@ -80,11 +80,11 @@
(such as assuming that the kobject is at the beginning of the structure)
and, instead, use the container_of() macro, found in ``<linux/kernel.h>``::
- container_of(pointer, type, member)
+ container_of(ptr, type, member)
where:
- * ``pointer`` is the pointer to the embedded kobject,
+ * ``ptr`` is the pointer to the embedded kobject,
* ``type`` is the type of the containing structure, and
* ``member`` is the name of the structure field to which ``pointer`` points.
@@ -140,7 +140,7 @@
int kobject_rename(struct kobject *kobj, const char *new_name);
-kobject_rename does not perform any locking or have a solid notion of
+kobject_rename() does not perform any locking or have a solid notion of
what names are valid so the caller must provide their own sanity checking
and serialization.
@@ -210,7 +210,7 @@
If all that you want to use a kobject for is to provide a reference counter
for your structure, please use the struct kref instead; a kobject would be
overkill. For more information on how to use struct kref, please see the
-file Documentation/kref.txt in the Linux kernel source tree.
+file Documentation/core-api/kref.rst in the Linux kernel source tree.
Creating "simple" kobjects
@@ -222,17 +222,17 @@
exception where a single kobject should be created. To create such an
entry, use the function::
- struct kobject *kobject_create_and_add(char *name, struct kobject *parent);
+ struct kobject *kobject_create_and_add(const char *name, struct kobject *parent);
This function will create a kobject and place it in sysfs in the location
underneath the specified parent kobject. To create simple attributes
associated with this kobject, use::
- int sysfs_create_file(struct kobject *kobj, struct attribute *attr);
+ int sysfs_create_file(struct kobject *kobj, const struct attribute *attr);
or::
- int sysfs_create_group(struct kobject *kobj, struct attribute_group *grp);
+ int sysfs_create_group(struct kobject *kobj, const struct attribute_group *grp);
Both types of attributes used here, with a kobject that has been created
with the kobject_create_and_add(), can be of type kobj_attribute, so no
@@ -300,8 +300,10 @@
void (*release)(struct kobject *kobj);
const struct sysfs_ops *sysfs_ops;
struct attribute **default_attrs;
+ const struct attribute_group **default_groups;
const struct kobj_ns_type_operations *(*child_ns_type)(struct kobject *kobj);
const void *(*namespace)(struct kobject *kobj);
+ void (*get_ownership)(struct kobject *kobj, kuid_t *uid, kgid_t *gid);
};
This structure is used to describe a particular type of kobject (or, more
@@ -352,12 +354,12 @@
kset use::
struct kset *kset_create_and_add(const char *name,
- struct kset_uevent_ops *u,
- struct kobject *parent);
+ const struct kset_uevent_ops *uevent_ops,
+ struct kobject *parent_kobj);
When you are finished with the kset, call::
- void kset_unregister(struct kset *kset);
+ void kset_unregister(struct kset *k);
to destroy it. This removes the kset from sysfs and decrements its reference
count. When the reference count goes to zero, the kset will be released.
@@ -371,9 +373,9 @@
associated with it, it can use the struct kset_uevent_ops to handle it::
struct kset_uevent_ops {
- int (*filter)(struct kset *kset, struct kobject *kobj);
- const char *(*name)(struct kset *kset, struct kobject *kobj);
- int (*uevent)(struct kset *kset, struct kobject *kobj,
+ int (* const filter)(struct kset *kset, struct kobject *kobj);
+ const char *(* const name)(struct kset *kset, struct kobject *kobj);
+ int (* const uevent)(struct kset *kset, struct kobject *kobj,
struct kobj_uevent_env *env);
};
diff --git a/Documentation/kref.txt b/Documentation/core-api/kref.rst
similarity index 100%
rename from Documentation/kref.txt
rename to Documentation/core-api/kref.rst
diff --git a/Documentation/core-api/padata.rst b/Documentation/core-api/padata.rst
index 9a24c11..0830e5b 100644
--- a/Documentation/core-api/padata.rst
+++ b/Documentation/core-api/padata.rst
@@ -4,23 +4,26 @@
The padata parallel execution mechanism
=======================================
-:Date: December 2019
+:Date: May 2020
Padata is a mechanism by which the kernel can farm jobs out to be done in
-parallel on multiple CPUs while retaining their ordering. It was developed for
-use with the IPsec code, which needs to be able to perform encryption and
-decryption on large numbers of packets without reordering those packets. The
-crypto developers made a point of writing padata in a sufficiently general
-fashion that it could be put to other uses as well.
+parallel on multiple CPUs while optionally retaining their ordering.
-Usage
-=====
+It was originally developed for IPsec, which needs to perform encryption and
+decryption on large numbers of packets without reordering those packets. This
+is currently the sole consumer of padata's serialized job support.
+
+Padata also supports multithreaded jobs, splitting up the job evenly while load
+balancing and coordinating between threads.
+
+Running Serialized Jobs
+=======================
Initializing
------------
-The first step in using padata is to set up a padata_instance structure for
-overall control of how jobs are to be run::
+The first step in using padata to run serialized jobs is to set up a
+padata_instance structure for overall control of how jobs are to be run::
#include <linux/padata.h>
@@ -162,6 +165,24 @@
It is the user's responsibility to ensure all outstanding jobs are complete
before any of the above are called.
+Running Multithreaded Jobs
+==========================
+
+A multithreaded job has a main thread and zero or more helper threads, with the
+main thread participating in the job and then waiting until all helpers have
+finished. padata splits the job into units called chunks, where a chunk is a
+piece of the job that one thread completes in one call to the thread function.
+
+A user has to do three things to run a multithreaded job. First, describe the
+job by defining a padata_mt_job structure, which is explained in the Interface
+section. This includes a pointer to the thread function, which padata will
+call each time it assigns a job chunk to a thread. Then, define the thread
+function, which accepts three arguments, ``start``, ``end``, and ``arg``, where
+the first two delimit the range that the thread operates on and the last is a
+pointer to the job's shared state, if any. Prepare the shared state, which is
+typically allocated on the main thread's stack. Last, call
+padata_do_multithreaded(), which will return once the job is finished.
+
Interface
=========
diff --git a/Documentation/core-api/pin_user_pages.rst b/Documentation/core-api/pin_user_pages.rst
index 2e939ff..6068266 100644
--- a/Documentation/core-api/pin_user_pages.rst
+++ b/Documentation/core-api/pin_user_pages.rst
@@ -148,23 +148,46 @@
because DAX pages do not have a separate page cache, and so "pinning" implies
locking down file system blocks, which is not (yet) supported in that way.
-CASE 3: Hardware with page faulting support
--------------------------------------------
-Here, a well-written driver doesn't normally need to pin pages at all. However,
-if the driver does choose to do so, it can register MMU notifiers for the range,
-and will be called back upon invalidation. Either way (avoiding page pinning, or
-using MMU notifiers to unpin upon request), there is proper synchronization with
-both filesystem and mm (page_mkclean(), munmap(), etc).
+CASE 3: MMU notifier registration, with or without page faulting hardware
+-------------------------------------------------------------------------
+Device drivers can pin pages via get_user_pages*(), and register for mmu
+notifier callbacks for the memory range. Then, upon receiving a notifier
+"invalidate range" callback , stop the device from using the range, and unpin
+the pages. There may be other possible schemes, such as for example explicitly
+synchronizing against pending IO, that accomplish approximately the same thing.
-Therefore, neither flag needs to be set.
+Or, if the hardware supports replayable page faults, then the device driver can
+avoid pinning entirely (this is ideal), as follows: register for mmu notifier
+callbacks as above, but instead of stopping the device and unpinning in the
+callback, simply remove the range from the device's page tables.
-In this case, ideally, neither get_user_pages() nor pin_user_pages() should be
-called. Instead, the software should be written so that it does not pin pages.
-This allows mm and filesystems to operate more efficiently and reliably.
+Either way, as long as the driver unpins the pages upon mmu notifier callback,
+then there is proper synchronization with both filesystem and mm
+(page_mkclean(), munmap(), etc). Therefore, neither flag needs to be set.
CASE 4: Pinning for struct page manipulation only
-------------------------------------------------
-Here, normal GUP calls are sufficient, so neither flag needs to be set.
+If only struct page data (as opposed to the actual memory contents that a page
+is tracking) is affected, then normal GUP calls are sufficient, and neither flag
+needs to be set.
+
+CASE 5: Pinning in order to write to the data within the page
+-------------------------------------------------------------
+Even though neither DMA nor Direct IO is involved, just a simple case of "pin,
+write to a page's data, unpin" can cause a problem. Case 5 may be considered a
+superset of Case 1, plus Case 2, plus anything that invokes that pattern. In
+other words, if the code is neither Case 1 nor Case 2, it may still require
+FOLL_PIN, for patterns like this:
+
+Correct (uses FOLL_PIN calls):
+ pin_user_pages()
+ write to the data within the pages
+ unpin_user_pages()
+
+INCORRECT (uses FOLL_GET calls):
+ get_user_pages()
+ write to the data within the pages
+ put_page()
page_maybe_dma_pinned(): the whole point of pinning
===================================================
diff --git a/Documentation/core-api/printk-basics.rst b/Documentation/core-api/printk-basics.rst
new file mode 100644
index 0000000..563a9ce
--- /dev/null
+++ b/Documentation/core-api/printk-basics.rst
@@ -0,0 +1,115 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===========================
+Message logging with printk
+===========================
+
+printk() is one of the most widely known functions in the Linux kernel. It's the
+standard tool we have for printing messages and usually the most basic way of
+tracing and debugging. If you're familiar with printf(3) you can tell printk()
+is based on it, although it has some functional differences:
+
+ - printk() messages can specify a log level.
+
+ - the format string, while largely compatible with C99, doesn't follow the
+ exact same specification. It has some extensions and a few limitations
+ (no ``%n`` or floating point conversion specifiers). See :ref:`How to get
+ printk format specifiers right <printk-specifiers>`.
+
+All printk() messages are printed to the kernel log buffer, which is a ring
+buffer exported to userspace through /dev/kmsg. The usual way to read it is
+using ``dmesg``.
+
+printk() is typically used like this::
+
+ printk(KERN_INFO "Message: %s\n", arg);
+
+where ``KERN_INFO`` is the log level (note that it's concatenated to the format
+string, the log level is not a separate argument). The available log levels are:
+
++----------------+--------+-----------------------------------------------+
+| Name | String | Alias function |
++================+========+===============================================+
+| KERN_EMERG | "0" | pr_emerg() |
++----------------+--------+-----------------------------------------------+
+| KERN_ALERT | "1" | pr_alert() |
++----------------+--------+-----------------------------------------------+
+| KERN_CRIT | "2" | pr_crit() |
++----------------+--------+-----------------------------------------------+
+| KERN_ERR | "3" | pr_err() |
++----------------+--------+-----------------------------------------------+
+| KERN_WARNING | "4" | pr_warn() |
++----------------+--------+-----------------------------------------------+
+| KERN_NOTICE | "5" | pr_notice() |
++----------------+--------+-----------------------------------------------+
+| KERN_INFO | "6" | pr_info() |
++----------------+--------+-----------------------------------------------+
+| KERN_DEBUG | "7" | pr_debug() and pr_devel() if DEBUG is defined |
++----------------+--------+-----------------------------------------------+
+| KERN_DEFAULT | "" | |
++----------------+--------+-----------------------------------------------+
+| KERN_CONT | "c" | pr_cont() |
++----------------+--------+-----------------------------------------------+
+
+
+The log level specifies the importance of a message. The kernel decides whether
+to show the message immediately (printing it to the current console) depending
+on its log level and the current *console_loglevel* (a kernel variable). If the
+message priority is higher (lower log level value) than the *console_loglevel*
+the message will be printed to the console.
+
+If the log level is omitted, the message is printed with ``KERN_DEFAULT``
+level.
+
+You can check the current *console_loglevel* with::
+
+ $ cat /proc/sys/kernel/printk
+ 4 4 1 7
+
+The result shows the *current*, *default*, *minimum* and *boot-time-default* log
+levels.
+
+To change the current console_loglevel simply write the the desired level to
+``/proc/sys/kernel/printk``. For example, to print all messages to the console::
+
+ # echo 8 > /proc/sys/kernel/printk
+
+Another way, using ``dmesg``::
+
+ # dmesg -n 5
+
+sets the console_loglevel to print KERN_WARNING (4) or more severe messages to
+console. See ``dmesg(1)`` for more information.
+
+As an alternative to printk() you can use the ``pr_*()`` aliases for
+logging. This family of macros embed the log level in the macro names. For
+example::
+
+ pr_info("Info message no. %d\n", msg_num);
+
+prints a ``KERN_INFO`` message.
+
+Besides being more concise than the equivalent printk() calls, they can use a
+common definition for the format string through the pr_fmt() macro. For
+instance, defining this at the top of a source file (before any ``#include``
+directive)::
+
+ #define pr_fmt(fmt) "%s:%s: " fmt, KBUILD_MODNAME, __func__
+
+would prefix every pr_*() message in that file with the module and function name
+that originated the message.
+
+For debugging purposes there are also two conditionally-compiled macros:
+pr_debug() and pr_devel(), which are compiled-out unless ``DEBUG`` (or
+also ``CONFIG_DYNAMIC_DEBUG`` in the case of pr_debug()) is defined.
+
+
+Function reference
+==================
+
+.. kernel-doc:: kernel/printk/printk.c
+ :functions: printk
+
+.. kernel-doc:: include/linux/printk.h
+ :functions: pr_emerg pr_alert pr_crit pr_err pr_warn pr_notice pr_info
+ pr_fmt pr_debug pr_devel pr_cont
diff --git a/Documentation/core-api/printk-formats.rst b/Documentation/core-api/printk-formats.rst
index 8ebe46b1..8c9aba2 100644
--- a/Documentation/core-api/printk-formats.rst
+++ b/Documentation/core-api/printk-formats.rst
@@ -2,6 +2,8 @@
How to get printk format specifiers right
=========================================
+.. _printk-specifiers:
+
:Author: Randy Dunlap <rdunlap@infradead.org>
:Author: Andrew Murray <amurray@mpc-data.co.uk>
@@ -112,6 +114,20 @@
consideration the effect of compiler optimisations which may occur
when tail-calls are used and marked with the noreturn GCC attribute.
+Probed Pointers from BPF / tracing
+----------------------------------
+
+::
+
+ %pks kernel string
+ %pus user string
+
+The ``k`` and ``u`` specifiers are used for printing prior probed memory from
+either kernel memory (k) or user memory (u). The subsequent ``s`` specifier
+results in printing a string. For direct use in regular vsnprintf() the (k)
+and (u) annotation is ignored, however, when used out of BPF's bpf_trace_printk(),
+for example, it reads the memory it is pointing to without faulting.
+
Kernel Pointers
---------------
@@ -468,21 +484,23 @@
%pfwf /ocp@68000000/i2c@48072000/camera@10/port/endpoint - Full name
%pfwP endpoint - Node name
-Time and date (struct rtc_time)
--------------------------------
+Time and date
+-------------
::
- %ptR YYYY-mm-ddTHH:MM:SS
- %ptRd YYYY-mm-dd
- %ptRt HH:MM:SS
- %ptR[dt][r]
+ %pt[RT] YYYY-mm-ddTHH:MM:SS
+ %pt[RT]d YYYY-mm-dd
+ %pt[RT]t HH:MM:SS
+ %pt[RT][dt][r]
-For printing date and time as represented by struct rtc_time structure in
-human readable format.
+For printing date and time as represented by
+ R struct rtc_time structure
+ T time64_t type
+in human readable format.
-By default year will be incremented by 1900 and month by 1. Use %ptRr (raw)
-to suppress this behaviour.
+By default year will be incremented by 1900 and month by 1.
+Use %pt[RT]r (raw) to suppress this behaviour.
Passed by reference.
diff --git a/Documentation/core-api/protection-keys.rst b/Documentation/core-api/protection-keys.rst
index 49d9833..ec575e7 100644
--- a/Documentation/core-api/protection-keys.rst
+++ b/Documentation/core-api/protection-keys.rst
@@ -5,8 +5,9 @@
======================
Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
-which is found on Intel's Skylake "Scalable Processor" Server CPUs.
-It will be avalable in future non-server parts.
+which is found on Intel's Skylake (and later) "Scalable Processor"
+Server CPUs. It will be available in future non-server Intel parts
+and future AMD processors.
For anyone wishing to test or use this feature, it is available in
Amazon's EC2 C5 instances and is known to work there using an Ubuntu
diff --git a/Documentation/core-api/rbtree.rst b/Documentation/core-api/rbtree.rst
new file mode 100644
index 0000000..6b88837
--- /dev/null
+++ b/Documentation/core-api/rbtree.rst
@@ -0,0 +1,429 @@
+=================================
+Red-black Trees (rbtree) in Linux
+=================================
+
+
+:Date: January 18, 2007
+:Author: Rob Landley <rob@landley.net>
+
+What are red-black trees, and what are they for?
+------------------------------------------------
+
+Red-black trees are a type of self-balancing binary search tree, used for
+storing sortable key/value data pairs. This differs from radix trees (which
+are used to efficiently store sparse arrays and thus use long integer indexes
+to insert/access/delete nodes) and hash tables (which are not kept sorted to
+be easily traversed in order, and must be tuned for a specific size and
+hash function where rbtrees scale gracefully storing arbitrary keys).
+
+Red-black trees are similar to AVL trees, but provide faster real-time bounded
+worst case performance for insertion and deletion (at most two rotations and
+three rotations, respectively, to balance the tree), with slightly slower
+(but still O(log n)) lookup time.
+
+To quote Linux Weekly News:
+
+ There are a number of red-black trees in use in the kernel.
+ The deadline and CFQ I/O schedulers employ rbtrees to
+ track requests; the packet CD/DVD driver does the same.
+ The high-resolution timer code uses an rbtree to organize outstanding
+ timer requests. The ext3 filesystem tracks directory entries in a
+ red-black tree. Virtual memory areas (VMAs) are tracked with red-black
+ trees, as are epoll file descriptors, cryptographic keys, and network
+ packets in the "hierarchical token bucket" scheduler.
+
+This document covers use of the Linux rbtree implementation. For more
+information on the nature and implementation of Red Black Trees, see:
+
+ Linux Weekly News article on red-black trees
+ https://lwn.net/Articles/184495/
+
+ Wikipedia entry on red-black trees
+ https://en.wikipedia.org/wiki/Red-black_tree
+
+Linux implementation of red-black trees
+---------------------------------------
+
+Linux's rbtree implementation lives in the file "lib/rbtree.c". To use it,
+"#include <linux/rbtree.h>".
+
+The Linux rbtree implementation is optimized for speed, and thus has one
+less layer of indirection (and better cache locality) than more traditional
+tree implementations. Instead of using pointers to separate rb_node and data
+structures, each instance of struct rb_node is embedded in the data structure
+it organizes. And instead of using a comparison callback function pointer,
+users are expected to write their own tree search and insert functions
+which call the provided rbtree functions. Locking is also left up to the
+user of the rbtree code.
+
+Creating a new rbtree
+---------------------
+
+Data nodes in an rbtree tree are structures containing a struct rb_node member::
+
+ struct mytype {
+ struct rb_node node;
+ char *keystring;
+ };
+
+When dealing with a pointer to the embedded struct rb_node, the containing data
+structure may be accessed with the standard container_of() macro. In addition,
+individual members may be accessed directly via rb_entry(node, type, member).
+
+At the root of each rbtree is an rb_root structure, which is initialized to be
+empty via:
+
+ struct rb_root mytree = RB_ROOT;
+
+Searching for a value in an rbtree
+----------------------------------
+
+Writing a search function for your tree is fairly straightforward: start at the
+root, compare each value, and follow the left or right branch as necessary.
+
+Example::
+
+ struct mytype *my_search(struct rb_root *root, char *string)
+ {
+ struct rb_node *node = root->rb_node;
+
+ while (node) {
+ struct mytype *data = container_of(node, struct mytype, node);
+ int result;
+
+ result = strcmp(string, data->keystring);
+
+ if (result < 0)
+ node = node->rb_left;
+ else if (result > 0)
+ node = node->rb_right;
+ else
+ return data;
+ }
+ return NULL;
+ }
+
+Inserting data into an rbtree
+-----------------------------
+
+Inserting data in the tree involves first searching for the place to insert the
+new node, then inserting the node and rebalancing ("recoloring") the tree.
+
+The search for insertion differs from the previous search by finding the
+location of the pointer on which to graft the new node. The new node also
+needs a link to its parent node for rebalancing purposes.
+
+Example::
+
+ int my_insert(struct rb_root *root, struct mytype *data)
+ {
+ struct rb_node **new = &(root->rb_node), *parent = NULL;
+
+ /* Figure out where to put new node */
+ while (*new) {
+ struct mytype *this = container_of(*new, struct mytype, node);
+ int result = strcmp(data->keystring, this->keystring);
+
+ parent = *new;
+ if (result < 0)
+ new = &((*new)->rb_left);
+ else if (result > 0)
+ new = &((*new)->rb_right);
+ else
+ return FALSE;
+ }
+
+ /* Add new node and rebalance tree. */
+ rb_link_node(&data->node, parent, new);
+ rb_insert_color(&data->node, root);
+
+ return TRUE;
+ }
+
+Removing or replacing existing data in an rbtree
+------------------------------------------------
+
+To remove an existing node from a tree, call::
+
+ void rb_erase(struct rb_node *victim, struct rb_root *tree);
+
+Example::
+
+ struct mytype *data = mysearch(&mytree, "walrus");
+
+ if (data) {
+ rb_erase(&data->node, &mytree);
+ myfree(data);
+ }
+
+To replace an existing node in a tree with a new one with the same key, call::
+
+ void rb_replace_node(struct rb_node *old, struct rb_node *new,
+ struct rb_root *tree);
+
+Replacing a node this way does not re-sort the tree: If the new node doesn't
+have the same key as the old node, the rbtree will probably become corrupted.
+
+Iterating through the elements stored in an rbtree (in sort order)
+------------------------------------------------------------------
+
+Four functions are provided for iterating through an rbtree's contents in
+sorted order. These work on arbitrary trees, and should not need to be
+modified or wrapped (except for locking purposes)::
+
+ struct rb_node *rb_first(struct rb_root *tree);
+ struct rb_node *rb_last(struct rb_root *tree);
+ struct rb_node *rb_next(struct rb_node *node);
+ struct rb_node *rb_prev(struct rb_node *node);
+
+To start iterating, call rb_first() or rb_last() with a pointer to the root
+of the tree, which will return a pointer to the node structure contained in
+the first or last element in the tree. To continue, fetch the next or previous
+node by calling rb_next() or rb_prev() on the current node. This will return
+NULL when there are no more nodes left.
+
+The iterator functions return a pointer to the embedded struct rb_node, from
+which the containing data structure may be accessed with the container_of()
+macro, and individual members may be accessed directly via
+rb_entry(node, type, member).
+
+Example::
+
+ struct rb_node *node;
+ for (node = rb_first(&mytree); node; node = rb_next(node))
+ printk("key=%s\n", rb_entry(node, struct mytype, node)->keystring);
+
+Cached rbtrees
+--------------
+
+Computing the leftmost (smallest) node is quite a common task for binary
+search trees, such as for traversals or users relying on a the particular
+order for their own logic. To this end, users can use 'struct rb_root_cached'
+to optimize O(logN) rb_first() calls to a simple pointer fetch avoiding
+potentially expensive tree iterations. This is done at negligible runtime
+overhead for maintanence; albeit larger memory footprint.
+
+Similar to the rb_root structure, cached rbtrees are initialized to be
+empty via::
+
+ struct rb_root_cached mytree = RB_ROOT_CACHED;
+
+Cached rbtree is simply a regular rb_root with an extra pointer to cache the
+leftmost node. This allows rb_root_cached to exist wherever rb_root does,
+which permits augmented trees to be supported as well as only a few extra
+interfaces::
+
+ struct rb_node *rb_first_cached(struct rb_root_cached *tree);
+ void rb_insert_color_cached(struct rb_node *, struct rb_root_cached *, bool);
+ void rb_erase_cached(struct rb_node *node, struct rb_root_cached *);
+
+Both insert and erase calls have their respective counterpart of augmented
+trees::
+
+ void rb_insert_augmented_cached(struct rb_node *node, struct rb_root_cached *,
+ bool, struct rb_augment_callbacks *);
+ void rb_erase_augmented_cached(struct rb_node *, struct rb_root_cached *,
+ struct rb_augment_callbacks *);
+
+
+Support for Augmented rbtrees
+-----------------------------
+
+Augmented rbtree is an rbtree with "some" additional data stored in
+each node, where the additional data for node N must be a function of
+the contents of all nodes in the subtree rooted at N. This data can
+be used to augment some new functionality to rbtree. Augmented rbtree
+is an optional feature built on top of basic rbtree infrastructure.
+An rbtree user who wants this feature will have to call the augmentation
+functions with the user provided augmentation callback when inserting
+and erasing nodes.
+
+C files implementing augmented rbtree manipulation must include
+<linux/rbtree_augmented.h> instead of <linux/rbtree.h>. Note that
+linux/rbtree_augmented.h exposes some rbtree implementations details
+you are not expected to rely on; please stick to the documented APIs
+there and do not include <linux/rbtree_augmented.h> from header files
+either so as to minimize chances of your users accidentally relying on
+such implementation details.
+
+On insertion, the user must update the augmented information on the path
+leading to the inserted node, then call rb_link_node() as usual and
+rb_augment_inserted() instead of the usual rb_insert_color() call.
+If rb_augment_inserted() rebalances the rbtree, it will callback into
+a user provided function to update the augmented information on the
+affected subtrees.
+
+When erasing a node, the user must call rb_erase_augmented() instead of
+rb_erase(). rb_erase_augmented() calls back into user provided functions
+to updated the augmented information on affected subtrees.
+
+In both cases, the callbacks are provided through struct rb_augment_callbacks.
+3 callbacks must be defined:
+
+- A propagation callback, which updates the augmented value for a given
+ node and its ancestors, up to a given stop point (or NULL to update
+ all the way to the root).
+
+- A copy callback, which copies the augmented value for a given subtree
+ to a newly assigned subtree root.
+
+- A tree rotation callback, which copies the augmented value for a given
+ subtree to a newly assigned subtree root AND recomputes the augmented
+ information for the former subtree root.
+
+The compiled code for rb_erase_augmented() may inline the propagation and
+copy callbacks, which results in a large function, so each augmented rbtree
+user should have a single rb_erase_augmented() call site in order to limit
+compiled code size.
+
+
+Sample usage
+^^^^^^^^^^^^
+
+Interval tree is an example of augmented rb tree. Reference -
+"Introduction to Algorithms" by Cormen, Leiserson, Rivest and Stein.
+More details about interval trees:
+
+Classical rbtree has a single key and it cannot be directly used to store
+interval ranges like [lo:hi] and do a quick lookup for any overlap with a new
+lo:hi or to find whether there is an exact match for a new lo:hi.
+
+However, rbtree can be augmented to store such interval ranges in a structured
+way making it possible to do efficient lookup and exact match.
+
+This "extra information" stored in each node is the maximum hi
+(max_hi) value among all the nodes that are its descendants. This
+information can be maintained at each node just be looking at the node
+and its immediate children. And this will be used in O(log n) lookup
+for lowest match (lowest start address among all possible matches)
+with something like::
+
+ struct interval_tree_node *
+ interval_tree_first_match(struct rb_root *root,
+ unsigned long start, unsigned long last)
+ {
+ struct interval_tree_node *node;
+
+ if (!root->rb_node)
+ return NULL;
+ node = rb_entry(root->rb_node, struct interval_tree_node, rb);
+
+ while (true) {
+ if (node->rb.rb_left) {
+ struct interval_tree_node *left =
+ rb_entry(node->rb.rb_left,
+ struct interval_tree_node, rb);
+ if (left->__subtree_last >= start) {
+ /*
+ * Some nodes in left subtree satisfy Cond2.
+ * Iterate to find the leftmost such node N.
+ * If it also satisfies Cond1, that's the match
+ * we are looking for. Otherwise, there is no
+ * matching interval as nodes to the right of N
+ * can't satisfy Cond1 either.
+ */
+ node = left;
+ continue;
+ }
+ }
+ if (node->start <= last) { /* Cond1 */
+ if (node->last >= start) /* Cond2 */
+ return node; /* node is leftmost match */
+ if (node->rb.rb_right) {
+ node = rb_entry(node->rb.rb_right,
+ struct interval_tree_node, rb);
+ if (node->__subtree_last >= start)
+ continue;
+ }
+ }
+ return NULL; /* No match */
+ }
+ }
+
+Insertion/removal are defined using the following augmented callbacks::
+
+ static inline unsigned long
+ compute_subtree_last(struct interval_tree_node *node)
+ {
+ unsigned long max = node->last, subtree_last;
+ if (node->rb.rb_left) {
+ subtree_last = rb_entry(node->rb.rb_left,
+ struct interval_tree_node, rb)->__subtree_last;
+ if (max < subtree_last)
+ max = subtree_last;
+ }
+ if (node->rb.rb_right) {
+ subtree_last = rb_entry(node->rb.rb_right,
+ struct interval_tree_node, rb)->__subtree_last;
+ if (max < subtree_last)
+ max = subtree_last;
+ }
+ return max;
+ }
+
+ static void augment_propagate(struct rb_node *rb, struct rb_node *stop)
+ {
+ while (rb != stop) {
+ struct interval_tree_node *node =
+ rb_entry(rb, struct interval_tree_node, rb);
+ unsigned long subtree_last = compute_subtree_last(node);
+ if (node->__subtree_last == subtree_last)
+ break;
+ node->__subtree_last = subtree_last;
+ rb = rb_parent(&node->rb);
+ }
+ }
+
+ static void augment_copy(struct rb_node *rb_old, struct rb_node *rb_new)
+ {
+ struct interval_tree_node *old =
+ rb_entry(rb_old, struct interval_tree_node, rb);
+ struct interval_tree_node *new =
+ rb_entry(rb_new, struct interval_tree_node, rb);
+
+ new->__subtree_last = old->__subtree_last;
+ }
+
+ static void augment_rotate(struct rb_node *rb_old, struct rb_node *rb_new)
+ {
+ struct interval_tree_node *old =
+ rb_entry(rb_old, struct interval_tree_node, rb);
+ struct interval_tree_node *new =
+ rb_entry(rb_new, struct interval_tree_node, rb);
+
+ new->__subtree_last = old->__subtree_last;
+ old->__subtree_last = compute_subtree_last(old);
+ }
+
+ static const struct rb_augment_callbacks augment_callbacks = {
+ augment_propagate, augment_copy, augment_rotate
+ };
+
+ void interval_tree_insert(struct interval_tree_node *node,
+ struct rb_root *root)
+ {
+ struct rb_node **link = &root->rb_node, *rb_parent = NULL;
+ unsigned long start = node->start, last = node->last;
+ struct interval_tree_node *parent;
+
+ while (*link) {
+ rb_parent = *link;
+ parent = rb_entry(rb_parent, struct interval_tree_node, rb);
+ if (parent->__subtree_last < last)
+ parent->__subtree_last = last;
+ if (start < parent->start)
+ link = &parent->rb.rb_left;
+ else
+ link = &parent->rb.rb_right;
+ }
+
+ node->__subtree_last = last;
+ rb_link_node(&node->rb, rb_parent, link);
+ rb_insert_augmented(&node->rb, root, &augment_callbacks);
+ }
+
+ void interval_tree_remove(struct interval_tree_node *node,
+ struct rb_root *root)
+ {
+ rb_erase_augmented(&node->rb, root, &augment_callbacks);
+ }
diff --git a/Documentation/dev-tools/coccinelle.rst b/Documentation/dev-tools/coccinelle.rst
index 00a3409..70274c3 100644
--- a/Documentation/dev-tools/coccinelle.rst
+++ b/Documentation/dev-tools/coccinelle.rst
@@ -14,7 +14,7 @@
tree-wide patches and detection of problematic programming patterns.
Getting Coccinelle
--------------------
+------------------
The semantic patches included in the kernel use features and options
which are provided by Coccinelle version 1.0.0-rc11 and above.
@@ -56,7 +56,7 @@
https://github.com/coccinelle/coccinelle/blob/master/install.txt
Supplemental documentation
----------------------------
+--------------------------
For supplemental documentation refer to the wiki:
@@ -128,7 +128,7 @@
make coccicheck MODE=report V=1
Coccinelle parallelization
----------------------------
+--------------------------
By default, coccicheck tries to run as parallel as possible. To change
the parallelism, set the J= variable. For example, to run across 4 CPUs::
@@ -333,7 +333,7 @@
// Requires: 1.0.5
Proposing new semantic patches
--------------------------------
+------------------------------
New semantic patches can be proposed and submitted by kernel
developers. For sake of clarity, they should be organized in the
diff --git a/Documentation/dev-tools/gdb-kernel-debugging.rst b/Documentation/dev-tools/gdb-kernel-debugging.rst
index 19df792..4756f6b 100644
--- a/Documentation/dev-tools/gdb-kernel-debugging.rst
+++ b/Documentation/dev-tools/gdb-kernel-debugging.rst
@@ -24,7 +24,7 @@
- Create a virtual Linux machine for QEMU/KVM (see www.linux-kvm.org and
www.qemu.org for more details). For cross-development,
- http://landley.net/aboriginal/bin keeps a pool of machine images and
+ https://landley.net/aboriginal/bin keeps a pool of machine images and
toolchains that can be helpful to start from.
- Build the kernel with CONFIG_GDB_SCRIPTS enabled, but leave
diff --git a/Documentation/dev-tools/index.rst b/Documentation/dev-tools/index.rst
index 09dee10..f7809c7 100644
--- a/Documentation/dev-tools/index.rst
+++ b/Documentation/dev-tools/index.rst
@@ -21,6 +21,7 @@
kasan
ubsan
kmemleak
+ kcsan
gdb-kernel-debugging
kgdb
kselftest
diff --git a/Documentation/dev-tools/kcov.rst b/Documentation/dev-tools/kcov.rst
index 1c4e182..8548b0b 100644
--- a/Documentation/dev-tools/kcov.rst
+++ b/Documentation/dev-tools/kcov.rst
@@ -217,14 +217,15 @@
threads: the global ones, that are spawned during kernel boot in a limited
number of instances (e.g. one USB hub_event() worker thread is spawned per
USB HCD); and the local ones, that are spawned when a user interacts with
-some kernel interface (e.g. vhost workers).
+some kernel interface (e.g. vhost workers); as well as from soft
+interrupts.
-To enable collecting coverage from a global background thread, a unique
-global handle must be assigned and passed to the corresponding
-kcov_remote_start() call. Then a userspace process can pass a list of such
-handles to the KCOV_REMOTE_ENABLE ioctl in the handles array field of the
-kcov_remote_arg struct. This will attach the used kcov device to the code
-sections, that are referenced by those handles.
+To enable collecting coverage from a global background thread or from a
+softirq, a unique global handle must be assigned and passed to the
+corresponding kcov_remote_start() call. Then a userspace process can pass
+a list of such handles to the KCOV_REMOTE_ENABLE ioctl in the handles
+array field of the kcov_remote_arg struct. This will attach the used kcov
+device to the code sections, that are referenced by those handles.
Since there might be many local background threads spawned from different
userspace processes, we can't use a single global handle per annotation.
@@ -242,7 +243,7 @@
currently reserved and must be zero. In the future the number of bytes
used for the subsystem or handle ids might be increased.
-When a particular userspace proccess collects coverage by via a common
+When a particular userspace proccess collects coverage via a common
handle, kcov will collect coverage for each code section that is annotated
to use the common handle obtained as kcov_handle from the current
task_struct. However non common handles allow to collect coverage
diff --git a/Documentation/dev-tools/kcsan.rst b/Documentation/dev-tools/kcsan.rst
new file mode 100644
index 0000000..ce4bbd9
--- /dev/null
+++ b/Documentation/dev-tools/kcsan.rst
@@ -0,0 +1,321 @@
+The Kernel Concurrency Sanitizer (KCSAN)
+========================================
+
+The Kernel Concurrency Sanitizer (KCSAN) is a dynamic race detector, which
+relies on compile-time instrumentation, and uses a watchpoint-based sampling
+approach to detect races. KCSAN's primary purpose is to detect `data races`_.
+
+Usage
+-----
+
+KCSAN requires Clang version 11 or later.
+
+To enable KCSAN configure the kernel with::
+
+ CONFIG_KCSAN = y
+
+KCSAN provides several other configuration options to customize behaviour (see
+the respective help text in ``lib/Kconfig.kcsan`` for more info).
+
+Error reports
+~~~~~~~~~~~~~
+
+A typical data race report looks like this::
+
+ ==================================================================
+ BUG: KCSAN: data-race in generic_permission / kernfs_refresh_inode
+
+ write to 0xffff8fee4c40700c of 4 bytes by task 175 on cpu 4:
+ kernfs_refresh_inode+0x70/0x170
+ kernfs_iop_permission+0x4f/0x90
+ inode_permission+0x190/0x200
+ link_path_walk.part.0+0x503/0x8e0
+ path_lookupat.isra.0+0x69/0x4d0
+ filename_lookup+0x136/0x280
+ user_path_at_empty+0x47/0x60
+ vfs_statx+0x9b/0x130
+ __do_sys_newlstat+0x50/0xb0
+ __x64_sys_newlstat+0x37/0x50
+ do_syscall_64+0x85/0x260
+ entry_SYSCALL_64_after_hwframe+0x44/0xa9
+
+ read to 0xffff8fee4c40700c of 4 bytes by task 166 on cpu 6:
+ generic_permission+0x5b/0x2a0
+ kernfs_iop_permission+0x66/0x90
+ inode_permission+0x190/0x200
+ link_path_walk.part.0+0x503/0x8e0
+ path_lookupat.isra.0+0x69/0x4d0
+ filename_lookup+0x136/0x280
+ user_path_at_empty+0x47/0x60
+ do_faccessat+0x11a/0x390
+ __x64_sys_access+0x3c/0x50
+ do_syscall_64+0x85/0x260
+ entry_SYSCALL_64_after_hwframe+0x44/0xa9
+
+ Reported by Kernel Concurrency Sanitizer on:
+ CPU: 6 PID: 166 Comm: systemd-journal Not tainted 5.3.0-rc7+ #1
+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
+ ==================================================================
+
+The header of the report provides a short summary of the functions involved in
+the race. It is followed by the access types and stack traces of the 2 threads
+involved in the data race.
+
+The other less common type of data race report looks like this::
+
+ ==================================================================
+ BUG: KCSAN: data-race in e1000_clean_rx_irq+0x551/0xb10
+
+ race at unknown origin, with read to 0xffff933db8a2ae6c of 1 bytes by interrupt on cpu 0:
+ e1000_clean_rx_irq+0x551/0xb10
+ e1000_clean+0x533/0xda0
+ net_rx_action+0x329/0x900
+ __do_softirq+0xdb/0x2db
+ irq_exit+0x9b/0xa0
+ do_IRQ+0x9c/0xf0
+ ret_from_intr+0x0/0x18
+ default_idle+0x3f/0x220
+ arch_cpu_idle+0x21/0x30
+ do_idle+0x1df/0x230
+ cpu_startup_entry+0x14/0x20
+ rest_init+0xc5/0xcb
+ arch_call_rest_init+0x13/0x2b
+ start_kernel+0x6db/0x700
+
+ Reported by Kernel Concurrency Sanitizer on:
+ CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.3.0-rc7+ #2
+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.12.0-1 04/01/2014
+ ==================================================================
+
+This report is generated where it was not possible to determine the other
+racing thread, but a race was inferred due to the data value of the watched
+memory location having changed. These can occur either due to missing
+instrumentation or e.g. DMA accesses. These reports will only be generated if
+``CONFIG_KCSAN_REPORT_RACE_UNKNOWN_ORIGIN=y`` (selected by default).
+
+Selective analysis
+~~~~~~~~~~~~~~~~~~
+
+It may be desirable to disable data race detection for specific accesses,
+functions, compilation units, or entire subsystems. For static blacklisting,
+the below options are available:
+
+* KCSAN understands the ``data_race(expr)`` annotation, which tells KCSAN that
+ any data races due to accesses in ``expr`` should be ignored and resulting
+ behaviour when encountering a data race is deemed safe.
+
+* Disabling data race detection for entire functions can be accomplished by
+ using the function attribute ``__no_kcsan``::
+
+ __no_kcsan
+ void foo(void) {
+ ...
+
+ To dynamically limit for which functions to generate reports, see the
+ `DebugFS interface`_ blacklist/whitelist feature.
+
+ For ``__always_inline`` functions, replace ``__always_inline`` with
+ ``__no_kcsan_or_inline`` (which implies ``__always_inline``)::
+
+ static __no_kcsan_or_inline void foo(void) {
+ ...
+
+* To disable data race detection for a particular compilation unit, add to the
+ ``Makefile``::
+
+ KCSAN_SANITIZE_file.o := n
+
+* To disable data race detection for all compilation units listed in a
+ ``Makefile``, add to the respective ``Makefile``::
+
+ KCSAN_SANITIZE := n
+
+Furthermore, it is possible to tell KCSAN to show or hide entire classes of
+data races, depending on preferences. These can be changed via the following
+Kconfig options:
+
+* ``CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY``: If enabled and a conflicting write
+ is observed via a watchpoint, but the data value of the memory location was
+ observed to remain unchanged, do not report the data race.
+
+* ``CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC``: Assume that plain aligned writes
+ up to word size are atomic by default. Assumes that such writes are not
+ subject to unsafe compiler optimizations resulting in data races. The option
+ causes KCSAN to not report data races due to conflicts where the only plain
+ accesses are aligned writes up to word size.
+
+DebugFS interface
+~~~~~~~~~~~~~~~~~
+
+The file ``/sys/kernel/debug/kcsan`` provides the following interface:
+
+* Reading ``/sys/kernel/debug/kcsan`` returns various runtime statistics.
+
+* Writing ``on`` or ``off`` to ``/sys/kernel/debug/kcsan`` allows turning KCSAN
+ on or off, respectively.
+
+* Writing ``!some_func_name`` to ``/sys/kernel/debug/kcsan`` adds
+ ``some_func_name`` to the report filter list, which (by default) blacklists
+ reporting data races where either one of the top stackframes are a function
+ in the list.
+
+* Writing either ``blacklist`` or ``whitelist`` to ``/sys/kernel/debug/kcsan``
+ changes the report filtering behaviour. For example, the blacklist feature
+ can be used to silence frequently occurring data races; the whitelist feature
+ can help with reproduction and testing of fixes.
+
+Tuning performance
+~~~~~~~~~~~~~~~~~~
+
+Core parameters that affect KCSAN's overall performance and bug detection
+ability are exposed as kernel command-line arguments whose defaults can also be
+changed via the corresponding Kconfig options.
+
+* ``kcsan.skip_watch`` (``CONFIG_KCSAN_SKIP_WATCH``): Number of per-CPU memory
+ operations to skip, before another watchpoint is set up. Setting up
+ watchpoints more frequently will result in the likelihood of races to be
+ observed to increase. This parameter has the most significant impact on
+ overall system performance and race detection ability.
+
+* ``kcsan.udelay_task`` (``CONFIG_KCSAN_UDELAY_TASK``): For tasks, the
+ microsecond delay to stall execution after a watchpoint has been set up.
+ Larger values result in the window in which we may observe a race to
+ increase.
+
+* ``kcsan.udelay_interrupt`` (``CONFIG_KCSAN_UDELAY_INTERRUPT``): For
+ interrupts, the microsecond delay to stall execution after a watchpoint has
+ been set up. Interrupts have tighter latency requirements, and their delay
+ should generally be smaller than the one chosen for tasks.
+
+They may be tweaked at runtime via ``/sys/module/kcsan/parameters/``.
+
+Data Races
+----------
+
+In an execution, two memory accesses form a *data race* if they *conflict*,
+they happen concurrently in different threads, and at least one of them is a
+*plain access*; they *conflict* if both access the same memory location, and at
+least one is a write. For a more thorough discussion and definition, see `"Plain
+Accesses and Data Races" in the LKMM`_.
+
+.. _"Plain Accesses and Data Races" in the LKMM: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/memory-model/Documentation/explanation.txt#n1922
+
+Relationship with the Linux-Kernel Memory Consistency Model (LKMM)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The LKMM defines the propagation and ordering rules of various memory
+operations, which gives developers the ability to reason about concurrent code.
+Ultimately this allows to determine the possible executions of concurrent code,
+and if that code is free from data races.
+
+KCSAN is aware of *marked atomic operations* (``READ_ONCE``, ``WRITE_ONCE``,
+``atomic_*``, etc.), but is oblivious of any ordering guarantees and simply
+assumes that memory barriers are placed correctly. In other words, KCSAN
+assumes that as long as a plain access is not observed to race with another
+conflicting access, memory operations are correctly ordered.
+
+This means that KCSAN will not report *potential* data races due to missing
+memory ordering. Developers should therefore carefully consider the required
+memory ordering requirements that remain unchecked. If, however, missing
+memory ordering (that is observable with a particular compiler and
+architecture) leads to an observable data race (e.g. entering a critical
+section erroneously), KCSAN would report the resulting data race.
+
+Race Detection Beyond Data Races
+--------------------------------
+
+For code with complex concurrency design, race-condition bugs may not always
+manifest as data races. Race conditions occur if concurrently executing
+operations result in unexpected system behaviour. On the other hand, data races
+are defined at the C-language level. The following macros can be used to check
+properties of concurrent code where bugs would not manifest as data races.
+
+.. kernel-doc:: include/linux/kcsan-checks.h
+ :functions: ASSERT_EXCLUSIVE_WRITER ASSERT_EXCLUSIVE_WRITER_SCOPED
+ ASSERT_EXCLUSIVE_ACCESS ASSERT_EXCLUSIVE_ACCESS_SCOPED
+ ASSERT_EXCLUSIVE_BITS
+
+Implementation Details
+----------------------
+
+KCSAN relies on observing that two accesses happen concurrently. Crucially, we
+want to (a) increase the chances of observing races (especially for races that
+manifest rarely), and (b) be able to actually observe them. We can accomplish
+(a) by injecting various delays, and (b) by using address watchpoints (or
+breakpoints).
+
+If we deliberately stall a memory access, while we have a watchpoint for its
+address set up, and then observe the watchpoint to fire, two accesses to the
+same address just raced. Using hardware watchpoints, this is the approach taken
+in `DataCollider
+<http://usenix.org/legacy/events/osdi10/tech/full_papers/Erickson.pdf>`_.
+Unlike DataCollider, KCSAN does not use hardware watchpoints, but instead
+relies on compiler instrumentation and "soft watchpoints".
+
+In KCSAN, watchpoints are implemented using an efficient encoding that stores
+access type, size, and address in a long; the benefits of using "soft
+watchpoints" are portability and greater flexibility. KCSAN then relies on the
+compiler instrumenting plain accesses. For each instrumented plain access:
+
+1. Check if a matching watchpoint exists; if yes, and at least one access is a
+ write, then we encountered a racing access.
+
+2. Periodically, if no matching watchpoint exists, set up a watchpoint and
+ stall for a small randomized delay.
+
+3. Also check the data value before the delay, and re-check the data value
+ after delay; if the values mismatch, we infer a race of unknown origin.
+
+To detect data races between plain and marked accesses, KCSAN also annotates
+marked accesses, but only to check if a watchpoint exists; i.e. KCSAN never
+sets up a watchpoint on marked accesses. By never setting up watchpoints for
+marked operations, if all accesses to a variable that is accessed concurrently
+are properly marked, KCSAN will never trigger a watchpoint and therefore never
+report the accesses.
+
+Key Properties
+~~~~~~~~~~~~~~
+
+1. **Memory Overhead:** The overall memory overhead is only a few MiB
+ depending on configuration. The current implementation uses a small array of
+ longs to encode watchpoint information, which is negligible.
+
+2. **Performance Overhead:** KCSAN's runtime aims to be minimal, using an
+ efficient watchpoint encoding that does not require acquiring any shared
+ locks in the fast-path. For kernel boot on a system with 8 CPUs:
+
+ - 5.0x slow-down with the default KCSAN config;
+ - 2.8x slow-down from runtime fast-path overhead only (set very large
+ ``KCSAN_SKIP_WATCH`` and unset ``KCSAN_SKIP_WATCH_RANDOMIZE``).
+
+3. **Annotation Overheads:** Minimal annotations are required outside the KCSAN
+ runtime. As a result, maintenance overheads are minimal as the kernel
+ evolves.
+
+4. **Detects Racy Writes from Devices:** Due to checking data values upon
+ setting up watchpoints, racy writes from devices can also be detected.
+
+5. **Memory Ordering:** KCSAN is *not* explicitly aware of the LKMM's ordering
+ rules; this may result in missed data races (false negatives).
+
+6. **Analysis Accuracy:** For observed executions, due to using a sampling
+ strategy, the analysis is *unsound* (false negatives possible), but aims to
+ be complete (no false positives).
+
+Alternatives Considered
+-----------------------
+
+An alternative data race detection approach for the kernel can be found in the
+`Kernel Thread Sanitizer (KTSAN) <https://github.com/google/ktsan/wiki>`_.
+KTSAN is a happens-before data race detector, which explicitly establishes the
+happens-before order between memory operations, which can then be used to
+determine data races as defined in `Data Races`_.
+
+To build a correct happens-before relation, KTSAN must be aware of all ordering
+rules of the LKMM and synchronization primitives. Unfortunately, any omission
+leads to large numbers of false positives, which is especially detrimental in
+the context of the kernel which includes numerous custom synchronization
+mechanisms. To track the happens-before relation, KTSAN's implementation
+requires metadata for each memory location (shadow memory), which for each page
+corresponds to 4 pages of shadow memory, and can translate into overhead of
+tens of GiB on a large system.
diff --git a/Documentation/dev-tools/kgdb.rst b/Documentation/dev-tools/kgdb.rst
index d38be58..61293f4 100644
--- a/Documentation/dev-tools/kgdb.rst
+++ b/Documentation/dev-tools/kgdb.rst
@@ -274,6 +274,30 @@
on the initial connect, or to use a debugger proxy that allows an
unmodified gdb to do the debugging.
+Kernel parameter: ``kgdboc_earlycon``
+-------------------------------------
+
+If you specify the kernel parameter ``kgdboc_earlycon`` and your serial
+driver registers a boot console that supports polling (doesn't need
+interrupts and implements a nonblocking read() function) kgdb will attempt
+to work using the boot console until it can transition to the regular
+tty driver specified by the ``kgdboc`` parameter.
+
+Normally there is only one boot console (especially that implements the
+read() function) so just adding ``kgdboc_earlycon`` on its own is
+sufficient to make this work. If you have more than one boot console you
+can add the boot console's name to differentiate. Note that names that
+are registered through the boot console layer and the tty layer are not
+the same for the same port.
+
+For instance, on one board to be explicit you might do::
+
+ kgdboc_earlycon=qcom_geni kgdboc=ttyMSM0
+
+If the only boot console on the device was "qcom_geni", you could simplify::
+
+ kgdboc_earlycon kgdboc=ttyMSM0
+
Kernel parameter: ``kgdbwait``
------------------------------
diff --git a/Documentation/dev-tools/kselftest.rst b/Documentation/dev-tools/kselftest.rst
index 61ae13c..469d115 100644
--- a/Documentation/dev-tools/kselftest.rst
+++ b/Documentation/dev-tools/kselftest.rst
@@ -151,6 +151,29 @@
$ cd kselftest
$ ./run_kselftest.sh
+Packaging selftests
+===================
+
+In some cases packaging is desired, such as when tests need to run on a
+different system. To package selftests, run::
+
+ $ make -C tools/testing/selftests gen_tar
+
+This generates a tarball in the `INSTALL_PATH/kselftest-packages` directory. By
+default, `.gz` format is used. The tar format can be overridden by specifying
+a `FORMAT` make variable. Any value recognized by `tar's auto-compress`_ option
+is supported, such as::
+
+ $ make -C tools/testing/selftests gen_tar FORMAT=.xz
+
+`make gen_tar` invokes `make install` so you can use it to package a subset of
+tests by using variables specified in `Running a subset of selftests`_
+section::
+
+ $ make -C tools/testing/selftests gen_tar TARGETS="bpf" FORMAT=.xz
+
+.. _tar's auto-compress: https://www.gnu.org/software/tar/manual/html_node/gzip.html#auto_002dcompress
+
Contributing new tests
======================
@@ -301,7 +324,8 @@
.. kernel-doc:: tools/testing/selftests/kselftest_harness.h
:functions: TH_LOG TEST TEST_SIGNAL FIXTURE FIXTURE_DATA FIXTURE_SETUP
- FIXTURE_TEARDOWN TEST_F TEST_HARNESS_MAIN
+ FIXTURE_TEARDOWN TEST_F TEST_HARNESS_MAIN FIXTURE_VARIANT
+ FIXTURE_VARIANT_ADD
Operators
---------
diff --git a/Documentation/dev-tools/kunit/start.rst b/Documentation/dev-tools/kunit/start.rst
index e1c5ce8..bb112cf 100644
--- a/Documentation/dev-tools/kunit/start.rst
+++ b/Documentation/dev-tools/kunit/start.rst
@@ -32,15 +32,17 @@
options required by the tests.
A good starting point for a ``.kunitconfig`` is the KUnit defconfig:
+
.. code-block:: bash
cd $PATH_TO_LINUX_REPO
cp arch/um/configs/kunit_defconfig .kunitconfig
You can then add any other Kconfig options you wish, e.g.:
+
.. code-block:: none
- CONFIG_LIST_KUNIT_TEST=y
+ CONFIG_LIST_KUNIT_TEST=y
:doc:`kunit_tool <kunit-tool>` will ensure that all config options set in
``.kunitconfig`` are set in the kernel ``.config`` before running the tests.
@@ -54,8 +56,8 @@
other tools (such as make menuconfig) to adjust other config options.
-Running the tests
------------------
+Running the tests (KUnit Wrapper)
+---------------------------------
To make sure that everything is set up correctly, simply invoke the Python
wrapper from your kernel repo:
@@ -105,8 +107,9 @@
KUnit and KUnit tests can be compiled as modules: in this case the tests in a
module will be run when the module is loaded.
-Running the tests
------------------
+
+Running the tests (w/o KUnit Wrapper)
+-------------------------------------
Build and run your kernel as usual. Test output will be written to the kernel
log in `TAP <https://testanything.org/>`_ format.
diff --git a/Documentation/dev-tools/kunit/usage.rst b/Documentation/dev-tools/kunit/usage.rst
index 473a236..3c3fe8b 100644
--- a/Documentation/dev-tools/kunit/usage.rst
+++ b/Documentation/dev-tools/kunit/usage.rst
@@ -595,7 +595,7 @@
KUnit debugfs representation
============================
When kunit test suites are initialized, they create an associated directory
-in /sys/kernel/debug/kunit/<test-suite>. The directory contains one file
+in ``/sys/kernel/debug/kunit/<test-suite>``. The directory contains one file
- results: "cat results" displays results of each test case and the results
of the entire suite for the last test run.
@@ -604,4 +604,4 @@
run in a native environment, either as modules or builtin. Having a way
to display results like this is valuable as otherwise results can be
intermixed with other events in dmesg output. The maximum size of each
-results file is KUNIT_LOG_SIZE bytes (defined in include/kunit/test.h).
+results file is KUNIT_LOG_SIZE bytes (defined in ``include/kunit/test.h``).
diff --git a/Documentation/devicetree/bindings/ABI.rst b/Documentation/devicetree/bindings/ABI.rst
new file mode 100644
index 0000000..a885713c
--- /dev/null
+++ b/Documentation/devicetree/bindings/ABI.rst
@@ -0,0 +1,42 @@
+.. SPDX-License-Identifier: GPL-2.0
+
+===================
+Devicetree (DT) ABI
+===================
+
+I. Regarding stable bindings/ABI, we quote from the 2013 ARM mini-summit
+ summary document:
+
+ "That still leaves the question of, what does a stable binding look
+ like? Certainly a stable binding means that a newer kernel will not
+ break on an older device tree, but that doesn't mean the binding is
+ frozen for all time. Grant said there are ways to change bindings that
+ don't result in breakage. For instance, if a new property is added,
+ then default to the previous behaviour if it is missing. If a binding
+ truly needs an incompatible change, then change the compatible string
+ at the same time. The driver can bind against both the old and the
+ new. These guidelines aren't new, but they desperately need to be
+ documented."
+
+II. General binding rules
+
+ 1) Maintainers, don't let perfect be the enemy of good. Don't hold up a
+ binding because it isn't perfect.
+
+ 2) Use specific compatible strings so that if we need to add a feature (DMA)
+ in the future, we can create a new compatible string. See I.
+
+ 3) Bindings can be augmented, but the driver shouldn't break when given
+ the old binding. ie. add additional properties, but don't change the
+ meaning of an existing property. For drivers, default to the original
+ behaviour when a newly added property is missing.
+
+ 4) Don't submit bindings for staging or unstable. That will be decided by
+ the devicetree maintainers *after* discussion on the mailinglist.
+
+III. Notes
+
+ 1) This document is intended as a general familiarization with the process as
+ decided at the 2013 Kernel Summit. When in doubt, the current word of the
+ devicetree maintainers overrules this document. In that situation, a patch
+ updating this document would be appreciated.
diff --git a/Documentation/devicetree/bindings/ABI.txt b/Documentation/devicetree/bindings/ABI.txt
deleted file mode 100644
index d25f8d3..0000000
--- a/Documentation/devicetree/bindings/ABI.txt
+++ /dev/null
@@ -1,39 +0,0 @@
-
- Devicetree (DT) ABI
-
-I. Regarding stable bindings/ABI, we quote from the 2013 ARM mini-summit
- summary document:
-
- "That still leaves the question of, what does a stable binding look
- like? Certainly a stable binding means that a newer kernel will not
- break on an older device tree, but that doesn't mean the binding is
- frozen for all time. Grant said there are ways to change bindings that
- don't result in breakage. For instance, if a new property is added,
- then default to the previous behaviour if it is missing. If a binding
- truly needs an incompatible change, then change the compatible string
- at the same time. The driver can bind against both the old and the
- new. These guidelines aren't new, but they desperately need to be
- documented."
-
-II. General binding rules
-
- 1) Maintainers, don't let perfect be the enemy of good. Don't hold up a
- binding because it isn't perfect.
-
- 2) Use specific compatible strings so that if we need to add a feature (DMA)
- in the future, we can create a new compatible string. See I.
-
- 3) Bindings can be augmented, but the driver shouldn't break when given
- the old binding. ie. add additional properties, but don't change the
- meaning of an existing property. For drivers, default to the original
- behaviour when a newly added property is missing.
-
- 4) Don't submit bindings for staging or unstable. That will be decided by
- the devicetree maintainers *after* discussion on the mailinglist.
-
-III. Notes
-
- 1) This document is intended as a general familiarization with the process as
- decided at the 2013 Kernel Summit. When in doubt, the current word of the
- devicetree maintainers overrules this document. In that situation, a patch
- updating this document would be appreciated.
diff --git a/Documentation/devicetree/bindings/Makefile b/Documentation/devicetree/bindings/Makefile
index 1df680d..a638989 100644
--- a/Documentation/devicetree/bindings/Makefile
+++ b/Documentation/devicetree/bindings/Makefile
@@ -2,27 +2,38 @@
DT_DOC_CHECKER ?= dt-doc-validate
DT_EXTRACT_EX ?= dt-extract-example
DT_MK_SCHEMA ?= dt-mk-schema
+DT_MK_SCHEMA_USERONLY_FLAG := $(if $(DT_SCHEMA_FILES), -u)
+
+DT_SCHEMA_MIN_VERSION = 2020.5
+
+PHONY += check_dtschema_version
+check_dtschema_version:
+ @{ echo $(DT_SCHEMA_MIN_VERSION); \
+ $(DT_DOC_CHECKER) --version 2>/dev/null || echo 0; } | sort -VC || \
+ { echo "ERROR: dtschema minimum version is v$(DT_SCHEMA_MIN_VERSION)" >&2; false; }
quiet_cmd_chk_binding = CHKDT $(patsubst $(srctree)/%,%,$<)
cmd_chk_binding = $(DT_DOC_CHECKER) -u $(srctree)/$(src) $< ; \
$(DT_EXTRACT_EX) $< > $@
-$(obj)/%.example.dts: $(src)/%.yaml FORCE
+$(obj)/%.example.dts: $(src)/%.yaml check_dtschema_version FORCE
$(call if_changed,chk_binding)
# Use full schemas when checking %.example.dts
DT_TMP_SCHEMA := $(obj)/processed-schema-examples.yaml
-quiet_cmd_mk_schema = SCHEMA $@
- cmd_mk_schema = $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) -o $@ $(real-prereqs)
-
-DT_DOCS = $(addprefix $(src)/, \
- $(shell \
- cd $(srctree)/$(src) && \
- find * \( -name '*.yaml' ! \
+find_cmd = find $(srctree)/$(src) \( -name '*.yaml' ! \
-name 'processed-schema*' ! \
- -name '*.example.dt.yaml' \) \
- ))
+ -name '*.example.dt.yaml' \)
+
+quiet_cmd_mk_schema = SCHEMA $@
+ cmd_mk_schema = rm -f $@ ; \
+ $(if $(DT_MK_SCHEMA_FLAGS), \
+ echo $(real-prereqs), \
+ $(find_cmd)) | \
+ xargs $(DT_MK_SCHEMA) $(DT_MK_SCHEMA_FLAGS) >> $@
+
+DT_DOCS = $(shell $(find_cmd) | sed -e 's|^$(srctree)/||')
DT_SCHEMA_FILES ?= $(DT_DOCS)
@@ -34,11 +45,11 @@
-Wno-avoid_unnecessary_addr_size \
-Wno-graph_child_address
-$(obj)/processed-schema-examples.yaml: $(DT_DOCS) FORCE
+$(obj)/processed-schema-examples.yaml: $(DT_DOCS) check_dtschema_version FORCE
$(call if_changed,mk_schema)
-$(obj)/processed-schema.yaml: DT_MK_SCHEMA_FLAGS := -u
-$(obj)/processed-schema.yaml: $(DT_SCHEMA_FILES) FORCE
+$(obj)/processed-schema.yaml: DT_MK_SCHEMA_FLAGS := $(DT_MK_SCHEMA_USERONLY_FLAG)
+$(obj)/processed-schema.yaml: $(DT_SCHEMA_FILES) check_dtschema_version FORCE
$(call if_changed,mk_schema)
extra-y += processed-schema.yaml
diff --git a/Documentation/devicetree/bindings/arm/altera.yaml b/Documentation/devicetree/bindings/arm/altera.yaml
index 49e0362..b388c5aa 100644
--- a/Documentation/devicetree/bindings/arm/altera.yaml
+++ b/Documentation/devicetree/bindings/arm/altera.yaml
@@ -13,8 +13,8 @@
compatible:
items:
- enum:
- - altr,socfpga-cyclone5
- - altr,socfpga-arria5
- - altr,socfpga-arria10
+ - altr,socfpga-cyclone5
+ - altr,socfpga-arria5
+ - altr,socfpga-arria10
- const: altr,socfpga
...
diff --git a/Documentation/devicetree/bindings/arm/amlogic.yaml b/Documentation/devicetree/bindings/arm/amlogic.yaml
index f74aba4..378229f 100644
--- a/Documentation/devicetree/bindings/arm/amlogic.yaml
+++ b/Documentation/devicetree/bindings/arm/amlogic.yaml
@@ -17,7 +17,7 @@
any time. Be sure to use a device tree binary and a kernel image
generated from the same source tree.
- Please refer to Documentation/devicetree/bindings/ABI.txt for a definition of a
+ Please refer to Documentation/devicetree/bindings/ABI.rst for a definition of a
stable binding/ABI.
properties:
@@ -107,6 +107,7 @@
- amlogic,p231
- libretech,aml-s905d-pc
- phicomm,n1
+ - smartlabs,sml5442tw
- const: amlogic,s905d
- const: amlogic,meson-gxl
@@ -148,6 +149,8 @@
- description: Boards with the Amlogic Meson G12B S922X SoC
items:
- enum:
+ - azw,gtking
+ - azw,gtking-pro
- hardkernel,odroid-n2
- khadas,vim3
- ugoos,am6
@@ -159,6 +162,7 @@
- enum:
- seirobotics,sei610
- khadas,vim3l
+ - hardkernel,odroid-c4
- const: amlogic,sm1
- description: Boards with the Amlogic Meson A1 A113L SoC
diff --git a/Documentation/devicetree/bindings/arm/amlogic/amlogic,meson-gx-ao-secure.yaml b/Documentation/devicetree/bindings/arm/amlogic/amlogic,meson-gx-ao-secure.yaml
index 66213bd..6cc7452 100644
--- a/Documentation/devicetree/bindings/arm/amlogic/amlogic,meson-gx-ao-secure.yaml
+++ b/Documentation/devicetree/bindings/arm/amlogic/amlogic,meson-gx-ao-secure.yaml
@@ -25,7 +25,7 @@
properties:
compatible:
- items:
+ items:
- const: amlogic,meson-gx-ao-secure
- const: syscon
diff --git a/Documentation/devicetree/bindings/arm/arm,scmi.txt b/Documentation/devicetree/bindings/arm/arm,scmi.txt
index dc102c4e..1f293ea 100644
--- a/Documentation/devicetree/bindings/arm/arm,scmi.txt
+++ b/Documentation/devicetree/bindings/arm/arm,scmi.txt
@@ -14,7 +14,7 @@
The scmi node with the following properties shall be under the /firmware/ node.
-- compatible : shall be "arm,scmi"
+- compatible : shall be "arm,scmi" or "arm,scmi-smc" for smc/hvc transports
- mboxes: List of phandle and mailbox channel specifiers. It should contain
exactly one or two mailboxes, one for transmitting messages("tx")
and another optional for receiving the notifications("rx") if
@@ -25,6 +25,7 @@
protocol identifier for a given sub-node.
- #size-cells : should be '0' as 'reg' property doesn't have any size
associated with it.
+- arm,smc-id : SMC id required when using smc or hvc transports
Optional properties:
diff --git a/Documentation/devicetree/bindings/arm/arm,vexpress-juno.yaml b/Documentation/devicetree/bindings/arm/arm,vexpress-juno.yaml
index 8c06a73..a3420c8 100644
--- a/Documentation/devicetree/bindings/arm/arm,vexpress-juno.yaml
+++ b/Documentation/devicetree/bindings/arm/arm,vexpress-juno.yaml
@@ -131,26 +131,23 @@
property, describing the physical location of the children nodes.
0 means motherboard site, while 1 and 2 are daughterboard sites, and
0xf means "sisterboard" which is the site containing the main CPU tile.
- allOf:
- - $ref: '/schemas/types.yaml#/definitions/uint32'
- - minimum: 0
- maximum: 15
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ minimum: 0
+ maximum: 15
arm,vexpress,position:
description: When daughterboards are stacked on one site, their position
in the stack be be described this attribute.
- allOf:
- - $ref: '/schemas/types.yaml#/definitions/uint32'
- - minimum: 0
- maximum: 3
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ minimum: 0
+ maximum: 3
arm,vexpress,dcc:
description: When describing tiles consisting of more than one DCC, its
number can be specified with this attribute.
- allOf:
- - $ref: '/schemas/types.yaml#/definitions/uint32'
- - minimum: 0
- maximum: 3
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ minimum: 0
+ maximum: 3
patternProperties:
"^bus@[0-9a-f]+$":
@@ -162,8 +159,7 @@
"simple-bus". If the compatible is placed in the "motherboard" node,
it is stricter and always has two compatibles.
type: object
- allOf:
- - $ref: '/schemas/simple-bus.yaml'
+ $ref: '/schemas/simple-bus.yaml'
properties:
compatible:
@@ -195,11 +191,11 @@
- const: simple-bus
arm,v2m-memory-map:
description: This describes the memory map type.
- allOf:
- - $ref: '/schemas/types.yaml#/definitions/string'
- - enum:
- - rs1
- - rs2
+ $ref: '/schemas/types.yaml#/definitions/string'
+ enum:
+ - rs1
+ - rs2
+
required:
- compatible
required:
diff --git a/Documentation/devicetree/bindings/arm/atmel-at91.yaml b/Documentation/devicetree/bindings/arm/atmel-at91.yaml
index 0357314..31b0c54 100644
--- a/Documentation/devicetree/bindings/arm/atmel-at91.yaml
+++ b/Documentation/devicetree/bindings/arm/atmel-at91.yaml
@@ -82,6 +82,13 @@
- const: atmel,sama5d2
- const: atmel,sama5
+ - description: Microchip SAMA5D2 Industrial Connectivity Platform
+ items:
+ - const: microchip,sama5d2-icp
+ - const: atmel,sama5d27
+ - const: atmel,sama5d2
+ - const: atmel,sama5
+
- description: SAM9X60-EK board
items:
- const: microchip,sam9x60ek
diff --git a/Documentation/devicetree/bindings/arm/bitmain.yaml b/Documentation/devicetree/bindings/arm/bitmain.yaml
index 0efdb4ac..5cd5b36 100644
--- a/Documentation/devicetree/bindings/arm/bitmain.yaml
+++ b/Documentation/devicetree/bindings/arm/bitmain.yaml
@@ -13,6 +13,6 @@
compatible:
items:
- enum:
- - bitmain,sophon-edge
+ - bitmain,sophon-edge
- const: bitmain,bm1880
...
diff --git a/Documentation/devicetree/bindings/arm/calxeda/hb-sregs.yaml b/Documentation/devicetree/bindings/arm/calxeda/hb-sregs.yaml
new file mode 100644
index 0000000..dfdc970
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/calxeda/hb-sregs.yaml
@@ -0,0 +1,49 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/arm/calxeda/hb-sregs.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Calxeda Highbank system registers
+
+description: |
+ The Calxeda Highbank system has a block of MMIO registers controlling
+ several generic system aspects. Those can be used to control some power
+ management, they also contain some gate and PLL clocks.
+
+maintainers:
+ - Andre Przywara <andre.przywara@arm.com>
+
+properties:
+ compatible:
+ const: calxeda,hb-sregs
+
+ reg:
+ maxItems: 1
+
+ clocks:
+ type: object
+
+required:
+ - compatible
+ - reg
+
+additionalProperties: false
+
+examples:
+ - |
+ sregs@fff3c000 {
+ compatible = "calxeda,hb-sregs";
+ reg = <0xfff3c000 0x1000>;
+
+ clocks {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ osc: oscillator {
+ #clock-cells = <0>;
+ compatible = "fixed-clock";
+ clock-frequency = <33333000>;
+ };
+ };
+ };
diff --git a/Documentation/devicetree/bindings/arm/calxeda/l2ecc.txt b/Documentation/devicetree/bindings/arm/calxeda/l2ecc.txt
deleted file mode 100644
index 94e642a..0000000
--- a/Documentation/devicetree/bindings/arm/calxeda/l2ecc.txt
+++ /dev/null
@@ -1,15 +0,0 @@
-Calxeda Highbank L2 cache ECC
-
-Properties:
-- compatible : Should be "calxeda,hb-sregs-l2-ecc"
-- reg : Address and size for ECC error interrupt clear registers.
-- interrupts : Should be single bit error interrupt, then double bit error
- interrupt.
-
-Example:
-
- sregs@fff3c200 {
- compatible = "calxeda,hb-sregs-l2-ecc";
- reg = <0xfff3c200 0x100>;
- interrupts = <0 71 4 0 72 4>;
- };
diff --git a/Documentation/devicetree/bindings/arm/calxeda/l2ecc.yaml b/Documentation/devicetree/bindings/arm/calxeda/l2ecc.yaml
new file mode 100644
index 0000000..a9fe012
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/calxeda/l2ecc.yaml
@@ -0,0 +1,42 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/arm/calxeda/l2ecc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Calxeda Highbank L2 cache ECC
+
+description: |
+ Binding for the Calxeda Highbank L2 cache controller ECC device.
+ This does not cover the actual L2 cache controller control registers,
+ but just the error reporting functionality.
+
+maintainers:
+ - Andre Przywara <andre.przywara@arm.com>
+
+properties:
+ compatible:
+ const: "calxeda,hb-sregs-l2-ecc"
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ items:
+ - description: single bit error interrupt
+ - description: double bit error interrupt
+
+required:
+ - compatible
+ - reg
+ - interrupts
+
+additionalProperties: false
+
+examples:
+ - |
+ sregs@fff3c200 {
+ compatible = "calxeda,hb-sregs-l2-ecc";
+ reg = <0xfff3c200 0x100>;
+ interrupts = <0 71 4>, <0 72 4>;
+ };
diff --git a/Documentation/devicetree/bindings/arm/coresight-cti.yaml b/Documentation/devicetree/bindings/arm/coresight-cti.yaml
index 3db3642..17df5cd 100644
--- a/Documentation/devicetree/bindings/arm/coresight-cti.yaml
+++ b/Documentation/devicetree/bindings/arm/coresight-cti.yaml
@@ -140,16 +140,14 @@
maxItems: 1
arm,trig-in-sigs:
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32-array
+ $ref: /schemas/types.yaml#/definitions/uint32-array
minItems: 1
maxItems: 32
description:
List of CTI trigger in signal numbers in use by a trig-conns node.
arm,trig-in-types:
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32-array
+ $ref: /schemas/types.yaml#/definitions/uint32-array
minItems: 1
maxItems: 32
description:
@@ -159,16 +157,14 @@
completely, then the types will default to GEN_IO.
arm,trig-out-sigs:
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32-array
+ $ref: /schemas/types.yaml#/definitions/uint32-array
minItems: 1
maxItems: 32
description:
List of CTI trigger out signal numbers in use by a trig-conns node.
arm,trig-out-types:
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32-array
+ $ref: /schemas/types.yaml#/definitions/uint32-array
minItems: 1
maxItems: 32
description:
@@ -178,8 +174,7 @@
or omitted completely, then the types will default to GEN_IO.
arm,trig-filters:
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32-array
+ $ref: /schemas/types.yaml#/definitions/uint32-array
minItems: 1
maxItems: 32
description:
@@ -187,8 +182,7 @@
active, unless filtering is disabled on the driver.
arm,trig-conn-name:
- allOf:
- - $ref: /schemas/types.yaml#/definitions/string
+ $ref: /schemas/types.yaml#/definitions/string
description:
Defines a connection name that will be displayed, if the cpu or
arm,cs-dev-assoc properties are not being used in this connection.
@@ -301,7 +295,7 @@
- |
cti@20110000 {
compatible = "arm,coresight-cti", "arm,primecell";
- reg = <0 0x20110000 0 0x1000>;
+ reg = <0x20110000 0x1000>;
clocks = <&soc_smc50mhz>;
clock-names = "apb_pclk";
diff --git a/Documentation/devicetree/bindings/arm/cpus.yaml b/Documentation/devicetree/bindings/arm/cpus.yaml
index a018147..40f692c 100644
--- a/Documentation/devicetree/bindings/arm/cpus.yaml
+++ b/Documentation/devicetree/bindings/arm/cpus.yaml
@@ -167,53 +167,53 @@
- qcom,kryo260
- qcom,kryo280
- qcom,kryo385
+ - qcom,kryo468
- qcom,kryo485
- qcom,scorpion
enable-method:
- allOf:
- - $ref: '/schemas/types.yaml#/definitions/string'
- - oneOf:
- # On ARM v8 64-bit this property is required
- - enum:
- - psci
- - spin-table
- # On ARM 32-bit systems this property is optional
- - enum:
- - actions,s500-smp
- - allwinner,sun6i-a31
- - allwinner,sun8i-a23
- - allwinner,sun9i-a80-smp
- - allwinner,sun8i-a83t-smp
- - amlogic,meson8-smp
- - amlogic,meson8b-smp
- - arm,realview-smp
- - aspeed,ast2600-smp
- - brcm,bcm11351-cpu-method
- - brcm,bcm23550
- - brcm,bcm2836-smp
- - brcm,bcm63138
- - brcm,bcm-nsp-smp
- - brcm,brahma-b15
- - marvell,armada-375-smp
- - marvell,armada-380-smp
- - marvell,armada-390-smp
- - marvell,armada-xp-smp
- - marvell,98dx3236-smp
- - marvell,mmp3-smp
- - mediatek,mt6589-smp
- - mediatek,mt81xx-tz-smp
- - qcom,gcc-msm8660
- - qcom,kpss-acc-v1
- - qcom,kpss-acc-v2
- - renesas,apmu
- - renesas,r9a06g032-smp
- - rockchip,rk3036-smp
- - rockchip,rk3066-smp
- - socionext,milbeaut-m10v-smp
- - ste,dbx500-smp
- - ti,am3352
- - ti,am4372
+ $ref: '/schemas/types.yaml#/definitions/string'
+ oneOf:
+ # On ARM v8 64-bit this property is required
+ - enum:
+ - psci
+ - spin-table
+ # On ARM 32-bit systems this property is optional
+ - enum:
+ - actions,s500-smp
+ - allwinner,sun6i-a31
+ - allwinner,sun8i-a23
+ - allwinner,sun9i-a80-smp
+ - allwinner,sun8i-a83t-smp
+ - amlogic,meson8-smp
+ - amlogic,meson8b-smp
+ - arm,realview-smp
+ - aspeed,ast2600-smp
+ - brcm,bcm11351-cpu-method
+ - brcm,bcm23550
+ - brcm,bcm2836-smp
+ - brcm,bcm63138
+ - brcm,bcm-nsp-smp
+ - brcm,brahma-b15
+ - marvell,armada-375-smp
+ - marvell,armada-380-smp
+ - marvell,armada-390-smp
+ - marvell,armada-xp-smp
+ - marvell,98dx3236-smp
+ - marvell,mmp3-smp
+ - mediatek,mt6589-smp
+ - mediatek,mt81xx-tz-smp
+ - qcom,gcc-msm8660
+ - qcom,kpss-acc-v1
+ - qcom,kpss-acc-v2
+ - renesas,apmu
+ - renesas,r9a06g032-smp
+ - rockchip,rk3036-smp
+ - rockchip,rk3066-smp
+ - socionext,milbeaut-m10v-smp
+ - ste,dbx500-smp
+ - ti,am3352
+ - ti,am4372
cpu-release-addr:
$ref: '/schemas/types.yaml#/definitions/uint64'
diff --git a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
index 623fedf..7150474 100644
--- a/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
+++ b/Documentation/devicetree/bindings/arm/freescale/fsl,scu.txt
@@ -108,7 +108,8 @@
Required properties:
- compatible: Should be one of:
"fsl,imx8qm-iomuxc",
- "fsl,imx8qxp-iomuxc".
+ "fsl,imx8qxp-iomuxc",
+ "fsl,imx8dxl-iomuxc".
Required properties for Pinctrl sub nodes:
- fsl,pins: Each entry consists of 3 integers which represents
@@ -116,7 +117,8 @@
integers <pin_id mux_mode> are specified using a
PIN_FUNC_ID macro, which can be found in
<dt-bindings/pinctrl/pads-imx8qm.h>,
- <dt-bindings/pinctrl/pads-imx8qxp.h>.
+ <dt-bindings/pinctrl/pads-imx8qxp.h>,
+ <dt-bindings/pinctrl/pads-imx8dxl.h>.
The last integer CONFIG is the pad setting value like
pull-up on this pin.
diff --git a/Documentation/devicetree/bindings/arm/fsl.yaml b/Documentation/devicetree/bindings/arm/fsl.yaml
index cd3fbe7..05906e2 100644
--- a/Documentation/devicetree/bindings/arm/fsl.yaml
+++ b/Documentation/devicetree/bindings/arm/fsl.yaml
@@ -119,6 +119,7 @@
- fsl,imx6q-sabreauto
- fsl,imx6q-sabrelite
- fsl,imx6q-sabresd
+ - kontron,imx6q-samx6i # Kontron i.MX6 Dual/Quad SMARC Module
- technexion,imx6q-pico-dwarf # TechNexion i.MX6Q Pico-Dwarf
- technexion,imx6q-pico-hobbit # TechNexion i.MX6Q Pico-Hobbit
- technexion,imx6q-pico-nymph # TechNexion i.MX6Q Pico-Nymph
@@ -170,6 +171,7 @@
- emtrion,emcon-mx6-avari # emCON-MX6S or emCON-MX6DL SoM on Avari Base
- fsl,imx6dl-sabreauto # i.MX6 DualLite/Solo SABRE Automotive Board
- fsl,imx6dl-sabresd # i.MX6 DualLite SABRE Smart Device Board
+ - kontron,imx6dl-samx6i # Kontron i.MX6 Solo SMARC Module
- technexion,imx6dl-pico-dwarf # TechNexion i.MX6DL Pico-Dwarf
- technexion,imx6dl-pico-hobbit # TechNexion i.MX6DL Pico-Hobbit
- technexion,imx6dl-pico-nymph # TechNexion i.MX6DL Pico-Nymph
@@ -177,7 +179,9 @@
- technologic,imx6dl-ts4900
- technologic,imx6dl-ts7970
- toradex,colibri_imx6dl # Colibri iMX6 Module
+ - toradex,colibri_imx6dl-v1_1 # Colibri iMX6 Module V1.1
- toradex,colibri_imx6dl-eval-v3 # Colibri iMX6 Module on Colibri Evaluation Board V3
+ - toradex,colibri_imx6dl-v1_1-eval-v3 # Colibri iMX6 Module V1.1 on Colibri Evaluation Board V3
- ysoft,imx6dl-yapp4-draco # i.MX6 DualLite Y Soft IOTA Draco board
- ysoft,imx6dl-yapp4-hydra # i.MX6 DualLite Y Soft IOTA Hydra board
- ysoft,imx6dl-yapp4-ursa # i.MX6 Solo Y Soft IOTA Ursa board
diff --git a/Documentation/devicetree/bindings/arm/l2c2x0.yaml b/Documentation/devicetree/bindings/arm/l2c2x0.yaml
index 5d1d50e..6b8f4d4 100644
--- a/Documentation/devicetree/bindings/arm/l2c2x0.yaml
+++ b/Documentation/devicetree/bindings/arm/l2c2x0.yaml
@@ -70,43 +70,39 @@
description: Cycles of latency for Data RAM accesses. Specifies 3 cells of
read, write and setup latencies. Minimum valid values are 1. Controllers
without setup latency control should use a value of 0.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32-array
- - minItems: 2
- maxItems: 3
- items:
- minimum: 0
- maximum: 8
+ $ref: /schemas/types.yaml#/definitions/uint32-array
+ minItems: 2
+ maxItems: 3
+ items:
+ minimum: 0
+ maximum: 8
arm,tag-latency:
description: Cycles of latency for Tag RAM accesses. Specifies 3 cells of
read, write and setup latencies. Controllers without setup latency control
should use 0. Controllers without separate read and write Tag RAM latency
values should only use the first cell.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32-array
- - minItems: 1
- maxItems: 3
- items:
- minimum: 0
- maximum: 8
+ $ref: /schemas/types.yaml#/definitions/uint32-array
+ minItems: 1
+ maxItems: 3
+ items:
+ minimum: 0
+ maximum: 8
arm,dirty-latency:
description: Cycles of latency for Dirty RAMs. This is a single cell.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - minimum: 1
- maximum: 8
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 8
arm,filter-ranges:
description: <start length> Starting address and length of window to
filter. Addresses in the filter window are directed to the M1 port. Other
addresses will go to the M0 port.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32-array
- - items:
- minItems: 2
- maxItems: 2
+ $ref: /schemas/types.yaml#/definitions/uint32-array
+ items:
+ minItems: 2
+ maxItems: 2
arm,io-coherent:
description: indicates that the system is operating in an hardware
@@ -131,36 +127,31 @@
arm,double-linefill:
description: Override double linefill enable setting. Enable if
non-zero, disable if zero.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1]
arm,double-linefill-incr:
description: Override double linefill on INCR read. Enable
if non-zero, disable if zero.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1]
arm,double-linefill-wrap:
description: Override double linefill on WRAP read. Enable
if non-zero, disable if zero.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1]
arm,prefetch-drop:
description: Override prefetch drop enable setting. Enable if non-zero,
disable if zero.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1]
arm,prefetch-offset:
description: Override prefetch offset value.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1, 2, 3, 4, 5, 6, 7, 15, 23, 31 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1, 2, 3, 4, 5, 6, 7, 15, 23, 31]
arm,shared-override:
description: The default behavior of the L220 or PL310 cache
@@ -193,35 +184,31 @@
description: |
Data prefetch. Value: <0> (forcibly disable), <1>
(forcibly enable), property absent (retain settings set by firmware)
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1]
prefetch-instr:
description: |
Instruction prefetch. Value: <0> (forcibly disable),
<1> (forcibly enable), property absent (retain settings set by
firmware)
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1]
arm,dynamic-clock-gating:
description: |
L2 dynamic clock gating. Value: <0> (forcibly
disable), <1> (forcibly enable), property absent (OS specific behavior,
preferably retain firmware settings)
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1]
arm,standby-mode:
description: L2 standby mode enable. Value <0> (forcibly disable),
<1> (forcibly enable), property absent (OS specific behavior,
preferably retain firmware settings)
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1]
arm,early-bresp-disable:
description: Disable the CA9 optimization Early BRESP (PL310)
diff --git a/Documentation/devicetree/bindings/arm/mediatek.yaml b/Documentation/devicetree/bindings/arm/mediatek.yaml
index 4043c50..abc544d 100644
--- a/Documentation/devicetree/bindings/arm/mediatek.yaml
+++ b/Documentation/devicetree/bindings/arm/mediatek.yaml
@@ -84,6 +84,28 @@
- enum:
- mediatek,mt8135-evbp1
- const: mediatek,mt8135
+ - description: Google Elm (Acer Chromebook R13)
+ items:
+ - const: google,elm-rev8
+ - const: google,elm-rev7
+ - const: google,elm-rev6
+ - const: google,elm-rev5
+ - const: google,elm-rev4
+ - const: google,elm-rev3
+ - const: google,elm
+ - const: mediatek,mt8173
+ - description: Google Hana (Lenovo Chromebook N23 Yoga, C330, 300e,...)
+ items:
+ - const: google,hana-rev6
+ - const: google,hana-rev5
+ - const: google,hana-rev4
+ - const: google,hana-rev3
+ - const: google,hana
+ - const: mediatek,mt8173
+ - description: Google Hana rev7 (Poin2 Chromebook 11C)
+ items:
+ - const: google,hana-rev7
+ - const: mediatek,mt8173
- items:
- enum:
- mediatek,mt8173-evb
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt
index ff000cc..bd7a0fa 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,apmixedsys.txt
@@ -8,6 +8,7 @@
- compatible: Should be one of:
- "mediatek,mt2701-apmixedsys"
- "mediatek,mt2712-apmixedsys", "syscon"
+ - "mediatek,mt6765-apmixedsys", "syscon"
- "mediatek,mt6779-apmixedsys", "syscon"
- "mediatek,mt6797-apmixedsys"
- "mediatek,mt7622-apmixedsys"
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt
index e4ca7b7..38309db 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,audsys.txt
@@ -7,6 +7,7 @@
- compatible: Should be one of:
- "mediatek,mt2701-audsys", "syscon"
+ - "mediatek,mt6765-audsys", "syscon"
- "mediatek,mt6779-audio", "syscon"
- "mediatek,mt7622-audsys", "syscon"
- "mediatek,mt7623-audsys", "mediatek,mt2701-audsys", "syscon"
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt
index 1f4aaa1..a0ce820 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,camsys.txt
@@ -6,6 +6,7 @@
Required Properties:
- compatible: Should be one of:
+ - "mediatek,mt6765-camsys", "syscon"
- "mediatek,mt6779-camsys", "syscon"
- "mediatek,mt8183-camsys", "syscon"
- #clock-cells: Must be 1
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt
index 2b693e3..1e1f007 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,imgsys.txt
@@ -8,6 +8,7 @@
- compatible: Should be one of:
- "mediatek,mt2701-imgsys", "syscon"
- "mediatek,mt2712-imgsys", "syscon"
+ - "mediatek,mt6765-imgsys", "syscon"
- "mediatek,mt6779-imgsys", "syscon"
- "mediatek,mt6797-imgsys", "syscon"
- "mediatek,mt7623-imgsys", "mediatek,mt2701-imgsys", "syscon"
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt
index db2f4fd..49a968b 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,infracfg.txt
@@ -9,6 +9,7 @@
- compatible: Should be one of:
- "mediatek,mt2701-infracfg", "syscon"
- "mediatek,mt2712-infracfg", "syscon"
+ - "mediatek,mt6765-infracfg", "syscon"
- "mediatek,mt6779-infracfg_ao", "syscon"
- "mediatek,mt6797-infracfg", "syscon"
- "mediatek,mt7622-infracfg", "syscon"
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mipi0a.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mipi0a.txt
new file mode 100644
index 0000000..8be5978
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mipi0a.txt
@@ -0,0 +1,28 @@
+Mediatek mipi0a (mipi_rx_ana_csi0a) controller
+============================
+
+The Mediatek mipi0a controller provides various clocks
+to the system.
+
+Required Properties:
+
+- compatible: Should be one of:
+ - "mediatek,mt6765-mipi0a", "syscon"
+- #clock-cells: Must be 1
+
+The mipi0a controller uses the common clk binding from
+Documentation/devicetree/bindings/clock/clock-bindings.txt
+The available clocks are defined in dt-bindings/clock/mt*-clk.h.
+
+The mipi0a controller also uses the common power domain from
+Documentation/devicetree/bindings/soc/mediatek/scpsys.txt
+The available power doamins are defined in dt-bindings/power/mt*-power.h.
+
+Example:
+
+mipi0a: clock-controller@11c10000 {
+ compatible = "mediatek,mt6765-mipi0a", "syscon";
+ reg = <0 0x11c10000 0 0x1000>;
+ power-domains = <&scpsys MT6765_POWER_DOMAIN_CAM>;
+ #clock-cells = <1>;
+};
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.txt
index 301eefb..d8c9108 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,mmsys.txt
@@ -1,13 +1,15 @@
Mediatek mmsys controller
============================
-The Mediatek mmsys controller provides various clocks to the system.
+The Mediatek mmsys system controller provides clock control, routing control,
+and miscellaneous control in mmsys partition.
Required Properties:
- compatible: Should be one of:
- "mediatek,mt2701-mmsys", "syscon"
- "mediatek,mt2712-mmsys", "syscon"
+ - "mediatek,mt6765-mmsys", "syscon"
- "mediatek,mt6779-mmsys", "syscon"
- "mediatek,mt6797-mmsys", "syscon"
- "mediatek,mt7623-mmsys", "mediatek,mt2701-mmsys", "syscon"
@@ -15,13 +17,13 @@
- "mediatek,mt8183-mmsys", "syscon"
- #clock-cells: Must be 1
-The mmsys controller uses the common clk binding from
+For the clock control, the mmsys controller uses the common clk binding from
Documentation/devicetree/bindings/clock/clock-bindings.txt
The available clocks are defined in dt-bindings/clock/mt*-clk.h.
Example:
-mmsys: clock-controller@14000000 {
+mmsys: syscon@14000000 {
compatible = "mediatek,mt8173-mmsys", "syscon";
reg = <0 0x14000000 0 0x1000>;
#clock-cells = <1>;
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,pericfg.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,pericfg.txt
deleted file mode 100644
index ecf027a..0000000
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,pericfg.txt
+++ /dev/null
@@ -1,36 +0,0 @@
-Mediatek pericfg controller
-===========================
-
-The Mediatek pericfg controller provides various clocks and reset
-outputs to the system.
-
-Required Properties:
-
-- compatible: Should be one of:
- - "mediatek,mt2701-pericfg", "syscon"
- - "mediatek,mt2712-pericfg", "syscon"
- - "mediatek,mt7622-pericfg", "syscon"
- - "mediatek,mt7623-pericfg", "mediatek,mt2701-pericfg", "syscon"
- - "mediatek,mt7629-pericfg", "syscon"
- - "mediatek,mt8135-pericfg", "syscon"
- - "mediatek,mt8173-pericfg", "syscon"
- - "mediatek,mt8183-pericfg", "syscon"
-- #clock-cells: Must be 1
-- #reset-cells: Must be 1
-
-The pericfg controller uses the common clk binding from
-Documentation/devicetree/bindings/clock/clock-bindings.txt
-The available clocks are defined in dt-bindings/clock/mt*-clk.h.
-Also it uses the common reset controller binding from
-Documentation/devicetree/bindings/reset/reset.txt.
-The available reset outputs are defined in
-dt-bindings/reset/mt*-resets.h
-
-Example:
-
-pericfg: power-controller@10003000 {
- compatible = "mediatek,mt8173-pericfg", "syscon";
- reg = <0 0x10003000 0 0x1000>;
- #clock-cells = <1>;
- #reset-cells = <1>;
-};
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,pericfg.yaml b/Documentation/devicetree/bindings/arm/mediatek/mediatek,pericfg.yaml
new file mode 100644
index 0000000..e271c46
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,pericfg.yaml
@@ -0,0 +1,65 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: "http://devicetree.org/schemas/arm/mediatek/mediatek,pericfg.yaml#"
+$schema: "http://devicetree.org/meta-schemas/core.yaml#"
+
+title: MediaTek Peripheral Configuration Controller
+
+maintainers:
+ - Bartosz Golaszewski <bgolaszewski@baylibre.com>
+
+description:
+ The Mediatek pericfg controller provides various clocks and reset outputs
+ to the system.
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - enum:
+ - mediatek,mt2701-pericfg
+ - mediatek,mt2712-pericfg
+ - mediatek,mt6765-pericfg
+ - mediatek,mt7622-pericfg
+ - mediatek,mt7629-pericfg
+ - mediatek,mt8135-pericfg
+ - mediatek,mt8173-pericfg
+ - mediatek,mt8183-pericfg
+ - mediatek,mt8516-pericfg
+ - const: syscon
+ - items:
+ # Special case for mt7623 for backward compatibility
+ - const: mediatek,mt7623-pericfg
+ - const: mediatek,mt2701-pericfg
+ - const: syscon
+
+ reg:
+ maxItems: 1
+
+ '#clock-cells':
+ const: 1
+
+ '#reset-cells':
+ const: 1
+
+required:
+ - compatible
+ - reg
+
+examples:
+ - |
+ pericfg@10003000 {
+ compatible = "mediatek,mt8173-pericfg", "syscon";
+ reg = <0x10003000 0x1000>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ };
+
+ - |
+ pericfg@10003000 {
+ compatible = "mediatek,mt7623-pericfg", "mediatek,mt2701-pericfg", "syscon";
+ reg = <0x10003000 0x1000>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+ };
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,topckgen.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,topckgen.txt
index 0293d69..9b0394c 100644
--- a/Documentation/devicetree/bindings/arm/mediatek/mediatek,topckgen.txt
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,topckgen.txt
@@ -8,6 +8,7 @@
- compatible: Should be one of:
- "mediatek,mt2701-topckgen"
- "mediatek,mt2712-topckgen", "syscon"
+ - "mediatek,mt6765-topckgen", "syscon"
- "mediatek,mt6779-topckgen", "syscon"
- "mediatek,mt6797-topckgen"
- "mediatek,mt7622-topckgen"
diff --git a/Documentation/devicetree/bindings/arm/mediatek/mediatek,vcodecsys.txt b/Documentation/devicetree/bindings/arm/mediatek/mediatek,vcodecsys.txt
new file mode 100644
index 0000000..c877bcc
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/mediatek/mediatek,vcodecsys.txt
@@ -0,0 +1,27 @@
+Mediatek vcodecsys controller
+============================
+
+The Mediatek vcodecsys controller provides various clocks to the system.
+
+Required Properties:
+
+- compatible: Should be one of:
+ - "mediatek,mt6765-vcodecsys", "syscon"
+- #clock-cells: Must be 1
+
+The vcodecsys controller uses the common clk binding from
+Documentation/devicetree/bindings/clock/clock-bindings.txt
+The available clocks are defined in dt-bindings/clock/mt*-clk.h.
+
+The vcodecsys controller also uses the common power domain from
+Documentation/devicetree/bindings/soc/mediatek/scpsys.txt
+The available power doamins are defined in dt-bindings/power/mt*-power.h.
+
+Example:
+
+venc_gcon: clock-controller@17000000 {
+ compatible = "mediatek,mt6765-vcodecsys", "syscon";
+ reg = <0 0x17000000 0 0x10000>;
+ power-domains = <&scpsys MT6765_POWER_DOMAIN_VCODEC>;
+ #clock-cells = <1>;
+};
diff --git a/Documentation/devicetree/bindings/arm/nxp/lpc32xx.yaml b/Documentation/devicetree/bindings/arm/nxp/lpc32xx.yaml
index 07f39d3..f7f0249 100644
--- a/Documentation/devicetree/bindings/arm/nxp/lpc32xx.yaml
+++ b/Documentation/devicetree/bindings/arm/nxp/lpc32xx.yaml
@@ -17,9 +17,8 @@
- nxp,lpc3230
- nxp,lpc3240
- items:
- - enum:
- - ea,ea3250
- - phytec,phy3250
- - const: nxp,lpc3250
-
+ - enum:
+ - ea,ea3250
+ - phytec,phy3250
+ - const: nxp,lpc3250
...
diff --git a/Documentation/devicetree/bindings/arm/psci.yaml b/Documentation/devicetree/bindings/arm/psci.yaml
index 9247b58..8b77cf8 100644
--- a/Documentation/devicetree/bindings/arm/psci.yaml
+++ b/Documentation/devicetree/bindings/arm/psci.yaml
@@ -69,13 +69,11 @@
method:
description: The method of calling the PSCI firmware.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/string-array
- - enum:
- # SMC #0, with the register assignments specified in this binding.
- - smc
- # HVC #0, with the register assignments specified in this binding.
- - hvc
+ $ref: /schemas/types.yaml#/definitions/string-array
+ enum:
+ - smc
+ # HVC #0, with the register assignments specified in this binding.
+ - hvc
cpu_suspend:
$ref: /schemas/types.yaml#/definitions/uint32
@@ -107,8 +105,8 @@
patternProperties:
"^power-domain-":
- allOf:
- - $ref: "../power/power-domain.yaml#"
+ $ref: "../power/power-domain.yaml#"
+
type: object
description: |
ARM systems can have multiple cores, sometimes in an hierarchical
diff --git a/Documentation/devicetree/bindings/arm/qcom.yaml b/Documentation/devicetree/bindings/arm/qcom.yaml
index 64ddae3..6031aee 100644
--- a/Documentation/devicetree/bindings/arm/qcom.yaml
+++ b/Documentation/devicetree/bindings/arm/qcom.yaml
@@ -37,6 +37,8 @@
msm8994
msm8996
sc7180
+ sdm630
+ sdm660
sdm845
The 'board' element must be one of the following strings:
@@ -155,6 +157,11 @@
- items:
- enum:
+ - xiaomi,lavender
+ - const: qcom,sdm660
+
+ - items:
+ - enum:
- qcom,ipq6018-cp01-c1
- const: qcom,ipq6018
diff --git a/Documentation/devicetree/bindings/arm/realtek.yaml b/Documentation/devicetree/bindings/arm/realtek.yaml
index ab59de1..845f9c7 100644
--- a/Documentation/devicetree/bindings/arm/realtek.yaml
+++ b/Documentation/devicetree/bindings/arm/realtek.yaml
@@ -14,6 +14,13 @@
const: '/'
compatible:
oneOf:
+ # RTD1195 SoC based boards
+ - items:
+ - enum:
+ - mele,x1000 # MeLE X1000
+ - realtek,horseradish # Realtek Horseradish EVB
+ - const: realtek,rtd1195
+
# RTD1293 SoC based boards
- items:
- enum:
@@ -25,6 +32,7 @@
- enum:
- mele,v9 # MeLE V9
- probox2,ava # ProBox2 AVA
+ - xnano,x5 # Xnano X5
- zidoo,x9s # Zidoo X9S
- const: realtek,rtd1295
@@ -33,4 +41,17 @@
- enum:
- synology,ds418 # Synology DiskStation DS418
- const: realtek,rtd1296
+
+ # RTD1395 SoC based boards
+ - items:
+ - enum:
+ - bananapi,bpi-m4 # Banana Pi BPI-M4
+ - realtek,lion-skin # Realtek Lion Skin EVB
+ - const: realtek,rtd1395
+
+ # RTD1619 SoC based boards
+ - items:
+ - enum:
+ - realtek,mjolnir # Realtek Mjolnir EVB
+ - const: realtek,rtd1619
...
diff --git a/Documentation/devicetree/bindings/arm/renesas,prr.yaml b/Documentation/devicetree/bindings/arm/renesas,prr.yaml
index dd08764..1f80767 100644
--- a/Documentation/devicetree/bindings/arm/renesas,prr.yaml
+++ b/Documentation/devicetree/bindings/arm/renesas,prr.yaml
@@ -33,5 +33,5 @@
- |
prr: chipid@ff000044 {
compatible = "renesas,prr";
- reg = <0 0xff000044 0 4>;
+ reg = <0xff000044 4>;
};
diff --git a/Documentation/devicetree/bindings/arm/renesas.yaml b/Documentation/devicetree/bindings/arm/renesas.yaml
index 611094d..b7d2e92 100644
--- a/Documentation/devicetree/bindings/arm/renesas.yaml
+++ b/Documentation/devicetree/bindings/arm/renesas.yaml
@@ -54,6 +54,16 @@
- description: RZ/G1H (R8A77420)
items:
+ - enum:
+ # iWave Systems RZ/G1H Qseven System On Module (iW-RainboW-G21M-Qseven)
+ - iwave,g21m
+ - const: renesas,r8a7742
+
+ - items:
+ - enum:
+ # iWave Systems RZ/G1H Qseven Development Platform (iW-RainboW-G21D-Qseven)
+ - iwave,g21d
+ - const: iwave,g21m
- const: renesas,r8a7742
- description: RZ/G1M (R8A77430)
diff --git a/Documentation/devicetree/bindings/arm/rockchip.yaml b/Documentation/devicetree/bindings/arm/rockchip.yaml
index 715586d..d4a4045 100644
--- a/Documentation/devicetree/bindings/arm/rockchip.yaml
+++ b/Documentation/devicetree/bindings/arm/rockchip.yaml
@@ -358,6 +358,11 @@
- const: haoyu,marsboard-rk3066
- const: rockchip,rk3066a
+ - description: Hardkernel Odroid Go Advance
+ items:
+ - const: hardkernel,rk3326-odroid-go2
+ - const: rockchip,rk3326
+
- description: Hugsun X99 TV Box
items:
- const: hugsun,x99
diff --git a/Documentation/devicetree/bindings/arm/samsung/exynos-chipid.yaml b/Documentation/devicetree/bindings/arm/samsung/exynos-chipid.yaml
index 0425d33..f99c0c6 100644
--- a/Documentation/devicetree/bindings/arm/samsung/exynos-chipid.yaml
+++ b/Documentation/devicetree/bindings/arm/samsung/exynos-chipid.yaml
@@ -22,9 +22,8 @@
Adaptive Supply Voltage bin selection. This can be used
to determine the ASV bin of an SoC if respective information
is missing in the CHIPID registers or in the OTP memory.
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [ 0, 1, 2, 3 ]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1, 2, 3]
required:
- compatible
diff --git a/Documentation/devicetree/bindings/arm/samsung/samsung-boards.yaml b/Documentation/devicetree/bindings/arm/samsung/samsung-boards.yaml
index 63acd57..eb92f9e 100644
--- a/Documentation/devicetree/bindings/arm/samsung/samsung-boards.yaml
+++ b/Documentation/devicetree/bindings/arm/samsung/samsung-boards.yaml
@@ -52,6 +52,7 @@
items:
- enum:
- insignal,origen # Insignal Origen
+ - samsung,i9100 # Samsung Galaxy S2 (GT-I9100)
- samsung,smdkv310 # Samsung SMDKV310 eval
- samsung,trats # Samsung Tizen Reference
- samsung,universal_c210 # Samsung C210
diff --git a/Documentation/devicetree/bindings/arm/socionext/uniphier.yaml b/Documentation/devicetree/bindings/arm/socionext/uniphier.yaml
index 65ad6d8..6caf1f9 100644
--- a/Documentation/devicetree/bindings/arm/socionext/uniphier.yaml
+++ b/Documentation/devicetree/bindings/arm/socionext/uniphier.yaml
@@ -17,45 +17,46 @@
- description: LD4 SoC boards
items:
- enum:
- - socionext,uniphier-ld4-ref
+ - socionext,uniphier-ld4-ref
- const: socionext,uniphier-ld4
- description: Pro4 SoC boards
items:
- enum:
- - socionext,uniphier-pro4-ace
- - socionext,uniphier-pro4-ref
- - socionext,uniphier-pro4-sanji
+ - socionext,uniphier-pro4-ace
+ - socionext,uniphier-pro4-ref
+ - socionext,uniphier-pro4-sanji
- const: socionext,uniphier-pro4
- description: sLD8 SoC boards
items:
- enum:
- - socionext,uniphier-sld8-ref
+ - socionext,uniphier-sld8-ref
- const: socionext,uniphier-sld8
- description: PXs2 SoC boards
items:
- enum:
- - socionext,uniphier-pxs2-gentil
- - socionext,uniphier-pxs2-vodka
+ - socionext,uniphier-pxs2-gentil
+ - socionext,uniphier-pxs2-vodka
- const: socionext,uniphier-pxs2
- description: LD6b SoC boards
items:
- enum:
- - socionext,uniphier-ld6b-ref
+ - socionext,uniphier-ld6b-ref
- const: socionext,uniphier-ld6b
- description: LD11 SoC boards
items:
- enum:
- - socionext,uniphier-ld11-global
- - socionext,uniphier-ld11-ref
+ - socionext,uniphier-ld11-global
+ - socionext,uniphier-ld11-ref
- const: socionext,uniphier-ld11
- description: LD20 SoC boards
items:
- enum:
- - socionext,uniphier-ld20-global
- - socionext,uniphier-ld20-ref
+ - socionext,uniphier-ld20-akebi96
+ - socionext,uniphier-ld20-global
+ - socionext,uniphier-ld20-ref
- const: socionext,uniphier-ld20
- description: PXs3 SoC boards
items:
- enum:
- - socionext,uniphier-pxs3-ref
+ - socionext,uniphier-pxs3-ref
- const: socionext,uniphier-pxs3
diff --git a/Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml b/Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml
index 55f7938..9f276bc9 100644
--- a/Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml
+++ b/Documentation/devicetree/bindings/arm/stm32/st,mlahb.yaml
@@ -20,7 +20,7 @@
[2]: https://wiki.st.com/stm32mpu/wiki/STM32MP15_RAM_mapping
allOf:
- - $ref: /schemas/simple-bus.yaml#
+ - $ref: /schemas/simple-bus.yaml#
properties:
compatible:
diff --git a/Documentation/devicetree/bindings/arm/stm32/st,stm32-syscon.yaml b/Documentation/devicetree/bindings/arm/stm32/st,stm32-syscon.yaml
index baff801..cf5db5e 100644
--- a/Documentation/devicetree/bindings/arm/stm32/st,stm32-syscon.yaml
+++ b/Documentation/devicetree/bindings/arm/stm32/st,stm32-syscon.yaml
@@ -14,9 +14,9 @@
compatible:
oneOf:
- items:
- - enum:
- - st,stm32mp157-syscfg
- - const: syscon
+ - enum:
+ - st,stm32mp157-syscfg
+ - const: syscon
reg:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/arm/stm32/stm32.yaml b/Documentation/devicetree/bindings/arm/stm32/stm32.yaml
index 1fcf306..790e6dd 100644
--- a/Documentation/devicetree/bindings/arm/stm32/stm32.yaml
+++ b/Documentation/devicetree/bindings/arm/stm32/stm32.yaml
@@ -38,6 +38,9 @@
- items:
- enum:
- arrow,stm32mp157a-avenger96 # Avenger96
+ - lxa,stm32mp157c-mc1
+ - shiratech,stm32mp157a-iot-box # IoT Box
+ - shiratech,stm32mp157a-stinger96 # Stinger96
- st,stm32mp157c-ed1
- st,stm32mp157a-dk1
- st,stm32mp157c-dk2
diff --git a/Documentation/devicetree/bindings/arm/sunxi.yaml b/Documentation/devicetree/bindings/arm/sunxi.yaml
index abf2d97..87817ff 100644
--- a/Documentation/devicetree/bindings/arm/sunxi.yaml
+++ b/Documentation/devicetree/bindings/arm/sunxi.yaml
@@ -561,6 +561,11 @@
- const: olimex,a20-olinuxino-lime
- const: allwinner,sun7i-a20
+ - description: Olimex A20-OlinuXino LIME (with eMMC)
+ items:
+ - const: olimex,a20-olinuxino-lime-emmc
+ - const: allwinner,sun7i-a20
+
- description: Olimex A20-OlinuXino LIME2
items:
- const: olimex,a20-olinuxino-lime2
diff --git a/Documentation/devicetree/bindings/arm/syna.txt b/Documentation/devicetree/bindings/arm/syna.txt
index 2face46..d8b48f2e 100644
--- a/Documentation/devicetree/bindings/arm/syna.txt
+++ b/Documentation/devicetree/bindings/arm/syna.txt
@@ -13,7 +13,7 @@
time. Be sure to use a device tree binary and a kernel image generated from the
same source tree.
-Please refer to Documentation/devicetree/bindings/ABI.txt for a definition of a
+Please refer to Documentation/devicetree/bindings/ABI.rst for a definition of a
stable binding/ABI.
---------------------------------------------------------------
diff --git a/Documentation/devicetree/bindings/arm/tegra/nvidia,tegra20-pmc.yaml b/Documentation/devicetree/bindings/arm/tegra/nvidia,tegra20-pmc.yaml
index f17bb35..b71a20a 100644
--- a/Documentation/devicetree/bindings/arm/tegra/nvidia,tegra20-pmc.yaml
+++ b/Documentation/devicetree/bindings/arm/tegra/nvidia,tegra20-pmc.yaml
@@ -85,9 +85,8 @@
CPU power good signal from external PMIC to PMC is enabled.
nvidia,suspend-mode:
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - enum: [0, 1, 2]
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1, 2]
description:
The suspend mode that the platform should use.
Mode 0 is for LP0, CPU + Core voltage off and DRAM in self-refresh
@@ -323,7 +322,7 @@
tegra_pmc: pmc@7000e400 {
compatible = "nvidia,tegra210-pmc";
- reg = <0x0 0x7000e400 0x0 0x400>;
+ reg = <0x7000e400 0x400>;
clocks = <&tegra_car TEGRA210_CLK_PCLK>, <&clk32k_in>;
clock-names = "pclk", "clk32k_in";
#clock-cells = <1>;
diff --git a/Documentation/devicetree/bindings/ata/faraday,ftide010.yaml b/Documentation/devicetree/bindings/ata/faraday,ftide010.yaml
index bfc6357..6451928 100644
--- a/Documentation/devicetree/bindings/ata/faraday,ftide010.yaml
+++ b/Documentation/devicetree/bindings/ata/faraday,ftide010.yaml
@@ -26,8 +26,8 @@
oneOf:
- const: faraday,ftide010
- items:
- - const: cortina,gemini-pata
- - const: faraday,ftide010
+ - const: cortina,gemini-pata
+ - const: faraday,ftide010
reg:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/ata/renesas,rcar-sata.yaml b/Documentation/devicetree/bindings/ata/renesas,rcar-sata.yaml
index 7b69831..d06096a 100644
--- a/Documentation/devicetree/bindings/ata/renesas,rcar-sata.yaml
+++ b/Documentation/devicetree/bindings/ata/renesas,rcar-sata.yaml
@@ -17,6 +17,7 @@
- renesas,sata-r8a7779 # R-Car H1
- items:
- enum:
+ - renesas,sata-r8a7742 # RZ/G1H
- renesas,sata-r8a7790-es1 # R-Car H2 ES1
- renesas,sata-r8a7790 # R-Car H2 other than ES1
- renesas,sata-r8a7791 # R-Car M2-W
diff --git a/Documentation/devicetree/bindings/ata/sata_highbank.txt b/Documentation/devicetree/bindings/ata/sata_highbank.txt
deleted file mode 100644
index aa83407..0000000
--- a/Documentation/devicetree/bindings/ata/sata_highbank.txt
+++ /dev/null
@@ -1,44 +0,0 @@
-* Calxeda AHCI SATA Controller
-
-SATA nodes are defined to describe on-chip Serial ATA controllers.
-The Calxeda SATA controller mostly conforms to the AHCI interface
-with some special extensions to add functionality.
-Each SATA controller should have its own node.
-
-Required properties:
-- compatible : compatible list, contains "calxeda,hb-ahci"
-- interrupts : <interrupt mapping for SATA IRQ>
-- reg : <registers mapping>
-
-Optional properties:
-- dma-coherent : Present if dma operations are coherent
-- calxeda,port-phys : phandle-combophy and lane assignment, which maps each
- SATA port to a combophy and a lane within that
- combophy
-- calxeda,sgpio-gpio: phandle-gpio bank, bit offset, and default on or off,
- which indicates that the driver supports SGPIO
- indicator lights using the indicated GPIOs
-- calxeda,led-order : a u32 array that map port numbers to offsets within the
- SGPIO bitstream.
-- calxeda,tx-atten : a u32 array that contains TX attenuation override
- codes, one per port. The upper 3 bytes are always
- 0 and thus ignored.
-- calxeda,pre-clocks : a u32 that indicates the number of additional clock
- cycles to transmit before sending an SGPIO pattern
-- calxeda,post-clocks: a u32 that indicates the number of additional clock
- cycles to transmit after sending an SGPIO pattern
-
-Example:
- sata@ffe08000 {
- compatible = "calxeda,hb-ahci";
- reg = <0xffe08000 0x1000>;
- interrupts = <115>;
- dma-coherent;
- calxeda,port-phys = <&combophy5 0 &combophy0 0 &combophy0 1
- &combophy0 2 &combophy0 3>;
- calxeda,sgpio-gpio =<&gpioh 5 1 &gpioh 6 1 &gpioh 7 1>;
- calxeda,led-order = <4 0 1 2 3>;
- calxeda,tx-atten = <0xff 22 0xff 0xff 23>;
- calxeda,pre-clocks = <10>;
- calxeda,post-clocks = <0>;
- };
diff --git a/Documentation/devicetree/bindings/ata/sata_highbank.yaml b/Documentation/devicetree/bindings/ata/sata_highbank.yaml
new file mode 100644
index 0000000..5e2a239
--- /dev/null
+++ b/Documentation/devicetree/bindings/ata/sata_highbank.yaml
@@ -0,0 +1,92 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/ata/sata_highbank.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Calxeda AHCI SATA Controller
+
+description: |
+ The Calxeda SATA controller mostly conforms to the AHCI interface
+ with some special extensions to add functionality, to map GPIOs for
+ activity LEDs and for mapping the ComboPHYs.
+
+maintainers:
+ - Andre Przywara <andre.przywara@arm.com>
+
+properties:
+ compatible:
+ const: calxeda,hb-ahci
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ dma-coherent: true
+
+ calxeda,pre-clocks:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: |
+ Indicates the number of additional clock cycles to transmit before
+ sending an SGPIO pattern.
+
+ calxeda,post-clocks:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: |
+ Indicates the number of additional clock cycles to transmit after
+ sending an SGPIO pattern.
+
+ calxeda,led-order:
+ description: Maps port numbers to offsets within the SGPIO bitstream.
+ $ref: /schemas/types.yaml#/definitions/uint32-array
+ minItems: 1
+ maxItems: 8
+
+ calxeda,port-phys:
+ description: |
+ phandle-combophy and lane assignment, which maps each SATA port to a
+ combophy and a lane within that combophy
+ $ref: /schemas/types.yaml#/definitions/phandle-array
+ minItems: 1
+ maxItems: 8
+
+ calxeda,tx-atten:
+ description: |
+ Contains TX attenuation override codes, one per port.
+ The upper 24 bits of each entry are always 0 and thus ignored.
+ $ref: /schemas/types.yaml#/definitions/uint32-array
+ minItems: 1
+ maxItems: 8
+
+ calxeda,sgpio-gpio:
+ description: |
+ phandle-gpio bank, bit offset, and default on or off, which indicates
+ that the driver supports SGPIO indicator lights using the indicated
+ GPIOs.
+
+required:
+ - compatible
+ - reg
+ - interrupts
+
+additionalProperties: false
+
+examples:
+ - |
+ sata@ffe08000 {
+ compatible = "calxeda,hb-ahci";
+ reg = <0xffe08000 0x1000>;
+ interrupts = <115>;
+ dma-coherent;
+ calxeda,port-phys = <&combophy5 0>, <&combophy0 0>, <&combophy0 1>,
+ <&combophy0 2>, <&combophy0 3>;
+ calxeda,sgpio-gpio =<&gpioh 5 1>, <&gpioh 6 1>, <&gpioh 7 1>;
+ calxeda,led-order = <4 0 1 2 3>;
+ calxeda,tx-atten = <0xff 22 0xff 0xff 23>;
+ calxeda,pre-clocks = <10>;
+ calxeda,post-clocks = <0>;
+ };
+
+...
diff --git a/Documentation/devicetree/bindings/auxdisplay/hit,hd44780.txt b/Documentation/devicetree/bindings/auxdisplay/hit,hd44780.txt
deleted file mode 100644
index 2aa24b88..0000000
--- a/Documentation/devicetree/bindings/auxdisplay/hit,hd44780.txt
+++ /dev/null
@@ -1,45 +0,0 @@
-DT bindings for the Hitachi HD44780 Character LCD Controller
-
-The Hitachi HD44780 Character LCD Controller is commonly used on character LCDs
-that can display one or more lines of text. It exposes an M6800 bus interface,
-which can be used in either 4-bit or 8-bit mode.
-
-Required properties:
- - compatible: Must contain "hit,hd44780",
- - data-gpios: Must contain an array of either 4 or 8 GPIO specifiers,
- referring to the GPIO pins connected to the data signal lines DB0-DB7
- (8-bit mode) or DB4-DB7 (4-bit mode) of the LCD Controller's bus interface,
- - enable-gpios: Must contain a GPIO specifier, referring to the GPIO pin
- connected to the "E" (Enable) signal line of the LCD Controller's bus
- interface,
- - rs-gpios: Must contain a GPIO specifier, referring to the GPIO pin
- connected to the "RS" (Register Select) signal line of the LCD Controller's
- bus interface,
- - display-height-chars: Height of the display, in character cells,
- - display-width-chars: Width of the display, in character cells.
-
-Optional properties:
- - rw-gpios: Must contain a GPIO specifier, referring to the GPIO pin
- connected to the "RW" (Read/Write) signal line of the LCD Controller's bus
- interface,
- - backlight-gpios: Must contain a GPIO specifier, referring to the GPIO pin
- used for enabling the LCD's backlight,
- - internal-buffer-width: Internal buffer width (default is 40 for displays
- with 1 or 2 lines, and display-width-chars for displays with more than 2
- lines).
-
-Example:
-
- auxdisplay {
- compatible = "hit,hd44780";
-
- data-gpios = <&hc595 0 GPIO_ACTIVE_HIGH>,
- <&hc595 1 GPIO_ACTIVE_HIGH>,
- <&hc595 2 GPIO_ACTIVE_HIGH>,
- <&hc595 3 GPIO_ACTIVE_HIGH>;
- enable-gpios = <&hc595 4 GPIO_ACTIVE_HIGH>;
- rs-gpios = <&hc595 5 GPIO_ACTIVE_HIGH>;
-
- display-height-chars = <2>;
- display-width-chars = <16>;
- };
diff --git a/Documentation/devicetree/bindings/auxdisplay/hit,hd44780.yaml b/Documentation/devicetree/bindings/auxdisplay/hit,hd44780.yaml
new file mode 100644
index 0000000..9222b06
--- /dev/null
+++ b/Documentation/devicetree/bindings/auxdisplay/hit,hd44780.yaml
@@ -0,0 +1,96 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/auxdisplay/hit,hd44780.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Hitachi HD44780 Character LCD Controller
+
+maintainers:
+ - Geert Uytterhoeven <geert@linux-m68k.org>
+
+description:
+ The Hitachi HD44780 Character LCD Controller is commonly used on character
+ LCDs that can display one or more lines of text. It exposes an M6800 bus
+ interface, which can be used in either 4-bit or 8-bit mode.
+
+properties:
+ compatible:
+ const: hit,hd44780
+
+ data-gpios:
+ description:
+ GPIO pins connected to the data signal lines DB0-DB7 (8-bit mode) or
+ DB4-DB7 (4-bit mode) of the LCD Controller's bus interface.
+ oneOf:
+ - maxItems: 4
+ - maxItems: 8
+
+ enable-gpios:
+ description:
+ GPIO pin connected to the "E" (Enable) signal line of the LCD
+ Controller's bus interface.
+ maxItems: 1
+
+ rs-gpios:
+ description:
+ GPIO pin connected to the "RS" (Register Select) signal line of the LCD
+ Controller's bus interface.
+ maxItems: 1
+
+ rw-gpios:
+ description:
+ GPIO pin connected to the "RW" (Read/Write) signal line of the LCD
+ Controller's bus interface.
+ maxItems: 1
+
+ backlight-gpios:
+ description: GPIO pin used for enabling the LCD's backlight.
+ maxItems: 1
+
+ display-height-chars:
+ description: Height of the display, in character cells,
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 4
+
+ display-width-chars:
+ description: Width of the display, in character cells.
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 64
+
+ internal-buffer-width:
+ description:
+ Internal buffer width (default is 40 for displays with 1 or 2 lines, and
+ display-width-chars for displays with more than 2 lines).
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
+ maximum: 64
+
+required:
+ - compatible
+ - data-gpios
+ - enable-gpios
+ - rs-gpios
+ - display-height-chars
+ - display-width-chars
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/gpio/gpio.h>
+ auxdisplay {
+ compatible = "hit,hd44780";
+
+ data-gpios = <&hc595 0 GPIO_ACTIVE_HIGH>,
+ <&hc595 1 GPIO_ACTIVE_HIGH>,
+ <&hc595 2 GPIO_ACTIVE_HIGH>,
+ <&hc595 3 GPIO_ACTIVE_HIGH>;
+ enable-gpios = <&hc595 4 GPIO_ACTIVE_HIGH>;
+ rs-gpios = <&hc595 5 GPIO_ACTIVE_HIGH>;
+
+ display-height-chars = <2>;
+ display-width-chars = <16>;
+ };
diff --git a/Documentation/devicetree/bindings/bus/allwinner,sun50i-a64-de2.yaml b/Documentation/devicetree/bindings/bus/allwinner,sun50i-a64-de2.yaml
index f0b3d30f..0503651 100644
--- a/Documentation/devicetree/bindings/bus/allwinner,sun50i-a64-de2.yaml
+++ b/Documentation/devicetree/bindings/bus/allwinner,sun50i-a64-de2.yaml
@@ -31,12 +31,11 @@
maxItems: 1
allwinner,sram:
- allOf:
- - $ref: /schemas/types.yaml#definitions/phandle-array
- - maxItems: 1
description:
The SRAM that needs to be claimed to access the display engine
bus.
+ $ref: /schemas/types.yaml#definitions/phandle-array
+ maxItems: 1
ranges: true
diff --git a/Documentation/devicetree/bindings/bus/allwinner,sun8i-a23-rsb.yaml b/Documentation/devicetree/bindings/bus/allwinner,sun8i-a23-rsb.yaml
index 8097361..32d33b9 100644
--- a/Documentation/devicetree/bindings/bus/allwinner,sun8i-a23-rsb.yaml
+++ b/Documentation/devicetree/bindings/bus/allwinner,sun8i-a23-rsb.yaml
@@ -21,8 +21,8 @@
oneOf:
- const: allwinner,sun8i-a23-rsb
- items:
- - const: allwinner,sun8i-a83t-rsb
- - const: allwinner,sun8i-a23-rsb
+ - const: allwinner,sun8i-a83t-rsb
+ - const: allwinner,sun8i-a23-rsb
reg:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/bus/arm,integrator-ap-lm.yaml b/Documentation/devicetree/bindings/bus/arm,integrator-ap-lm.yaml
new file mode 100644
index 0000000..4722742
--- /dev/null
+++ b/Documentation/devicetree/bindings/bus/arm,integrator-ap-lm.yaml
@@ -0,0 +1,83 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/bus/arm,integrator-ap-lm.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Integrator/AP Logic Module extension bus
+
+maintainers:
+ - Linus Walleij <linusw@kernel.org>
+
+description: The Integrator/AP is a prototyping platform and as such has a
+ site for stacking up to four logic modules (LM) designed specifically for
+ use with this platform. A special system controller register can be read to
+ determine if a logic module is connected at index 0, 1, 2 or 3. The logic
+ module connector is described in this binding. The logic modules per se
+ then have their own specific per-module bindings and they will be described
+ as subnodes under this logic module extension bus.
+
+properties:
+ "#address-cells":
+ const: 1
+
+ "#size-cells":
+ const: 1
+
+ compatible:
+ items:
+ - const: arm,integrator-ap-lm
+
+ ranges: true
+ dma-ranges: true
+
+patternProperties:
+ "^bus(@[0-9a-f]*)?$":
+ description: Nodes on the Logic Module bus represent logic modules
+ and are named with bus. The first module is at 0xc0000000, the second
+ at 0xd0000000 and so on until the top of the memory of the system at
+ 0xffffffff. All information about the memory used by the module is
+ in ranges and dma-ranges.
+ type: object
+
+ required:
+ - compatible
+
+required:
+ - compatible
+
+examples:
+ - |
+ bus@c0000000 {
+ compatible = "arm,integrator-ap-lm";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ ranges = <0xc0000000 0xc0000000 0x40000000>;
+ dma-ranges;
+
+ bus@c0000000 {
+ compatible = "simple-bus";
+ ranges = <0x00000000 0xc0000000 0x10000000>;
+ /* The Logic Modules sees the Core Module 0 RAM @80000000 */
+ dma-ranges = <0x00000000 0x80000000 0x10000000>;
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ serial@100000 {
+ compatible = "arm,pl011", "arm,primecell";
+ reg = <0x00100000 0x1000>;
+ interrupts-extended = <&impd1_vic 1>;
+ };
+
+ impd1_vic: interrupt-controller@3000000 {
+ compatible = "arm,pl192-vic";
+ interrupt-controller;
+ #interrupt-cells = <1>;
+ reg = <0x03000000 0x1000>;
+ valid-mask = <0x00000bff>;
+ interrupts-extended = <&pic 9>;
+ };
+ };
+ };
+
+additionalProperties: false
diff --git a/Documentation/devicetree/bindings/bus/baikal,bt1-apb.yaml b/Documentation/devicetree/bindings/bus/baikal,bt1-apb.yaml
new file mode 100644
index 0000000..68b0131
--- /dev/null
+++ b/Documentation/devicetree/bindings/bus/baikal,bt1-apb.yaml
@@ -0,0 +1,90 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+# Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/bus/baikal,bt1-apb.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Baikal-T1 APB-bus
+
+maintainers:
+ - Serge Semin <fancer.lancer@gmail.com>
+
+description: |
+ Baikal-T1 CPU or DMAC MMIO requests are handled by the AMBA 3 AXI Interconnect
+ which routes them to the AXI-APB bridge. This interface is a single master
+ multiple slaves bus in turn serializing IO accesses and routing them to the
+ addressed APB slave devices. In case of any APB protocol collisions, slave
+ device not responding on timeout an IRQ is raised with an erroneous address
+ reported to the APB terminator (APB Errors Handler Block).
+
+allOf:
+ - $ref: /schemas/simple-bus.yaml#
+
+properties:
+ compatible:
+ contains:
+ const: baikal,bt1-apb
+
+ reg:
+ items:
+ - description: APB EHB MMIO registers
+ - description: APB MMIO region with no any device mapped
+
+ reg-names:
+ items:
+ - const: ehb
+ - const: nodev
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ items:
+ - description: APB reference clock
+
+ clock-names:
+ items:
+ - const: pclk
+
+ resets:
+ items:
+ - description: APB domain reset line
+
+ reset-names:
+ items:
+ - const: prst
+
+unevaluatedProperties: false
+
+required:
+ - compatible
+ - reg
+ - reg-names
+ - interrupts
+ - clocks
+ - clock-names
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/mips-gic.h>
+
+ bus@1f059000 {
+ compatible = "baikal,bt1-apb", "simple-bus";
+ reg = <0x1f059000 0x1000>,
+ <0x1d000000 0x2040000>;
+ reg-names = "ehb", "nodev";
+ #address-cells = <1>;
+ #size-cells = <1>;
+
+ ranges;
+
+ interrupts = <GIC_SHARED 16 IRQ_TYPE_LEVEL_HIGH>;
+
+ clocks = <&ccu_sys 1>;
+ clock-names = "pclk";
+
+ resets = <&ccu_sys 1>;
+ reset-names = "prst";
+ };
+...
diff --git a/Documentation/devicetree/bindings/bus/baikal,bt1-axi.yaml b/Documentation/devicetree/bindings/bus/baikal,bt1-axi.yaml
new file mode 100644
index 0000000..29e1aae
--- /dev/null
+++ b/Documentation/devicetree/bindings/bus/baikal,bt1-axi.yaml
@@ -0,0 +1,107 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+# Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/bus/baikal,bt1-axi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Baikal-T1 AXI-bus
+
+maintainers:
+ - Serge Semin <fancer.lancer@gmail.com>
+
+description: |
+ AXI3-bus is the main communication bus of Baikal-T1 SoC connecting all
+ high-speed peripheral IP-cores with RAM controller and with MIPS P5600
+ cores. Traffic arbitration is done by means of DW AXI Interconnect (so
+ called AXI Main Interconnect) routing IO requests from one block to
+ another: from CPU to SoC peripherals and between some SoC peripherals
+ (mostly between peripheral devices and RAM, but also between DMA and
+ some peripherals). In case of any protocol error, device not responding
+ an IRQ is raised and a faulty situation is reported to the AXI EHB
+ (Errors Handler Block) embedded on top of the DW AXI Interconnect and
+ accessible by means of the Baikal-T1 System Controller.
+
+allOf:
+ - $ref: /schemas/simple-bus.yaml#
+
+properties:
+ compatible:
+ contains:
+ const: baikal,bt1-axi
+
+ reg:
+ minItems: 1
+ items:
+ - description: Synopsys DesignWare AXI Interconnect QoS registers
+ - description: AXI EHB MMIO system controller registers
+
+ reg-names:
+ minItems: 1
+ items:
+ - const: qos
+ - const: ehb
+
+ '#interconnect-cells':
+ const: 1
+
+ syscon:
+ $ref: /schemas/types.yaml#definitions/phandle
+ description: Phandle to the Baikal-T1 System Controller DT node
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ items:
+ - description: Main Interconnect uplink reference clock
+
+ clock-names:
+ items:
+ - const: aclk
+
+ resets:
+ items:
+ - description: Main Interconnect reset line
+
+ reset-names:
+ items:
+ - const: arst
+
+unevaluatedProperties: false
+
+required:
+ - compatible
+ - reg
+ - reg-names
+ - syscon
+ - interrupts
+ - clocks
+ - clock-names
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/mips-gic.h>
+
+ bus@1f05a000 {
+ compatible = "baikal,bt1-axi", "simple-bus";
+ reg = <0x1f05a000 0x1000>,
+ <0x1f04d110 0x8>;
+ reg-names = "qos", "ehb";
+ #address-cells = <1>;
+ #size-cells = <1>;
+ #interconnect-cells = <1>;
+
+ syscon = <&syscon>;
+
+ ranges;
+
+ interrupts = <GIC_SHARED 127 IRQ_TYPE_LEVEL_HIGH>;
+
+ clocks = <&ccu_axi 0>;
+ clock-names = "aclk";
+
+ resets = <&ccu_axi 0>;
+ reset-names = "arst";
+ };
+...
diff --git a/Documentation/devicetree/bindings/clock/allwinner,sun4i-a10-gates-clk.yaml b/Documentation/devicetree/bindings/clock/allwinner,sun4i-a10-gates-clk.yaml
index ed1b212..9a37a357 100644
--- a/Documentation/devicetree/bindings/clock/allwinner,sun4i-a10-gates-clk.yaml
+++ b/Documentation/devicetree/bindings/clock/allwinner,sun4i-a10-gates-clk.yaml
@@ -52,12 +52,12 @@
- const: allwinner,sun4i-a10-dram-gates-clk
- items:
- - const: allwinner,sun5i-a13-dram-gates-clk
- - const: allwinner,sun4i-a10-gates-clk
+ - const: allwinner,sun5i-a13-dram-gates-clk
+ - const: allwinner,sun4i-a10-gates-clk
- items:
- - const: allwinner,sun8i-h3-apb0-gates-clk
- - const: allwinner,sun4i-a10-gates-clk
+ - const: allwinner,sun8i-h3-apb0-gates-clk
+ - const: allwinner,sun4i-a10-gates-clk
reg:
maxItems: 1
diff --git a/Documentation/devicetree/bindings/clock/baikal,bt1-ccu-div.yaml b/Documentation/devicetree/bindings/clock/baikal,bt1-ccu-div.yaml
new file mode 100644
index 0000000..2821425
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/baikal,bt1-ccu-div.yaml
@@ -0,0 +1,188 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+# Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/baikal,bt1-ccu-div.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Baikal-T1 Clock Control Unit Dividers
+
+maintainers:
+ - Serge Semin <fancer.lancer@gmail.com>
+
+description: |
+ Clocks Control Unit is the core of Baikal-T1 SoC System Controller
+ responsible for the chip subsystems clocking and resetting. The CCU is
+ connected with an external fixed rate oscillator, which signal is transformed
+ into clocks of various frequencies and then propagated to either individual
+ IP-blocks or to groups of blocks (clock domains). The transformation is done
+ by means of an embedded into CCU PLLs and gateable/non-gateable dividers. The
+ later ones are described in this binding. Each clock domain can be also
+ individually reset by using the domain clocks divider configuration
+ registers. Baikal-T1 CCU is logically divided into the next components:
+ 1) External oscillator (normally XTAL's 25 MHz crystal oscillator, but
+ in general can provide any frequency supported by the CCU PLLs).
+ 2) PLLs clocks generators (PLLs).
+ 3) AXI-bus clock dividers (AXI) - described in this binding file.
+ 4) System devices reference clock dividers (SYS) - described in this binding
+ file.
+ which are connected with each other as shown on the next figure:
+
+ +---------------+
+ | Baikal-T1 CCU |
+ | +----+------|- MIPS P5600 cores
+ | +-|PLLs|------|- DDR controller
+ | | +----+ |
+ +----+ | | | | |
+ |XTAL|--|-+ | | +---+-|
+ +----+ | | | +-|AXI|-|- AXI-bus
+ | | | +---+-|
+ | | | |
+ | | +----+---+-|- APB-bus
+ | +-------|SYS|-|- Low-speed Devices
+ | +---+-|- High-speed Devices
+ +---------------+
+
+ Each sub-block is represented as a separate DT node and has an individual
+ driver to be bound with.
+
+ In order to create signals of wide range frequencies the external oscillator
+ output is primarily connected to a set of CCU PLLs. Some of PLLs CLKOUT are
+ then passed over CCU dividers to create signals required for the target clock
+ domain (like AXI-bus or System Device consumers). The dividers have the
+ following structure:
+
+ +--------------+
+ CLKIN --|->+----+ 1|\ |
+ SETCLK--|--|/DIV|->| | |
+ CLKDIV--|--| | | |-|->CLKLOUT
+ LOCK----|--+----+ | | |
+ | |/ |
+ | | |
+ EN------|-----------+ |
+ RST-----|--------------|->RSTOUT
+ +--------------+
+
+ where CLKIN is the reference clock coming either from CCU PLLs or from an
+ external clock oscillator, SETCLK - a command to update the output clock in
+ accordance with a set divider, CLKDIV - clocks divider, LOCK - a signal of
+ the output clock stabilization, EN - enable/disable the divider block,
+ RST/RSTOUT - reset clocks domain signal. Depending on the consumer IP-core
+ peculiarities the dividers may lack of some functionality depicted on the
+ figure above (like EN, CLKDIV/LOCK/SETCLK). In this case the corresponding
+ clock provider just doesn't expose either switching functions, or the rate
+ configuration, or both of them.
+
+ The clock dividers, which output clock is then consumed by the SoC individual
+ devices, are united into a single clocks provider called System Devices CCU.
+ Similarly the dividers with output clocks utilized as AXI-bus reference clocks
+ are called AXI-bus CCU. Both of them use the common clock bindings with no
+ custom properties. The list of exported clocks and reset signals can be found
+ in the files: 'include/dt-bindings/clock/bt1-ccu.h' and
+ 'include/dt-bindings/reset/bt1-ccu.h'. Since System Devices and AXI-bus CCU
+ are a part of the Baikal-T1 SoC System Controller their DT nodes are supposed
+ to be a children of later one.
+
+if:
+ properties:
+ compatible:
+ contains:
+ const: baikal,bt1-ccu-axi
+
+then:
+ properties:
+ clocks:
+ items:
+ - description: CCU SATA PLL output clock
+ - description: CCU PCIe PLL output clock
+ - description: CCU Ethernet PLL output clock
+
+ clock-names:
+ items:
+ - const: sata_clk
+ - const: pcie_clk
+ - const: eth_clk
+
+else:
+ properties:
+ clocks:
+ items:
+ - description: External reference clock
+ - description: CCU SATA PLL output clock
+ - description: CCU PCIe PLL output clock
+ - description: CCU Ethernet PLL output clock
+
+ clock-names:
+ items:
+ - const: ref_clk
+ - const: sata_clk
+ - const: pcie_clk
+ - const: eth_clk
+
+properties:
+ compatible:
+ enum:
+ - baikal,bt1-ccu-axi
+ - baikal,bt1-ccu-sys
+
+ reg:
+ maxItems: 1
+
+ "#clock-cells":
+ const: 1
+
+ "#reset-cells":
+ const: 1
+
+unevaluatedProperties: false
+
+required:
+ - compatible
+ - "#clock-cells"
+ - clocks
+ - clock-names
+
+examples:
+ # AXI-bus Clock Control Unit node:
+ - |
+ #include <dt-bindings/clock/bt1-ccu.h>
+
+ clock-controller@1f04d030 {
+ compatible = "baikal,bt1-ccu-axi";
+ reg = <0x1f04d030 0x030>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+
+ clocks = <&ccu_pll CCU_SATA_PLL>,
+ <&ccu_pll CCU_PCIE_PLL>,
+ <&ccu_pll CCU_ETH_PLL>;
+ clock-names = "sata_clk", "pcie_clk", "eth_clk";
+ };
+ # System Devices Clock Control Unit node:
+ - |
+ #include <dt-bindings/clock/bt1-ccu.h>
+
+ clock-controller@1f04d060 {
+ compatible = "baikal,bt1-ccu-sys";
+ reg = <0x1f04d060 0x0a0>;
+ #clock-cells = <1>;
+ #reset-cells = <1>;
+
+ clocks = <&clk25m>,
+ <&ccu_pll CCU_SATA_PLL>,
+ <&ccu_pll CCU_PCIE_PLL>,
+ <&ccu_pll CCU_ETH_PLL>;
+ clock-names = "ref_clk", "sata_clk", "pcie_clk",
+ "eth_clk";
+ };
+ # Required Clock Control Unit PLL node:
+ - |
+ ccu_pll: clock-controller@1f04d000 {
+ compatible = "baikal,bt1-ccu-pll";
+ reg = <0x1f04d000 0x028>;
+ #clock-cells = <1>;
+
+ clocks = <&clk25m>;
+ clock-names = "ref_clk";
+ };
+...
diff --git a/Documentation/devicetree/bindings/clock/baikal,bt1-ccu-pll.yaml b/Documentation/devicetree/bindings/clock/baikal,bt1-ccu-pll.yaml
new file mode 100644
index 0000000..97131bf
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/baikal,bt1-ccu-pll.yaml
@@ -0,0 +1,131 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+# Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/baikal,bt1-ccu-pll.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Baikal-T1 Clock Control Unit PLL
+
+maintainers:
+ - Serge Semin <fancer.lancer@gmail.com>
+
+description: |
+ Clocks Control Unit is the core of Baikal-T1 SoC System Controller
+ responsible for the chip subsystems clocking and resetting. The CCU is
+ connected with an external fixed rate oscillator, which signal is transformed
+ into clocks of various frequencies and then propagated to either individual
+ IP-blocks or to groups of blocks (clock domains). The transformation is done
+ by means of PLLs and gateable/non-gateable dividers embedded into the CCU.
+ It's logically divided into the next components:
+ 1) External oscillator (normally XTAL's 25 MHz crystal oscillator, but
+ in general can provide any frequency supported by the CCU PLLs).
+ 2) PLLs clocks generators (PLLs) - described in this binding file.
+ 3) AXI-bus clock dividers (AXI).
+ 4) System devices reference clock dividers (SYS).
+ which are connected with each other as shown on the next figure:
+
+ +---------------+
+ | Baikal-T1 CCU |
+ | +----+------|- MIPS P5600 cores
+ | +-|PLLs|------|- DDR controller
+ | | +----+ |
+ +----+ | | | | |
+ |XTAL|--|-+ | | +---+-|
+ +----+ | | | +-|AXI|-|- AXI-bus
+ | | | +---+-|
+ | | | |
+ | | +----+---+-|- APB-bus
+ | +-------|SYS|-|- Low-speed Devices
+ | +---+-|- High-speed Devices
+ +---------------+
+
+ Each CCU sub-block is represented as a separate dts-node and has an
+ individual driver to be bound with.
+
+ In order to create signals of wide range frequencies the external oscillator
+ output is primarily connected to a set of CCU PLLs. There are five PLLs
+ to create a clock for the MIPS P5600 cores, the embedded DDR controller,
+ SATA, Ethernet and PCIe domains. The last three domains though named by the
+ biggest system interfaces in fact include nearly all of the rest SoC
+ peripherals. Each of the PLLs is based on True Circuits TSMC CLN28HPM core
+ with an interface wrapper (so called safe PLL' clocks switcher) to simplify
+ the PLL configuration procedure. The PLLs work as depicted on the next
+ diagram:
+
+ +--------------------------+
+ | |
+ +-->+---+ +---+ +---+ | +---+ 0|\
+ CLKF--->|/NF|--->|PFD|...|VCO|-+->|/OD|--->| |
+ +---+ +->+---+ +---+ /->+---+ | |--->CLKOUT
+ CLKOD---------C----------------+ 1| |
+ +--------C--------------------------->|/
+ | | ^
+ Rclk-+->+---+ | |
+ CLKR--->|/NR|-+ |
+ +---+ |
+ BYPASS--------------------------------------+
+ BWADJ--->
+
+ where Rclk is the reference clock coming from XTAL, NR - reference clock
+ divider, NF - PLL clock multiplier, OD - VCO output clock divider, CLKOUT -
+ output clock, BWADJ is the PLL bandwidth adjustment parameter. At this moment
+ the binding supports the PLL dividers configuration in accordance with a
+ requested rate, while bypassing and bandwidth adjustment settings can be
+ added in future if it gets to be necessary.
+
+ The PLLs CLKOUT is then either directly connected with the corresponding
+ clocks consumer (like P5600 cores or DDR controller) or passed over a CCU
+ divider to create a signal required for the clock domain.
+
+ The CCU PLL dts-node uses the common clock bindings with no custom
+ parameters. The list of exported clocks can be found in
+ 'include/dt-bindings/clock/bt1-ccu.h'. Since CCU PLL is a part of the
+ Baikal-T1 SoC System Controller its DT node is supposed to be a child of
+ later one.
+
+properties:
+ compatible:
+ const: baikal,bt1-ccu-pll
+
+ reg:
+ maxItems: 1
+
+ "#clock-cells":
+ const: 1
+
+ clocks:
+ description: External reference clock
+ maxItems: 1
+
+ clock-names:
+ const: ref_clk
+
+unevaluatedProperties: false
+
+required:
+ - compatible
+ - "#clock-cells"
+ - clocks
+ - clock-names
+
+examples:
+ # Clock Control Unit PLL node:
+ - |
+ clock-controller@1f04d000 {
+ compatible = "baikal,bt1-ccu-pll";
+ reg = <0x1f04d000 0x028>;
+ #clock-cells = <1>;
+
+ clocks = <&clk25m>;
+ clock-names = "ref_clk";
+ };
+ # Required external oscillator:
+ - |
+ clk25m: clock-oscillator-25m {
+ compatible = "fixed-clock";
+ #clock-cells = <0>;
+ clock-frequency = <25000000>;
+ clock-output-names = "clk25m";
+ };
+...
diff --git a/Documentation/devicetree/bindings/clock/bitmain,bm1880-clk.yaml b/Documentation/devicetree/bindings/clock/bitmain,bm1880-clk.yaml
index 8559fe8..228c931 100644
--- a/Documentation/devicetree/bindings/clock/bitmain,bm1880-clk.yaml
+++ b/Documentation/devicetree/bindings/clock/bitmain,bm1880-clk.yaml
@@ -65,7 +65,7 @@
- |
uart0: serial@58018000 {
compatible = "snps,dw-apb-uart";
- reg = <0x0 0x58018000 0x0 0x2000>;
+ reg = <0x58018000 0x2000>;
clocks = <&clk 45>, <&clk 46>;
clock-names = "baudclk", "apb_pclk";
interrupts = <0 9 4>;
diff --git a/Documentation/devicetree/bindings/clock/calxeda.txt b/Documentation/devicetree/bindings/clock/calxeda.txt
deleted file mode 100644
index 0a6ac1b..0000000
--- a/Documentation/devicetree/bindings/clock/calxeda.txt
+++ /dev/null
@@ -1,17 +0,0 @@
-Device Tree Clock bindings for Calxeda highbank platform
-
-This binding uses the common clock binding[1].
-
-[1] Documentation/devicetree/bindings/clock/clock-bindings.txt
-
-Required properties:
-- compatible : shall be one of the following:
- "calxeda,hb-pll-clock" - for a PLL clock
- "calxeda,hb-a9periph-clock" - The A9 peripheral clock divided from the
- A9 clock.
- "calxeda,hb-a9bus-clock" - The A9 bus clock divided from the A9 clock.
- "calxeda,hb-emmc-clock" - Divided clock for MMC/SD controller.
-- reg : shall be the control register offset from SYSREGs base for the clock.
-- clocks : shall be the input parent clock phandle for the clock. This is
- either an oscillator or a pll output.
-- #clock-cells : from common clock binding; shall be set to 0.
diff --git a/Documentation/devicetree/bindings/clock/calxeda.yaml b/Documentation/devicetree/bindings/clock/calxeda.yaml
new file mode 100644
index 0000000..a34cbf3
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/calxeda.yaml
@@ -0,0 +1,82 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/calxeda.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Device Tree Clock bindings for Calxeda highbank platform
+
+description: |
+ This binding covers the Calxeda SoC internal peripheral and bus clocks
+ as used by peripherals. The clocks live inside the "system register"
+ region of the SoC, so are typically presented as children of an
+ "hb-sregs" node.
+
+maintainers:
+ - Andre Przywara <andre.przywara@arm.com>
+
+properties:
+ "#clock-cells":
+ const: 0
+
+ compatible:
+ enum:
+ - calxeda,hb-pll-clock
+ - calxeda,hb-a9periph-clock
+ - calxeda,hb-a9bus-clock
+ - calxeda,hb-emmc-clock
+
+ reg:
+ maxItems: 1
+
+ clocks:
+ maxItems: 1
+
+required:
+ - "#clock-cells"
+ - compatible
+ - clocks
+ - reg
+
+additionalProperties: false
+
+examples:
+ - |
+ sregs@3fffc000 {
+ compatible = "calxeda,hb-sregs";
+ reg = <0x3fffc000 0x1000>;
+
+ clocks {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ osc: oscillator {
+ #clock-cells = <0>;
+ compatible = "fixed-clock";
+ clock-frequency = <33333000>;
+ };
+
+ ddrpll: ddrpll@108 {
+ #clock-cells = <0>;
+ compatible = "calxeda,hb-pll-clock";
+ clocks = <&osc>;
+ reg = <0x108>;
+ };
+
+ a9pll: a9pll@100 {
+ #clock-cells = <0>;
+ compatible = "calxeda,hb-pll-clock";
+ clocks = <&osc>;
+ reg = <0x100>;
+ };
+
+ a9periphclk: a9periphclk@104 {
+ #clock-cells = <0>;
+ compatible = "calxeda,hb-a9periph-clock";
+ clocks = <&a9pll>;
+ reg = <0x104>;
+ };
+ };
+ };
+
+...
diff --git a/Documentation/devicetree/bindings/clock/cirrus,lochnagar.txt b/Documentation/devicetree/bindings/clock/cirrus,lochnagar.txt
deleted file mode 100644
index 52a064c..0000000
--- a/Documentation/devicetree/bindings/clock/cirrus,lochnagar.txt
+++ /dev/null
@@ -1,94 +0,0 @@
-Cirrus Logic Lochnagar Audio Development Board
-
-Lochnagar is an evaluation and development board for Cirrus Logic
-Smart CODEC and Amp devices. It allows the connection of most Cirrus
-Logic devices on mini-cards, as well as allowing connection of
-various application processor systems to provide a full evaluation
-platform. Audio system topology, clocking and power can all be
-controlled through the Lochnagar, allowing the device under test
-to be used in a variety of possible use cases.
-
-This binding document describes the binding for the clock portion of
-the driver.
-
-Also see these documents for generic binding information:
- [1] Clock : ../clock/clock-bindings.txt
-
-And these for relevant defines:
- [2] include/dt-bindings/clock/lochnagar.h
-
-This binding must be part of the Lochnagar MFD binding:
- [3] ../mfd/cirrus,lochnagar.txt
-
-Required properties:
-
- - compatible : One of the following strings:
- "cirrus,lochnagar1-clk"
- "cirrus,lochnagar2-clk"
-
- - #clock-cells : Must be 1. The first cell indicates the clock
- number, see [2] for available clocks and [1].
-
-Optional properties:
-
- - clocks : Must contain an entry for each clock in clock-names.
- - clock-names : May contain entries for each of the following
- clocks:
- - ln-cdc-clkout : Output clock from CODEC card.
- - ln-dsp-clkout : Output clock from DSP card.
- - ln-gf-mclk1,ln-gf-mclk2,ln-gf-mclk3,ln-gf-mclk4 : Optional
- input audio clocks from host system.
- - ln-psia1-mclk, ln-psia2-mclk : Optional input audio clocks from
- external connector.
- - ln-spdif-mclk : Optional input audio clock from SPDIF.
- - ln-spdif-clkout : Optional input audio clock from SPDIF.
- - ln-adat-mclk : Optional input audio clock from ADAT.
- - ln-pmic-32k : On board fixed clock.
- - ln-clk-12m : On board fixed clock.
- - ln-clk-11m : On board fixed clock.
- - ln-clk-24m : On board fixed clock.
- - ln-clk-22m : On board fixed clock.
- - ln-clk-8m : On board fixed clock.
- - ln-usb-clk-24m : On board fixed clock.
- - ln-usb-clk-12m : On board fixed clock.
-
- - assigned-clocks : A list of Lochnagar clocks to be reparented, see
- [2] for available clocks.
- - assigned-clock-parents : Parents to be assigned to the clocks
- listed in "assigned-clocks".
-
-Optional nodes:
-
- - fixed-clock nodes may be registered for the following on board clocks:
- - ln-pmic-32k : 32768 Hz
- - ln-clk-12m : 12288000 Hz
- - ln-clk-11m : 11298600 Hz
- - ln-clk-24m : 24576000 Hz
- - ln-clk-22m : 22579200 Hz
- - ln-clk-8m : 8192000 Hz
- - ln-usb-clk-24m : 24576000 Hz
- - ln-usb-clk-12m : 12288000 Hz
-
-Example:
-
-lochnagar {
- lochnagar-clk {
- compatible = "cirrus,lochnagar2-clk";
-
- #clock-cells = <1>;
-
- clocks = <&clk-audio>, <&clk_pmic>;
- clock-names = "ln-gf-mclk2", "ln-pmic-32k";
-
- assigned-clocks = <&lochnagar-clk LOCHNAGAR_CDC_MCLK1>,
- <&lochnagar-clk LOCHNAGAR_CDC_MCLK2>;
- assigned-clock-parents = <&clk-audio>,
- <&clk-pmic>;
- };
-
- clk-pmic: clk-pmic {
- compatible = "fixed-clock";
- clock-cells = <0>;
- clock-frequency = <32768>;
- };
-};
diff --git a/Documentation/devicetree/bindings/clock/cirrus,lochnagar.yaml b/Documentation/devicetree/bindings/clock/cirrus,lochnagar.yaml
new file mode 100644
index 0000000..59de125
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/cirrus,lochnagar.yaml
@@ -0,0 +1,78 @@
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/cirrus,lochnagar.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Cirrus Logic Lochnagar Audio Development Board
+
+maintainers:
+ - patches@opensource.cirrus.com
+
+description: |
+ Lochnagar is an evaluation and development board for Cirrus Logic
+ Smart CODEC and Amp devices. It allows the connection of most Cirrus
+ Logic devices on mini-cards, as well as allowing connection of various
+ application processor systems to provide a full evaluation platform.
+ Audio system topology, clocking and power can all be controlled through
+ the Lochnagar, allowing the device under test to be used in a variety of
+ possible use cases.
+
+ This binding document describes the binding for the clock portion of the
+ driver.
+
+ Also see these documents for generic binding information:
+ [1] Clock : ../clock/clock-bindings.txt
+
+ And these for relevant defines:
+ [2] include/dt-bindings/clock/lochnagar.h
+
+ This binding must be part of the Lochnagar MFD binding:
+ [3] ../mfd/cirrus,lochnagar.yaml
+
+properties:
+ compatible:
+ enum:
+ - cirrus,lochnagar1-clk
+ - cirrus,lochnagar2-clk
+
+ '#clock-cells':
+ description:
+ The first cell indicates the clock number, see [2] for available
+ clocks and [1].
+ const: 1
+
+ clock-names:
+ items:
+ enum:
+ - ln-cdc-clkout # Output clock from CODEC card.
+ - ln-dsp-clkout # Output clock from DSP card.
+ - ln-gf-mclk1 # Optional input clock from host system.
+ - ln-gf-mclk2 # Optional input clock from host system.
+ - ln-gf-mclk3 # Optional input clock from host system.
+ - ln-gf-mclk4 # Optional input clock from host system.
+ - ln-psia1-mclk # Optional input clock from external connector.
+ - ln-psia2-mclk # Optional input clock from external connector.
+ - ln-spdif-mclk # Optional input clock from SPDIF.
+ - ln-spdif-clkout # Optional input clock from SPDIF.
+ - ln-adat-mclk # Optional input clock from ADAT.
+ - ln-pmic-32k # On board fixed clock.
+ - ln-clk-12m # On board fixed clock.
+ - ln-clk-11m # On board fixed clock.
+ - ln-clk-24m # On board fixed clock.
+ - ln-clk-22m # On board fixed clock.
+ - ln-clk-8m # On board fixed clock.
+ - ln-usb-clk-24m # On board fixed clock.
+ - ln-usb-clk-12m # On board fixed clock.
+ minItems: 1
+ maxItems: 19
+
+ clocks: true
+ assigned-clocks: true
+ assigned-clock-parents: true
+
+additionalProperties: false
+
+required:
+ - compatible
+ - '#clock-cells'
diff --git a/Documentation/devicetree/bindings/clock/fixed-factor-clock.yaml b/Documentation/devicetree/bindings/clock/fixed-factor-clock.yaml
index b567f80..f415845 100644
--- a/Documentation/devicetree/bindings/clock/fixed-factor-clock.yaml
+++ b/Documentation/devicetree/bindings/clock/fixed-factor-clock.yaml
@@ -24,9 +24,8 @@
clock-div:
description: Fixed divider
- allOf:
- - $ref: /schemas/types.yaml#/definitions/uint32
- - minimum: 1
+ $ref: /schemas/types.yaml#/definitions/uint32
+ minimum: 1
clock-mult:
description: Fixed multiplier
diff --git a/Documentation/devicetree/bindings/clock/fsl,plldig.yaml b/Documentation/devicetree/bindings/clock/fsl,plldig.yaml
index a203d5d..9ac716d 100644
--- a/Documentation/devicetree/bindings/clock/fsl,plldig.yaml
+++ b/Documentation/devicetree/bindings/clock/fsl,plldig.yaml
@@ -28,15 +28,14 @@
const: 0
fsl,vco-hz:
- description: Optional for VCO frequency of the PLL in Hertz.
- The VCO frequency of this PLL cannot be changed during runtime
- only at startup. Therefore, the output frequencies are very
- limited and might not even closely match the requested frequency.
- To work around this restriction the user may specify its own
- desired VCO frequency for the PLL.
- minimum: 650000000
- maximum: 1300000000
- default: 1188000000
+ description: Optional for VCO frequency of the PLL in Hertz. The VCO frequency
+ of this PLL cannot be changed during runtime only at startup. Therefore,
+ the output frequencies are very limited and might not even closely match
+ the requested frequency. To work around this restriction the user may specify
+ its own desired VCO frequency for the PLL.
+ minimum: 650000000
+ maximum: 1300000000
+ default: 1188000000
required:
- compatible
@@ -51,7 +50,7 @@
- |
dpclk: clock-display@f1f0000 {
compatible = "fsl,ls1028a-plldig";
- reg = <0x0 0xf1f0000 0x0 0xffff>;
+ reg = <0xf1f0000 0xffff>;
#clock-cells = <0>;
clocks = <&osc_27m>;
};
diff --git a/Documentation/devicetree/bindings/clock/idt,versaclock5.txt b/Documentation/devicetree/bindings/clock/idt,versaclock5.txt
index 05a245c..bcff681 100644
--- a/Documentation/devicetree/bindings/clock/idt,versaclock5.txt
+++ b/Documentation/devicetree/bindings/clock/idt,versaclock5.txt
@@ -12,6 +12,7 @@
"idt,5p49v5933"
"idt,5p49v5935"
"idt,5p49v6901"
+ "idt,5p49v6965"
- reg: i2c device address, shall be 0x68 or 0x6a.
- #clock-cells: from common clock binding; shall be set to 1.
- clocks: from common clock binding; list of parent clock handles,
diff --git a/Documentation/devicetree/bindings/clock/imx1-clock.txt b/Documentation/devicetree/bindings/clock/imx1-clock.txt
deleted file mode 100644
index 9823baf..0000000
--- a/Documentation/devicetree/bindings/clock/imx1-clock.txt
+++ /dev/null
@@ -1,26 +0,0 @@
-* Clock bindings for Freescale i.MX1 CPUs
-
-Required properties:
-- compatible: Should be "fsl,imx1-ccm".
-- reg: Address and length of the register set.
-- #clock-cells: Should be <1>.
-
-The clock consumer should specify the desired clock by having the clock
-ID in its "clocks" phandle cell. See include/dt-bindings/clock/imx1-clock.h
-for the full list of i.MX1 clock IDs.
-
-Examples:
- clks: ccm@21b000 {
- #clock-cells = <1>;
- compatible = "fsl,imx1-ccm";
- reg = <0x0021b000 0x1000>;
- };
-
- pwm: pwm@208000 {
- #pwm-cells = <2>;
- compatible = "fsl,imx1-pwm";
- reg = <0x00208000 0x1000>;
- interrupts = <34>;
- clocks = <&clks IMX1_CLK_DUMMY>, <&clks IMX1_CLK_PER1>;
- clock-names = "ipg", "per";
- };
diff --git a/Documentation/devicetree/bindings/clock/imx1-clock.yaml b/Documentation/devicetree/bindings/clock/imx1-clock.yaml
new file mode 100644
index 0000000..f4833a2
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/imx1-clock.yaml
@@ -0,0 +1,51 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/imx1-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Clock bindings for Freescale i.MX1 CPUs
+
+maintainers:
+ - Alexander Shiyan <shc_work@mail.ru>
+
+description: |
+ The clock consumer should specify the desired clock by having the clock
+ ID in its "clocks" phandle cell. See include/dt-bindings/clock/imx1-clock.h
+ for the full list of i.MX1 clock IDs.
+
+properties:
+ compatible:
+ const: fsl,imx1-ccm
+
+ reg:
+ maxItems: 1
+
+ '#clock-cells':
+ const: 1
+
+required:
+ - compatible
+ - reg
+ - '#clock-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/imx1-clock.h>
+
+ clock-controller@21b000 {
+ #clock-cells = <1>;
+ compatible = "fsl,imx1-ccm";
+ reg = <0x0021b000 0x1000>;
+ };
+
+ pwm@208000 {
+ #pwm-cells = <2>;
+ compatible = "fsl,imx1-pwm";
+ reg = <0x00208000 0x1000>;
+ interrupts = <34>;
+ clocks = <&clks IMX1_CLK_DUMMY>, <&clks IMX1_CLK_PER1>;
+ clock-names = "ipg", "per";
+ };
diff --git a/Documentation/devicetree/bindings/clock/imx21-clock.txt b/Documentation/devicetree/bindings/clock/imx21-clock.txt
deleted file mode 100644
index 806f63d..0000000
--- a/Documentation/devicetree/bindings/clock/imx21-clock.txt
+++ /dev/null
@@ -1,27 +0,0 @@
-* Clock bindings for Freescale i.MX21
-
-Required properties:
-- compatible : Should be "fsl,imx21-ccm".
-- reg : Address and length of the register set.
-- interrupts : Should contain CCM interrupt.
-- #clock-cells: Should be <1>.
-
-The clock consumer should specify the desired clock by having the clock
-ID in its "clocks" phandle cell. See include/dt-bindings/clock/imx21-clock.h
-for the full list of i.MX21 clock IDs.
-
-Examples:
- clks: ccm@10027000{
- compatible = "fsl,imx21-ccm";
- reg = <0x10027000 0x800>;
- #clock-cells = <1>;
- };
-
- uart1: serial@1000a000 {
- compatible = "fsl,imx21-uart";
- reg = <0x1000a000 0x1000>;
- interrupts = <20>;
- clocks = <&clks IMX21_CLK_UART1_IPG_GATE>,
- <&clks IMX21_CLK_PER1>;
- clock-names = "ipg", "per";
- };
diff --git a/Documentation/devicetree/bindings/clock/imx21-clock.yaml b/Documentation/devicetree/bindings/clock/imx21-clock.yaml
new file mode 100644
index 0000000..518ad9a
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/imx21-clock.yaml
@@ -0,0 +1,51 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/imx21-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Clock bindings for Freescale i.MX21
+
+maintainers:
+ - Alexander Shiyan <shc_work@mail.ru>
+
+description: |
+ The clock consumer should specify the desired clock by having the clock
+ ID in its "clocks" phandle cell. See include/dt-bindings/clock/imx21-clock.h
+ for the full list of i.MX21 clock IDs.
+
+properties:
+ compatible:
+ const: fsl,imx21-ccm
+
+ reg:
+ maxItems: 1
+
+ '#clock-cells':
+ const: 1
+
+required:
+ - compatible
+ - reg
+ - '#clock-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/imx21-clock.h>
+
+ clock-controller@10027000 {
+ compatible = "fsl,imx21-ccm";
+ reg = <0x10027000 0x800>;
+ #clock-cells = <1>;
+ };
+
+ serial@1000a000 {
+ compatible = "fsl,imx21-uart";
+ reg = <0x1000a000 0x1000>;
+ interrupts = <20>;
+ clocks = <&clks IMX21_CLK_UART1_IPG_GATE>,
+ <&clks IMX21_CLK_PER1>;
+ clock-names = "ipg", "per";
+ };
diff --git a/Documentation/devicetree/bindings/clock/imx23-clock.txt b/Documentation/devicetree/bindings/clock/imx23-clock.txt
deleted file mode 100644
index 8385348..0000000
--- a/Documentation/devicetree/bindings/clock/imx23-clock.txt
+++ /dev/null
@@ -1,70 +0,0 @@
-* Clock bindings for Freescale i.MX23
-
-Required properties:
-- compatible: Should be "fsl,imx23-clkctrl"
-- reg: Address and length of the register set
-- #clock-cells: Should be <1>
-
-The clock consumer should specify the desired clock by having the clock
-ID in its "clocks" phandle cell. The following is a full list of i.MX23
-clocks and IDs.
-
- Clock ID
- ------------------
- ref_xtal 0
- pll 1
- ref_cpu 2
- ref_emi 3
- ref_pix 4
- ref_io 5
- saif_sel 6
- lcdif_sel 7
- gpmi_sel 8
- ssp_sel 9
- emi_sel 10
- cpu 11
- etm_sel 12
- cpu_pll 13
- cpu_xtal 14
- hbus 15
- xbus 16
- lcdif_div 17
- ssp_div 18
- gpmi_div 19
- emi_pll 20
- emi_xtal 21
- etm_div 22
- saif_div 23
- clk32k_div 24
- rtc 25
- adc 26
- spdif_div 27
- clk32k 28
- dri 29
- pwm 30
- filt 31
- uart 32
- ssp 33
- gpmi 34
- spdif 35
- emi 36
- saif 37
- lcdif 38
- etm 39
- usb 40
- usb_phy 41
-
-Examples:
-
-clks: clkctrl@80040000 {
- compatible = "fsl,imx23-clkctrl";
- reg = <0x80040000 0x2000>;
- #clock-cells = <1>;
-};
-
-auart0: serial@8006c000 {
- compatible = "fsl,imx23-auart";
- reg = <0x8006c000 0x2000>;
- interrupts = <24 25 23>;
- clocks = <&clks 32>;
-};
diff --git a/Documentation/devicetree/bindings/clock/imx23-clock.yaml b/Documentation/devicetree/bindings/clock/imx23-clock.yaml
new file mode 100644
index 0000000..66cb238
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/imx23-clock.yaml
@@ -0,0 +1,92 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/imx23-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Clock bindings for Freescale i.MX23
+
+maintainers:
+ - Shawn Guo <shawn.guo@linaro.org>
+
+description: |
+ The clock consumer should specify the desired clock by having the clock
+ ID in its "clocks" phandle cell. The following is a full list of i.MX23
+ clocks and IDs.
+
+ Clock ID
+ ------------------
+ ref_xtal 0
+ pll 1
+ ref_cpu 2
+ ref_emi 3
+ ref_pix 4
+ ref_io 5
+ saif_sel 6
+ lcdif_sel 7
+ gpmi_sel 8
+ ssp_sel 9
+ emi_sel 10
+ cpu 11
+ etm_sel 12
+ cpu_pll 13
+ cpu_xtal 14
+ hbus 15
+ xbus 16
+ lcdif_div 17
+ ssp_div 18
+ gpmi_div 19
+ emi_pll 20
+ emi_xtal 21
+ etm_div 22
+ saif_div 23
+ clk32k_div 24
+ rtc 25
+ adc 26
+ spdif_div 27
+ clk32k 28
+ dri 29
+ pwm 30
+ filt 31
+ uart 32
+ ssp 33
+ gpmi 34
+ spdif 35
+ emi 36
+ saif 37
+ lcdif 38
+ etm 39
+ usb 40
+ usb_phy 41
+
+properties:
+ compatible:
+ const: fsl,imx23-clkctrl
+
+ reg:
+ maxItems: 1
+
+ '#clock-cells':
+ const: 1
+
+required:
+ - compatible
+ - reg
+ - '#clock-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ clock-controller@80040000 {
+ compatible = "fsl,imx23-clkctrl";
+ reg = <0x80040000 0x2000>;
+ #clock-cells = <1>;
+ };
+
+ serial@8006c000 {
+ compatible = "fsl,imx23-auart";
+ reg = <0x8006c000 0x2000>;
+ interrupts = <24 25 23>;
+ clocks = <&clks 32>;
+ };
diff --git a/Documentation/devicetree/bindings/clock/imx25-clock.txt b/Documentation/devicetree/bindings/clock/imx25-clock.txt
deleted file mode 100644
index f8135ea..0000000
--- a/Documentation/devicetree/bindings/clock/imx25-clock.txt
+++ /dev/null
@@ -1,160 +0,0 @@
-* Clock bindings for Freescale i.MX25
-
-Required properties:
-- compatible: Should be "fsl,imx25-ccm"
-- reg: Address and length of the register set
-- interrupts: Should contain CCM interrupt
-- #clock-cells: Should be <1>
-
-The clock consumer should specify the desired clock by having the clock
-ID in its "clocks" phandle cell. The following is a full list of i.MX25
-clocks and IDs.
-
- Clock ID
- ---------------------------
- dummy 0
- osc 1
- mpll 2
- upll 3
- mpll_cpu_3_4 4
- cpu_sel 5
- cpu 6
- ahb 7
- usb_div 8
- ipg 9
- per0_sel 10
- per1_sel 11
- per2_sel 12
- per3_sel 13
- per4_sel 14
- per5_sel 15
- per6_sel 16
- per7_sel 17
- per8_sel 18
- per9_sel 19
- per10_sel 20
- per11_sel 21
- per12_sel 22
- per13_sel 23
- per14_sel 24
- per15_sel 25
- per0 26
- per1 27
- per2 28
- per3 29
- per4 30
- per5 31
- per6 32
- per7 33
- per8 34
- per9 35
- per10 36
- per11 37
- per12 38
- per13 39
- per14 40
- per15 41
- csi_ipg_per 42
- epit_ipg_per 43
- esai_ipg_per 44
- esdhc1_ipg_per 45
- esdhc2_ipg_per 46
- gpt_ipg_per 47
- i2c_ipg_per 48
- lcdc_ipg_per 49
- nfc_ipg_per 50
- owire_ipg_per 51
- pwm_ipg_per 52
- sim1_ipg_per 53
- sim2_ipg_per 54
- ssi1_ipg_per 55
- ssi2_ipg_per 56
- uart_ipg_per 57
- ata_ahb 58
- reserved 59
- csi_ahb 60
- emi_ahb 61
- esai_ahb 62
- esdhc1_ahb 63
- esdhc2_ahb 64
- fec_ahb 65
- lcdc_ahb 66
- rtic_ahb 67
- sdma_ahb 68
- slcdc_ahb 69
- usbotg_ahb 70
- reserved 71
- reserved 72
- reserved 73
- reserved 74
- can1_ipg 75
- can2_ipg 76
- csi_ipg 77
- cspi1_ipg 78
- cspi2_ipg 79
- cspi3_ipg 80
- dryice_ipg 81
- ect_ipg 82
- epit1_ipg 83
- epit2_ipg 84
- reserved 85
- esdhc1_ipg 86
- esdhc2_ipg 87
- fec_ipg 88
- reserved 89
- reserved 90
- reserved 91
- gpt1_ipg 92
- gpt2_ipg 93
- gpt3_ipg 94
- gpt4_ipg 95
- reserved 96
- reserved 97
- reserved 98
- iim_ipg 99
- reserved 100
- reserved 101
- kpp_ipg 102
- lcdc_ipg 103
- reserved 104
- pwm1_ipg 105
- pwm2_ipg 106
- pwm3_ipg 107
- pwm4_ipg 108
- rngb_ipg 109
- reserved 110
- scc_ipg 111
- sdma_ipg 112
- sim1_ipg 113
- sim2_ipg 114
- slcdc_ipg 115
- spba_ipg 116
- ssi1_ipg 117
- ssi2_ipg 118
- tsc_ipg 119
- uart1_ipg 120
- uart2_ipg 121
- uart3_ipg 122
- uart4_ipg 123
- uart5_ipg 124
- reserved 125
- wdt_ipg 126
- cko_div 127
- cko_sel 128
- cko 129
-
-Examples:
-
-clks: ccm@53f80000 {
- compatible = "fsl,imx25-ccm";
- reg = <0x53f80000 0x4000>;
- interrupts = <31>;
-};
-
-uart1: serial@43f90000 {
- compatible = "fsl,imx25-uart", "fsl,imx21-uart";
- reg = <0x43f90000 0x4000>;
- interrupts = <45>;
- clocks = <&clks 79>, <&clks 50>;
- clock-names = "ipg", "per";
-};
diff --git a/Documentation/devicetree/bindings/clock/imx25-clock.yaml b/Documentation/devicetree/bindings/clock/imx25-clock.yaml
new file mode 100644
index 0000000..2a2b107
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/imx25-clock.yaml
@@ -0,0 +1,186 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/imx25-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Clock bindings for Freescale i.MX25
+
+maintainers:
+ - Sascha Hauer <s.hauer@pengutronix.de>
+
+description: |
+ The clock consumer should specify the desired clock by having the clock
+ ID in its "clocks" phandle cell. The following is a full list of i.MX25
+ clocks and IDs.
+
+ Clock ID
+ --------------------------
+ dummy 0
+ osc 1
+ mpll 2
+ upll 3
+ mpll_cpu_3_4 4
+ cpu_sel 5
+ cpu 6
+ ahb 7
+ usb_div 8
+ ipg 9
+ per0_sel 10
+ per1_sel 11
+ per2_sel 12
+ per3_sel 13
+ per4_sel 14
+ per5_sel 15
+ per6_sel 16
+ per7_sel 17
+ per8_sel 18
+ per9_sel 19
+ per10_sel 20
+ per11_sel 21
+ per12_sel 22
+ per13_sel 23
+ per14_sel 24
+ per15_sel 25
+ per0 26
+ per1 27
+ per2 28
+ per3 29
+ per4 30
+ per5 31
+ per6 32
+ per7 33
+ per8 34
+ per9 35
+ per10 36
+ per11 37
+ per12 38
+ per13 39
+ per14 40
+ per15 41
+ csi_ipg_per 42
+ epit_ipg_per 43
+ esai_ipg_per 44
+ esdhc1_ipg_per 45
+ esdhc2_ipg_per 46
+ gpt_ipg_per 47
+ i2c_ipg_per 48
+ lcdc_ipg_per 49
+ nfc_ipg_per 50
+ owire_ipg_per 51
+ pwm_ipg_per 52
+ sim1_ipg_per 53
+ sim2_ipg_per 54
+ ssi1_ipg_per 55
+ ssi2_ipg_per 56
+ uart_ipg_per 57
+ ata_ahb 58
+ reserved 59
+ csi_ahb 60
+ emi_ahb 61
+ esai_ahb 62
+ esdhc1_ahb 63
+ esdhc2_ahb 64
+ fec_ahb 65
+ lcdc_ahb 66
+ rtic_ahb 67
+ sdma_ahb 68
+ slcdc_ahb 69
+ usbotg_ahb 70
+ reserved 71
+ reserved 72
+ reserved 73
+ reserved 74
+ can1_ipg 75
+ can2_ipg 76
+ csi_ipg 77
+ cspi1_ipg 78
+ cspi2_ipg 79
+ cspi3_ipg 80
+ dryice_ipg 81
+ ect_ipg 82
+ epit1_ipg 83
+ epit2_ipg 84
+ reserved 85
+ esdhc1_ipg 86
+ esdhc2_ipg 87
+ fec_ipg 88
+ reserved 89
+ reserved 90
+ reserved 91
+ gpt1_ipg 92
+ gpt2_ipg 93
+ gpt3_ipg 94
+ gpt4_ipg 95
+ reserved 96
+ reserved 97
+ reserved 98
+ iim_ipg 99
+ reserved 100
+ reserved 101
+ kpp_ipg 102
+ lcdc_ipg 103
+ reserved 104
+ pwm1_ipg 105
+ pwm2_ipg 106
+ pwm3_ipg 107
+ pwm4_ipg 108
+ rngb_ipg 109
+ reserved 110
+ scc_ipg 111
+ sdma_ipg 112
+ sim1_ipg 113
+ sim2_ipg 114
+ slcdc_ipg 115
+ spba_ipg 116
+ ssi1_ipg 117
+ ssi2_ipg 118
+ tsc_ipg 119
+ uart1_ipg 120
+ uart2_ipg 121
+ uart3_ipg 122
+ uart4_ipg 123
+ uart5_ipg 124
+ reserved 125
+ wdt_ipg 126
+ cko_div 127
+ cko_sel 128
+ cko 129
+
+properties:
+ compatible:
+ const: fsl,imx25-ccm
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ '#clock-cells':
+ const: 1
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - '#clock-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ clock-controller@53f80000 {
+ compatible = "fsl,imx25-ccm";
+ reg = <0x53f80000 0x4000>;
+ interrupts = <31>;
+ #clock-cells = <1>;
+ };
+
+ serial@43f90000 {
+ compatible = "fsl,imx25-uart", "fsl,imx21-uart";
+ reg = <0x43f90000 0x4000>;
+ interrupts = <45>;
+ clocks = <&clks 79>, <&clks 50>;
+ clock-names = "ipg", "per";
+ };
diff --git a/Documentation/devicetree/bindings/clock/imx27-clock.txt b/Documentation/devicetree/bindings/clock/imx27-clock.txt
deleted file mode 100644
index 4c95c04..0000000
--- a/Documentation/devicetree/bindings/clock/imx27-clock.txt
+++ /dev/null
@@ -1,27 +0,0 @@
-* Clock bindings for Freescale i.MX27
-
-Required properties:
-- compatible: Should be "fsl,imx27-ccm"
-- reg: Address and length of the register set
-- interrupts: Should contain CCM interrupt
-- #clock-cells: Should be <1>
-
-The clock consumer should specify the desired clock by having the clock
-ID in its "clocks" phandle cell. See include/dt-bindings/clock/imx27-clock.h
-for the full list of i.MX27 clock IDs.
-
-Examples:
- clks: ccm@10027000{
- compatible = "fsl,imx27-ccm";
- reg = <0x10027000 0x1000>;
- #clock-cells = <1>;
- };
-
- uart1: serial@1000a000 {
- compatible = "fsl,imx27-uart", "fsl,imx21-uart";
- reg = <0x1000a000 0x1000>;
- interrupts = <20>;
- clocks = <&clks IMX27_CLK_UART1_IPG_GATE>,
- <&clks IMX27_CLK_PER1_GATE>;
- clock-names = "ipg", "per";
- };
diff --git a/Documentation/devicetree/bindings/clock/imx27-clock.yaml b/Documentation/devicetree/bindings/clock/imx27-clock.yaml
new file mode 100644
index 0000000..b5f3ed0
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/imx27-clock.yaml
@@ -0,0 +1,55 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/imx27-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Clock bindings for Freescale i.MX27
+
+maintainers:
+ - Fabio Estevam <fabio.estevam@freescale.com>
+
+description: |
+ The clock consumer should specify the desired clock by having the clock
+ ID in its "clocks" phandle cell. See include/dt-bindings/clock/imx27-clock.h
+ for the full list of i.MX27 clock IDs.
+
+properties:
+ compatible:
+ const: fsl,imx27-ccm
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ '#clock-cells':
+ const: 1
+
+required:
+ - compatible
+ - reg
+ - '#clock-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ #include <dt-bindings/clock/imx27-clock.h>
+
+ clock-controller@10027000 {
+ compatible = "fsl,imx27-ccm";
+ reg = <0x10027000 0x1000>;
+ interrupts = <31>;
+ #clock-cells = <1>;
+ };
+
+ serial@1000a000 {
+ compatible = "fsl,imx27-uart", "fsl,imx21-uart";
+ reg = <0x1000a000 0x1000>;
+ interrupts = <20>;
+ clocks = <&clks IMX27_CLK_UART1_IPG_GATE>,
+ <&clks IMX27_CLK_PER1_GATE>;
+ clock-names = "ipg", "per";
+ };
diff --git a/Documentation/devicetree/bindings/clock/imx28-clock.txt b/Documentation/devicetree/bindings/clock/imx28-clock.txt
deleted file mode 100644
index d84a37d..0000000
--- a/Documentation/devicetree/bindings/clock/imx28-clock.txt
+++ /dev/null
@@ -1,93 +0,0 @@
-* Clock bindings for Freescale i.MX28
-
-Required properties:
-- compatible: Should be "fsl,imx28-clkctrl"
-- reg: Address and length of the register set
-- #clock-cells: Should be <1>
-
-The clock consumer should specify the desired clock by having the clock
-ID in its "clocks" phandle cell. The following is a full list of i.MX28
-clocks and IDs.
-
- Clock ID
- ------------------
- ref_xtal 0
- pll0 1
- pll1 2
- pll2 3
- ref_cpu 4
- ref_emi 5
- ref_io0 6
- ref_io1 7
- ref_pix 8
- ref_hsadc 9
- ref_gpmi 10
- saif0_sel 11
- saif1_sel 12
- gpmi_sel 13
- ssp0_sel 14
- ssp1_sel 15
- ssp2_sel 16
- ssp3_sel 17
- emi_sel 18
- etm_sel 19
- lcdif_sel 20
- cpu 21
- ptp_sel 22
- cpu_pll 23
- cpu_xtal 24
- hbus 25
- xbus 26
- ssp0_div 27
- ssp1_div 28
- ssp2_div 29
- ssp3_div 30
- gpmi_div 31
- emi_pll 32
- emi_xtal 33
- lcdif_div 34
- etm_div 35
- ptp 36
- saif0_div 37
- saif1_div 38
- clk32k_div 39
- rtc 40
- lradc 41
- spdif_div 42
- clk32k 43
- pwm 44
- uart 45
- ssp0 46
- ssp1 47
- ssp2 48
- ssp3 49
- gpmi 50
- spdif 51
- emi 52
- saif0 53
- saif1 54
- lcdif 55
- etm 56
- fec 57
- can0 58
- can1 59
- usb0 60
- usb1 61
- usb0_phy 62
- usb1_phy 63
- enet_out 64
-
-Examples:
-
-clks: clkctrl@80040000 {
- compatible = "fsl,imx28-clkctrl";
- reg = <0x80040000 0x2000>;
- #clock-cells = <1>;
-};
-
-auart0: serial@8006a000 {
- compatible = "fsl,imx28-auart", "fsl,imx23-auart";
- reg = <0x8006a000 0x2000>;
- interrupts = <112 70 71>;
- clocks = <&clks 45>;
-};
diff --git a/Documentation/devicetree/bindings/clock/imx28-clock.yaml b/Documentation/devicetree/bindings/clock/imx28-clock.yaml
new file mode 100644
index 0000000..72328d5
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/imx28-clock.yaml
@@ -0,0 +1,115 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/imx28-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Clock bindings for Freescale i.MX28
+
+maintainers:
+ - Shawn Guo <shawn.guo@linaro.org>
+
+description: |
+ The clock consumer should specify the desired clock by having the clock
+ ID in its "clocks" phandle cell. The following is a full list of i.MX28
+ clocks and IDs.
+
+ Clock ID
+ ------------------
+ ref_xtal 0
+ pll0 1
+ pll1 2
+ pll2 3
+ ref_cpu 4
+ ref_emi 5
+ ref_io0 6
+ ref_io1 7
+ ref_pix 8
+ ref_hsadc 9
+ ref_gpmi 10
+ saif0_sel 11
+ saif1_sel 12
+ gpmi_sel 13
+ ssp0_sel 14
+ ssp1_sel 15
+ ssp2_sel 16
+ ssp3_sel 17
+ emi_sel 18
+ etm_sel 19
+ lcdif_sel 20
+ cpu 21
+ ptp_sel 22
+ cpu_pll 23
+ cpu_xtal 24
+ hbus 25
+ xbus 26
+ ssp0_div 27
+ ssp1_div 28
+ ssp2_div 29
+ ssp3_div 30
+ gpmi_div 31
+ emi_pll 32
+ emi_xtal 33
+ lcdif_div 34
+ etm_div 35
+ ptp 36
+ saif0_div 37
+ saif1_div 38
+ clk32k_div 39
+ rtc 40
+ lradc 41
+ spdif_div 42
+ clk32k 43
+ pwm 44
+ uart 45
+ ssp0 46
+ ssp1 47
+ ssp2 48
+ ssp3 49
+ gpmi 50
+ spdif 51
+ emi 52
+ saif0 53
+ saif1 54
+ lcdif 55
+ etm 56
+ fec 57
+ can0 58
+ can1 59
+ usb0 60
+ usb1 61
+ usb0_phy 62
+ usb1_phy 63
+ enet_out 64
+
+properties:
+ compatible:
+ const: fsl,imx28-clkctrl
+
+ reg:
+ maxItems: 1
+
+ '#clock-cells':
+ const: 1
+
+required:
+ - compatible
+ - reg
+ - '#clock-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ clock-controller@80040000 {
+ compatible = "fsl,imx28-clkctrl";
+ reg = <0x80040000 0x2000>;
+ #clock-cells = <1>;
+ };
+
+ serial@8006a000 {
+ compatible = "fsl,imx28-auart", "fsl,imx23-auart";
+ reg = <0x8006a000 0x2000>;
+ interrupts = <112 70 71>;
+ clocks = <&clks 45>;
+ };
diff --git a/Documentation/devicetree/bindings/clock/imx31-clock.txt b/Documentation/devicetree/bindings/clock/imx31-clock.txt
deleted file mode 100644
index 0a29109..0000000
--- a/Documentation/devicetree/bindings/clock/imx31-clock.txt
+++ /dev/null
@@ -1,90 +0,0 @@
-* Clock bindings for Freescale i.MX31
-
-Required properties:
-- compatible: Should be "fsl,imx31-ccm"
-- reg: Address and length of the register set
-- interrupts: Should contain CCM interrupt
-- #clock-cells: Should be <1>
-
-The clock consumer should specify the desired clock by having the clock
-ID in its "clocks" phandle cell. The following is a full list of i.MX31
-clocks and IDs.
-
- Clock ID
- -----------------------
- dummy 0
- ckih 1
- ckil 2
- mpll 3
- spll 4
- upll 5
- mcu_main 6
- hsp 7
- ahb 8
- nfc 9
- ipg 10
- per_div 11
- per 12
- csi_sel 13
- fir_sel 14
- csi_div 15
- usb_div_pre 16
- usb_div_post 17
- fir_div_pre 18
- fir_div_post 19
- sdhc1_gate 20
- sdhc2_gate 21
- gpt_gate 22
- epit1_gate 23
- epit2_gate 24
- iim_gate 25
- ata_gate 26
- sdma_gate 27
- cspi3_gate 28
- rng_gate 29
- uart1_gate 30
- uart2_gate 31
- ssi1_gate 32
- i2c1_gate 33
- i2c2_gate 34
- i2c3_gate 35
- hantro_gate 36
- mstick1_gate 37
- mstick2_gate 38
- csi_gate 39
- rtc_gate 40
- wdog_gate 41
- pwm_gate 42
- sim_gate 43
- ect_gate 44
- usb_gate 45
- kpp_gate 46
- ipu_gate 47
- uart3_gate 48
- uart4_gate 49
- uart5_gate 50
- owire_gate 51
- ssi2_gate 52
- cspi1_gate 53
- cspi2_gate 54
- gacc_gate 55
- emi_gate 56
- rtic_gate 57
- firi_gate 58
-
-Examples:
-
-clks: ccm@53f80000{
- compatible = "fsl,imx31-ccm";
- reg = <0x53f80000 0x4000>;
- interrupts = <31>, <53>;
- #clock-cells = <1>;
-};
-
-uart1: serial@43f90000 {
- compatible = "fsl,imx31-uart", "fsl,imx21-uart";
- reg = <0x43f90000 0x4000>;
- interrupts = <45>;
- clocks = <&clks 10>, <&clks 30>;
- clock-names = "ipg", "per";
-};
diff --git a/Documentation/devicetree/bindings/clock/imx31-clock.yaml b/Documentation/devicetree/bindings/clock/imx31-clock.yaml
new file mode 100644
index 0000000..1b6f75d
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/imx31-clock.yaml
@@ -0,0 +1,120 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/imx31-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Clock bindings for Freescale i.MX31
+
+maintainers:
+ - Fabio Estevam <fabio.estevam@freescale.com>
+
+description: |
+ The clock consumer should specify the desired clock by having the clock
+ ID in its "clocks" phandle cell. The following is a full list of i.MX31
+ clocks and IDs.
+
+ Clock ID
+ -----------------------
+ dummy 0
+ ckih 1
+ ckil 2
+ mpll 3
+ spll 4
+ upll 5
+ mcu_main 6
+ hsp 7
+ ahb 8
+ nfc 9
+ ipg 10
+ per_div 11
+ per 12
+ csi_sel 13
+ fir_sel 14
+ csi_div 15
+ usb_div_pre 16
+ usb_div_post 17
+ fir_div_pre 18
+ fir_div_post 19
+ sdhc1_gate 20
+ sdhc2_gate 21
+ gpt_gate 22
+ epit1_gate 23
+ epit2_gate 24
+ iim_gate 25
+ ata_gate 26
+ sdma_gate 27
+ cspi3_gate 28
+ rng_gate 29
+ uart1_gate 30
+ uart2_gate 31
+ ssi1_gate 32
+ i2c1_gate 33
+ i2c2_gate 34
+ i2c3_gate 35
+ hantro_gate 36
+ mstick1_gate 37
+ mstick2_gate 38
+ csi_gate 39
+ rtc_gate 40
+ wdog_gate 41
+ pwm_gate 42
+ sim_gate 43
+ ect_gate 44
+ usb_gate 45
+ kpp_gate 46
+ ipu_gate 47
+ uart3_gate 48
+ uart4_gate 49
+ uart5_gate 50
+ owire_gate 51
+ ssi2_gate 52
+ cspi1_gate 53
+ cspi2_gate 54
+ gacc_gate 55
+ emi_gate 56
+ rtic_gate 57
+ firi_gate 58
+
+properties:
+ compatible:
+ const: fsl,imx31-ccm
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ description: CCM provides 2 interrupt requests, request 1 is to generate
+ interrupt for DVFS when a frequency change is requested, request 2 is
+ to generate interrupt for DPTC when a voltage change is requested.
+ items:
+ - description: CCM DVFS interrupt request 1
+ - description: CCM DPTC interrupt request 2
+
+ '#clock-cells':
+ const: 1
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - '#clock-cells'
+
+additionalProperties: false
+
+examples:
+ - |
+ clock-controller@53f80000 {
+ compatible = "fsl,imx31-ccm";
+ reg = <0x53f80000 0x4000>;
+ interrupts = <31>, <53>;
+ #clock-cells = <1>;
+ };
+
+ serial@43f90000 {
+ compatible = "fsl,imx31-uart", "fsl,imx21-uart";
+ reg = <0x43f90000 0x4000>;
+ interrupts = <45>;
+ clocks = <&clks 10>, <&clks 30>;
+ clock-names = "ipg", "per";
+ };
diff --git a/Documentation/devicetree/bindings/clock/imx35-clock.txt b/Documentation/devicetree/bindings/clock/imx35-clock.txt
deleted file mode 100644
index f497832..0000000
--- a/Documentation/devicetree/bindings/clock/imx35-clock.txt
+++ /dev/null
@@ -1,114 +0,0 @@
-* Clock bindings for Freescale i.MX35
-
-Required properties:
-- compatible: Should be "fsl,imx35-ccm"
-- reg: Address and length of the register set
-- interrupts: Should contain CCM interrupt
-- #clock-cells: Should be <1>
-
-The clock consumer should specify the desired clock by having the clock
-ID in its "clocks" phandle cell. The following is a full list of i.MX35
-clocks and IDs.
-
- Clock ID
- ---------------------------
- ckih 0
- mpll 1
- ppll 2
- mpll_075 3
- arm 4
- hsp 5
- hsp_div 6
- hsp_sel 7
- ahb 8
- ipg 9
- arm_per_div 10
- ahb_per_div 11
- ipg_per 12
- uart_sel 13
- uart_div 14
- esdhc_sel 15
- esdhc1_div 16
- esdhc2_div 17
- esdhc3_div 18
- spdif_sel 19
- spdif_div_pre 20
- spdif_div_post 21
- ssi_sel 22
- ssi1_div_pre 23
- ssi1_div_post 24
- ssi2_div_pre 25
- ssi2_div_post 26
- usb_sel 27
- usb_div 28
- nfc_div 29
- asrc_gate 30
- pata_gate 31
- audmux_gate 32
- can1_gate 33
- can2_gate 34
- cspi1_gate 35
- cspi2_gate 36
- ect_gate 37
- edio_gate 38
- emi_gate 39
- epit1_gate 40
- epit2_gate 41
- esai_gate 42
- esdhc1_gate 43
- esdhc2_gate 44
- esdhc3_gate 45
- fec_gate 46
- gpio1_gate 47
- gpio2_gate 48
- gpio3_gate 49
- gpt_gate 50
- i2c1_gate 51
- i2c2_gate 52
- i2c3_gate 53
- iomuxc_gate 54
- ipu_gate 55
- kpp_gate 56
- mlb_gate 57
- mshc_gate 58
- owire_gate 59
- pwm_gate 60
- rngc_gate 61
- rtc_gate 62
- rtic_gate 63
- scc_gate 64
- sdma_gate 65
- spba_gate 66
- spdif_gate 67
- ssi1_gate 68
- ssi2_gate 69
- uart1_gate 70
- uart2_gate 71
- uart3_gate 72
- usbotg_gate 73
- wdog_gate 74
- max_gate 75
- admux_gate 76
- csi_gate 77
- csi_div 78
- csi_sel 79
- iim_gate 80
- gpu2d_gate 81
- ckli_gate 82
-
-Examples:
-
-clks: ccm@53f80000 {
- compatible = "fsl,imx35-ccm";
- reg = <0x53f80000 0x4000>;
- interrupts = <31>;
- #clock-cells = <1>;
-};
-
-esdhc1: esdhc@53fb4000 {
- compatible = "fsl,imx35-esdhc";
- reg = <0x53fb4000 0x4000>;
- interrupts = <7>;
- clocks = <&clks 9>, <&clks 8>, <&clks 43>;
- clock-names = "ipg", "ahb", "per";
-};
diff --git a/Documentation/devicetree/bindings/clock/imx35-clock.yaml b/Documentation/devicetree/bindings/clock/imx35-clock.yaml
new file mode 100644
index 0000000..bd871da
--- /dev/null
+++ b/Documentation/devicetree/bindings/clock/imx35-clock.yaml
@@ -0,0 +1,139 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/clock/imx35-clock.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Clock bindings for Freescale i.MX35
+
+maintainers:
+ - Steffen Trumtrar <s.trumtrar@pengutronix.de>
+
+description: |
+ The clock consumer should specify the desired clock by having the clock
+ ID in its "clocks" phandle cell. The following is a full list of i.MX35
+ clocks and IDs.
+
+ Clock ID
+ ---------------------------
+ ckih 0
+ mpll 1
+ ppll 2
+ mpll_075 3
+ arm 4
+ hsp 5
+ hsp_div 6
+ hsp_sel 7
+ ahb 8
+ ipg 9
+ arm_per_div 10
+ ahb_per_div 11
+ ipg_per 12
+ uart_sel 13
+ uart_div 14
+ esdhc_sel 15
+ esdhc1_div 16
+ esdhc2_div 17
+ esdhc3_div 18
+ spdif_sel 19
+ spdif_div_pre 20
+ spdif_div_post 21
+ ssi_sel 22
+ ssi1_div_pre 23
+ ssi1_div_post 24
+ ssi2_div_pre 25
+ ssi2_div_post 26
+ usb_sel 27
+ usb_div 28
+ nfc_div 29
+ asrc_gate 30
+ pata_gate 31
+ audmux_gate 32
+ can1_gate 33
+ can2_gate 34
+ cspi1_gate 35
+ cspi2_gate 36
+ ect_gate 37
+ edio_gate 38
+ emi_gate 39
+ epit1_gate 40
+ epit2_gate 41
+ esai_gate 42
+ esdhc1_gate 43
+ esdhc2_gate 44
+ esdhc3_gate 45
+ fec_gate 46
+ gpio1_gate 47
+ gpio2_gate 48
+ gpio3_gate 49
+ gpt_gate 50
+ i2c1_gate 51
+ i2c2_gate 52
+ i2c3_gate 53
+ iomuxc_gate 54
+ ipu_gate 55
+ kpp_gate 56
+ mlb_gate 57
+ mshc_gate 58
+ owire_gate 59
+ pwm_gate 60
+ rngc_gate 61
+ rtc_gate 62
+ rtic_gate 63
+ scc_gate 64
+ sdma_gate 65
+ spba_gate 66
+ spdif_gate 67
+ ssi1_gate 68
+ ssi2_gate 69
+ uart1_gate 70
+ uart2_gate 71
+ uart3_gate 72
+ usbotg_gate 73
+ wdog_gate 74
+