blob: 18526692de49ccf0dca827c8e4f59ac8cfef1333 [file] [log] [blame]
The MSI Driver Guide HOWTO
Tom L Nguyen tom.l.nguyen@intel.com
10/03/2003
Revised Feb 12, 2004 by Martine Silbermann
email: Martine.Silbermann@hp.com
1. About this guide
This guide describes the basics of Message Signaled Interrupts(MSI), the
advantages of using MSI over traditional interrupt mechanisms, and how
to enable your driver to use MSI or MSI-X. Also included is a Frequently
Asked Questions.
2. Copyright 2003 Intel Corporation
3. What is MSI/MSI-X?
Message Signaled Interrupt (MSI), as described in the PCI Local Bus
Specification Revision 2.3 or latest, is an optional feature, and a
required feature for PCI Express devices. MSI enables a device function
to request service by sending an Inbound Memory Write on its PCI bus to
the FSB as a Message Signal Interrupt transaction. Because MSI is
generated in the form of a Memory Write, all transaction conditions,
such as a Retry, Master-Abort, Target-Abort or normal completion, are
supported.
A PCI device that supports MSI must also support pin IRQ assertion
interrupt mechanism to provide backward compatibility for systems that
do not support MSI. In Systems, which support MSI, the bus driver is
responsible for initializing the message address and message data of
the device function's MSI/MSI-X capability structure during device
initial configuration.
An MSI capable device function indicates MSI support by implementing
the MSI/MSI-X capability structure in its PCI capability list. The
device function may implement both the MSI capability structure and
the MSI-X capability structure; however, the bus driver should not
enable both, but instead enable only the MSI-X capability structure.
The MSI capability structure contains Message Control register,
Message Address register and Message Data register. These registers
provide the bus driver control over MSI. The Message Control register
indicates the MSI capability supported by the device. The Message
Address register specifies the target address and the Message Data
register specifies the characteristics of the message. To request
service, the device function writes the content of the Message Data
register to the target address. The device and its software driver
are prohibited from writing to these registers.
The MSI-X capability structure is an optional extension to MSI. It
uses an independent and separate capability structure. There are
some key advantages to implementing the MSI-X capability structure
over the MSI capability structure as described below.
- Support a larger maximum number of vectors per function.
- Provide the ability for system software to configure
each vector with an independent message address and message
data, specified by a table that resides in Memory Space.
- MSI and MSI-X both support per-vector masking. Per-vector
masking is an optional extension of MSI but a required
feature for MSI-X. Per-vector masking provides the kernel
the ability to mask/unmask MSI when servicing its software
interrupt service routing handler. If per-vector masking is
not supported, then the device driver should provide the
hardware/software synchronization to ensure that the device
generates MSI when the driver wants it to do so.
4. Why use MSI?
As a benefit the simplification of board design, MSI allows board
designers to remove out of band interrupt routing. MSI is another
step towards a legacy-free environment.
Due to increasing pressure on chipset and processor packages to
reduce pin count, the need for interrupt pins is expected to
diminish over time. Devices, due to pin constraints, may implement
messages to increase performance.
PCI Express endpoints uses INTx emulation (in-band messages) instead
of IRQ pin assertion. Using INTx emulation requires interrupt
sharing among devices connected to the same node (PCI bridge) while
MSI is unique (non-shared) and does not require BIOS configuration
support. As a result, the PCI Express technology requires MSI
support for better interrupt performance.
Using MSI enables the device functions to support two or more
vectors, which can be configure to target different CPU's to
increase scalability.
5. Configuring a driver to use MSI/MSI-X
By default, the kernel will not enable MSI/MSI-X on all devices that
support this capability. The CONFIG_PCI_USE_VECTOR kernel option
must be selected to enable MSI/MSI-X support.
5.1 Including MSI support into the kernel
To allow MSI-Capable device drivers to selectively enable MSI (using
pci_enable_msi as described below), the VECTOR based scheme needs to
be enabled by setting CONFIG_PCI_USE_VECTOR.
Since the target of the inbound message is the local APIC, providing
CONFIG_PCI_USE_VECTOR is dependent on whether CONFIG_X86_LOCAL_APIC
is enabled or not.
int pci_enable_msi(struct pci_dev *)
With this new API, any existing device driver, which like to have
MSI enabled on its device function, must call this explicitly. A
successful call will initialize the MSI/MSI-X capability structure
with ONE vector, regardless of whether the device function is
capable of supporting multiple messages. This vector replaces the
pre-assigned dev->irq with a new MSI vector. To avoid the conflict
of new assigned vector with existing pre-assigned vector requires
the device driver to call this API before calling request_irq(...).
The below diagram shows the events, which switches the interrupt
mode on the MSI-capable device function between MSI mode and
PIN-IRQ assertion mode.
------------ pci_enable_msi ------------------------
| | <=============== | |
| MSI MODE | | PIN-IRQ ASSERTION MODE |
| | ===============> | |
------------ free_irq ------------------------
5.2 Configuring for MSI support
Due to the non-contiguous fashion in vector assignment of the
existing Linux kernel, this version does not support multiple
messages regardless of the device function is capable of supporting
more than one vector. The bus driver initializes only entry 0 of
this capability if pci_enable_msi(...) is called successfully by
the device driver.
5.3 Configuring for MSI-X support
Both the MSI capability structure and the MSI-X capability structure
share the same above semantics; however, due to the ability of the
system software to configure each vector of the MSI-X capability
structure with an independent message address and message data, the
non-contiguous fashion in vector assignment of the existing Linux
kernel has no impact on supporting multiple messages on an MSI-X
capable device functions. By default, as mentioned above, ONE vector
should be always allocated to the MSI-X capability structure at
entry 0. The bus driver does not initialize other entries of the
MSI-X table.
Note that the PCI subsystem should have full control of a MSI-X
table that resides in Memory Space. The software device driver
should not access this table.
To request for additional vectors, the device software driver should
call function msi_alloc_vectors(). It is recommended that the
software driver should call this function once during the
initialization phase of the device driver.
The function msi_alloc_vectors(), once invoked, enables either
all or nothing, depending on the current availability of vector
resources. If no vector resources are available, the device function
still works with ONE vector. If the vector resources are available
for the number of vectors requested by the driver, this function
will reconfigure the MSI-X capability structure of the device with
additional messages, starting from entry 1. To emphasize this
reason, for example, the device may be capable for supporting the
maximum of 32 vectors while its software driver usually may request
4 vectors.
For each vector, after this successful call, the device driver is
responsible to call other functions like request_irq(), enable_irq(),
etc. to enable this vector with its corresponding interrupt service
handler. It is the device driver's choice to have all vectors shared
the same interrupt service handler or each vector with a unique
interrupt service handler.
In addition to the function msi_alloc_vectors(), another function
msi_free_vectors() is provided to allow the software driver to
release a number of vectors back to the vector resources. Once
invoked, the PCI subsystem disables (masks) each vector released.
These vectors are no longer valid for the hardware device and its
software driver to use. Like free_irq, it recommends that the
device driver should also call msi_free_vectors to release all
additional vectors previously requested.
int msi_alloc_vectors(struct pci_dev *dev, int *vector, int nvec)
This API enables the software driver to request the PCI subsystem
for additional messages. Depending on the number of vectors
available, the PCI subsystem enables either all or nothing.
Argument dev points to the device (pci_dev) structure.
Argument vector is a pointer of integer type. The number of
elements is indicated in argument nvec.
Argument nvec is an integer indicating the number of messages
requested.
A return of zero indicates that the number of allocated vector is
successfully allocated. Otherwise, indicate resources not
available.
int msi_free_vectors(struct pci_dev* dev, int *vector, int nvec)
This API enables the software driver to inform the PCI subsystem
that it is willing to release a number of vectors back to the
MSI resource pool. Once invoked, the PCI subsystem disables each
MSI-X entry associated with each vector stored in the argument 2.
These vectors are no longer valid for the hardware device and
its software driver to use.
Argument dev points to the device (pci_dev) structure.
Argument vector is a pointer of integer type. The number of
elements is indicated in argument nvec.
Argument nvec is an integer indicating the number of messages
released.
A return of zero indicates that the number of allocated vectors
is successfully released. Otherwise, indicates a failure.
5.4 Hardware requirements for MSI support
MSI support requires support from both system hardware and
individual hardware device functions.
5.4.1 System hardware support
Since the target of MSI address is the local APIC CPU, enabling
MSI support in Linux kernel is dependent on whether existing
system hardware supports local APIC. Users should verify their
system whether it runs when CONFIG_X86_LOCAL_APIC=y.
In SMP environment, CONFIG_X86_LOCAL_APIC is automatically set;
however, in UP environment, users must manually set
CONFIG_X86_LOCAL_APIC. Once CONFIG_X86_LOCAL_APIC=y, setting
CONFIG_PCI_USE_VECTOR enables the VECTOR based scheme and
the option for MSI-capable device drivers to selectively enable
MSI (using pci_enable_msi as described below).
Note that CONFIG_X86_IO_APIC setting is irrelevant because MSI
vector is allocated new during runtime and MSI support does not
depend on BIOS support. This key independency enables MSI support
on future IOxAPIC free platform.
5.4.2 Device hardware support
The hardware device function supports MSI by indicating the
MSI/MSI-X capability structure on its PCI capability list. By
default, this capability structure will not be initialized by
the kernel to enable MSI during the system boot. In other words,
the device function is running on its default pin assertion mode.
Note that in many cases the hardware supporting MSI have bugs,
which may result in system hang. The software driver of specific
MSI-capable hardware is responsible for whether calling
pci_enable_msi or not. A return of zero indicates the kernel
successfully initializes the MSI/MSI-X capability structure of the
device funtion. The device function is now running on MSI mode.
5.5 How to tell whether MSI is enabled on device function
At the driver level, a return of zero from pci_enable_msi(...)
indicates to the device driver that its device function is
initialized successfully and ready to run in MSI mode.
At the user level, users can use command 'cat /proc/interrupts'
to display the vector allocated for the device and its interrupt
mode, as shown below.
CPU0 CPU1
0: 324639 0 IO-APIC-edge timer
1: 1186 0 IO-APIC-edge i8042
2: 0 0 XT-PIC cascade
12: 2797 0 IO-APIC-edge i8042
14: 6543 0 IO-APIC-edge ide0
15: 1 0 IO-APIC-edge ide1
169: 0 0 IO-APIC-level uhci-hcd
185: 0 0 IO-APIC-level uhci-hcd
193: 138 10 PCI MSI aic79xx
201: 30 0 PCI MSI aic79xx
225: 30 0 IO-APIC-level aic7xxx
233: 30 0 IO-APIC-level aic7xxx
NMI: 0 0
LOC: 324553 325068
ERR: 0
MIS: 0
6. FAQ
Q1. Are there any limitations on using the MSI?
A1. If the PCI device supports MSI and conforms to the
specification and the platform supports the APIC local bus,
then using MSI should work.
Q2. Will it work on all the Pentium processors (P3, P4, Xeon,
AMD processors)? In P3 IPI's are transmitted on the APIC local
bus and in P4 and Xeon they are transmitted on the system
bus. Are there any implications with this?
A2. MSI support enables a PCI device sending an inbound
memory write (0xfeexxxxx as target address) on its PCI bus
directly to the FSB. Since the message address has a
redirection hint bit cleared, it should work.
Q3. The target address 0xfeexxxxx will be translated by the
Host Bridge into an interrupt message. Are there any
limitations on the chipsets such as Intel 8xx, Intel e7xxx,
or VIA?
A3. If these chipsets support an inbound memory write with
target address set as 0xfeexxxxx, as conformed to PCI
specification 2.3 or latest, then it should work.
Q4. From the driver point of view, if the MSI is lost because
of the errors occur during inbound memory write, then it may
wait for ever. Is there a mechanism for it to recover?
A4. Since the target of the transaction is an inbound memory
write, all transaction termination conditions (Retry,
Master-Abort, Target-Abort, or normal completion) are
supported. A device sending an MSI must abide by all the PCI
rules and conditions regarding that inbound memory write. So,
if a retry is signaled it must retry, etc... We believe that
the recommendation for Abort is also a retry (refer to PCI
specification 2.3 or latest).