Linux » Books » Developer »
Linux Device Driver Programmer's Guide,Porting to SGI Altix Systems
(document number: 007-4520-007 / published: 2008-09-24)
table of contents | additional info | download find in page
Chapter 7. PCI-X I/O and Memory Resources
This chapter describes programmable
I/O (PIO) architecture, first describing the PIO address and then describing
the flow of PIO operations.
On SGI Altix 3000 systems, the PIO address that the device
driver and CPU encounter is different from the PCI-X addresses that are
initialized on the base address registers (BARs). PCI-X host bridge
adapters on SGI Altix 3000 systems can generate only single address cycles
for PIO read and write operations. This limits the size of the PCI address
on a PCI bus to 32 bits.
PCI-X host adapters on SGI Altix 3000 systems provide a set of device
registers that help maintain PIO attributes. Two of the most common
PIO attributes are as follows: | DEV_IO_MEM | | Enables device memory or I/O space. When set, the request
generated on the PCI bus is for the PCI memory resource. Otherwise, the
request is generated for the PCI I/O resource.
| | DEV_OFF | | Specifies PCI-X address offset bits. These 12 bits replace
bits 31 to 20 of the PIO address that the PCI-X host bridge adapter obtains
from the CPU.
|
The diagram in Figure 7-1, provides the
breakdown of a “mapped” PIO address as seen by the device
driver on the CPU. Note that this address is quite different from the
addresses that are initialized on the BARs.
The following sections describe the flow of PIO operation
to local and remote PCI-X devices.
Targeting a PCI-X Device on a Local Node
The flow of PIO to a local PCI-X device is depicted in Figure 7-2. Following the figure is an explanation of the
numbered components.
CPU sees that the address is “uncached.”
It forwards the address on the front side bus (FSB).
The SHub receives the request and determines
from the NASID (bits 48 to 38) that it is targeted to itself.
The SHub forwards the request to the attached
PCI-X host bridge adapter via the Xtown2 link.
The PCI-X host bridge adapter receives the
request, parses the addresses, and places the modified PIO address (PCI-X
bus address) on the PCI-X bus.
The PCI-X device with the matching BARs responds
to the request.
Targeting a PCI-X Device on a Remote Node
The flow of PIO to a remote PCI-X device is depicted in Figure 7-3. Following the figure is an explanation of the
numbered components.
The CPU places the PIO mapped address
on the FSB.
The local SHub receives the request and determines
from the NASID that it is not targeted to itself.
The local SHub forwards the request via the
NUMAlink to the targeted remote SHub.
The PCI-X host bridge adapter receives the
request, parses the addresses, and places the modified PIO Address (PCI-X
bus address) on the PCI-X bus.
The PCI-X device with the matching BARs responds
to the request.
PIO Address Translation from CPU to PCI Bus
The PIO address that the CPU issues does not look
anything like the PCI address on the targeted base address register (BAR).
Consider the following example:
Example 7-1. Address Translation
A PCI-X device has requested for I/O a resouce of 512 bytes. It
is connected via NASID (Node ID) 0x0 and it is on that node's local PCI
bus (widget identifier) 0xe.
At boot time the system has initialized the BARs to 0x1fff_0001.
Given the previous information, the "mapped" PIO address looks like the
following to the device driver and the CPU:
The diagram in Figure 7-4, provides the breakdown
of an address that the CPU issues. This is the address you get from the
pci_dev structure.
The targeted PCI-X host bridge adapter gets the PIO address as the
following:
The device register contents of 0x11ff specifies the following:
DEV_IO_MEM == IO
DEV_OFF == 0x1ff |
The relevent device register is the one identified by PCI bus widgets
0xe and 0x4. With this information from the device register for this
address, the PCI-X host bridge adapter places the following PCI-X bus
PCI address on widget 0xe as a PCI I/O transaction (read or write operation):
The PCI-X host bridge adapter strips the 0xe4 (from 0x0e4f_0000
to form 0xf_0000) and prepends 0x1ff (from the device register
DEV_OFF value) to 0xf_0000 to make the PCI-X bus address 0x1fff_0000.
This value matches the value as initialized in the BAR.
 | Note: Reading
the BARs for an address to use as a PIO will definitely not work on SGI
Altix 3000 systems. It might or might not work on any other systems.
Most importantly, it makes your code not portable.
|
PCI-X PIO Resource Management
Device drivers on SGI Altix 3000 systems must
use the PCI resource routines described in the following sections to obtain
either the I/O or memory PIO addresses that are initialized by the platform.
Device drivers must not read or use the BARs directly.
PCI-X I/O Resource Address
Linux provides the following PCI resource interfaces
to obtain the PCI I/O resource address.
To retrieve the start I/O resource address: pci_resource_start(dev,bar) |
To retrieve the ending address of an I/O resource address:
pci_resource_end(dev,bar) |
To obtain the length of an I/O resource address:
pci_resource_len(dev,bar) |
For example: reg_base = pci_resource_start(pdev, 0);
reg_len = pci_resource_len(pdev, 0);
flags = pci_resource_flags(dev,bar);
if (flags & IORESOURCE_IO) {
// This is an I/O resource.
} |
PCI-X Memory Resource Address
Linux provides the following PCI resource
interfaces to obtain PCI memory resource addresses.
To retrieve the start memory resource address: pci_resource_start(dev,bar) |
To retrieve the ending address of a memory resource address:
pci_resource_end(dev,bar) |
To obtain the length of a memory resource address:
pci_resource_len(dev,bar) |
For example: reg_base = pci_resource_start(pdev, 0);
reg_len = pci_resource_len(pdev, 0);
flags = pci_resource_flags(dev,bar);
if (flags & IORESOURCE_MEM) {
// This is a memory resource.
} |
If the device driver provides the ability to memory-map the
memory resource address into user space, the pgprot_noncached
() macro must be used to set appropriate caching
attributes on the corresponding virtual memory area.
For example: #include <asm/page.h>
static int
my_dev_mmap(struct file *filp, struct vm_area_struct *vma) {
unsigned long my_dev_page;
struct pci_dev *dev;
/* Determine dev by methods specific to your driver, then... */
/* Check validity of input arguments, then... */
my_dev_page = pci_resource_start(dev, 0) + MY_DEV_PAGE_OFFSET;
vma->vm_flags |= VM_IO | VM_RESERVED;
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
#if defined(CONFIG_IA64) || define(CONFIG_IA64_GENERIC)
my_pci_page = REGION_OFFSET(pci_resource_start(dev,0) + MY_DEV_PAGE_OFFSET);
#endif
return io_remap_page_range(vma, vma->vm_start, my_pci_page,
MY_DEV_PAGE_LEN, vma->vm_page_prot);
}
|
PCI-X I/O Resource Reservation
It is strongly recommended that device drivers
call the following PCI-X resource reservation routines to ensure that
no other drivers are currently using a resource by mistake.
To reserve a PCI I/O resource region:
request_region(start,n,name) |
To reserve a PCI memory resource region: request_mem_region(start,n,name) |
To release the PCI I/O resource:
To release the PCI memory resource region: release_mem_region(start,n); |
For example: request_region(reg_base, reg_len, "any_id");
.....
release_region(reg_base, reg_len); |
PCI-X I/O Resource Use Macros
You should reference PCI-X I/O resource
addresses by using the following macros.
Single byte access macros: inb(address);
outb(value, address); |
Single word access macros: inw(address);
outw(value, address); |
Single long access macros: inl(address);
outl(value, address); |
Multiple byte access macros: insb(address, value_address, byte_count);
outsb(address, value_address, byte_count); |
Multiple word access macros: insw(address, value_address, word_count);
outsw(address, value_address, word_count); |
Multiple long access macros: insl(address, value_address, long_count);
outsl(address, value_address, long_count); |
 | Note: Even though on SGI Altix 3000 systems, PCI-X I/O resource
addresses are mapped addresses and can be referenced without using any
of the macros in the preceding list, it is recommended that you use these
macros so that your code is portable.
|
PCI-X Memory Resource Use Macros
PCI-X memory resource addresses should
not be used alone. Use the following platform-independent macros with
PCI-X memory resource addresses.
Single byte access macros: readb(address);
writeb(value, address); |
Single word access macros: readw(address);
writew(value, address); |
Single long access macros (4 bytes): readl(address);
writel(value, address); |
Single unsigned long access macros (8 bytes): readq(address);
writeq(value, address); |
PIO Write (Posted) Synchronization
PIO write operations on SGI Altix 3000 systems can be cached
in the various system components prior to actual arrival at the device.
These PIO write operations are called “posted” operations.
To explicitly flush these write operations, the device driver is required
to perform a PIO read operation (also known as a “PIO flush”)
after the last significant PIO write operation.
The need to perform PIO flushes becomes apparent when you consider
a multithreaded driver. Multithreaded drivers use a memory lock for synchronization,
as shown in the example sequence in Table 7-1.
Table 7-1. Memory Locks
Time
|
CPU 0
| CPU 1
|
|---|
n
| (1) Grab lock (This CPU wins the
race for the lock)
| (1) Grab lock (This CPU must wait,
as CPU 0 has the lock)
| n +
1
| (2) PIO write of Oxa to device
x
| (2) Waiting
| n +
2
| (3) Release lock (but no guarantee
that #2 has completed)
| (3) Receive lock
| n +
3
| (4) No activity
| (4) PIO write of Oxb to device
x
| n +
4
| (5) Device can receive Oxb before
Oxa
|
|
To avoid the releasing of the memory lock before the PIO write has
completed, drivers for SGI Altix 3000 systems can be programmed to issue
an additional operation (a read operation to the same controller, called
a PIO flush) to force the data to be delivered to the device before the
memory lock is released and a second thread can issue a read operation.
The sequence shown in Table 7-2, illustrates the correct
usage.
Table 7-2. Correct Memory Lock Usage
Time
|
CPU 0
| CPU 1
|
|---|
n
| (1) Grab lock (This CPU wins the
race for the lock)
| (1) Grab lock (This CPU must wait,
as CPU 0 has the lock)
| n +
1
| (2) PIO write of Oxa to device
x
| (2) Waiting
| n +
2
| (3) PIO read to the same controller
(4) Device receives Oxa
| (3) Waiting
| n +
3
| (5) Release lock
| (4) Receive lock
| n +
4
| (6) No activity
| (5) PIO write of Oxb to device
x
(6) PIO read to the same controller
(7) Device receives Oxb
|
Even though at n + 1 CPU 0 issued the
PIO write, it does not guarantee that the device will have received the
data (Oxa) before n + 3. Similarly, it does
not guarantee that the PIO write from CPU 1 at n
+ 3 does not arrive at the device before the operation that was issued
by CPU 0 at n + 1.
Following is a more concrete example from a hypothetical device
driver: ...
CPU A: spin_lock_irqsave(&dev_lock, flags)
CPU A: val = readl(my_status);
CPU A: ...
CPU A: writel(newval, ring_ptr);
CPU A: spin_unlock_irqrestore(&dev_lock, flags)
...
CPU B: spin_lock_irqsave(&dev_lock, flags)
CPU B: val = readl(my_status);
CPU B: ...
CPU B: writel(newval2, ring_ptr);
CPU B: spin_unlock_irqrestore(&dev_lock, flags)
... |
In the case above, the device may receive newval2
before it receives newval, which could cause problems.
Following is a fix for the problem: ...
CPU A: spin_lock_irqsave(&dev_lock, flags)
CPU A: val = readl(my_status);
CPU A: ...
CPU A: writel(newval, ring_ptr);
(***The following line fixes the previous problem***)
CPU A: (void)readl(safe_register); /* maybe a config register? */
CPU A: spin_unlock_irqrestore(&dev_lock, flags)
...
CPU B: spin_lock_irqsave(&dev_lock, flags)
CPU B: val = readl(my_status);
CPU B: ...
CPU B: writel(newval2, ring_ptr);
CPU B: (void)readl(safe_register); /* maybe a config register? */
CPU B: spin_unlock_irqrestore(&dev_lock, flags) |
Here, the read operations from safe_register
cause the I/O chipset to flush any pending write operations before actually
posting the read operation to the chipset, thus preventing possible data
corruption.
For more informaton, see Appendix A, “Memory Operation Ordering on SGI Altix Systems”.
PIO Read Flushing Posted DMA Buffers
SGI Altix system hardware provides the capability to buffer write
DMA buffers. These buffers are flushed only when the device generates
an interrupt.
PCI specification requires that any bridge that can buffer DMA write
buffers must ensure that these posted buffers are flushed whenever a PIO
read is issued to the device. Because this specification is not supported
on SGI Altix hardware, all of the PIO read macros (for example,
inX() and readX()) have been enhanced to
perform a DMA write flush before returning to the caller. However, on
some devices and device drivers, this enhancement can cause a negligible
performance degradation. Because of this potential performance implication,
a “fast” PIO call procedure is available. These calls do not
perform any DMA write buffer flushing. For devices that do not depend
on a PIO read to flush posted write DMA buffers, you can use the following
set of interfaces: sn_inb_fast (unsigned long port)
sn_inw_fast (unsigned long port)
sn_inl_fast (unsigned long port)
sn_readb_fast (void *addr)
sn_readw_fast (void *addr)
sn_readl_fast (void *addr) |
These calls are defined in the include/asm-ia64/sn/sn2/io.h
file.
Linux Device Driver Programmer's Guide,Porting to SGI Altix Systems
(document number: 007-4520-007 / published: 2008-09-24)
table of contents | additional info | download
Front Matter
New Features in This Guide
About This Guide
Chapter 1. Introduction
Chapter 2. Architecture
Chapter 3. PCI-X Device Attachment
Chapter 4. PCI System Initialization
Chapter 5. Finding Your PCI Device
Chapter 6. PCI/PCI-X Configuration Space
Chapter 7. PCI-X I/O and Memory Resources
Chapter 8. PCI-X Interrupt Mechanism
Chapter 9. PCI-X Direct Memory Access (DMA)
Chapter 10. Device Driver Memory Usage
Chapter 11. Time Management
Chapter 12. Building Linux Kernels and Modules
Appendix A. Memory Operation Ordering on SGI Altix Systems
Index
home/search |
what's new |
help
|