Technical Briefing

August, 1996

Transfer Methods for Adapter Cards

Introduction

Data to be processed by the computer's CPU (Central Processing Unit) is stored in PC memory. The CPU reads data from PC memory into its registers, and also writes data to PC memory from its registers. I/O devices, including network adapters, can also move data into and out of PC memory.

This paper describes four possible ways for a computer to move data between an adapter and PC memory.

In the two transfer methods that the paper describes first (PC DMA, and Bus Master DMA), the data completely bypasses the CPU, and the network adapter accesses PC memory directly.

In the two methods described second (Programmed Input/Output and Memory Mapped Input/Output) the CPU is central to the transfer process and indeed all of the data passes through the CPU.

Bus types

A computer's CPU and PC memory are connected to each other and to I/O devices by the PC bus (a generic term for the data, address, and control buses in the PC). When data is moved inside the PC, the number of individual transfer operations required to accomplish the whole transfer depends upon, and is limited by, the size of the PC bus. Where the PC bus connects the CPU to PC memory it might be as much as 64-bits wide. Where it connects I/O devices (often via an expansion slot) to the CPU or to PC memory, it might be 8-, 16- or 32-bits wide: the original IBM PCs had an 8-bit ISA bus; AT computers have 16-bit ISA buses; and EISA, MC32, and PCI computers have 32-bit buses. Of course, it is quicker to transfer a given quantity of data over a bus 32-bits at a time than 16- or 8-bits at a time.

Data transfer speeds are also affected by whether the data transfers use streaming or not. A transfer that uses streaming is one in which the device performing the transfer (the CPU or a network adapter, for example) does so without having to be told which memory address to use for each individual transfer that is required. Instead, the device is told which address to read to or write from first, and it simply increments the address it uses by 8-, 16-, or 32-bits (depending on the bus) for each subsequent memory access until the whole transfer is complete.

Finally, it is possible to install adapters designed for different buses into the same PC. For example, PCI computers often have an ISA bus as well as a PCI one, and can therefore use PCI or ISA adapters. Also, the EISA expansion slots in EISA computers can accept ISA adapters. An ISA adapter in an EISA slot, however, will use 16-bit transfers.

The figure below shows that the size of a single data transfer is limited by the size of the computer's bus

PC-controlled Direct Memory Access

Before an adapter can use PC-controlled DMA, it needs to be allocated a DMA channel to the PC's DMA controller. The DMA channel is a user-configurable value on most adapters.

PC-controlled DMA transfers bypass the computer's CPU, and are managed by the PC's DMA Controller. When the adapter has data in its receive memory, it sends a control signal to the DMA controller to tell it that data is available. The DMA controller will have been pre-programmed by the CPU (that is, by the adapter driver) with an address in memory for received data. It tells the adapter the address it can access in memory, and the adapter writes the data to memory.

To transmit data, the CPU tells the DMA controller to move a block of data from an address in PC memory (determined by the adapter driver) to the adapter. Each time the adapter receives data from the bus it writes it to its transmit area of memory.

The figure below illustrates Bus Master DMA

Bus Master Direct Memory Access

Before an adapter can use Bus Master DMA, it needs to be allocated a DMA channel to the PC's DMA controller. The DMA channel is used by the adapter's Bus Master circuitry.

In Bus Master DMA the adapter controls the whole of the data transfer. When the adapter has received data, it signals a request to the system to let it use the bus. When the system grants it the use of the bus, the adapter writes the data directly into PC memory.

Similarly, to transmit data, the adapter signals a request to the system to let it use the bus. Then, when it has the use of the bus, it performs the transfer.

Modern PC buses allow adapters to use streaming for Bus Master transfers. Streaming is where the adapter signals to the PC's memory circuitry that the data is sequential, and the memory circuitry then increments the address automatically for each memory access that the transfer requires. This speeds up transfers because it enables the adapter to put an address onto the address bus only once (for the first access) instead of once for every memory access that a transfer requires.

The figure below illustrates Bus Master DMA

Programmed Input/Output (PIO)

In the method of data transfer known as PIO, the computer's CPU is in complete control. When the network adapter has received data, it interrupts the CPU which then reads the data from a fixed location in I/O space and writes it into PC memory.

Each time the CPU reads data from the I/O space, a processor on the adapter performs its own read operation to recover the data from the adapter's memory. It then puts the data onto the bus for the CPU, which writes it into memory. The CPU continues to read from the adapter's location in I/O space until the transfer is completed.

The size of a single data transfer is limit

The figure below illustrates CPU Controlled Transfers

To transmit data, the CPU reads from PC memory and writes to the adapter's fixed location in I/O space. Each time it performs a write operation (with 8-, 16-, or 32-bits of data), the processor on the adapter writes the data to the transmit area in adapter memory, and the adapter transmits it onto the network.

Memory Mapped Input/Output (MMIO)

MMIO is the same as PIO except that, instead of the computer's processor reading from and writing to a fixed location in I/O space, it uses a specific range of addresses in the CPU's memory map. The specific range is determined by the adapter or by the driver depending on whether the driver has been configured by the network administrator to use a specific area of mapped memory.

When the CPU accesses the area of mapped memory associated with the adapter, it uses the same signals as it would use if it were really accessing a memory chip. When the adapter circuitry responds to provide or accept data it does so by faking the response signals that a memory chip would use. For each access, the adapter either reads from the receive area on the adapter (for receive operations) or writes to the transmit area (for transmit operations).

If MMIO is being used to transfer streams of data, the CPU normally accesses the data sequentially starting from the base location of the adapter's address range. If MMIO is being used only to transfer control data, the adapter can enable the CPU to perform random accesses (the technique it uses for this is similar to the technique it uses to enable the CPU to access Shared Memory).

MMIO can be faster than PIO. In some systems it is faster for a CPU to move data between two addresses in PC memory than itis for the CPU to move data between an address in memory and an address in I/O space. When it is moving data between two memory locations, the CPU can burst data between them. This means that it does not need to provide the memory circuitry with the destination address for each read or write operation of a given data transfer. Once it has the address for the first read or write operation, the memory circuitry assumes that the address for the next one will simply be the next memory address along.

Faster CPU instructions for PIO and MMIO

The instructions that the CPU uses to perform PIO and MMIO are:

Todays CPUs can also use the following Repeat String instructions: REP INS, REP OUTS, and REP MOVS. These instructions perform the same operations as those associated with the IN, OUT, and MOV instructions, except that they repeat those operations a preset number of times. This is faster than executing individual transfers a number of times.

Most adapter cards support these String CPU instructions. The 32-bit variant of the instructions gives the fastest throughput, but support for the instructions depends upon the adapter hardware. Some 32-bit PCI adapters will not support them, and some 16-bit ISA adapters will (by providing two 16-bit transfers for each 32-bit access).

[Adapter  Installation][Driver Installation][Software Utilities][Additional Information]

____________________