The main methods for transferring large amounts of data between PCI-device
memory addresses and a host machine's random-access memory (RAM) are
block transfers (which may or may not result in
PCI burst transfers), and bus-master
DMA.
Block transfers are easier to implement, but DMA is much more effective and
reliable for transferring large amounts of data, as explained in this document.
NOTE
For general information on how to improve your driver's performance with
WinDriver, refer to Technical Document #17.
Block Transfers
You can use the WinDriver WDC_ReadAddrBlock() and
WDC_WriteAddrBlock() functions (or the low-level
WD_Transfer() function with a string command) to perform block
(string) transfers — i.e., transfer blocks of data from the device memory
(read) or to the device memory (write); you can use also
WDC_MultiTransfer() (or the low-level
WD_MultiTransfer() function) to group multiple block transfers
into a single function call. This is more efficient than performing multiple
single transfers.
The WinDriver block-transfer functions use assembler string instructions (such
as REP MOVSD , or a 64-bit MMX instruction for
64-bit transfers) to move a block of memory between PCI-mapped
memory on the device and the host's RAM. From a software perspective, this is
the most that can be done to attempt to initiate PCI
burst transfers.
Burst Transfers
The hardware uses PCI burst mode to perform burst transfers — i.e.,
transfer the data in "bursts" of block reads/writes, resulting in a small
performance improvement compared to the alternative of single WORD transfers.
Some host controllers implement burst transfers by grouping access to
successive PCI addresses into PCI bursts.
The host-side software has no way to control whether a target PCI transfer is
issued as a burst transfer. The most the host can do is initiate transfers
using assembler string instructions — as done by
the WinDriver block-transfer APIs — but
there's no guarantee that this will translate into burst transfers, as this is
entirely up to the hardware.
Most PCI host controllers support PCI burst mode for write transfers.
It is generally less common to find similar burst-mode support for PCI
read transfers.
64-Bit Transfers
WinDriver supports performing 64-bit transfers, using QWORD string
(block) transfers, on both 64-bit and 32-bit
platforms (see the WinDriver PCI User's Manual for the supported platforms). If
you have 64-bit PCI hardware (card and bus), you may be able to
improve the transfer rate by using 64-bit transfers, even if your
host platform is only 32-bit. However, note that
- The ability to perform actual 64-bit transfers requires
that such transfers be supported by the hardware — including the
CPU, the PCI card, the PCI host controller, and the PCI bridge —
and it can be affected by any of these components or their specific
combination.
-
The conventional wisdom among hardware engineers is that performing two
32-bit DWORD transfers is more efficient than performing a
single 64-bit QWORD transfer; the reason is that the
64-bit transfer requires an additional CPU cycle to
negotiate a 64-bit transfer mode, and this cycle can be
used, instead, to perform a second 32-bit transfer.
Therefore, performing 64-bit transfers is generally more
advisable if you wish to transfer more than 64 bits of data in a single
burst.
DMA
The best way to improve the performance of large PCI memory data transfers is
by using bus-master direct memory access (DMA), and not by performing block
transfers (which as explained above, may or may not result in PCI burst
transfers).
Most PCI architectures today provide DMA capability, which enables data to be
transferred directly between memory-mapped addresses on the PCI device and the
host's RAM, freeing the CPU from involvement in the data transfer and thus
improving the host's performance. DMA data-buffer sizes are limited only by the
size of the host's RAM and the available memory.
For detailed information on DMA and how to implement it with WinDriver, refer
to the WinDriver PCI User's Manual. (The low-level WinDriver DMA APIs are
documented in the WinDriver PCI Low-Level API Reference.)
In addition, see the
WinDriver DMA Technical Documents.
|