Technical Document #17

Technical Document #17
How can I improve my driver's performance with WinDriver?

The performance of the driver you develop and the data-transfer rate are dependent on the specific OS, the hardware, and the driver design.
Following are some suggestions for improving your driver's performance with WinDriver:


  • "PCI" references below include PCI Express (PCIe), PCMCIA. and CardBus.
  • For detailed information on how to access PCI and ISA addresses and improve your performance with WinDriver, and for a full WinDriver PCI/ISA API reference, refer to the WinDriver PCI User's Manual.
    For a description of the low-level WD_xxx APIs, refer to the WinDriver PCI Low-Level API Reference.

For PCI devices, use memory rather than I/O mapped ranges in your hardware design, as it is much faster to access memory than I/O.

In general, it's faster to access memory addresses directly. You can use the WinDriver WDC_MEM_DIRECT_ADDR macro to receive the relevant direct memory region mapping for a card previously registered using WDC_PciDeviceOpen() / WDC_PcmciaDeviceOpen() / WDC_IsaDeviceOpen() (depending on your hardware), and then access the memory directly by passing this address to one of the WDC_ReadMemXXX or WDC_WriteMemXXX macros, or to one of the WDC_ReadAddrXXX() or WDC_WriteAddrXXX() functions (not including WDC_ReadAddrBlock() / WDC_WriteAddrBlock() — see later below).
(When using the low-level WinDriver API, use the direct user-mode memory-range mapping returned by WD_CardRegister() in cardReg.Card.Item[i].I.Mem.pUserDirectAddr / dwUserDirectAddr before v11.8.0.)

When accessing I/O addresses, or when transferring a large amount of data from/to memory, you might improve the performance by implementing block (string) transfers, using WDC_ReadAddrBlock() or WDC_WriteAddrBlock() (or the low-level WD_Transfer() function with a string command), and/or by grouping several transfers into a single function call using WDC_MultiTransfer() (or the low-level WD_MultiTransfer() function); (for small memory data transfers, direct memory access is generally more efficient).
The block transfers translate to assembler string instructions, which may further improve the performance by initiating PCI burst mode, if supported by the hardware.
If you need to transfer more than 64 bits of data in a single burst, you may be able to improve the performance using 64-bit transfers, which can be implemented with WinDriver using QWORD string (block) transfers. This can be especially useful on 32-bit host platforms. Note, however, that the ability to perform actual 64-bit transfers is dependent on the hardware components involved in the transfer, as explained in the WinDriver PCI User's Manual.
For more information on PCI block transfers (including PCI burst and 64-bit transfers), refer to Technical Document #108.

When performing large memory transfers it's preferable to use PCI bus-master direct memory access (DMA), when possible, as it is generally faster than PCI target access (such as that used in the other memory-access methods described above); see also Technical Document #108. For detailed instructions on how to use the WinDriver APIs (namely, the WDC_DMAxxx() or low-level WD_DMAxxx() functions) to implement DMA for PCI devices capable of acting as bus masters, refer to the WinDriver PCI User's Manual.

To further improve the driver's performance, you can use WinDriver's Kernel PlugIn feature to move performance-critical sections of your code from the user mode to the kernel, thereby improving the overall performance of your driver. This feature will allow you, for example, to handle interrupts directly in the kernel (see also Technical Document #48 regarding the Kernel PlugIn interrupt latency). A detailed description of the Kernel PlugIn feature can be found in the WinDriver User's Manual.
The Kernel PlugIn is not supported on Windows CE and VxWorks, since these operating systems do not distinguish between user and kernel mode. On VxWorks (last supported in WinDriver v5.2.2) you can improve the interrupt handling rate by using the windrvr_isr callback function, as explained in the manual and in Technical Document #115.


For detailed information on how to perform USB transfers and improve your performance with WinDriver, and for a full WinDriver USB API reference, refer to the WinDriver USB User's Manual.

To increase the data transfer rate you can try replacing several data transfers, which use relatively small data buffers, with a single transfer that uses a big data buffer, thereby eliminating some of the function-call overhead and reducing the context switches between the user and kernel mode.
(The size of the data buffer is set in the dwBufferSize parameter of the WDU_Transfer() function.)
NOTE: The size of the buffer used in the calls to WDU_Transfer() is not limited to the maximum packet size for the device, although we recommend that you use buffer sizes that are multiples of the maximum packet size.

You might also be able to improve the transfer rate by modifying the device's firmware (for example, by increasing the maximum packet size for the hardware).