Chapter 9. Advanced Issues

This chapter covers advanced driver development issues and contains guidelines for using WinDriver to perform tasks that cannot be fully automated by the DriverWizard.

Note that WinDriver's enhanced support for specific chipsets [7] includes custom APIs for performing hardware-specific tasks like DMA and interrupt handling, thus freeing developers of drivers for these chipsets from the need to implement the code for performing these tasks themselves.

9.1. Performing Direct Memory Access (DMA)

This section describes how to use WinDriver to implement bus-master Direct Memory Access (DMA) for devices capable of acting as bus masters. Such devices have a DMA controller, which the driver should program directly.

DMA is a capability provided by some computer bus architectures — including PCI and PCIe — which allows data to be sent directly from an attached device to the memory on the host, freeing the CPU from involvement with the data transfer and thus improving the host's performance.

A DMA buffer can be allocated in two ways:

  • Contiguous buffer — A contiguous block of memory is allocated.
  • Scatter/Gather — The allocated buffer can be fragmented in the physical memory and does not need to be allocated contiguously. The allocated physical memory blocks are mapped to a contiguous buffer in the calling process's virtual address space, thus enabling easy access to the allocated physical memory blocks.

The programming of a device's DMA controller is hardware specific. Normally, you need to program your device with the local address (on your device), the host address (the physical memory address on your PC) and the transfer count (the size of the memory block to transfer), and then set the register that initiates the transfer.

WinDriver provides you with API for implementing both contiguous-buffer DMA and Scatter/Gather DMA (if supported by the hardware) — see the description of WDC_DMAContigBufLock() [B.3.40], WDC_DMASGBufLock() [B.3.41], and WDC_DMABufUnlock() [B.3.43]. (The lower-level WD_DMAxxx API is described in the WinDriver PCI Low-Level API Reference, but we recommend using the convenient wrapper WDC_xxx API instead.)

The following sections include code samples that demonstrate how to use WinDriver to implement Scatter/Gather DMA [9.1.1] and contiguous-buffer DMA [9.1.2], and an explanation on how to preallocate contiguous DMA buffers on Windows [9.1.2.1].

[Note]
  • The sample routines demonstrate using either an interrupt mechanism or a polling mechanism to determine DMA completion.
  • The sample routines allocate a DMA buffer and enable DMA interrupts (if polling is not used) and then free the buffer and disable the interrupts (if enabled) for each DMA transfer. However, when you implement your actual DMA code, you can allocate DMA buffer(s) once, at the beginning of your application, enable the DMA interrupts (if polling is not used), then perform DMA transfers repeatedly, using the same buffer(s), and disable the interrupts (if enabled) and free the buffer(s) only when your application no longer needs to perform DMA.

9.1.1. Implementing Scatter/Gather DMA

Following is a sample routine that uses WinDriver's WDC API [B.2] to allocate a Scatter/Gather DMA buffer and perform bus-master DMA transfers.
A more detailed example, which is specific to the enhanced support for PLX chipsets [7] can be found in the WinDriver/plx/lib/plx_lib.c library file and WinDriver/plx/diag_lib/plx_diag_lib.c diagnostics library file (which utilizes the plx_lib.c DMA API).

BOOL DMARoutine(WDC_DEVICE_HANDLE hDev, DWORD dwDMABufSize,
    UINT32 u32LocalAddr, DWORD dwOptions, BOOL fPolling, BOOL fToDev)
{
    PVOID pBuf;
    WD_DMA *pDma = NULL;
    BOOL fRet = FALSE;

    /* Allocate a user-mode buffer for Scatter/Gather DMA */
    pBuf = malloc(dwBufSize);
    if (!pBuf)
        return FALSE;

    /* Lock the DMA buffer and program the DMA controller */
    if (!DMAOpen(hDev, pBuf, u32LocalAddr, dwBufSize, fToDev, &pDma))
        goto Exit;

    /* Enable DMA interrupts (if not polling) */
    if (!fPolling)
    {
        if (!MyDMAInterruptEnable(hDev, MyDmaIntHandler, pDma))
            goto Exit; /* Failed enabling DMA interrupts */
    }

    /* Flush the CPU caches (see documentation of WDC_DMASyncCpu()) */
    WDC_DMASyncCpu(pDma);

    /* Start DMA - write to the device to initiate the DMA transfer */
    MyDMAStart(hDev, pDma);

    
    /* Wait for the DMA transfer to complete */
    MyDMAWaitForCompletion(hDev, pDma, fPolling);
    /* Flush the I/O caches (see documentation of WDC_DMASyncIo()) */
    WDC_DMASyncIo(pDma);

    fRet = TRUE;
Exit:
    DMAClose(pDma, fPolling);
    free(pBuf);
    return fRet;
}

/* DMAOpen: Locks a Scatter/Gather DMA buffer */
BOOL DMAOpen(WDC_DEVICE_HANDLE hDev, PVOID pBuf, UINT32 u32LocalAddr,
    DWORD dwDMABufSize, BOOL fToDev, WD_DMA **ppDma)
{
    DWORD dwStatus, i;
    DWORD dwOptions = fToDev ? DMA_TO_DEVICE : DMA_FROM_DEVICE;

    /* Lock a Scatter/Gather DMA buffer */
    dwStatus = WDC_DMASGBufLock(hDev, pBuf, dwOptions, dwDMABufSize, ppDma);
    if (WD_STATUS_SUCCESS != dwStatus)
    {
        printf("Failed locking a Scatter/Gather DMA buffer. Error 0x%lx - %s\n",
            dwStatus, Stat2Str(dwStatus));
        return FALSE;
    }

    /* Program the device's DMA registers for each physical page */
    MyDMAProgram((*ppDma)->Page, (*ppDma)->dwPages, fToDev);

    return TRUE;
}

/* DMAClose: Unlocks a previously locked Scatter/Gather DMA buffer */
void DMAClose(WD_DMA *pDma, BOOL fPolling)
{
    /* Disable DMA interrupts (if not polling) */
    if (!fPolling)
        MyDMAInterruptDisable(hDev);

    /* Unlock and free the DMA buffer */
    WDC_DMABufUnlock(pDma);
}

What Should You Implement?

In the code sample above, it is up to you to implement the following MyDMAxxx() routines, according to your device's specification:

  • MyDMAProgram(): Program the device's DMA registers.
    Refer the device's data sheet for the details.
  • MyDMAStart(): Write to the device to initiate DMA transfers.
  • MyDMAInterruptEnable() and MyDMAInterruptDisable(): Use WDC_IntEnable() [B.3.48] and WDC_IntDisable() [B.3.49] (respectively) to enable/disable the software interrupts and write/read the relevant register(s) on the device in order to physically enable/disable the hardware DMA interrupts (see Section 9.2 for details regarding interrupt handling with WinDriver.)
  • MyDMAWaitForCompletion(): Poll the device for completion or wait for "DMA DONE" interrupt.

[Note]
When using the basic WD_xxx API (described in the WinDriver PCI Low-Level API Reference) to allocate a Scatter/Gather DMA buffer that is larger than 1MB, you need to set the DMA_LARGE_BUFFER flag in the call to WD_DMALock() and allocate memory for the additional memory pages, as explained in the following FAQ: http://www.jungo.com/st/support/windriver/windriver_faqs/#dma1. However, when using WDC_DMASGBufLock() [B.3.41] to allocate the DMA buffer, you do not need any special implementation for allocating large buffers, since the function handles this for you.

9.1.2.  Implementing Contiguous-Buffer DMA

Following is a sample routine that uses WinDriver's WDC API [B.2] to allocate a contiguous DMA buffer and perform bus-master DMA transfers.
For more detailed, hardware-specific, contiguous DMA examples, refer to the following enhanced-support chipset [7] sample library files:

  • PLX — WinDriver/plx/lib/plx_lib.c and WinDriver/plx/diag_lib/plx_diag_lib.c (which utilizes the plx_lib.c DMA API)
  • Xilinx Bus Master DMA (BMD) design — WinDriver/xilinx/bmd_design/bmd_lib.c

BOOL DMARoutine(WDC_DEVICE_HANDLE hDev, DWORD dwDMABufSize,
    UINT32 u32LocalAddr, DWORD dwOptions, BOOL fPolling, BOOL fToDev)
{
    PVOID pBuf = NULL;
    WD_DMA *pDma = NULL;
    BOOL fRet = FALSE;

    /* Allocate a DMA buffer and open DMA for the selected channel */
    if (!DMAOpen(hDev, &pBuf, u32LocalAddr, dwDMABufSize, fToDev, &pDma))
        goto Exit;

    /* Enable DMA interrupts (if not polling) */
    if (!fPolling)
    {
        if (!MyDMAInterruptEnable(hDev, MyDmaIntHandler, pDma))
            goto Exit; /* Failed enabling DMA interrupts */
    }

    /* Flush the CPU caches (see documentation of WDC_DMASyncCpu()) */
    WDC_DMASyncCpu(pDma);
    
    /* Start DMA - write to the device to initiate the DMA transfer */
    MyDMAStart(hDev, pDma);

    /* Wait for the DMA transfer to complete */
    MyDMAWaitForCompletion(hDev, pDma, fPolling);

    /* Flush the I/O caches (see documentation of WDC_DMASyncIo()) */
    WDC_DMASyncIo(pDma);

    fRet = TRUE;

Exit:
    DMAClose(pDma, fPolling);
    return fRet;
}

/* DMAOpen: Allocates and locks a contiguous DMA buffer */
BOOL DMAOpen(WDC_DEVICE_HANDLE hDev, PVOID *ppBuf, UINT32 u32LocalAddr,
    DWORD dwDMABufSize, BOOL fToDev, WD_DMA **ppDma)
{
    DWORD dwStatus;
    DWORD dwOptions = fToDev ? DMA_TO_DEVICE : DMA_FROM_DEVICE;

    /* Allocate and lock a contiguous DMA buffer */
    dwStatus = WDC_DMAContigBufLock(hDev, ppBuf, dwOptions, dwDMABufSize, ppDma);
    if (WD_STATUS_SUCCESS != dwStatus)
    {
        printf("Failed locking a contiguous DMA buffer. Error 0x%lx - %s\n",
            dwStatus, Stat2Str(dwStatus));
        return FALSE;
    }

    /* Program the device's DMA registers for the physical DMA page */
    MyDMAProgram((*ppDma)->Page, (*ppDma)->dwPages, fToDev);

    return TRUE;
}

/* DMAClose: Frees a previously allocated contiguous DMA buffer */
void DMAClose(WD_DMA *pDma, BOOL fPolling)
{
    /* Disable DMA interrupts (if not polling) */
    if (!fPolling)
        MyDMAInterruptDisable(hDev);

    /* Unlock and free the DMA buffer */
    WDC_DMABufUnlock(pDma);
}

What Should You Implement?
In the code sample above, it is up to you to implement the following MyDMAxxx() routines, according to your device's specification:
  • MyDMAProgram(): Program the device's DMA registers.
    Refer the device's data sheet for the details.
  • MyDMAStart(): Write to the device to initiate DMA transfers.
  • MyDMAInterruptEnable() and MyDMAInterruptDisable(): Use WDC_IntEnable() [B.3.48] and WDC_IntDisable() [B.3.49] (respectively) to enable/disable the software interrupts and write/read the relevant register(s) on the device in order to physically enable/disable the hardware DMA interrupts (see Section 9.2 for details regarding interrupt handling with WinDriver.)
  • MyDMAWaitForCompletion(): Poll the device for completion or wait for "DMA DONE" interrupt.

9.1.2.1. Preallocating Contiguous DMA Buffers on Windows

WinDriver doesn't limit the size of the DMA buffer that can be allocated using its DMA APIs. However, the success of the DMA allocation is dependent on the amount of available system resources at the time of the allocation. Therefore, the earlier you try to allocate the buffer, the better your chances of succeeding.
WinDriver for Windows allows you to configure your device INF file to preallocate contiguous DMA buffers at boot time, thus increasing the odds that the allocation(s) will succeed.

[Note]
You may preallocate a maximum of 512 buffers: — 256 host-to-device buffers and/or 256 device-to-host buffers.

There are 2 ways to preallocate contiguous DMA buffers on Windows: Directly from the DriverWizard, or manually via editing the INF file.

Directly from DriverWizard:
  1. In DriverWizard, start a new project, select a device from the list and click Generate .INF file as shown in the DriverWizard Walkthrough Step 2.
  2. Check Preallocate Host-To-Device DMA Buffers and/or Preallocate Device-To-Host DMA Buffers to enable the text boxes under each checkbox.
  3. Adjust the Size, Count and Flags parameters as desired.
    [Note]
    The Size and Flags fields must be hexadecimal numbers, formatted with the "0x" prefix, as shown below.

    Figure 9.1. DriverWizard INF File Information

    DriverWizard INF File Information

    [Note]
    The supported WinDriver DMA flags are documented in the description of dwOptions field of the WD_DMA struct [B.7.9]. To locate the relevant flag values to set in the INF file, look for the flag definitions in the WinDriver\include\windrvr.h file; (look for the enum that contains the DMA_KERNEL_BUFFER_ALLOC flag).
  4. Click Next, you will then be prompted to choose a filename for your .INF file. After choosing a filename, the INF file will be created and ready to use, with your desired parameters.

Manually by editing an existing INF file:
  1. Add the required configuration under the [UpdateRegistryDevice] registry key in your device INF file, as shown below.
    [Note]
    • The examples are for configuring preallocation of eight DMA buffers but you may, of-course, select to preallocate just one buffer (or none at all).
    • To preallocate unidirectional buffers, add these lines:
      ; Host-to-device DMA buffer:
      HKR,, "DmaToDeviceCount",0x00010001,0x04       ; Number of preallocated
                                                     ; DMA_TO_DEVICE buffers
      HKR,, "DmaToDeviceBytes",0x00010001,0x100000   ; Buffer size, in bytes
      HKR,, "DmaToDeviceOptions",0x00010001,0x41     ; DMA flags (0x40=DMA_TO_DEVICE
                                                     ; + 0x1=DMA_KERNEL_BUFFER_ALLOC
      ; Device-to-host DMA buffer:
      HKR,, "DmaFromDeviceCount",0x00010001,0x04     ; Number of preallocated
                                                     ; DMA_FROM_DEVICE buffers
      HKR,, "DmaFromDeviceBytes",0x00010001,0x100000 ; Buffer size, in bytes
      HKR,, "DmaFromDeviceOptions",0x00010001,0x21   ; DMA flags (0x20=DMA_FROM_DEVICE
                                                     ; + 0x1=DMA_KERNEL_BUFFER_ALLOC)
      
  2. Edit the buffer sizes and add flags to the options masks in the INF file, as needed.
    Note, however, that the direction flags and the DMA_KERNEL_BUFFER_ALLOC flag must be set as shown in Step 1.
    [Note]
    The supported WinDriver DMA flags are documented in the description of dwOptions field of the WD_DMA struct [B.7.9]. To locate the relevant flag values to set in the INF file, look for the flag definitions in the WinDriver\include\windrvr.h file; (look for the enum that contains the DMA_KERNEL_BUFFER_ALLOC flag).
  3. In your code, the first n calls (if you configured the INF file to preallocate n DMA buffers) to the contiguous-DMA-lock function — WDC_DMAContigBufLock() [B.3.40] — should set parameter values that match the buffer configurations in the INF file:
    • For a device-to-host buffer, the DMA-options mask parameter (dwOptions / pDma->dwOptions) should contain the same DMA flags set in the DmaFromDeviceOptions registry key value, and the buffer-size parameter (dwDMABufSize / pDma->dwBytes) should be set to the value of the DmaFromDeviceBytes registry key value.
    • For a host-to-device buffer, the DMA-options mask parameter (dwOptions /
      pDma->dwOptions
      ) should contain the same flags set in the DmaToDeviceOptions registry key value, and the buffer-size parameter (dwDMABufSize /
      pDma->dwBytes
      ) should be set to the value of the DmaToDeviceBytes registry key value.
    [Note]
    • When using WDC_DMAContigBufLock() [B.3.40] you don't need to explicitly set the DMA_KERNEL_BUFFER_ALLOC flag (which must be set in the INF-file configuration) because the function sets this flag automatically.
    • When using the low-level WinDriver WD_DMALock() function (described in the WinDriver PCI Low-Level API Reference), the DMA options are set in the function's pDma->dwOptions parameter — which must also include the DMA_KERNEL_BUFFER_ALLOC flag — and the buffer size is set in the pDma->dwBytes parameter.
    • If the buffer preallocation fails due to insufficient resources, you may need to increase the size of the non-paged pool (from which the memory is allocated), as explained in WinDriver Technical Document 58 (http://www.jungo.com/st/support/tech_docs/td58.html).