9.3. Byte Ordering

9.3.1. Introduction to Endianness

There are two main architectures for handling memory storage. They are called Big Endian and Little Endian and refer to the order in which the bytes are stored in memory.

  • Big endian means that the most significant byte of any multi-byte data field is stored at the lowest memory address.
    This means a Hex word like 0x1234 is stored in memory as (0x12 0x34). The big end, or upper end, is stored first. The same is true for a four-byte value; for example, 0x12345678 would be stored as (0x12 0x34 0x56 0x78).
  • Little endian means that the least significant byte of any multi-byte data field is stored at the lowest memory address.
    This means a Hex word like 0x1234 is stored in memory as (0x34 0x12). The little end, or lower end, is stored first. The same is true for a four-byte value; for example, 0x12345678 would be stored as (0x78 0x56 0x34 0x12).

All processors are designated as either big endian or little endian. Intel's x86 processors and their clones are little endian. Sun's SPARC, Motorola's 68K, and the PowerPC families are all big endian.

An endianness difference can cause problems if a computer unknowingly tries to read binary data written in the opposite format from a shared memory location or file.

The terms big endian and little endian are derived from the Lilliputians of Gulliver's Travels (Jonathan Swift 1726), whose major political issue was which end of the soft-boiled egg should be opened, the little or the big end.

9.3.2. WinDriver Byte Ordering Macros

The PCI bus is designated as little endian, complying with x86 architecture. In order to prevent problems resulting from byte ordering incompatibility between the PCI bus and SPARC and PowerPC architectures, WinDriver includes macro definitions that convert data between little and big endian.

When developing drivers using WinDriver, these macro definitions enable cross platform portability. Using these macro definitions is safe even for drivers that are going to be deployed on x86 architecture.

The following sections describe the macros and when to use them.

9.3.3. Macros for PCI Target Access

WinDriver's macros for PCI target access are used for converting endianness while reading/writing from/to PCI cards using memory mapped ranges of PCI devices.

[Note]
These macro definitions apply to Linux PowerPC architecture.
  • dtoh16 — Macro definition for converting a WORD (device to host)
  • dtoh32 — Macro definition for converting a DWORD (device to host)
  • dtoh64 — Macro definition for converting a QWORD (device to host)

Use these macros in the following situations:

  1. To prepare data to be written to the device, in cases of direct write access to the card using a memory mapped range.

    For example:

    DWORD data = VALUE;
      *mapped_address = dtoh32(data);

  2. To process data that has been read from the device, in cases of direct read access from the card using a memory mapped range.

    For example:

    WORD data = dtoh16(*mapped_address);

[Note]
WinDriver's APIs — WDC_Read/WriteXXX() [B.3.19B.3.24], WDC_MultiTransfer() [B.3.25], and the lower level WD_Transfer() and WD_MultiTransfer() functions (see the WinDriver PCI Low-Level API Reference) already perform the required byte ordering translations, therefore when using these APIs to read/write memory addresses you do not need to use the dtoh16/32/64() macros to convert the data (nor is this required for I/O addresses).

9.3.4. Macros for PCI Master Access

WinDriver's macros for PCI master access are used for converting endianness of data in host memory that is accessed by the PCI master device, i.e., in cases of access that is initiated by the device rather than the host.

[Note]
These macro definitions apply to both Linux PowerPC and SPARC architectures.
  • htod16 — Macro definition for converting a WORD (host to device)
  • htod32 — Macro definition for converting a DWORD (host to device)
  • htod64 — Macro definition for converting a QWORD (host to device)

Use these macros to prepare data on the host memory to be a read/written by the card. An example of such a case is a chain of descriptors for scatter/gather DMA.

The following example is an extract from the PLX_DMAOpen() function in WinDriver's PLX library (see WinDriver/plx/lib/plx_lib.c):

        /* Setting chain of DMA pages in the memory */
        for (dwPageNumber = 0, u32MemoryCopied = 0;
            dwPageNumber < pPLXDma->pDma->dwPages;
            dwPageNumber++)
        {
            pList[dwPageNumber].u32PADR =
                htod32((UINT32)pPLXDma->pDma->Page[dwPageNumber].pPhysicalAddr);
            pList[dwPageNumber].u32LADR =
                htod32((u32LocalAddr + (fAutoinc ? u32MemoryCopied : 0)));
            pList[dwPageNumber].u32SIZ =
                htod32((UINT32)pPLXDma->pDma->Page[dwPageNumber].dwBytes);
            pList[dwPageNumber].u32DPR =
                htod32((u32StartOfChain + sizeof(DMA_LIST) * (dwPageNumber + 1))
                | BIT0 | (fIsRead ? BIT3 : 0));
            u32MemoryCopied += pPLXDma->pDma->Page[dwPageNumber].dwBytes;
        }

        pList[dwPageNumber - 1].u32DPR |= htod32(BIT1); /* Mark end of chain */