Wednesday, November 12, 2014

Detecting and Handling Endianness in Run-Time

A long time ago, in a very remote island known as Lilliput, society was split into two factions: Big-Endians who opened their soft-boiled eggs at the larger end ("the primitive way") and Little-Endians who broke their eggs at the smaller end. As the Emperor commanded all his subjects to break the smaller end, this resulted in a civil war with dramatic consequences: 11.000 people have, at several times, suffered death rather than submitting to breaking their eggs at the smaller end [1]-[2].


Eventually, the 'Little-Endian' vs. 'Big-Endian' feud carried over into the world of computing as well, where it refers to the order in which bytes in multi-byte numbers should be stored, most-significant first (Big-Endian) or least-significant first (Little-Endian) to be more precise [2].

Endian (endianness in the most common cases) refers to how bytes are ordered within computer memory
  • Big-Endian means that the most significant byte of any multibyte data field is stored at the lowest memory address, which is also the address of the larger field. 
  • Little-Endian means that the least significant byte of any multibyte data field is stored at the lowest memory address, which is also the address of the larger field.
For example, consider the 32-bit number, 0x16FAE50A. Following the Big-Endian convention, a computer will store it as follows:
  • Base_Address       : 16
  • Base_Address   + 1 : FA
  • Base_Address   + 2 : E5
  • Base_Address   + 3 : 0A
Whereas architectures that follow the Little-Endian rules will store it as follows:
  • Base_Address       : 0A
  • Base_Address   + 1 : E5
  • Base_Address   + 2 : FA
  • Base_Address   + 3 : 16
Even that there is no significant performance difference between the two endianness types;
different microcontrollers follow different endianness platforms,
some microcontrollers are big-indian and some 
others are little-endian.

As an example for EEPROM Handler that can cause serious bugs in case that endianness is not carefully considered, check the following code:
typedef struct EEPhandler_tstrPROBE_CONFIG_GROUP{
    short as16X_BOARD_TEMP_TABLE[10];
}EEPhandler_tstrPROBE_CONFIG_GROUP;

void DEE_bEepromReadSync(char* pu8BufferToRead){
    //EEPROM data is read exactly as if we wrote the following code:
    *pu8BufferToRead = 0x00;
    pu8BufferToRead++;
    *pu8BufferToRead = 0x1F;   
}

void EPhandler_bReadSyncPROBE_CONFIG_GROUP(EEPhandler_tstrPROBE_CONFIG_GROUP*
 pstrPROBE_CONFIG_GROUP){
    DEE_bEepromReadSync((char *)pstrPROBE_CONFIG_GROUP);
    // After the previous line of code, we have
    // (*pstrPROBE_CONFIG_GROUP).as16X_BOARD_TEMP_TABLE[0] = 0x001F on some platforms and = 0x1F00 on another platforms
    // according to its endianness!
}
After the execution of the line of code: DEE_bEepromReadSync((char *)pstrPROBE_CONFIG_GROUP) ,
we will have the first element of s16X_BOARD_TEMP_TABLE = 0x001F on some platforms and
0x1F00 on another platforms according to its endianness!
This is a serious issue if the code is meant to run several platforms or handling communication between different micrcocontrollers.

Endianness is not a compiler issue, nor even an operating system issue, but a platform issue.There are no compiler options or workarounds for endianness.
There are however conversion routines so that you can 
normalize the endianness of stored data. 

There are programmatic ways to detect whether or not you are on a big-endian or little-endian architecture at run-time.
Unions can be used to detect endiannes in run-time as follows:
int is_big_endian(void){
    union {
        uint32_t i;
        char c[4];
    } bint = {0x01020304};
    return bint.c[0] == 1; 
}

Or the pointer casting trick as follows:
int is_big_endian (void){
    short int word = 0x0001;
    char *byte = (char *) &word;
    return(byte[0] ? false : true);
}
Afterwards some routines can be used to convert the endianness in run-time. Below is a routine that swaps a 32-bit unsigned integers:
uint32_t swap_endian_u32(uint32_t u){
    union{
        uint32_t u;
        unsigned char u8[sizeof(uint32_t)];
    } source, dest;

    source.u = u;
    for (size_t k = 0; k < sizeof(uint32_t); k++)
        dest.u8[k] = source.u8[sizeof(uint32_t ) - k - 1];
    return dest.u;

No comments: