Someone once asked on one of the xine mailing lists, “Is xine big endian or little endian?” Clearly, the person was confused but his heart was in the right place: He had heard about the endianness issue and that it affects machine portability somehow. Here is Multimedia Mike’s quick and easy guide to what you need to know about endianness and platform portability:
When the CPU interacts with the outside world, the CPU needs to worry about endianness.
Are you reading data from the disk or from the network or from stdin? Then you need to make sure the bytes are in the correct order. Are you writing data to the disk or the network or to stdout? Then you need to make sure the bytes are in the correct order. At all other times, you really don’t need to care. The CPU does the right thing and puts the data in whatever order it needs for internal arithmetic processing. Thus, if reading a big endian 16-bit number from the disk, read the first byte, shift it to bits 15-8 of a variable in memory, read the second byte, and logically OR it to the same variable at bits 7-0. In C notation:
/* read first_byte */ /* read second_byte */ variable = ( first_byte << 8 ) | second_byte;
After this, you don’t worry about the endianness of said variable unless you need to write the variable back out to the disk or the network and maybe stdout. The CPU does the right thing.
What functions should you use to manipulate endianness? In xine and FFmpeg I added the BE_/LE_ macros. For example BE_32() interprets the next 4 bytes of the parameter bytestream as a big endian 32-bit number and does the correct shifting. LE_32() does the same thing but for a little endian 32-bit number. These macros read each byte individually and perform the correct shifting depending on the endianness detected at build time.
If you need to write macros like this, please do not do any casting such as:
/* for big endian CPUs */ #define BE_32(x) *(uint32_t *)x
If the pointer is not aligned on a 4-byte boundary this will cause problems, such as:
- operation will be slower than an aligned access but will still work (not a big deal)
- operation will cause a program fault (this happens on Sparc CPUs)
- CPU will take the liberty of aligning the address on a 4-byte boundary and then read the memory (perhaps the worst possible consequence, worse than the program crashing, since the program will obtain the wrong data; I have heard that certain CPUs do this but I am not sure which ones)
Also, please, please, please do not bother using the network-to-host and host-to-network endian conversion macros that come packaged with network APIs, at least not unless you are doing network programming. It is pure overkill to bring the networking API into a project just for endian facilities. Sure, every Unix-like system may have those headers but what happens when someone wants to use your nice, neat, seemingly isolated multimedia module on a unique embedded platform that has no networking facilities? Re-invent the macros or crib them from xine or FFmpeg.
Another big point regarding portability is processing structures. Just because a spec has a nice neat header structure layout with 16-bit width and 16-bit height and a few other fields does not mean you should try to read the structure directly into a pre-defined data structure with the same layout. For one thing, multi-byte numbers will make some endian assumption. Thus, to be portable, the numbers have to be swapped for one orientation or another after the read. But the bigger issue is that compilers have a tendency to pad data structures. Why? Often to make data access to individual data members faster. It’s a classic speed vs. size trade-off. Most compilers allow you to turn off such padding. But such directives tend to be specific to different compilers. Then, depending on how a particular compiler feels on a given day, it may or may not honor such directives. So just don’t do it, ‘kay? Don’t read blocks of memory directly into data structures. Read a block of bytes and then sort out the bytes into data structures using endian macros. Let the compiler organize the data structures any way it chooses.
You’re already on your way to writing pain-free, portable code!