Many years ago, when my pet dinosaur was still alive, I recall the painful transition from C on 16-bit machines, to C on 32-bit machines. Many lessons from that era were incorporated into the the C89 standard, which happened in almost the same time frame.
Years later, and some folks are once again struggling, this time with the 32-bit to 64-bit transition. This time, the C standard got there ahead of you, and has a nifty <stdint.h> include file, and a whole bunch of useful standard types.
The embedded code base I am currently working with is riddled with 32-bit assumptions. First, they decided to define their own <wrong_types.h> (names changed to protect the guilty) with, you guessed it, typedefs and/or #defines for a whole bunch of symbols very nearly identical to the standard types in <stddef.h>, <stdint.h> and <stdbool.h>. And, in some cases, with incorrect implementations of standard types. Of course, compiling this code with a native 64-bit compiler, rather than the 32-bit cross compiler, has excrement exploding in all directions.
This article contains several portability lessons that folks seem to have forgotten (or, probably, are too young to have ever learned).
Lesson 0:
Why am I compiling it native when it is embedded code? Test Driven Development (TDD). It will save your sanity.
With a native build of (most of) the embedded code, I can put automated tests into the build system, for anything that I can run without having to be on the target, such as (I kid you not) the BCD arithmetic library. The code download takes ~8 minutes, so running all those tests as part of the build is way faster at spotting regressions. Also, my desktop build host is an order of magnitude faster than the target, and quad core.
Lesson 1:
Do not create a file called <wrong_types.h> and try to second-guess the C compiler. You will always and forever be wrong, plus your code is not reusable if you do this. This is a non-portable highway straight to hell. Always have a second system for building embedded code, each with different numbers of bits (e.g. 32-bit embedded target and 64-bit build host). That way assumptions will explode at compile time or automated test time, not in front of a customer. If you can arrange things so that the two systems you are testing on also have different endian-ness, that’s also good for exposing assumptions.
The idea is for the code to compile correctly on both platforms, with no code changes required. Zero edits. Build both, every day, with your Continuous Integration (CI) build server. It is possible, but you have to use the types the compiler provides to tell you how big everything is.
Use <stddef.h>, <stdint.h> and <stdbool.h> instead of your own types header file. They are standards compliant, portable, and the standard’s authors have thought more deeply about the semantics than you have. Never heard of them? Find a copy of the standard and read it, now. Google for “ansi-c 2011 n1570 pdf”, it should be on the first page.
Oh, and don’t replace libc.a with your own implementation, either. The stuff in the libc.a provided by the compiler vendor is probably way better than yours, and will also be standards compliant. (Did you know strcmp is required by the standard to use unsigned chars for the comparison? Didn’t think so.) Don’t replace libgcc.a with your own implementation, either. When (not if) the compiler is upgraded, you don’t want all the bug fixes and security fixes in their libc.a and libgcc.a to pass you by.
Lesson 2:
Always, always, always compile with at least -Werror -Wall -Wextra, and then fix all the problems revealed. If they aren’t bugs now, they will be when the clueless maintenance guys get done with it. (Marooned on a proprietary compiler? Just use GCC in parallel, it’s free.)
Don’t just add more casts to make the warnings go away. Actually think about what the compiler is telling you, and then fix the underlying flaw in how you are doing things. The third and fourth lessons are some specific instances of this.
“But they’re just warnings.” Bzzt, wrong. Spot the defect in the following code:
(volatile long)x = bigness;
Gcc has been warning about this for almost 20 years. As of Gcc 4.4, this is now a fatal error, and about time, too. Still can’t figure it out? The C standard (since 1989, 22+ years ago) says that the output of a cast is never an lvalue (“What’s an lvalue?” An expression that may appear on the left of an assignment, a subset of all the possible expressions that can appear on the right hand side). The code should read
*(volatile long *)&x = bigness;
Even this re-written version probably still contains a design flaw. Unless you are talking to a device register, and even then it is questionable.
Lesson 3:
You can’t fit a pointer into a long.
void *p = blah;
long fake_ptr = (long)p;
Use intptr_t or uintptr_t for this job. Better yet, ask yourself why you are pushing an integer type around, not a pointer type. This often reveals design flaws, dinosaur sized ones. BTW, using casts to make the compiler shut up, like this…
void *p = blah;
long fake_ptr = (long)(intptr_t)p;
simply obfuscates the bug. The compiler is trying to tell you that your conversion will, one day, if not already, lose bits of representation. When you
p = (void *)fake_ptr;
the pointer that went in isn’t guaranteed to be the pointer that comes out. “But this will only ever be used on a 32-bit machine.” Bzzzt, wrong. I’m still scraping excrement off the walls.
Lesson 4:
You can’t fit what sizeof returns into an int
int len = sizeof(something);
use size_t for this job. It isn’t guaranteed to fit into a long, either. And, just to be clear, size_t is guaranteed to be unsigned. Use ssize_t (POSIX) or ptrdiff_t (ANSI C) for the signed variety, although any modern version of GCC is going to give zillions of signed-vs-unsigned warnings, and for good reason: the type promotion rules around these cases are almost certainly not what you think they are.
“What do I care? Anyone declaring a 2GB object is an idiot.” Bzzzt, wrong. Try writing a malloc implementation one day. Or, worse, reading one that assumes int32_t is big enough to hold the size of the malloc arena without losing address bits and without going negative.
Lesson 5:
(I can’t believe this one is still with us.) Do not use sprintf or vsprintf, ever, for anything. Always, always, always use snprintf or vsnprintf. This usually means fixing all of your APIs so that whenever a string/buffer pointer argument is passed, it is immediately followed by a size_t argument that is the size of the aforementioned string/buffer. (If your compiler vendor is 22 years late in providing a standards compliant snprintf function, upgrade; to Linux if necessary, it’s free.)
Also, if you are tempted to write your own strncpy that doesn’t have the ugly strncpy semantics (look it up), don’t. Use snprintf instead (take your time, think it through, it wasn’t obvious to me, either).
Lesson 6:
Use “man gcc” and read what -Wshadow does. Then turn it on. And then, for any significantly old’n‘large code base, be horrified at what it tells you. The -Werror option is your friend.
Here endeth the lesson. (For today, anyway.)