Heroic Defender of the Stack

Problem Statement

I have been investigating stack smashing and countermeasures (stack smashing prevention, or SSP). Briefly, stack smashing occurs when a function allocates a static array on the stack and writes past the end of it, onto other local variables and eventually onto other function stack frames. When it comes time to return from the function, the return address has been corrupted and the program ends up some place it really shouldn’t. In the best case, the program just crashes; in the worst case, a malicious party crafts code to exploit this malfunction.

Further, debugging such a problem is especially obnoxious because by the time the program has crashed, it has already trashed any record (on the stack) of how it got into the errant state.

Preventative Countermeasure

GCC has had SSP since version 4.1. The computer inserts SSP as additional code when the -fstack-protector command line switch is specified. Implementation-wise, SSP basically inserts a special value (the literature refers to this as the ‘canary’ as in “canary in the coalmine”) at the top of the stack frame when entering the function, and code before leaving the function to make sure the canary didn’t get stepped on. If something happens to the canary, the program is immediately aborted with a message to stderr about what happened. Further, gcc’s man page on my Ubuntu machine proudly trumpets that this functionality is enabled per default ever since Ubuntu 6.10.

And that’s really all there is to it. Your code is safe from stack smashing by default. Or so the hand-wavy documentation would have you believe.

Not exactly

Exercising the SSP

I wanted to see the SSP in action to make sure it was a real thing. So I wrote some code that smashes the stack in pretty brazen ways so that I could reasonably expect to trigger the SSP (see later in this post for the code). Here’s what I learned that wasn’t in any documentation:

SSP is only emitted for functions that have static arrays of 8-bit data (i.e., [unsigned] chars). If you have static arrays of other data types (like, say, 32-bit ints), those are still fair game for stack smashing.

Evaluating the security vs. speed/code size trade-offs, it makes sense that the compiler wouldn’t apply this protection everywhere (I can only muse about how my optimization-obsessive multimedia hacking colleagues would absolute freak out if this code were unilaterally added to all functions). So why are only static char arrays deemed to be “vulnerable objects” (the wording that the gcc man page uses)? A security hacking colleague suggested that this is probably due to the fact that the kind of data which poses the highest risk is arrays of 8-bit input data from, e.g., network sources.

The gcc man page also lists an option -fstack-protector-all that is supposed to protect all functions. The man page’s definition of “all functions” perhaps differs from my own since invoking the option does not differ in result from plain, vanilla -fstack-protector.

The Valgrind Connection

“Memory trouble? Run Valgrind!” That may as well be Valgrind’s marketing slogan. Indeed, it’s the go-to utility for finding troublesome memory-related problems and has saved me on a number of occasions. However, it must be noted that it is useless for debugging this type of problem. If you understand how Valgrind works, this makes perfect sense. Valgrind operates by watching all memory accesses and ensuring that the program is only accessing memory to which it has privileges. In the stack smashing scenario, the program is fully allowed to write to that stack space; after all, the program recently, legitimately pushed that return value onto the stack when calling the errant, stack smashing function.

Valgrind embodies a suite of tools. My idea for an addition to this suite would be a mechanism which tracks return values every time a call instruction is encountered. The tool could track the return values in a separate stack data structure, though this might have some thorny consequences for some more unusual program flows. Instead, it might track them in some kind of hash/dictionary data structure and warn the programmer whenever a ‘ret’ instruction is returning to an address that isn’t in the dictionary.

Simple Stack Smashing Code

Here’s the code I wrote to test exactly how SSP gets invoked in gcc. Compile with ‘gcc -g -O0 -Wall -fstack-protector-all -Wstack-protector stack-fun.c -o stack-fun‘.

stack-fun.c:

The above incarnation should just produce the traditional “Segmentation fault”. However, uncommenting and executing stack_smasher8() in favor of stack_smasher32() should result in “*** stack smashing detected ***: ./stack-fun terminated”, followed by the venerable “Segmentation fault”.

As indicated in the comments for stack_smasher32(), it’s possible to trick the compiler into emitting SSP for a function by inserting an array of at least 8 bytes (any less and SSP won’t emit, as documented, unless gcc’s ssp-buffer-size parameter is tweaked). This has to be compiled with no optimization at all (-O0) or else the compiler will (quite justifiably) optimize away the unused buffer and omit SSP.

For reference, I ran my tests on Ubuntu 10.04.1 with gcc 4.4.3 compiling the code for both x86_32 and x86_64.

3 thoughts on “Heroic Defender of the Stack

  1. Reimar

    I expect that the valgrind callgrind component should be reasonably easy to extend for detecting rets to the wrong place (if it doesn’t already).
    It could (never checked if it can) also detect accesses that are simply outside the stack frame even when the accesses are not sequential.
    It’s also possible to imagine special call/ret instructions that handle this kind of thing with minimal effort, but I fear for x86 that chance has passed with x86-64, almost no-one will be willing to use optional instructions for such central stuff.

  2. Pingback: lots of links | CactuarJ's NotePad

Comments are closed.