Naive x86 Re-targeter | Breaking Eggs And Making Omelettes

Here’s the complete, do-it-yourself instructions and code for that re-targeting experiment. First, the files:

unnamed-re-project.py, which is the re-targeter, and outputs C code that relies on the support files:
asm2c.c and
asm2c.h; these are compiled along with
testbench.c to demonstrate the program
function.txt contains the disassembly that the re-targeter is hardcoded to process

The re-targeter wants to process code like that found in function.txt. This is the disassembly format output by my favorite Win32 PE disassembler, Sang Cho’s Disassembler. I knew in advance that the function expects 4 parameters, and that fact is hardcoded in the re-targeter along with the file name. The testbench.c file contains the opcodes for the original function and allows the programmer to switch between the original opcodes and the re-targeted code for verification.

To run this experiment:

download all 5 files into one (Unix, x86) directory
./unnamed-re-project.py > bitreader.c
gcc *.c -o testbench

The testbench.c program simulates the data structure that the re-targeted bitreading function expects, along with a bitstream that looks like 0xA5, 0x5A repeated. Running the program should result in:

 12 bits = A55
  4 bits = A

If compiling on x86_32, you can switch the “#if 0” to “#if 1” in order to test the original opcodes.

I took a stab at making the re-targeter portable to a big endian platform, as brainstormed last night. However, I soon realized what my hastily scrawled, year-old note about that task’s difficulty must have warned about– the outlined approach works for source arguments, but is not as straightforward for destination arguments.

I don’t have immediate access to an x86_64/Linux environment, but I would like to know if the re-targeted code compiles on that platform. The entire point of a re-targeter is to run code on a different platform, though I suppose alternate operating systems on the same CPU architecture is another interpretation.

So, of course the re-targeter is super-naive in its current form. It only implements enough instructions to handle that one function in function.txt. It does not handle branching at all. In order to do so, each of the instruction emitters would also need to output the code that adjusts the appropriate flags after each arithmetic instruction. Then, a branch would map to a simple goto with the correct address label.

Related Posts

3 thoughts on “Naive x86 Re-targeter”

astrange November 3, 2007 at 11:52 pm

It compiles on x86-64 but just crashes (it puts stack_segment above the first 32 bits of the address space). You’d have to use uintptr_t everywhere or ptrdiff instead of uint32.

Also, I think you’re using STACK_TOP wrong; the stack top is actually esp + 4*STACK_TOP. Or sizeof(uintptr_t), if you do that.

GCC is surprisingly bad at compiling this, I guess they didn’t expect all the unnecessary labels and a completely visible-to-the-compiler program that uses globals.
Multimedia Mike Post authorNovember 4, 2007 at 5:16 am

Ah, another unforeseen problem (stack). Thanks for the test.
igorsk November 5, 2007 at 6:41 am

In slightly related news, IDA 4.9 has been released as freeware:
http://www.datarescue.com/idabase/idadownfreeware.htm
Only x86, but it does include an integrated win32 debugger.

P.S. if you’d like to see Hex-Rays output for a particular piece of code, just send me the binary and indicate the part you’d like to check. It doesn’t process FPU instuctions at the moment, however.

Comments are closed.