I am getting more ideas in my head about how I want to put this BFRE program together.
One thing that I always forget about with these RE exercises is those CPU flags. In order to correctly virtualize the underlying i386 machine, the flags register must hold the correct bits in the correct spots. After all, some program may wish to push the flags onto the stack, pop them into a register, and test them manually. “Why?” is anyone’s guess. So I need to track down those mappings again. Let’s see, I think I had to use those in a program once before… ah! Found them again:
#define V_FLAG 0x0800 // overflow flag #define S_FLAG 0x0080 // sign flag #define Z_FLAG 0x0040 // zero flag #define A_FLAG 0x0010 // auxiliary carry flag #define P_FLAG 0x0004 // parity flag #define C_FLAG 0x0001 // carry flag
Quick update: the remaining 3 flags occupy the following positions:
#define D_FLAG 0x0400 // direction flag #define I_FLAG 0x0200 // interrupt flag #define T_FLAG 0x0100 // trap flag
This comes from an old experiment I performed called the “Execution Profiler” or execprof. This is where I had essentially re-invented the Linux ptrace facilities before I had learned about them. I should write an article about that experiment, especially since I think it gathers some useful intelligence (and a lot of it).
Of the remaining 3 CPU flags (from a stock x86), the interrupt and trap flags are irrelevant in program execution. The location of the direction flag would also be useful to know in case any string instructions came around (most likely large memory shuffles or clearing a memory block).
The tougher part about the flags situation involves doing all the right things based on any given opcode. Basically, arithmetic opcodes will monkey with all 6 of the flags #define’d above. The add and sub instructions will modify all 6. The and/or bit test instructions clear the overflow and carry flags, modify the sign, zero, and parity flags, and leave auxiliary carry in an undefined state, according to my handy Borland Turbo Assember v4.0 reference manual (purchased in 1994 and still does the job).
What to do about these 6 flags? The virtual processor must act faithfully as a real processor would or else correct results can not be guaranteed. So there will be up to 6 compare-assign constructs before or after many arithmetic operations. Obviously, I am not concerned about speed for this exercise. But I am worried about cluttering up code that is already going to be hard enough to read with GOTOs. Let’s examine the following instruction:
add dword[edi+04], esi
First, BFRE has to recognize the instruction (add) and take apart the source operand (esi) and destination operand (*(unsigned int *)(edi+4)). The effective operation of this instruction is:
*(unsigned int *)(edi+4) += esi;
Then, 6 CPU flags must be modified based on the result. However, since some of the flags are based on information that is not expressed by the result itself. For example, the sum itself does not express whether the addition overflowed the capacity of the registers. Thus, the carry flag must be set based on an evaluation of the operands.
All I am saying is that this could bulk up the code. A cleaner way to do it might be to write a macro along the lines of:
// WordPress has trouble saving backslash characters #define ASM_ADD(dest_op, source_op) ((int64_t)(dest_op) + (int64_t)(source_op) > 0xFFFFFFFF) ? flags | C_FLAG : flags & (~C_FLAG); dest_op += source_op; (!dest_op) ? flags | Z_FLAG : flags & (~Z_FLAG); ... and so on and so forth for the other 4 affected flags... then: ASM_ADD(*(unsigned int *)(edi+4), esi);
Subsequent program phases could make it a point to correctly reduce that macro to a simpler addition once a little more intelligence has been uncovered about the program and it is no longer desirable to have a C program that operates 100% as the equivalent ASM program did.