I sometimes hypothesize about reverse engineering code compiled for alternate (i.e. non-x86) CPU architectures. It makes one question why so much effort is focused on x86 RE (to which the simple and immediate answer is, because all the interesting code is compiled for the x86 architecture). Maybe I’m just enamored at how neat RISC code tends to be, with typical architectures featuring 32-bit instruction words. Writing a disassembler obviously embodies not even a fraction of the complexity of a decent x86 disassembler. Fortunately, the GNU binutils take care the disassembly details already (I recently posted a Wiki page on using objdump, even cross-compiling for non-native architectures). Here is some representative disassembly from a PowerPC ELF binary, for those who have never been exposed:
16b40: 80 e1 01 14 lwz r7,276(r1) 16b44: 7c 09 3a 14 add r0,r9,r7 16b48: 7d 3e 00 ae lbzx r9,r30,r0 16b4c: 55 20 e1 3e rlwinm r0,r9,28,4,31 16b50: 48 00 00 08 b 16b58 16b54: 38 00 00 0f li r0,15 16b58: 2c 0b 00 0f cmpwi r11,15 16b5c: 7c 04 03 78 mr r4,r0 16b60: 40 82 00 54 bne- 16bb4 16b64: 88 19 00 02 lbz r0,2(r25)
Quite a change from the typical x86 slop. Though I sometimes wonder what the ‘reduced’ in reduced instruction set computer (RISC) is really supposed to mean. It definitely doesn’t indicate reduced functionality for individual instructions. I looked up that rlwinm instruction: Rotate Left Word Immediate Then AND with Mask. I started to wonder if it would be simpler to compose an assembly re-targeter for a RISC CPU until I started reading up on this instruction.
And here’s some MIPS RISC code:
20157c: 84820002 lh v0,2(a0) 201580: 2484000a addiu a0,a0,10 201584: 44820000 mtc1 v0,$f0 201588: 46800020 cvt.s.w $f0,$f0 20158c: 46010002 mul.s $f0,$f0,$f1 201590: e4600000 swc1 $f0,0(v1) 201594: 0501fff2 bgez t0,0x201560 201598: 24630008 addiu v1,v1,8 20159c: 1000000a b 0x2015c8 2015a0: 3c020000 lui v0,0x0
As memory serves, with MIPS CPUs, you get the added fun of manually tracking in your brain the CPU pipelining. I.e., an arithmetic operation from one instruction may not be completed by the next instruction, which happens to operate on the same register, and the compiler was specifically counting on that, and you need to count on it as well during your RE efforts.
You’re probably thinking of delayed branching with the MIPS. In your snippet, the addiu instruction following the bgez conditional jump will always be executed. Likewise, the lui after the b instruction will also be executed. The reason for this is avoiding a costly pipeline flush when the branch is taken. Such a design reduces the need for branch prediction, and takes less silicon to implement.
Modern MIPS cores, I don’t know about old versions, do track inter-instruction register dependencies and stall the pipeline if an instruction needs results from an earlier instruction that hasn’t yet completed.
Regarding the meaning of reduced, one thing that all RISC architectures I’ve encountered have in common is a lack of instructions taking memory operands. The only memory operations supported are load and store, with some variants to support the atomic operations needed for locks. To perform arithmetic on a value stored in memory, it must be loaded into a register, manipulated there, and finally written back to memory.
Thanks for clarifying the delayed branching stuff. It has been a long time since I dealt with this stuff.
You can read at http://www.sasktelwebsite.net/jbayko/cpu.html (Great Microprocessors of the Past and Present) a lot of inner workings and history of CPUs.
About Power architecture, it says:
” RISC initially stood for Reduced Instruction Set Computer, but IBM defined it as Reduced Instruction Set Cycles, and implemented a relatively complex processor (POWER – Performance Optimization With Enhanced RISC) with more high level instructions than even many memory-data processors. “
IBM also introduced the wonderful instruction EIEIO: Enforce In-Order Execution of I/O, more commonly known as a memory barrier. Perhaps Motorola scores still higher with the 6809 and its SEX instruction, short for Sign EXtend.