I was giddy when I recently learned that there were x86_64 builds of the Real codecs available that had function names inside, if for no other reason than that it might finally provide a good reason to learn x86_64 ASM. But then Benjamin helpfully pointed out that there are .a libraries available for their codecs as well (look for .a files in the current source packages). These are far more interesting, particularly in the context of black box reverse engineering. So I established a little proof of concept experiment.
There is a function in the adec40.a file (apparently the meat of the RealVideo 4.0 codec) called C_ITransform4x4_DCOnly. Sounds like it performs some math transform on a 4×4 matrix of coefficients but only when the DC coefficient is non-zero. Doing a quick RE analysis of the short function yields my best guess that the function computes DC_coeff * 209, adds 512, divides by 1024, and places the quotient in all 16 samples in the matrix. That would be in line with what a function of this sort would do, but I’m not convinced if I have the math just right.
The ASM function in question can’t be called directly from C as it was clearly assembled to be called from certain other functions– this function does not save the registers it trashes. Thus, I decided to write an ASM wrapper (assembled with NASM) and link it to a simple C program.
main.c:
Here is the linkage ASM file (I think the right term might be ‘thunk’)– and drat, there is no NASM syntax highlighting in this code highlighter plugin yet; let’s just pretend it’s C.
real-link.asm:
And the Makefile to tie it all together (assumes adec40.a is in the same directory):
TARGET = real-link $(TARGET): main.o real-link.o gcc -g -o $(TARGET) -Wall main.o real-link.o adec40.a main.o: main.c gcc -g -o main.o -c main.c real-link.o: real-link.asm nasm -o real-link.o -f elf real-link.asm clean: rm -f *.o
So the idea behind the test program is to iterate through DC coefficients from 0 on, incrementing by 50 each time, and check on one of the transformed coefficients.
0: 0
50: 8
100: 17
150: 25
200: 33
250: 41
300: 50
350: 58
400: 66
450: 74
500: 83
550: 91
600: 99
650: 107
700: 116
750: 124
800: 132
850: 140
900: 149
950: 157
1000: 165
1050: 173
1100: 182
1150: 190
1200: 198
1250: 206
1300: 215
1350: 223
1400: 231
1450: 239
1500: 248
1550: 256
1600: 264
1650: 272
1700: 281
1750: 289
1800: 297
1850: 305
1900: 314
1950: 322
I reason that these values are probably the final Y-U-V sample values and are likely restricted to the 0..255 byte range. Based on that assumption, the DC coefficient couldn’t get much larger than 1550.
And my guess about the transform formula turns out to be incorrect. Imagine that. DC = 1000 would yield samples of 205. But the authoritative answer is 165. I know that the net result of the transform was to multiply the DC coefficient by some constant (in a convoluted way using a series of lea instructions in order to avoid an actual multiplication instruction) and then add 512 and divide by 1024 (shift right by 10). Working backwards using this knowledge, combined with the empirical data above, the constant actually works out to be 169 rather than 209.
Wow, an RE experiment of mine actually worked and yielded useful information. Believe me, no one is more surprised than me.