Practical Reverse Engineering

by Mike Melanson (mike at multimedia.cx)
Updated June 29, 2004

Example: Windows Media Video 9

For this experiment, I decided to target Microsoft's Windows Media Video 9 (WMV9) binary codec module. This is a big target. One reason I chose this is because popular Unix media players can already use the Win32 binary decoding module in order to decode the data via some clever hacking. Thus, it is (relatively) simple to add hooks into these open source programs to gather execution profiling data.

If you wish to follow along in this example, the target file is:

wmv9dmod.dll, 807032 bytes
MD5: 292beecb089f13b70af8e44e5bfefa5c
This file is available on the MPlayer Codec page in the Win32 Codecpack ("lite" version will do).

First, disassemble the file:

scd wmv9dmod.dll > wmv9dmod.dll.txt
Next, run the output disassembly through the scd-addresses.pl script:
scd-addresses.pl < wmv9dmod.dll.txt > wmv9dmod-functions.txt
The next step is a little more complicated. This is the part that involves gathering execution data. The general idea is to set the trap flag before the function you wish to monitor and clear it afterwards. On an Intel x86 CPU, doing this will result in a trap interrupt after every instruction. Then, a custom trap handler (acting on SIGTRAP in Linux-land) logs each instruction pointer address to a file.

With some special formatting, a little patience, and a LOT of disk space, the data is ready to be churned through another Perl script, sort-addresses.pl. This script reads in a file containing a list of addresses generated using whatever methods necessary, and one or more files containing function boundaries. The script outputs a series of profiles that show how much time the program spent in each function.

The wmv9dmod.dll file conforms to the Microsoft DirectMedia Object (DMO) API. This particular type of binary module has a very small number of public functions and they typically return structures with pointers to subfunctions within the module that do the real work. Such is the case here as module initialization yields the addresses of ProcessInput() and ProcessOutput(). These sound like the 2 most interesting functions to monitor.

For this experiment, I took 5 Microsoft WMV files, all identical save for the fact they they are encoded at different bitrates: 56K, 128K, 300K, 500K, and 700K. I set up the profiling facilities to trap execution data while decoding the first 4 frames of each type of data. The first frames of this particular video are mostly dark. The thinking here is that the first frame of each file will be a keyframe which will flex certain decoding functions. Meanwhile, the subsequent 3 frames should be interframes with very little different from the first frame and they will exploit different areas of code.

I was surprised by the relative sizes of the profiling output files:

 93614956  wmv9-56k.txt
 81672616  wmv9-128k.txt
125989596  wmv9-300k.txt
115864856  wmv9-500k.txt
115944496  wmv9-700k.txt
My guess was that progressively higher bitrates would require more and more instructions to decode, but that was not always the case. There might be other factors at work here. Perhaps a slower framerate caused a fade-in sooner on the 56K file vs. the 128K file, so that a subsequent image required more decoding logic in the 56K file.

The next thing that surprised me is that I can't coax Perl into doing quite what I expect. Actually, that is not a big surprise. Until I get the bugs ironed out of this script, I can post the typical results from running ProcessInput() on the first frame. These functions always dominated:

*************************************************************************
Profile: ProcessInput(), frame #0
total addresses executed: 6573909
*************************************************************************
(no name): 00440430 -> 004406AF, count = 1175863 (17.8868158959913%)
(no name): 0043CDE0 -> 0043F8FF, count = 991800 (15.0869140415543%)
(no name): 0044E840 -> 0044EB6F, count = 943158 (14.346988983267%)
(no name): 0044EB70 -> 0044ED9F, count = 685402 (10.4260950372145%)
(no name): 004A4B20 -> FFFFFFFF, count = 594077 (9.03689114041585%)
(no name): 00432840 -> 004333BF, count = 477467 (7.26306068428997%)
(no name): 00440870 -> 0044092F, count = 408885 (6.21981533361657%)
(no name): 004406B0 -> 0044086F, count = 234320 (3.56439372677657%)
(no name): 004334A0 -> 004337FF, count = 228732 (3.47939102899051%)
(no name): 00458CC0 -> 0045A45F, count = 186976 (2.84421338962861%)
(no name): 00434D30 -> 00434FAF, count = 116100 (1.76607251484619%)
(no name): 00432520 -> 0043283F, count = 100977 (1.53602673842914%)
(no name): 00440370 -> 0044042F, count = 98678 (1.50105515607229%)
(no name): 00440930 -> 00440AAF, count = 85124 (1.29487645782745%)
(no name): 004307F0 -> 0043085F, count = 66600 (1.0130958612296%)
[...]
The fifth function, the one that ranges from 004A4B20 -> 4294967295, represents any execution addresses that did not fall into the other bins. These are the pieces of native support code needed to run the binary module. Eventually, these functions should be represented in the breakdown as well.

Return to the main page