Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

Assembly Bizarro

July 15th, 2006 by Multimedia Mike

In lieu of recent Pickover puzzles (I have an impressive backlog of those to work on), VAG sent in this curious x86 ASM nugget:

    xchg    bh, bl
    mov     di, bx
    shr     di, 1
    shr     di, 1
    add     di, bx
    xchg    bh, bl
    add     di, dx

The hint that VAG gives is that it has something to do with a common multimedia task. Try to figure it out.

Posted in General | 4 Comments »

4 Responses

  1. john_doe Says:

    It calculates the offset in a 320*200 buffer of the given x,y coodinate.
    di = bx * 320 + dx
    Since bx can only be in the range of 0..199 (i.e. only bl) the code uses this to its advantage.

  2. Reimar Says:

    Though I must admit it was a bit unfair not to tell us if it was AT&T or Intel syntax, that always costs me an extra scratching-my-head moment :-)
    I wonder for what CPU that was written though, 286? Because IIRC “shr di, 2” would have been possible and faster from 386 on…

  3. Multimedia Mike Says:

    Reimar: You should usually be able to sort out the syntax ordering by looking for an immediate value, as in the “shr” instruction, since it would be pretty useless to shift an immediate value.

    I’m confused by those sequential shifts as well. I can only assume it was some pipeline optimization measure. Except that this code may have predated such considerations.

  4. VAG Says:

    john_doe: Correct.
    Reimar: This should be pretty obvious. Though, AT&T syntax use % and $ symbols to indicate registers/constants.

    Anyway. Yes, it’s a legacy 286 code. Just an explanation how it works.
    Generally, we can express 320 in power-of-two values as 256 + 64. Multiply by 256 can be performed by shifting value left 8 times, by 64 as shifting left 6 times. However, instead of straight multiply by 64 we can reuse multiplication by 256 and divide it by 4 – shift right two times. xchg basically acts as shl bx, 8 in this sample (as bh is always zero). That’s how it works. No ideas if it’s really faster than direct table lookup. :)