Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

Deobfuscation Redux: JavaScript

June 19th, 2011 by Multimedia Mike

Google recently released version 12 of their Chrome browser. This version adds a new feature that automatically allows deobfuscating obfuscated JavaScript source code.

Before:



After:



As a reverse engineering purist, I was a bit annoyed. Not at the feature, just the naming. This is clearly code beautification but not necessarily deobfuscation. The real obfuscation comes not from removing whitespace but from renaming variable and function names to terse 1- and 2-letter identifiers. True automated deobfuscation — which entails recovering the original variable and function identifiers as well as source code comments — is basically impossible.

Still, it makes me wonder if there is any interest in a JavaScript deobfuscator that operates similar to my Java deobfuscator which was one of the first things I published on this blog. The general idea is automatically replace function names with random English verbs (since functions correspond to actions) and variable names with random animal names (I decided “English nouns” encompassed too broad a category of words). I suspect the day that someone releases a proprietary multimedia codec in a pure (though obfuscated) JavaScript format is that day that I will try to accomplish this, if it hasn’t been done already.

See also:

Posted in Reverse Engineering | 4 Comments »

4 Responses

  1. Mathias Says:

    Did you try it on truly obfuscated code like on this page:
    http://yaisb.blogspot.com/2006/10/defeating-dean-edwards-javascript.html
    ?
    I only know the Firefox Deobfuscator plugin and while it of course doesn’t rename the variables, it at least unpacks that code.

  2. Reimar Says:

    If you think the term “deobfuscation” annoying, I think the term “obfuscation” is also a bit unfair, I think as often as not the reason might just be to reduce the size of the code and not obfuscation in itself.
    I actually wrote such a preprocessor for a C-style language myself because the machine control device strangly had a limit on the source-code size (despite only running the binary code…).
    And while random verbs/names would help, I think the possiblity to rename all related (and only related) occurences would be most useful (and unfortunately also most effort).

  3. Multimedia Mike Says:

    @Reimar: Absolutely agreed: for JavaScript — especially at Google’s scale — I suspect that obfuscation is absolutely necessary. Minimal obfuscation of IP is likely of some concern, but JS code would be unbelievably bulky to download otherwise.

  4. SvdB Says:

    The common term used to describe this is ‘minification’ ( http://en.wikipedia.org/wiki/Minification_(programming) ).