Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

Meta:

Remembering Fravia

July 31st, 2009 by Multimedia Mike

I was reading up on this year’s Pwnie Awards — hoping that no nominations dealt with any software that I’m directly involved with — when I noticed someone named Fravia was up for a Lifetime Achievement Pwnie.

I remember Fravia, or really, his site. Back in 2000 when I became interested in reverse engineering due to its necessary if tangential relationship to understanding multimedia technology, I took to the web to search for tips. Fravia’s site was one of the first I found. It was apparently a goldmine of RE knowledge. But I could never know for sure– I always found the place packed with impenetrable jargon without a glossary in sight.

Further, the site seemed to focus primarily on how to reverse engineer relatively simple stuff– copy protection schemes and key generators. The targets I was — and remain — interested in tend to involve reasonably complicated mathematical algorithms compiled into machine code. Different domain, different challenges.

I think Fravia’s site was where I read an interesting document for programmers who wished to thwart reverse engineers. One tip was to load your program with blocks of NOP instructions. Apparently, these are harbingers of self-modifying code and in the context of counter-intelligence, a reverse engineer will go nuts anticipating and seeking out such aberrant code.

Fravia is no longer with us, having passed away in May of this year. His site lives on, as engimatic, baffling, and aesthetically unsophisticated as I remember it being 9 years ago. It seems to have shifted focus somewhere along the line to studying how search engines operate. I wonder if all that RE knowledge is lost forever (or perhaps buried deep in the internet archive which doesn’t make it much more useful).

In a way, Fravia was an inspiration for me– In addition to multimedia tech information, I wanted to publish data on practical reverse engineering matters so that other people could get up and running as quickly as possible without having to wade through weird jargon.

Posted in Reverse Engineering | 3 Comments »

Eee PC And Chrome

July 30th, 2009 by Multimedia Mike

I complain about a lot of software on this blog. But I wanted to take this opportunity to praise some software for once– Easy Peasy and Google Chrome. I’ve had some ups and downs with my Eee PC 701 netbook– great unit but the vendor-supplied Linux distribution was severely lacking. I auditioned some netbook-tailored distros last year and found one that worked reasonably well while being a bit rough around the edges — Ubuntu-Eee. One notable problem I experienced a few weeks after I installed it was that the wireless network driver quit working (though to be fair, I understand that was a greater problem due to an Ubuntu update around the same time).


Eee PC 701 running Easy Peasy and Google Chrome

These days, Ubuntu-Eee has been renamed Easy Peasy. I was finally sufficiently motivated to try installing it when enough other things on my existing Ubuntu-Eee distro had broken. Essentially all the problems that troubled me in its predecessor distro have vanished– wireless works again (though I still can’t seem to toggle it), all the sound controls work, even the hibernation works which impressed me greatly (even if I never use it).

Pertaining to web browsers, I have traditionally been satisfied with Firefox. Sure, it has been growing large in recent times, but what software hasn’t? It’s the price of software progress and all. However, I took this opportunity to try out Google Chrome which I never thought I would have reason to care about. I am roundly impressed with its speed and responsiveness. Seriously, this browser might even be lean enough for the guru to consider using on a regular basis.

I’m pleased that I can forgo a replacement for this classic Eee PC netbook for the foreseeable future.

Posted in General | No Comments »

XML Monkey

July 25th, 2009 by Multimedia Mike

I’m trying to come to terms with the reality that is XML. I may not like the format but that won’t change the fact that I have to interoperate with various XML data formats already in the wild. In other words, treat it like any random multimedia format. For example, suppose I want to write software to interpret the various comics that I’ve created with Taco Bell’s series of Comics Constructors CD-ROMs.


Amazon Raiders: XML Monkey, top panel

Read the rest of this entry »

Posted in Programming, Python | 9 Comments »

Ramping Up On JavaScript

July 22nd, 2009 by Multimedia Mike

I didn’t think I would ever have sufficient motivation to learn JavaScript, but here I am. I worked a little more on that new FATE index page based on Google’s Visualization API. To that end, I constructed the following plan:

Part 1: Create A JSON Data Source
Create a JSON data source, now that I figured out how to do that correctly. JSON data really is just a JavaScript data structure. It can be crazy to look at since it necessitates packing dictionaries inside of arrays inside of dictionaries inside of arrays. (Check the examples– observe that the data structure ends with “}]}]}});”.) But in the end, the Google visualization knows what to do with it.

Done.

Part 2: Connect the JSON Data Source
Hook the JSON data source up to the newest revision of the FATE front page, rolled out a little while ago.

Done.

Part 3: Save The User’s Most Recent Sort Criteria
The problem is that the page resets the sort criteria on a refresh. There needs to be a way to refresh the page while maintaining that criteria. This leads me to think that I should have some “refresh” button embedded in the page which asks the server for updated data using a facility I have heard of named XMLHttpRequest. I found a simple tutorial on the matter but was put off by the passage “Because of variations among the Web browsers, creating this object is more complicated than it need be.”

Backup idea: Cookies. Using this tutorial as a guide, set a cookie whenever the user changes either the sort column or the sort order.

Done, though I may want to revisit the XHR idea one day.

Part 4: Make It Look Good
Finally, figure out how the div tag works to make the layout a little cleaner.

Done. Sort of. There are 2 div tags on the page now, one for the header and one for the table. I suppose I will soon have to learn CSS to really drag this page out of 1997.

Bonus: Caching the JSON Data
Ideally, the web browser makes the JSON data request using the If-Modified-Since HTTP header. Use a sniffer to verify this. If this is true, add a table to the FATE MySQL table which contains a single column specifying the timestamp when the web config cache table was last updated. If this time is earlier than the time in the request header, respond with a 304 (not modified) HTTP code.

Not done. It seems that these requests don’t set the appropriate HTTP header, at least not in Firefox.

I hope to put this page into service soon, just as soon as I can dump the rest of the existing index.php script into this new one. As usual, not elegant, but it works.

Posted in FATE Server | 2 Comments »

Renoise XRNS

July 17th, 2009 by Multimedia Mike

A little piece of me died today when I read of the existence of XRNS, a music tracker format that uses XML. A music tracker format that uses XML! Can you imagine? If you can't, Google for "filetype:xrns" to find plenty of samples.

XML:
  1. <?xml version="1.0" encoding="UTF-8"?>
  2. <renoisesong doc_version="10">
  3.   <globalsongdata>
  4.     <octave>4</octave>
  5.     <editstep>1</editstep>
  6.     <loopplay>false</loopplay>
  7.     <loopcoeff>4</loopcoeff>
  8.     <loopstart>96</loopstart>
  9.     <beatspermin>123</beatspermin>
  10.     <ticksperline>3</ticksperline>
  11.     <shuffleisactive>true</shuffleisactive>
  12.     <shuffleamounts>
  13.       <shuffleamount>36</shuffleamount>
  14.       <shuffleamount>68</shuffleamount>
  15.       <shuffleamount>67</shuffleamount>
  16.       <shuffleamount>47</shuffleamount>
  17.     </shuffleamounts>
  18.     <songname>Untitled</songname>
  19.     <artist>By Somebody</artist>
  20. ...
  21.   </globalsongdata>
  22. ...
  23. </renoisesong>

And on it goes. It's difficult to articulate why this feels so heretical. It's like those old MOD/tracker formats were designed to be so pure, so efficient. This completely destroys that. Now your playback engine has to carry the baggage of a full XML parsing library.

There are elements of the FFmpeg development team that would enjoy seeing the program grow to be able to handle all the various tracker-type formats (myself included, obviously). It's not going to be pretty when XRNS collides with FFmpeg.

Addendum: Share the love over on the Renoise forums.

Posted in General | 66 Comments »

The Murder FILM

July 14th, 2009 by Multimedia Mike

Do you have any idea who killed Jennifer Shefield?
How well did you know Jennifer Shefield?
What can you tell me about Jennifer Shefield?

Call me cold, but I just can't bring myself to care about the above questions more than I care about what multimedia format is being used in an old Amiga shareware game simply entitled Murder, the demo of which can be downloaded in 11 .lha files from Aminet. LHA... does that ever take me back to the BBS days. That and ARJ.

Back to the format, though, it's definitely not related to Sega's FILM format (one of my all-time favorite formats). I think this may just be a series of Amiga IFF files with a header on them. This is because I see markers such as "8SVX".

The files also contain curious advice about playback. Apart from a string that specifically stipulates "Motorola 68000 family", there's also this tidbit in the header: "Amiga Hint LIST subtype FILM requires that all CAT chunks are exactly the same size, start at long word boundaries (pad with filler chunks) and come one after the other right to the end of the LIST. Write to agmsmith at 71330.3173@CompuServe.com for more info." I would email that address for the sake of due diligence but -- darn -- bad timing! CompuServe finally died just last week. The files were apparently made by a piece of software named AGMSMakeFilm which also seems to be available at the Aminet.

MultimediaWiki page and samples, naturally.

Posted in Game Hacking | 3 Comments »

Google Visualizing FATE

July 11th, 2009 by Multimedia Mike

I guess that Cloud Computing stuff doesn't only apply to data storage. There are also things like Google's Visualization API for manipulating and presenting data. In this paradigm, the data is under my control but the code to manipulate it lives on Google's servers.

Good or bad? That's up for debate, but the table visualization definitely caught my eye. Look at the experimental results when I put FATE data into the table. Notice how easy it is to sort by columns (the default sort is such that the failed builds float to the top). I may be a little too close to the situation, but I think it's a little better than my last attempt. Again, no more up-to-15-minute delay with this system; new build data is available for presentation as soon as it is submitted to the database.

Let me know what you think. Personally, I think we may have a winner here. Maybe Google's other visualizations (assorted graphs and such) could be just the thing we have been searching for in order to plot trends like performance and code size.

I just wish I could understand the data source wire protocol. As it stands, the index-v3.php script generates JavaScript on the fly to populate the table. It would be a bit more elegant if the data were provided by a separate script. But, hey, this works.

Posted in FATE Server | 12 Comments »

DCT PR

July 2nd, 2009 by Multimedia Mike

Some people think that multimedia compression is basically all discrete cosine transform (DCT) and little else.

2 years ago at LinuxTag, I gave a fairly general presentation regarding FFmpeg and open source multimedia hacking (I just noticed that the main page still uses a photo of me and my presentation). I theorized that one problem our little community has when it comes to attracting new multimedia hacking talent is that the field seems scary and mathematically unapproachable. I have this perception that this is what might happen when a curious individual wants to get into multimedia hacking:

I wonder how multimedia compression works?

Well, I've heard that everyone uses something called MPEG for multimedia compression.

Further, I have heard something about how MPEG is based around the discrete cosine transform (DCT).

Let's look up what the DCT is, exactly...


Discrete cosine transform written out on a chalkboard
Clever photo cribbed from a blog actually entitled Discrete Cosine

At which point the prospective contributor screams and runs away from the possibility of ever being productive in the field.

Now, the original talk discussed how that need not be the case, because DCT is really a minor part of multimedia technology overall; how there are lots and lots of diverse problems in the field yet to solve; and how there is room for lots of different types of contributors.

The notion of DCT's paramount importance in the grand scheme of multimedia compression persists to this day. While reading the HTML5 spec development mailing list, Sylvia Pfeiffer expressed this same line of thinking vis-à-vis Theora:

Even if there is no vendor right now who produces an ASIC for Theora, the components of the Theora codec are not fundamentally different to the components of other DCT based codecs. Therefore, AISCs [sic] that were built for other DCT based codecs may well be adaptable by the ASIC vendor to support Theora.

This prompted me to recall something I read in the MPEG FAQ a long time ago:

MPEG is a DCT based scheme?

The DCT and Huffman algorithms receive the most press coverage (e.g. "MPEG is a DCT based scheme with Huffman coding"), but are in fact less significant when compared to the variety of coding modes signaled to the decoder as context-dependent side information. The MPEG-1 and MPEG-2 IDCT has the same definition as H.261, H.263, JPEG.

A few questions later, the FAQ describes no less than 18 different considerations that help compress video data in MPEG; only the first one deals with transforms. Theora is much the same way. When I wrote the document about Theora's foundation codec, VP3, I started by listing off all of the coding methods involved: DCT, quantization, run length encoding, zigzag reordering, predictive coding, motion compensation, Huffman entropy coding, and variable length run length Booleans. Theora adds a few more concepts (such as encoding the large amount of stream-specific configuration data).

I used to have the same idea, though: I was one of the first people to download On2's VpVision package (the release of their VP3 code) and try to understand the algorithm. I remember focusing on the DCT and trying to find DCT-related code, assuming that it was central to the codec. I was surprised and confused to find that a vast amount of logic was devoted to simply reversing DC coefficient prediction. At the end of a huge amount of frame reconstruction code was a small, humble call to an IDCT function.

What I would like to get across here is that Theora is rather different than most video codecs, in just about every way you can name (no, wait: the base quantization matrix for golden frames is the same as the quantization matrix found in JPEG). As for the idea that most DCT-based codecs are all fundamentally the same, ironically, you can't even count on that with Theora-- its DCT is different than the one found in MPEG-1/2/4, H.263, and JPEG (which all use the same DCT). This was likely done in On2's valiant quest to make everything about the codec just different enough from every other popular codec, which runs quite contrary to the hope that ASIC vendors should be able to simply re-use a bunch of stuff used from other codecs.

Posted in Codec Technology | 10 Comments »

Sun OMS Has A Spec

July 1st, 2009 by Multimedia Mike

A little over a year ago, Sun was making rumblings about a brand new video codec that they were hoping to design from the ground up using known-good (read: patent-unencumbered) coding algorithms. This was to be called the Open Media Stack (OMS). I wrote a post about, made an obligatory MultimediaWiki page about it, and then promptly forgot all about it.

Today, by way of a blog post by Opera's Bruce Lawson describing why HTML5's <video> tag is, well, stalled (to put it charitably), I learned that Sun's OMS team has published at least 2 specs, the latest one being posted a few weeks ago on June 9, 2009. As he notes, the proposed Oracle buyout of Sun puts the OMS project's status in limbo.

The spec page links to forum threads where interested parties can discuss issues in the spec. There aren't many comments but the ones that exist seem to revolve around the codec's articial resolution limitations. For my part, I wonder about how to encapsulate it into a container format for transport. The format specifies a sequence header that is 96 bits (12 bytes) in length, though there are provisions for (currently unused) extended data as well as free-form user data. The sequence header would be categorized as extradata in an AVI file or in a MOV file's video stsd atom. Successive frames should be straightforward to store in a container format since the coding method only seems to employ intra-and inter-frames. Each frame begins with a header which specifies a 37-bit timestamp in reference to a 90 kHz clock. This allows for a recording that's just over 1 week in length. It's also one of those highly redundant items if this format were to be stuffed in a more generalized container format.

Anyway, the main video algorithm uses arithmetic and Golomb coding for its entropy coding; 8x8 pixel macroblocks which can either be an entire block unto themselves or subdivided into 4 4x4 sub-blocks; colorspace is YUV 4:2:0 planar; specifies bit-exact 2x2, 4x4, and 8x8 transforms and quantization procedures; spacial prediction using left, top-left, top, and top-right blocks; precisely-specified 1/4-pel motion compensation. All in all, it appears relatively simple; the 0.91 spec (annexes and all) weighs in at a mere 96 pages.

Naturally, there are no reference implementations or samples available. This got me to wondering about how one goes about creating a reference implementation of a codec (or a production implementation, for that matter). The encoder needs to come first so that it can generate samples. Only then can a decoder be written. Ideally, the initial encoder and decoder should be 2 totally different programs, written by 2 different people, though hopefully working closely together (speedy communication helps). There is wisdom in the FFmpeg community about not basing the primary encoder and decoder on the same base library after we reverse engineered one of the Real video codecs and found a fairly obvious bug that occurred in both sides of the codec.

I think I know one way to ensure that the encoder and decoder don't share any pieces-- write them in different computer languages.

I'm still wondering what kind of application would need to record video continuously for up to a week. How about a closed-circuit security camera? With a terabyte drive, it could store video for a week assuming a bitrate of 1.5 Mbits/sec. That's roughly the same bitrate as the original MPEG-1 standard. If this coding method compresses more efficiently than MPEG-1, this might be a plausible application.

Posted in Open Source Multimedia | No Comments »