Author Archives: Multimedia Mike

Naive Sorenson Video 1 Encoder

(Yes, the word is “naive” — or rather, “naïve” — not “native”. People always try to correct me when I use the word. Indeed, it should actually be written with 2 dots over the ‘i’ but who has a keyboard that can easily do that?)

At the most primitive level, programming a video encoder is about writing out a sequence of bits that the corresponding video decoder will understand. It’s sort of like creating a program — represented as a stream of opcodes — that will run on a given microprocessor or virtual machine. In fact, reading a video codec bitstream specification will reveal a lot of terminology along the lines of “transmitting information to the decoder” or “signaling the decoder to do xyz.”

Creating a good encoder that will deliver decent quality at a reasonable bitrate is difficult. Creating a naive encoder that produces a technically compliant bitstream, not so much.



When I wrote an FFmpeg encoder for Sorenson Video 1 (SVQ1), the first step was to just create a minimally compliant bitstream. The coarsest encoding mode that SVQ1 allows is to encode the average (mean) of each 16×16 block of samples. So I created an encoder that just encoded the mean of each block. Apple’s QuickTime Player was able to play the resulting video in all of its blocky glory. The result rather reminds me of the Super Nintendo’s mosaic effect.

Level 5 blocks (mean-only 16×16 encoding):



Level 3 blocks (mean-only 8×8 encoding):



It’s one thing for your own decoder (in this case, FFmpeg’s own decoder) to be able to decode the data. The big test is whether the official decoder (in this case, Apple QuickTime Player) can decode the file.



Now that’s a good feeling. After establishing that sort of baseline, it’s possible to adapt more and more features of the codec.

Dreamcast Anniversary Programming

This day last year saw a lot of nostalgia posts on the internet regarding the Sega Dreamcast, launched 10 years prior to that day (on 9/9/99). Regrettably, none of the retrospectives that I read really seemed to mention the homebrew potential, which is the aspect that interested me. On the occasion of the DC’s 11th anniversary, I wanted to remind myself how to build something for the unit and do so using modern equipment and build tools.



Background
Like many other programmers, I initially gained interest in programming because I desired to program video games. Not content to just plunk out games on a PC, I always had a deep, abiding ambition to program actual video game hardware. That is, I wanted to program a purpose-built video game console. The Sega Dreamcast might be the most ideal candidate to ever emerge for that task. All that was required to run your own software on the unit was the console, a PC, some free software tools, and a special connectivity measure.

The Equipment
Here is the hardware required (ideally) to build software for the DC:

  • The console itself (I happen to have 3 of them laying around, as pictured above)
  • Some peripherals: Such as the basic DC controller, the DC keyboard (flagship title: Typing of the Dead), and the visual memory unit (VMU)


  • VGA box: The DC supported 480p gaming via a device that allowed you to connect the console straight to a VGA monitor via 15-pin D-sub. Not required for development, but very useful. I happen to have 3 of them from different third parties:


  • Finally, the connectivity measure for hooking the DC to the PC.
    Continue reading

How Much H.264 In Each Encoder?

Thanks to my recent experiments with code coverage tools, I have a powerful new — admittedly somewhat specious — method of comparing programs. For example, I am certain that I have read on more than one occasion that Apple’s H.264 encoder sucks compared to x264 due, at least in part, to the Apple encoder’s alleged inability to exercise all of H.264’s features. I wonder how to test that claim?

Experiment
Use code coverage tools to determine which H.264 encoder uses the most features.

Assumptions

  • Movie trailers hosted by Apple will all be encoded with the same settings using Apple’s encoder.
  • Similarly, Yahoo’s movie trailers will be encoded with consistent settings using an unknown encoder.
  • Encoding a video using FFmpeg’s libx264-slow setting will necessarily throw a bunch of H.264’s features into the mix (I really don’t think this assumption holds much water, but I also don’t know what “standard” x264 settings are).

Methodology

  • Grab a random Apple-hosted 1080p movie trailer and random Yahoo-hosted 1080p movie trailer from Dave’s Trailer Page.
  • Use libx264/FFmpeg with the ‘slow’ preset to encode Big Buck Bunny 1080p from raw PNG files.
  • Build FFmpeg with code coverage enabled.
  • Decode each file to raw YUV, ignore audio decoding, generate code coverage statistics using gcovr, reset stats after each run by deleting *.gcda files.

Results

  • x264 1080p video: 9968 / 134203 lines
  • Apple 1080p trailer: 9968 / 134203 lines
  • Yahoo 1080p trailer: 9914 / 134203 lines

I also ran this old x264-encoded file (ImperishableNightStage6Low.mp4) through the same test. It demonstrated the most code coverage with 10671 / 134203 lines.

Conclusions
Conclusions? Ha! Go ahead and jump all over this test. I’m already fairly confident that it’s impossible (or maybe just very difficult) to build a single H.264-encoded video that exercises every feature that FFmpeg’s decoder supports. For example, is it possible for a file to use both CABAC and CAVLC entropy methods? If it’s possible, does any current encoder do that?

Using gcovr with FFmpeg

When I started investigating code coverage tools to analyze FFmpeg, I knew there had to be an easier way to do what I was trying to do (obtain code coverage statistics on a macro level for the entire project). I was hoping there was a way to ask the GNU gcov tool to do this directly. John K informed me in the comments of a tool called gcovr. Like my tool from the previous post, gcovr is a Python script that aggregates data collected by gcov. gcovr proves to be a little more competent than my tool.

Results
Here is the spreadsheet of results, reflecting FATE code coverage as of this writing. All FFmpeg source files are on the same sheet this time, including header files, sorted by percent covered (ascending), then total lines (descending).

Methodology
I wasn’t easily able to work with the default output from the gcovr tool. So I modified it into a tool called gcovr-csv which creates data that spreadsheets can digest more easily.

  • Build FFmpeg using the '-fprofile-arcs -ftest-coverage' in both the extra cflags and extra ldflags configuration options
  • 'make'
  • 'make fate'
  • From build directory: 'gcovr-csv > output.csv'
  • Massage the data a bit, deleting information about system header files (assuming you don’t care how much of /usr/include/stdlib.h is covered — 66%, BTW)

Leftovers
I became aware of some spreadsheet limitations thanks to this tool: Continue reading