Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

AAC Decoder Is In!

August 22nd, 2008 by Multimedia Mike

It certainly has been a long journey for native Advanced Audio Coding (AAC) in FFmpeg. It started with a Google Summer of Code project back in FFmpeg’s inaugural FFmpeg SoC season (2006). It went unfinished. Since then, many people have endeavored to fix it up to the point where it can be included into the mainline. But it was Robert Swain who persevered toward the end goal. And now look:

$ ffmpeg -formats
[...]
Codecs:
 D V    4xm             4X Movie
 D V D  8bps            QuickTime 8BPS video
 D A    8svx_exp        8SVX exponential
 D A    8svx_fib        8SVX fibonacci
 D A    aac             Advanced Audio Coding
[...]

Robert profiled the new AAC decoder to be significantly faster than the libfaad, the prevailing AAC decoding solution in the open source community. Further optimization work is ongoing, as is support for more advanced coding modes. Currently, the decoder only deals with low complexity (AAC-LC), the most common variant you are likely to encounter.

And of course, thanks also to Robert for creating more FATE work for me. I can’t avoid the problem of testing perceptual audio decoders for much longer.

Posted in FATE Server, Open Source Multimedia | 5 Comments »

5 Responses

  1. Robert Swain Says:

    Thanks for the post Mike. I think ~135% faster than FAAD isn’t too shabby. :)

    Also, it supports Main profile as well and it’s probably worth noting that I’m being contracted to implement the Spectral Band Replication and Parametric Stereo extensions to support HE AAC v1/v2.

    It seems the LATM multiplex format is also used in some places to transport HE AAC in MPEG-TS for DVB. There is a patch floating around on the mailing list from a guy called Paul Kendall for this but I want to look at the spec to check that it’s done in “the right way”. :) At the moment I need to decide if it’s better written as a demuxer, a bitstream parser or as a pseudo-decoder.

    After reading the summary text for it in the spec it looks like it should be a demuxer, but the main use I’ve seen for it, as mentioned, is in MPEG-TS. Can FFmpeg demuxers call other FFmpeg demuxers? LATM is supposed to be able to handle multiple MPEG-4 audio payloads and it is called Low-overhead MPEG-4 Audio Transport Multiplex. It is purported as a simpler alternative to MPEG-4 systems when not all those features are required.

    Gabriel Bouvigne (of LAME fame, amongst others mehtinks) has been offering a bit of guidance for the AAC encoder SoC project. I asked him a while ago what features are important for an AAC decoder. He said, other than LC, Main and HE profiles, AAC-LD (Low Delay) and BSAC (scalable) are used.

    Finally, I’m interested in AAC conformance, to some extent at least. We handle some window transitions differently to how the spec prescribes because it was deemed that the spec was poorly designed in this area and ‘our’ way does not seem to cause audible artifacts and is simpler and faster and more usual. Even the spec describes these transitions as ‘meaningless’ so it seems to look down on their use. :)

    I spotted a conformance ‘suite’ on the 3GPP site. Maybe you can do some testing using that. Apparently the K value they discuss in the accompanying PDF can be converted into a useful form with RMS = 2^-(K-1)/sqrt(12) as I recall. Then the std dev output from tiny_psnr can be compared to that value and if it’s less than it, it’s OK. Though it should be noted ‘our’ standard deviation is in the range [-32768, 32767] whereas theirs is [-1.0, 1.0] so the RMS threshold will need scaling. E-mail me if you want to talk about this some more. :)

  2. Multimedia Mike Says:

    Oh, I’ll be in touch. :-)

  3. Andrew Says:

    Neat stuff :) should be good once I learn the command line for FFmpeg or some frontend tool makes use of it, since AAC is pretty much a standard with MPEG4.

  4. Jim Says:

    I’d love to see AAC LD included, because it looks like QT Pro on Leopard exports LD by default for an mpeg-4 if FastStart is enabled. Not a big deal, but the “audio object type 23” errors were confusing for a long time, until I found this entry.

  5. sc Says:

    any idea if the AAC LD is going to be included in ffmpeg anytime in the near future? some of the files that i’m trying to run even if i specifically force ffmpeg to use faad. thanks for any insight.