Grand Unified Theory of Compact Disc

This is something I started writing about a decade ago (and I almost certainly have some of it wrong), back when compact discs still had a fair amount of relevance. Back around 2002, after a few years investigating multimedia technology, I took an interest in compact discs of all sorts. Even though there may seem to be a wide range of CD types, I generally found that they’re all fundamentally the same. I thought I would finally publishing something, incomplete though it may be.

Physical Perspective
There are a lot of ways to look at a compact disc. First, there’s the physical format, where a laser detects where pits/grooves have disturbed the smooth surface (a.k.a. lands). A lot of technical descriptions claim that these lands and pits on a CD correspond to ones and zeros. That’s not actually true, but you have to decide what level of abstraction you care about, and that abstraction is good enough if you only care about the discs from a software perspective.

Grand Unified Theory (Software Perspective)
Looking at a disc from a software perspective, I have generally found it useful to view a CD as a combination of a 2 main components:

  • table of contents (TOC)
  • a long string of sectors, each of which is 2352 bytes long

I like to believe that’s pretty much all there is to it. All of the information on a CD is stored as a string of sectors that might be chopped up into a series of anywhere from 1-99 individual tracks. The exact sector locations where these individual tracks begin are defined in the TOC.

Audio CDs (CD-DA / Red Book)
The initial purpose for the compact disc was to store digital audio. The strange sector size of 2352 bytes is an artifact of this original charter. “CD quality audio”, as any multimedia nerd knows, is formally defined as stereo PCM samples that are each 16 bits wide and played at a frequency of 44100 Hz.

(44100 audio frames / 1 second) * (2 samples / audio frame) * 
  (16 bits / 1 sample) * (1 byte / 8 bits) = 176,400 bytes / second
(176,400 bytes / 1 second) / (2352 bytes / 1 sector) = 75

75 is the number of sectors required to store a single second of CD-quality audio. A single sector stores 1/75th of a second, or a ‘frame’ of audio (though I think ‘frame’ gets tossed around at all levels when describing CD formats).

The term “red book” is thrown around in relation to audio CDs. There is a series of rainbow books that define various optical disc standards and the red book describes audio CDs.

Basic Data CD-ROMs (Mode 1 / Yellow Book)
Somewhere along the line, someone decided that general digital information could be stored on these discs. Hence, the CD-ROM was born. The standard model above still applies– TOC and string of 2352-byte sectors. However, it’s generally only useful to have a single track on a CD-ROM. Thus, the TOC only lists a single track. That single track can easily span the entire disc (something that would be unusual for a typical audio CD).

While the model is mostly the same, the most notable difference between and audio CD and a plain CD-ROM is that, while each sector is 2352 bytes long, only 2048 bytes are used to store actual data payload. The remaining bytes are used for synchronization and additional error detection/correction.

At least, the foregoing is true for mode 1 / form 1 CD-ROMs (which are the most common). “Mode 1” CD-ROMs are defined by a publication called the yellow book. There is also mode 1 / form 2. This forgoes the additional error detection and correction afforded by form 1 and dedicates 2336 of the 2352 sector bytes to the data payload.

CD-ROM XA (Mode 2 / Green Book)
From a software perspective, these are similar to mode 1 CD-ROMs. There are also 2 forms here. The first form gives a 2048-byte data payload while the second form yields a 2324-byte data payload.

Video CD (VCD / White Book)
These are CD-ROM XA discs that carry MPEG-1 video and audio data.

Photo CD (Beige Book)
This is something I have never personally dealt with. But it’s supposed to conform to the CD-ROM XA standard and probably fits into my model. It seems to date back to early in the CD-ROM era when CDs were particularly cost prohibitive.

Multisession CDs (Blue Book)
Okay, I admit that this confuses me a bit. Multisession discs allow a user to burn multiple sessions to a single recordable disc. I.e., burn a lump of data, then burn another lump at a later time, and the final result will look like all the lumps were recorded as the same big lump. I remember this being incredibly useful and cost effective back when recordable CDs cost around US$10 each (vs. being able to buy a spindle of 100 CD-Rs for US$10 or less now). Studying the cdrom.h file for the Linux OS, I found a system call named CDROMMULTISESSION that returns the sector address of the start of the last session. If I were to hypothesize about how to make this fit into my model, I might guess that the TOC has some hint that the disc was recorded in multisession (which needs to be decided up front) and the CDROMMULTISESSION call is made to find the last session. Or it could be that a disc read initialization operation always leads off with the CDROMMULTISESSION query in order to determine this.

I suppose I could figure out how to create a multisession disc with modern software, or possibly dig up a multisession disc from 15+ years ago, and then figure out how it should be read.

CD-i
This type puzzles my as well. I do have some CD-i discs and I thought that I could read them just fine (the last time I looked, which was many years ago). But my research for this blog post has me thinking that I might not have been seeing the entire picture when I first studied my CD-i samples. I was able to see some of the data, but sources indicate that only proper CD-i hardware is able to see all of the data on the disc (apparently, the TOC doesn’t show all of the sectors on disc).

Hybrid CDs (Data + Audio)
At some point, it became a notable selling point for an audio CD to have a data track with bonus features. Even more common (particularly in the early era of CD-ROMs) were computer and console games that used the first track of a disc for all the game code and assets and the remaining tracks for beautifully rendered game audio that could also be enjoyed outside the game. Same model: TOC points to the various tracks and also makes notes about which ones are data and which are audio.

There seems to be 2 distinct things described above. One type is the mixed mode CD which generally has the data in the first track and the audio in tracks 2..n. Then there is the enhanced CD, which apparently used multisession recording and put the data at the end. I think that the reasoning for this is that most audio CD player hardware would only read tracks from the first session and would have no way to see the data track. This was a positive thing. By contrast, when placing a mixed-mode CD into an audio player, the data track would be rendered as nonsense noise.

Subchannels
There’s at least one small detail that my model ignores: subchannels. CDs can encode bits of data in subchannels in sectors. This is used for things like CD-Text and CD-G. I may need to revisit this.

In Summary
There’s still a lot of ground to cover, like how those sectors might be formatted to show something useful (e.g., filesystems), and how the model applies to other types of optical discs. Sounds like something for another post.

10 thoughts on “Grand Unified Theory of Compact Disc

  1. Ian Farquhar

    Photo CD’s were Kodak’s first response to the glimmer of concern that digital photography might pose a “tiny” issue to their film business.

    It was announced in 1990, and Wikipedia has an excellent entry on it.

    People look at Kodak now and wonder how they could have missed the havoc digital photography wrought on Kodak’s business (it’s a classical disruptive technology situation). But the evidence suggested Kodak didn’t miss it at all.

    Kodak was sued by Polaroid for copying it’s instant film process in the 70’s. The lawsuit ran for 14 years, and in 1991 Polaroid was awarded $951M USD!

    This decimated Kodak’s finances, and resulted in a huge amount of R&D the company was working on being shut down. But it wasn’t only financial, because Kodak was also suffering the influx of the MBA (dis-)qualified management, which resulted in short-term thinking and management-through-spreadsheet. Politics were also rife, with management in Rochester killing of viable projects (eg. Kodak Cineon’s R&D in Melbourne). The company was a mess at the time it needed true leadership and vision.

    Kodak probably still should have moved faster, but in reality, they were in a perfect storm of a disruptive technology plus MBA-management which sees the company where it is today: one of the great technology names of the past, now gone.

    In reality, the PhotoCD idea was pretty good, and made sense in 1992 (when it launched). It kept a role for Kodak’s many photoshops, it provided high-quality scanning, the codec was ok, and it delivered the photos on a viable media. They just weren’t able to deliver it with the mess the company was in.

    BTW, Kodak also claimed 200 year life for PhotoCD’s, a claim I now find laughable.

  2. Joe

    Oh man, CDs. I can go on for hours about CDs.

    I’ll try to keep it brief, though. I have some corrections.

    First, mode 1 does not have forms: all mode 1 sectors have 2048 bytes of usable space, and the rest is error correction and stuff. What you’re calling “mode 1 form 1” is actually just mode 1, and what you’re calling “mode 1 form 2” is really mode 2, without XA. I have never seen a mode 2 CD that did not use the XA extensions, but as far as I know it’s still a legal (if impractical) interpretation of the standard.

    Next, most CD-i discs can be read just fine by PC CD drives. A regular CD-i disc is a single mode 2 XA data track followed by zero or more audio tracks. The data track contains a header at LBA 16, about 30 seconds of audio, and then the rest of the filesystem. All of it can be read using a normal CD drive, provided there are no mastering errors. The ones you have to look out for are the “CD-i ready” discs. These hide the data in the pregap of track 1, which is usually skipped when the disc is read. It’s still possible to access the data, but it takes more effort.

    Finally, the size of a CD-DA “Frame” is 24 bytes according to the standard, but no one really cares about that.

    Whew.

    …Actually, I want to keep talking about CDs.

    How about multisession. As far as I can gather, once the disc is finalized, the TOC will contain everything you need to know. Unfortunately, since I don’t have any multisession CDs, I can’t say for sure if there’s anything else strange going on there. I don’t remember if you can read the sectors from between sessions.

    And subchannel data. Every CD has it, but it’s usually just timecodes. (And the TOC, but usually the CD drive won’t let you see that part of the disc.) It’s also really unreliable: you’ll get something different every time you read it. Fortunately this doesn’t matter too much with timecodes, but it might cause issues with CD+G, CD-Text, and Playstation copy protection. Which reminds me, CD-Text punches your grand unified theory in the face: it likes to hang out in a separate section from the regular data, near the TOC.

    There’s one last thing, and it’s the devil in the details: offsets. Different CD drives read audio tracks shifted by different amounts, and you’ll run into trouble when you want to have one definitive, correct copy of a CD. If it’s data only, good for you – the CD drive automatically corrects for data offsets. Audio only? That’s what AccurateRip is for. Mixed audio and data? Welcome to my pain.

    …So much for keeping it brief…

  3. Jonathan Wilson

    Go ask the developers of the MAME emulator how hard it has been to make an “accurate” RIP of any of the various different optical disk formats and layouts that the hardware they emulate uses.
    They have to deal with all the wierd issues that come with trying to create a bit-accurate copy of a CD (rather than just a copy of the data or audio that’s on the CD)
    I think only a handful of devs even know how to properly dump a disk (especially if its got CD-audia on it and not just data)

  4. Anonymous

    MAME doesn’t really support “proper” dumps from CDs. The closest they have so far is a single session with data and audio tracks, subchannel data, and a separate TOC.

    If you want a “proper” dump of a CD, you have to find a CD drive that will let you read into the lead-in, and you have to use disc swapping to trick the drive into thinking the whole disc is audio. Very few drives can do this, and even those that can tend to have trouble seeking within the lead-in.

    If you want a bit-accurate copy of a CD, you have two options: go get a microscope, or hook up some expensive electronics to a really old CD player.

  5. Multimedia Mike Post author

    @Joe: Thanks for the info dump, and don’t worry about keeping it brief– this is a Safe Space for yammering about multimedia topics. :-)

    I’ve seen the explanation of a CD-DA frame being 24 bytes, and then there’s the idea that there are 75 2352-byte sectors. There are many ways to look at these wonderful discs but unfortunately some of the terminology gets overloaded.

  6. RC

    I don’t understand the confusion about multi-session CDs. They’re pretty simple, and information is easily available online. Basically, each session is written as a track, and the lead-in of each track contains a pointer to the locations of the previous session’s TOC. A flag at the start tells the operating system that it’s a multi-session disc, and so the software should seek to the last track first, then follow that chain of pointers, and compile the sessions into a single file-system.

    Old or dumb software would ignore the multi-session flag, and just naievely read only the first TOC and not see anything else. With the proper software (add-ons for Windows, options to Linux’s mount_iso9660) you can select from a list of sessions, and view each as a snapshot. With some software not mastering multi-session discs properly, and some OSes being particularly stupid, this feature was invaluable for getting at the files on the disc you actually wanted, or un-hiding files that were marked as deleted later.

  7. Multimedia Mike Post author

    @RC: I trust that the multisession information is out there. I just still hadn’t researched it thoroughly at the time I wanted to publish this blog post, which was already too lengthy. :-)

  8. docca

    Thanks for the article, Mike. That sure did bring back good (and sometimes hellish) memories.

    I used to master/rip tons of CDs of all kinds during the 1990s, and this kind of knowledge was fundamental to me. I even owned one or two “rainbow books” at the time, and collected strange and weird media and drives. Unfortunately most of it is gone, like the stack of a dozen or so CD recorders with many different interfaces (mostly SCSI variants). Some of them did cost an arm or a leg at the time. My favourite was the NEC MultiSpin 3Xp (top-loading!) which had the quirk of reading only the first session on multisession discs. It also had a battery pack so you could lug it around as a 2-kg audio CD player. So practical… :)

    You forgot to mention the bizarre CD+G media on the Hybrid section that sometimes contained Karaoke lyrics. And the joy of popping one of those on your stereo and actually managing to damage your speakers and/or your ears after your huge CD-player unit decided to pump raw data to them.

    Ian: I’ve been wondering about the suspenders and beard as well. I already have the beard, just need to let it grow to street-bum levels when the time comes. :)

Comments are closed.