This is something I started writing about a decade ago (and I almost certainly have some of it wrong), back when compact discs still had a fair amount of relevance. Back around 2002, after a few years investigating multimedia technology, I took an interest in compact discs of all sorts. Even though there may seem to be a wide range of CD types, I generally found that they’re all fundamentally the same. I thought I would finally publishing something, incomplete though it may be.
There are a lot of ways to look at a compact disc. First, there’s the physical format, where a laser detects where pits/grooves have disturbed the smooth surface (a.k.a. lands). A lot of technical descriptions claim that these lands and pits on a CD correspond to ones and zeros. That’s not actually true, but you have to decide what level of abstraction you care about, and that abstraction is good enough if you only care about the discs from a software perspective.
Grand Unified Theory (Software Perspective)
Looking at a disc from a software perspective, I have generally found it useful to view a CD as a combination of a 2 main components:
- table of contents (TOC)
- a long string of sectors, each of which is 2352 bytes long
I like to believe that’s pretty much all there is to it. All of the information on a CD is stored as a string of sectors that might be chopped up into a series of anywhere from 1-99 individual tracks. The exact sector locations where these individual tracks begin are defined in the TOC.
Audio CDs (CD-DA / Red Book)
The initial purpose for the compact disc was to store digital audio. The strange sector size of 2352 bytes is an artifact of this original charter. “CD quality audio”, as any multimedia nerd knows, is formally defined as stereo PCM samples that are each 16 bits wide and played at a frequency of 44100 Hz.
(44100 audio frames / 1 second) * (2 samples / audio frame) * (16 bits / 1 sample) * (1 byte / 8 bits) = 176,400 bytes / second (176,400 bytes / 1 second) / (2352 bytes / 1 sector) = 75
75 is the number of sectors required to store a single second of CD-quality audio. A single sector stores 1/75th of a second, or a ‘frame’ of audio (though I think ‘frame’ gets tossed around at all levels when describing CD formats).
The term “red book” is thrown around in relation to audio CDs. There is a series of rainbow books that define various optical disc standards and the red book describes audio CDs.
Basic Data CD-ROMs (Mode 1 / Yellow Book)
Somewhere along the line, someone decided that general digital information could be stored on these discs. Hence, the CD-ROM was born. The standard model above still applies– TOC and string of 2352-byte sectors. However, it’s generally only useful to have a single track on a CD-ROM. Thus, the TOC only lists a single track. That single track can easily span the entire disc (something that would be unusual for a typical audio CD).
While the model is mostly the same, the most notable difference between and audio CD and a plain CD-ROM is that, while each sector is 2352 bytes long, only 2048 bytes are used to store actual data payload. The remaining bytes are used for synchronization and additional error detection/correction.
At least, the foregoing is true for mode 1 / form 1 CD-ROMs (which are the most common). “Mode 1” CD-ROMs are defined by a publication called the yellow book. There is also mode 1 / form 2. This forgoes the additional error detection and correction afforded by form 1 and dedicates 2336 of the 2352 sector bytes to the data payload.
CD-ROM XA (Mode 2 / Green Book)
From a software perspective, these are similar to mode 1 CD-ROMs. There are also 2 forms here. The first form gives a 2048-byte data payload while the second form yields a 2324-byte data payload.
Video CD (VCD / White Book)
These are CD-ROM XA discs that carry MPEG-1 video and audio data.
Photo CD (Beige Book)
This is something I have never personally dealt with. But it’s supposed to conform to the CD-ROM XA standard and probably fits into my model. It seems to date back to early in the CD-ROM era when CDs were particularly cost prohibitive.
Multisession CDs (Blue Book)
Okay, I admit that this confuses me a bit. Multisession discs allow a user to burn multiple sessions to a single recordable disc. I.e., burn a lump of data, then burn another lump at a later time, and the final result will look like all the lumps were recorded as the same big lump. I remember this being incredibly useful and cost effective back when recordable CDs cost around US$10 each (vs. being able to buy a spindle of 100 CD-Rs for US$10 or less now). Studying the cdrom.h file for the Linux OS, I found a system call named CDROMMULTISESSION that returns the sector address of the start of the last session. If I were to hypothesize about how to make this fit into my model, I might guess that the TOC has some hint that the disc was recorded in multisession (which needs to be decided up front) and the CDROMMULTISESSION call is made to find the last session. Or it could be that a disc read initialization operation always leads off with the CDROMMULTISESSION query in order to determine this.
I suppose I could figure out how to create a multisession disc with modern software, or possibly dig up a multisession disc from 15+ years ago, and then figure out how it should be read.
This type puzzles my as well. I do have some CD-i discs and I thought that I could read them just fine (the last time I looked, which was many years ago). But my research for this blog post has me thinking that I might not have been seeing the entire picture when I first studied my CD-i samples. I was able to see some of the data, but sources indicate that only proper CD-i hardware is able to see all of the data on the disc (apparently, the TOC doesn’t show all of the sectors on disc).
Hybrid CDs (Data + Audio)
At some point, it became a notable selling point for an audio CD to have a data track with bonus features. Even more common (particularly in the early era of CD-ROMs) were computer and console games that used the first track of a disc for all the game code and assets and the remaining tracks for beautifully rendered game audio that could also be enjoyed outside the game. Same model: TOC points to the various tracks and also makes notes about which ones are data and which are audio.
There seems to be 2 distinct things described above. One type is the mixed mode CD which generally has the data in the first track and the audio in tracks 2..n. Then there is the enhanced CD, which apparently used multisession recording and put the data at the end. I think that the reasoning for this is that most audio CD player hardware would only read tracks from the first session and would have no way to see the data track. This was a positive thing. By contrast, when placing a mixed-mode CD into an audio player, the data track would be rendered as nonsense noise.
There’s at least one small detail that my model ignores: subchannels. CDs can encode bits of data in subchannels in sectors. This is used for things like CD-Text and CD-G. I may need to revisit this.
There’s still a lot of ground to cover, like how those sectors might be formatted to show something useful (e.g., filesystems), and how the model applies to other types of optical discs. Sounds like something for another post.