Breaking Eggs And Making Omelettes

Topics On Multimedia Technology and Reverse Engineering


Archives:

XSPF And XML

November 1st, 2007 by Multimedia Mike

Every so often, it’s a good idea to surf over to Xiph’s site to see if they have absorbed any other well-meaning multimedia-related free software projects. I’m not sure if XSPF started out as a separate effort, but it’s underneath the Xiph umbrella now. The project is billed as “the XML format for sharing playlists.” Yippee. To continue the “those who can, do…” series: Those who can, do; those who can’t, create metadata formats. Anyway, all the buzzwords are there: XML, open, portable, simple, XML. I’m surprised that it’s not an RFC yet (that I could find). I’m sure it’s only a matter of time. Going forward, all of the free multimedia players will be morally obligated to support XSPF. Be advised.

Maybe I’m just irritated because XSPF is supposed to be pronounced “spiff” which, to me, defiles the memory of Calvin.

I think those 3 letters — XML — put me off of this idea the most. Every now and then, I have entertained the idea of using XML to store or transport data for my own programs. But then I realize that I may as well just use an arbitrary binary format that is easier to parse. After all, isn’t XML just an arbitrary textual format? Actually, no. Arbitrary textual data would be easier to parse (e.g., records of data separated by carriage returns with individual fields separated by commas or some other character guaranteed not to occur in the regular data; i.e., CSV). XML requires strict structure around the arbitrary textual data.

As my esteemed multimedia hacking colleague, Attila Kinali, once articulated, “if you really think that XML is the answer, then you definitely misunderstood the question.” (Attila: Michael and I and the rest of the gang are going to make sure that quote is what you’re best known for.)

I think that XML is intuitively antithetical in the mind of the average multimedia hacker. Such an individual instinctively attempts to encode things with as few bits as possible for the express purpose of making transport or storage of that data more efficient. XML explicitly defies that notion by representing information with way more bits than are necessary.

Suddenly, I find myself wondering about representing DCT coefficient data using an XML schema — why not express a JPEG as a human-readable XML file?

[xml]
< ?xml version="1.0"?>



















[/xml]

Don’t laugh — it would be extensible. Someone could, for example, add markup to individual macroblocks. Is it anymore outlandish than, say, specifying vector drawings as XML?

Posted in Open Source Multimedia | 10 Comments »

10 Responses

  1. Kostya Says:

    Mike, you’ve forgotten about using XSLT to transform it into PNG, for example.
    I thought once about XML container:

    <xmlcontainer>
    <header>
     <stream id=”1″ width=”..” height=”..” codec=”&codecXVID”>
    </header>
    <data>
       <chunk type=”video” stream=”1″ blkdata=”(base64-encoded data)”/>
    <>
    </xmlcontainer>

    and external stylesheet to reference all known codecs.

    It would deserve a good flame if submitted to ffmpeg-devel.

  2. SvdB Says:

    I consider the main advantage of an XML format to be the availability of tools that you can let loose on your data, thus making your data more “open”.
    As a storage format, as opposed to a format to manipulate a file in, it seems less usable, at least for larger files.

    As for expanding image data into human-readable form, are you familiar with SNG (http://sng.sourceforge.net/)? It’s one of the more useful tools when working with png files, imho.

  3. Lucas Says:

    The purpose of XML in XSPF is to enable sharing. XML is great at being precise about the details of an encoding, especially things like how you specify the character set. Your bit-level approach wouldn’t be able to do that, and would leave developers unable to create interoperable playlist software.

    If you were coming from an internet protocols background instead of DSP you’d know all this. :) Bit-level people almost never have an instinct for internet protocols. And vice versa, I guess… See http://gonze.com/playlists/playlist-format-survey.html for the state of things right before we started on XSPF.

    BTW the first home of XSPF was at Musicbrainz, which still hosts our mailing list.

  4. Multimedia Mike Says:

    @SvdB: Wow! Every time I think I am only joking about something, I come to the stark realization that I wasn’t. Anyway, I just tried to compile SNG but configure complained about not finding an X color database.

    @Lucas: At the very least, thanks for the compare and contrast paper about the various playlist formats. That will form a useful basis for documenting playlist formats in the MultimediaWiki.

  5. SvdB Says:

    The use of SNG isn’t so much about the pixel data (at least for me), but rather the other chunks in the png file.

    “X color database” refers to the rgb.txt file that comes with X. Explicitely telling the configure script where it can find that file by adding something like “–with-rgbtxt=/opt/X11/share/X11/rgb.txt” to configure should solve your problem.

  6. sean barrett Says:

    XML is so frustrating. The game industry accepted XML as the basis for a 3d model interchange format — http://en.wikipedia.org/wiki/COLLADA

    From what I can tell from the specification, it doesn’t really leverage the only part of XML that theoretically makes XML valuable: nested content. Instead, lots of content works by cross-reference (you can define a ‘stream’ in one place, then access it from another place by name). (Yes, there is some nesting.)

    “Flat ascii file” is a “meta-format”; “flat binary file” is a “meta-format”–you can define other files from them. “Flat ascii file with one record per line” is a useful meta-format; it means you can apply certain kinds of unix pipes to them. And you get that advantage at an extremely low cost (it’s an extremely simple meta-file format).

    XML, on the other hand, is huge and hugely complex (in part because of the encodings someone talks about above); is not human-readable (the original plan for XML held out human-readability as a goal, but that text eventually got silently deleted from the w3 site); and induces a huge amount of complexity for the mere advantage of being able to have nested structure (compared to the flat structure mentioned above).

    (There are similar binary meta-formats, such as IFF (RIFF, TIFF, etc.), which are much, much simpler. Again, the whole encoding thing comes into play.)

    And then people make tools that handle the full complexity of XML (mostly by using godawful slow parsers from somewhere else, because parsing XML is rather complicated because the spec is so big), and then people tout an advantage of XML being that there are those tools. (And then there’s also the possibility of DTDs, and XSLT, etc.)

    And it’s very hard to convince anyone _using_ those XML tools that we’d be better off with some simpler tools on a simpler meta-file-format, because, hey, they have these tools right now.

    Nevermind that none of this addresses the “hard part”–semantics. Apparently there are a lot of “programmers” out there for whom being able to transform the syntactic representation without understanding the syntax is actually useful.

    And, anyway, XML becomes the classic sledgehammer/nail situation. XML has a ton of features piled into it (nested content; named entities; character encodings, etc.), so we get cases with things like Collada that leverage XML for some tiny subset of its features, pay a huge size-of-file and parsing cost, and probably don’t ever actually get used with any of the aforementioned tools.

    See also http://www.dajeil.com/Products.asp

  7. Multimedia Mike Says:

    http://www.dajeil.com/Products.asp … whoa! I almost choked when I was introduced just now to the concept of hardware built specifically to accelerate XML processing.

    Thanks for your comment, Sean. It goes to the heart of what annoys me most about XML– this view may be programmer-centric (a perspective which does not always lead to the most usable software), but whenever I examine a data format, I consider it in terms of the code investment for processing that data. Fixed binary formats — trivial. Line-based textual formats — still pretty straightforward. XML — [gasp]. I realize that there are all kinds of third party libraries that are supposed to take the pain away from XML parsing. But some of us still have a desire, deep down, to make lean and mean programs. Large parsing libraries tends to violate that goal.

    Further, when I think of complex textual metadata formats, I can’t shake the concern about security— buffer overflows and such.

  8. sean barrett Says:

    It is probably not a coincidence that I am also a programmer who uses plain C.

  9. Luca Barbato Says:

    XML is a wonderful markup that should phase out those relics called TeX…

    Beside that it shouldn’t used for other stuff (e.g. configuration files) and shouldn’t used to store data you don’t consider to convert/reshape using style sheets.

    I like a lot the possibility to write a docbook, shape it as xhtml for online fruition, make it as pdf etc etc etc.

    Actually I’m thinking about YAML since seems less annoying to use, anybody is up to prepare a docbook like yaml markup?

  10. Spudd86 Says:

    RE: Luca

    TeX is not a markup language. It is a macro expansion language, this is much more powerfull. You can write programs in TeX (it is in fact Turing Complete, meaning it can compute anything that is computable).

    TeX is easier to read than equivalent XML because TeX leverages whitespace for seperation where XML utterly ignores it. TeX is less verbose than XML because it is more specialized.

    TeX is not perfect, but for a powerfull typesetting environment it’s hard to think of a better conceptual design that doesn’t involve building a GUI (and a GUI that will be just as hard to learn as TeX at that). XML is not the answer here.

    XML is good at what it was oringally designed for, web based stuff that needs structured data in a format that allows free mixing of diffenrent structures.