{"id":1670,"date":"2009-07-25T12:11:48","date_gmt":"2009-07-25T19:11:48","guid":{"rendered":"http:\/\/multimedia.cx\/eggs\/?p=1670"},"modified":"2009-09-18T21:47:30","modified_gmt":"2009-09-19T04:47:30","slug":"xml-monkey","status":"publish","type":"post","link":"https:\/\/multimedia.cx\/eggs\/xml-monkey\/","title":{"rendered":"XML Monkey"},"content":{"rendered":"<p>I&#8217;m trying to come to terms with the reality that is XML. I may not like the format but that won&#8217;t change the fact that I have to interoperate with various XML data formats already in the wild. In other words, treat it like any random multimedia format. For example, suppose I want to write software to interpret the various comics that I&#8217;ve created with <a href=\"http:\/\/games.multimedia.cx\/another-taco-bell-promo\/\">Taco Bell&#8217;s series of Comics Constructors CD-ROMs<\/a>.<\/p>\n<p><center><br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/07\/xml-monkey-top-panel.jpg\" alt=\"Amazon Raiders: XML Monkey, top panel\" title=\"Amazon Raiders: XML Monkey, top panel\" width=\"447\" height=\"194\" class=\"aligncenter size-full wp-image-1671\" srcset=\"https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/07\/xml-monkey-top-panel.jpg 447w, https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/07\/xml-monkey-top-panel-300x130.jpg 300w\" sizes=\"auto, (max-width: 447px) 100vw, 447px\" \/><br \/>\n<\/center><\/p>\n<p><!--more--><\/p>\n<p><center><br \/>\n<img loading=\"lazy\" decoding=\"async\" src=\"http:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/07\/xml-monkey-bottom-panels.jpg\" alt=\"Amazon Raiders: XML Monkey, bottom panels\" title=\"Amazon Raiders: XML Monkey, bottom panels\" width=\"447\" height=\"403\" class=\"aligncenter size-full wp-image-1672\" srcset=\"https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/07\/xml-monkey-bottom-panels.jpg 447w, https:\/\/multimedia.cx\/eggs\/wp-content\/uploads\/2009\/07\/xml-monkey-bottom-panels-300x270.jpg 300w\" sizes=\"auto, (max-width: 447px) 100vw, 447px\" \/><br \/>\n<\/center><\/p>\n<p>The comics are saved as XML files that look something like this:<\/p>\n<p>[xml]<br \/>\n<comic>\n  <page0 name=\"pgt1\">\n    <sq1 mirror=\"0\" rotation=\"0\" scale=\"350\" y=\"283\" x=\"388\" bg=\"bg07\"><br \/>\n      <object sq=\"1\" libType=\"characters\" depth=\"1\" mirror=\"0\" rotation=\"0\" scale=\"100\" y=\"368\" x=\"196\" name=\"ch01\" \/><br \/>\n      <object sq=\"1\" libType=\"characters\" depth=\"2\" mirror=\"1\" rotation=\"0\" scale=\"100\" y=\"370\" x=\"338\" name=\"ch10\" \/><br \/>\n      <object sq=\"1\" libType=\"characters\" depth=\"3\" mirror=\"0\" rotation=\"0\" scale=\"100\" y=\"376\" x=\"342\" name=\"0\" \/><br \/>\n      <object sq=\"1\" libType=\"objects\" depth=\"4\" mirror=\"0\" rotation=\"0\" scale=\"100\" y=\"367\" x=\"469\" name=\"ob02\" \/><br \/>\n      <object txtColor=\"\" cont=\"We might as well face it-- XML isn&apos;t going away\" sq=\"1\" libType=\"bubbles\" depth=\"5\" mirror=\"1\" rotation=\"0\" scale=\"100\" y=\"265\" x=\"216\" name=\"bu01\" \/><br \/>\n      <object sq=\"1\" libType=\"characters\" depth=\"6\" mirror=\"0\" rotation=\"0\" scale=\"80\" y=\"321\" x=\"168\" name=\"ch19\" \/><br \/>\n    <\/sq1><br \/>\n&#8230;\n  <\/page0>\n&#8230;<br \/>\n<\/comic><br \/>\n[\/xml]<\/p>\n<p>How to even begin with this? Sometimes a good book can help. Yesterday, I found an old book from 1999 called <a href=\"http:\/\/www.amazon.com\/Just-Xml-John-E-Simpson\/dp\/0139434178\">&#8220;Just XML&#8221; by John E. Simpson<\/a>. It weighs in at nearly 400 pages. I thought XML was supposed to be relatively straightforward to understand.<\/p>\n<p>The book is supposed to be geared toward web programmers. I&#8217;m not a web programmer, but I do wish to know how to programmatically access this data. I have seen that Python has interfaces to libraries that parse XML. So I shoved xml-monkey.xml through the example code shown at the end of <a href=\"http:\/\/docs.python.org\/library\/pyexpat.html\">Python&#8217;s xml.parser.expat<\/a> documentation. This yields:<\/p>\n<pre>\r\nStart element: COMIC {}\r\nStart element: PAGE0 {u'name': u'pgt1'}\r\nStart element: SQ1 {u'scale': u'350', u'bg': u'bg07', \r\n  u'mirror': u'0', u'y': u'2\r\n  83', u'x': u'388', u'rotation': u'0'}\r\nStart element: OBJECT {u'scale': u'100', u'name': u'ch01', \r\n  u'sq': u'1', u'depth': u'1', u'mirror': u'0', u'y': u'368', u'x':\r\n  u'196', u'rotation': u'0', u'libType': u'characters'}\r\nEnd element: OBJECT\r\nStart element: OBJECT {u'scale': u'100', u'name': u'ch10', \r\n  u'sq': u'1', u'depth': u'2', u'mirror': u'1', u'y': u'370', u'x': \r\n  u'338', u'rotation': u'0', u'libType': u'characters'}\r\nEnd element: OBJECT\r\nStart element: OBJECT {u'scale': u'100', u'name': u'0', u'sq':\r\n  u'1', u'depth': u'3', u'mirror': u'0', u'y': u'376', u'x': u'342',\r\n  u'rotation': u'0', u'libType': u'characters'}\r\nEnd element: OBJECT\r\nStart element: OBJECT {u'scale': u'100', u'name': u'ob02', \r\n  u'sq': u'1', u'depth': u'4', u'mirror': u'0', u'y': u'367', u'x': \r\n  u'469', u'rotation': u'0', u'libType': u'objects'}\r\nEnd element: OBJECT\r\nStart element: OBJECT {u'scale': u'100', \r\n  u'cont': u\"We might as well face it-- XML isn't going away\", \r\n  u'name': u'bu01', u'sq': u'1', u'txtColor': u'', u'depth': u'5', u'mirror': \r\n  u'1', u'y': u'265', u'x': u'216', u'libType': u'bubbles', u'rotation': u'0'}\r\n...\r\n<\/pre>\n<p>So that&#8217;s something. I thought XML documents were required to start with a little more boilerplate such as &lt;?xml version=&#8221;1.0&#8243; encoding=&#8221;UTF-8&#8243;?&gt;. I see that there are a few levels to XML validity, the first is &#8220;well-formed&#8221; in which the document adheres to basic XML syntactic rules. Then there&#8217;s actually being &#8220;valid&#8221; which requires a document type definition to validate against. That DTD, I do not have.<\/p>\n<p>But this is still a good start. I can see how I might start processing the data using Python. This is good since I am encountering more and more XML files that I&#8217;m interested in manipulating.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Getting my feet wet with Python and XML, as well as a Comics Constructor software package<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[28,55],"tags":[],"class_list":["post-1670","post","type-post","status-publish","format-standard","hentry","category-programming","category-python"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/1670","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/comments?post=1670"}],"version-history":[{"count":7,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/1670\/revisions"}],"predecessor-version":[{"id":1812,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/posts\/1670\/revisions\/1812"}],"wp:attachment":[{"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/media?parent=1670"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/categories?post=1670"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/multimedia.cx\/eggs\/wp-json\/wp\/v2\/tags?post=1670"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}