overlooked
The Apache Muse code base includes a collection of DOM-based convenience methods that help us parse and construct XML fragments without a lot of DOM API boilerplate. These methods are included in an incredibly large class named XmlUtils and represent the collective knowledge, mistakes, and advice of a half dozen developers who have worked on WS-* technology for over three years. Changing any of the methods in XmlUtils could leave the Muse build totally broken; despite the fact that it's been extracted from the core engine into a small library, XmlUtils is very much a core piece of code because of how heavily dependent on it the rest of the project is.
Any piece of code that is so central to a project is bound to have one or two (or ten) ugly hacks to handle edge cases, bugs in dependencies, and backwards compatibilty. XmlUtils is fairly hack-free, but it does have some code that I consider... unsavory. In my opinion, the most cringe-worthy code snippets are the ones that need to accept an XML element and iterate over its child elements. The only way to get all of the direct children of an element is to call Node.getChildNodes() and save those Node objects that are actually Elements; such code involves one or more if blocks that check the concrete type of the Node objects and ignore the undesired ones:
Today these checks are common, but when I first wrote the code I allowed myself to make a number of assumptions because it was only used for traversing schema-validated SOAP messages. Then, one day, we added a configuration file[1], and all of a sudden the input was much less predictable. One of the first bugs I had to fix was the one caused by comments in the configuration file; XML comments become their own DOM nodes - different from Element or Text - and my code failed when comments were added in places where I expected a child element. My final fix eliminated comments from the DOM tree all together, but I kept the conditionals mentioned above on the off chance that we encountered XML pre-processor nodes or CDATA nodes. I would not be fooled again!if (nextNode.getNodeType() != Node.ELEMENT_NODE)
continue;
This bug, and others like it, were part of my education into the more pedantic corners of XML Land. Over the course of three years I learned many things about XML, and while many of them were logged in my brain and never used again, all of them helped to drive home the message that XML-related issues were never as simple as they first seemed. When reading or writing XML documents or schemas, great care must be taken to ensure that edge cases and ambiguity are not hiding in the bushes. And as much as I malign the DOM API, it does a pretty good job of alerting you to the fact that XML processing in a real world system is never as easy as more user-friendly APIs would have you believe.DocumentBuilderFactory factory =
DocumentBuilderFactory.newInstance();
//
// we don't need comment nodes - they'll only
// slow us down
//
factory.setIgnoringComments(true);
Now, despite four excruciating paragraphs on the internals of Muse, this post is actually about frustration I've had with some of my Zero-related work. I had to create some custom Dojo widgets this week, and after a few hours of searching the Internet for proper instructions[2], I started to make some progress: I had a widget "class", a widget template, and an HTML page that was loading the widget class and calling its initialization routines. The only problem was that nothing was showing up in the page - I had picked off all of the initialization errors, and yet there was no Dojo-inspired HTML to be seen.
Just as I was about to break down and open a DOM inspector to try and read through the eighty-seven levels of HTML to find the answer, it dawned on me: comments! My widget template is encapsulated in a single <div/> tag, but the HTML file that contains that <div/> has two comment tags: one for the IBM copyright notice and one with my comments explaining the content of the file. I had a hunch that Dojo was assuming the template file had only one node (an element) and wasn't checking for other, irrelevant nodes.
I was right.
I took the copyright notice and comments out of my template file and everything worked beautifully. It's good to know that my WS-* skill set isn't completely wasted here in RESTtopia.
[1] The beginning of the end for any project.
[2] The Dojo team isn't big on documentation, so I had to piece things together using half-baked suggestions from mailing lists and forums, some of which were no longer operating. I love reading documentation through Google cache!
Labels: muse, narcissism, programming
0 Comments:
Post a Comment
<< Home