Thursday, February 14, 2008

binding

I've been working on the Zero team for almost a year now, and in that time, Groovy has become my language of choice, both for Zero applications and non-Zero utilities. Groovy is, as Jerry Cuomo put it, "the nicotine patch for Java programmers"; it provides many of the cool features found in Python while freeing me from the tedious boilerplate of Java, all with a gentle learning curve. Like most Java-turned-Groovy users, I started out writing Java-centric code, picking up Groovy's shortcuts and elegance as I grew more experienced and shared code bases with other Groovy users. There are still many features that are not part of my toolbelt, but every day I seem to pick up a new one.

Because I use Groovy both for RESTful resource implementations and utility scripts, I often use Zero's /app/scripts directory to store code that is in any way reusable; this shortens my resource scripts and keeps time spent refactoring to a minimum. The only problem with invoking code in /app/scripts is that you have to do so with generic, reflection-based APIs, like so:
def script = "FooUtils.groovy";
def method = "getFoo";
def params = ["param1", "param2", ...];
def foo = invokeMethod(script, method, params);
To make it so the code in /app/scripts is in scope for your other Groovy code, you need to create a binding. Making a Groovy binding for a script isn't hard, it's just kind of tedious: you write a Java class that maps all standalone function names to reflection-based invocations on the Java class, and then use Groovy's script engine API to call the target function. You must also update your configuration file to register your Java class as a Groovy binding. The whole process is outlined in Zero's documentation as well as every developerWorks article I've written in the last six months. If you follow the instructions prescribed by the Zero team, the block of code shown above will become much more readable:
def foo = getFoo("param1", "param2", ...);
It's not often that I put code in /app/scripts that isn't meant to be shared with the rest of my application, so after I while I started poking around zero.core to see if there was a way to enable bindings automatically, with no Java code or config stanzas. The short answer is that, yes, it would be possible, but we would take a performance hit because of some additional reflection; I have not bothered to implement this solution, so I cannot say how severe this performance hit would be. I didn't want to go through a lot of trouble only to find out that my solution was slow as molasses, so instead I wrote a Groovy script to generate the binding classes and config stanzas for me.

The script is named binding.groovy, and you can download it here. You can look at a sample console session below:
$ ls
.
..
binding.groovy
my.zero.app
$ groovy binding my.zero.app
$ zero build
$ zero run
The script generates classes and configuration without touching any of your existing files. The zero build step compiles the Java classes so that they will be on the classpath at run time (zero run). You can find more details on usage, behavior, and licensing in the header comments.

Working on this tool gave me the opportunity to make a very useful comparison between Groovy and Java. Last summer I used Java to write RESTdoc, and that tool shares many requirements and behaviors with my latest one: both analyze the structure and code in a Zero application and use that information to generate one or more files using a template. RESTdoc is more complex because it must be usable from the command line, Ant scripts, and a GUI, but many of the algorithms are the same.

Based on my two experiences, I would have to say that using Groovy was far more enjoyable than using Java. But why?

First, I was able to get right to coding, without having to create all of the boilerplate that seems to appear in all of my non-web applications. You know: first write main(), then a non-static run(), then set up the exception handling, then define an exception hierarchy, and so on; then, just as you're starting to write real code, your mind starts to map out the larger pieces of the tool, and you start to think about which of these pieces should be pluggable, and then you start defining interfaces, and soon it's the end of the day and all you've done is create an Architecture.

Tomorrow, you think, I just have to write the code. And it seems like such a logical thought to have.

But it's not.

I wrote RESTdoc in just over two days. This latest tool required four hours. Granted, I was able to reuse many of the ideas I'd had while implementing RESTdoc, but those are just ideas - I couldn't reuse most of the code because it was all so... big. I knew that the code could be much simpler in Groovy, so I rewrote it. Quickly. The Groovy tool took less time because I was able to focus on actual logic and actual testing, not Java-oriented procedures that catered to my neuroses.

Second, the ability to use closures made my code smaller while also increasing its readability. Most of my closure usage is coupled with methods like each() and collect() (and their derivatives), methods that accept a closure as a parameter and apply it to a collection. I'm sure that some people abuse closures in a way that makes them feel like Java's anonymous classes, but for the most part they seem to function as a way to get things done with less bureaucracy.

The third thing that makes my Groovy development more enjoyable is the fact that lists (java.util.List) and maps (java.util.Map) are built into the language, and I can use them to create utility data structures without defining an inner class with getter and setter methods. You can do this in Java too, but it's frowned upon; it just doesn't feel right to put so much structure around your code and then use bags of goo to store your data. But while that's an appropriate feeling to have in many scenarios, it's a real downer when you're writing a script to generate config files. I love the fact that I can represent part of a parse tree with a set of key-value pairs and not feel guilty about it, and I really love the fact that I can create that set in one line of code:
return [
name: "getFoo",
params: ["param1", "param2"],
hasReturnValue: true
];
Finally, for all of the Java bashing I've done in this post, I have to remind myself that one of the best things about Groovy is the fact that it lets me devolve into traditional Java programming when I really need it. There are some tools for which Java integration is superior, and those tools aren't going to change any time soon. Java is also the original language of the JVM, and it is the best way to expose a language-agnostic API on that platform.

And sometimes, I'm just not ready to do things The Groovy Way. Like all creatures of habit, there are times when I hang on desperately to the past, for no good reason at all. Groovy allows for all of that, and it doesn't mock me when I fail to use it to the best of its abilities. It just runs my script.

He'll come around, it says to itself. Some day.

Labels: , ,

Wednesday, January 23, 2008

legendary

I started using Google Charts a few weeks ago, and I have to say: it's pretty stellar. You can create bar, line, or pie charts with multiple colors and data sets using a simple HTTP GET. The names of the query parameters are kind of... trite... but overall I think the API is very user-friendly. I'm a fan.

Of course, the API does have one problem: it requires me to send all of my data outside the IBM firewall. Perhaps you hadn't noticed, but the IBM Corporation employs a lot of lawyers, and said lawyers get very uncomfortable when you start talking about sharing company data with servers owned by our competitors[1]. It's unlikely that Google is employing a bunch of people to read through its server logs, find requests originating from its competitors' servers, and muse about their significance to Google's management team[2], but lawyers are paid to be paranoid, and ours are very good at their job. The net of this is that any IBM application that uses Google Charts and is not an obvious demo must be reading from a public data store.

I like to poke fun at IBM's giant legal department, but the truth is that it's not much different from that of other companies. IBM isn't the only company that will have trouble using Google Charts, so it would be nice to see some API enhancements with a nod to confidentiality. I think the easiest solution would be to split up the generation of charts and legends; the numbers that are used to create the actual bars or lines are only meaningful if they are accompanied by labels, so keeping the two things separate should satisfy the requirements of most corporate lawyers. The API should be augmented with some sample JavaScript code for generating legends that match the colors and font of a given chart; this code could be provided alongside the existing code for encoding data and invoked by programmers who are not allowed to share legend text with the outside world. This isn't as seamless as the original API, but it's better than nothing.

Assuming that Google is in no rush to appease third-party developers using a service that doesn't generate any revenue, I'll be writing my own legend generator in the near future. I'll post the code once it's complete.

[1] I guess they're a competitor. I can't think of an area where we compete with Google directly, but my inner lawyer is telling me that once a software company reaches a certain size, it automatically becomes a competitor, regardless of its current investments.

[2] The Terms of Service explicitly denies such activity.

Labels: , ,

Tuesday, January 22, 2008

beacon

Lately my blog has been devoid of the deep technical content implied by my host name; I assure you it has not been from lack of interest. In the last two months, I've published three articles on IBM's developerWorks, each exploring a different aspect of Project Zero and REST. Check it out:

  • Title: Extend Project Zero's scripting platform with Flickr APIs

    Abstract: The Flickr photo sharing service is one of today's most popular Web applications. It provides a robust hosting service with slick social networking capabilities that make uploading, organizing, and finding photos very simple. That's all very cool, but from a developer's perspective, the most interesting thing about Flickr is its public API for reading and writing photo data. You can send API requests over HTTP using any programming language you wish, and many open source projects have sprung up to encapsulate this API for various languages. In this article, you'll learn how to "Zero-ize" the Flickr API by providing a Groovy binding that is easily reusable in your Project Zero applications. When you're done, you'll be able to read and write photo data from your Groovy scripts in just a few lines of code.

    Reader's Digest Version: You want to use the Flickr API in your Java applications, but everywhere you turn there's a factory pattern or a glorified HttpURLConnection. You have started building Flickr URLs with StringBuilder, laboring under the strain of append() and hard-to-read query strings, when you receive the tragic news: you have died of dysentery. Game over.

    Fortunately, Groovy scripting lets you use all of your Java skillz while shaking off the cruft that was keeping you down. This article creates a set of Groovy scripts for invoking the Flickr API and shows how to share the scripts throughout your Zero applications; it ends by showing you how to generate one of those ubiquitous photo collages so that your site will look exactly like every other site on the Internet.


  • Title: Manage an HTTP server using RESTful interfaces and Project Zero

    Abstract: WS-* users and REST users have an ongoing debate over which technique is most appropriate for which problem sets, with WS-* users often claiming that more complex, enterprise-level problems cannot be solved RESTfully. This article puts that theory to the test by trying to create a RESTful solution for a problem area that is not often discussed by REST users: systems management. In a previous developerWorks tutorial, I showed how to create a Web services interface for managing HTTP server products; the tutorial used concepts from WSDL and the WS-* standards to define the management interface and software from Apache Muse and Apache Axis to create the management application. For this article, I use Project Zero and REST design principles to recreate the interface and function of the original application and determine if REST is a valid option for this enterprise project.

    Reader's Digest Version: Human sacrifice! Dogs and cats living together! Mass hysteria!

    It's almost unthinkable: creating a fair and level comparison of WS-* and REST based on experience working in both worlds. Well, I went ahead and thought it, and then I wrote it down so everyone could share my completely non-hysterical evaluation of REST as a foundation for remote systems management tools.


  • Title: Add Ruby templating to your Project Zero applications

    Abstract: Ruby users, take note. You can now do everything that Groovy and PHP users can do when creating Project Zero applications! In a previous article, we showed how to augment Project Zero to provide support for the Ruby scripting language. The code that we wrote enabled Ruby users to transfer their scripting skills to the Zero platform and take advantage of its unique programming model. Of course, scripting isn't the only way that Ruby is used to create applications - programmmers who use the Ruby on Rails framework also mix Ruby in HTML templates similar to JSP and PHP. These templates, called RHTML files, are very useful for creating dynamic user interfaces, and this article will show you how to extend our Ruby support to include them.

    Reader's Digest Version: Remember the first time you saw Back to the Future? As the movie ends, Marty has just discovered that his father isn't a sucker anymore, his sister is popular, his brother has a job, and something he's done has warranted his parents buying him a brand new 4x4. When Jennifer struts in a few minutes later, you undoubtedly thought, This was a killer movie. And you were right.

    But then, out of nowhere, Doc screeches into Marty's driveway in a beat-up De Lorean and tells the two teenagers that their future is in shambles and they have to go to the future to prevent a certain tragedy. No way! The movie closes with the De Lorean lifting off the ground and flying into 2015. Wow! Robert Zemeckis just turned your expectations upside-down, and now you can hardly wait for Back to the Future II. Do you remember that?

    Well, if you're like most people, you had the exact same response when you read this last abstract and realized that my Ruby on Zero article has its own Part II; you thought the first article was great, but now that you've gotten a taste, you can't imagine life without Part II and its Mr. Fusion-fueled RHTML files.


Having explored over a dozen topics related to Zero and REST, it's clear to me that one of Zero's greatest strengths is how flexible it is; in other words, Zero does not get in my way as I try to bend it to meet the needs of my project. Most of the time I don't have to do any bending at all, but sometimes I do, and rarely is something so hard that it's deemed impossible or not worth the trouble. If the Zero platform is successful, this will certainly be one of the reasons: it provides you with many tools and conventions for getting things done, but it doesn't force you into absolutes or some kind of software design religion.

Labels: , ,

Wednesday, October 17, 2007

grassroots

Zero now has a nice little database setup tool that helps one follow the guidelines described in this article. Steve Ims and I tweaked, updated, and refined the guidelines so that the tool could be reusable without programmers having to add configuration stanzas or additional artifacts to their applications; there's more information about this feature on the Zero forum, but I wanted to add a personal note about how gratifying it is to see this code make it into the Zero code base.

Ninety-nine percent of my programming habits - from how I arrange my file system to how I debug applications - are either slight variations on the habits of others or bordering on OCD. This means that a lot of the effort that I put into optimizing my productivity never benefits anyone other than me. I think this is why I feel that the most satisfying contributions that I can make to a project are the ones that originated from code that I wrote to make my own life easier, not those that are part of a feature plan. Perhaps this is just the ultimate fulfillment of my obsessions - forcing other people into my behavorial patterns - but I'd like to think it's based on the feeling that my improvement really is an improvement; with feature plans, sometimes you get it right and sometimes you don't, and you have to wait a while before you find out which is the case.

Okay, it's probably the obsessions. But it still feels good.

Labels: , ,

Wednesday, October 10, 2007

goo

After publishing one of my recent posts, I noticed a Freudian slip in the way I talked about software standards (emphasis added):
I think of security-related code as [being] responsible for authentication, authorization, encryption, and the complex protocols that the industry has created to simplify them.
My initial reaction was to correct this oxymoron, but then I realized that it was a fairly accurate description of many software projects and protocols that I encounter every day. My previous work in WS-Land was a great example of this: things started out simple, but as the specs expanded in order to handle more use cases, it became much harder to provide a simple toolkit for implementing them. Complex problems beget complex protocols, all in the name of simplification.

On the Zero forum there is a discussion about whether or not to include support for the deserialization of JSON data into Java beans. My gut reaction to this - which is based on many years of programming in strongly-typed languages and the belief that context assist is a inalienable right - was that converting JSON objects to beans is a fantastic idea, and that I wouldn't want to be friends with anyone who thought otherwise. Data binding, while complex in many ways, simplifies the even more complex problem of wading through the raw bytes of serialized objects. Right? You have to have principles.

Pat Mueller agrees with this sentiment, comparing a JSON object to a bag of goo, which doesn't sound like something that is easy to debug. If you have any doubts about this, ask a programmer on your team if he thinks his current project could be improved by adding bags to goo to the source - at best you'll get a weird look, and you probably won't be invited to give your opinions on the project anymore. In general, people don't want bags of goo in their source code.

What I could not have predicted was just how much working on Zero has changed my world view, and how much I would waver as I started to think about my experience creating RESTful services with JSON data structures. The more I thought about it, the more I realized how easy it is to get things done with JSON's Java APIs; the JSONObject and JSONArray types are just Maps and Lists, respectively, so if you're unfamiliar with the data flowing in or out of a service, it's easy to learn about it - just dump it to the console! The true nature of any JSON data structure can be determined in one step, without reading any documentation[1]. There is no data binding framework to configure or reflection errors to debug - you get the data in a simple collection of name-value pairs, and that's that. It's made service development a simple affair without the presence of a complex protocol.

But why is it okay to work with bags of goo for my RESTful services while demanding strongly-typed APIs in my other code? I definitely wouldn't want to write JavaScript-style code for all of my projects - it's too frustrating - but I like it for sending packets of data between applications and processing said data in a single module. Once you have to start delegating the data processing to multiple classes or systems, the Map and List usage becomes confusing because the origin of the data is no longer clear to the reader. This is where Pat's bags of goo comment rings true for me - if you're just passing around hash tables from library to library, you've got a debugging nightmare on your hands. For Groovy scripts in Zero, though... it's nice.

The worst part about using a more dynamic, JavaScript-style of coding in Java would be the inevitable move towards something like EMF, which I hate with the hate of a thousand suns. I think that as long as I stick to using simple collections for immediate processing of service I/O and use strongly-typed beans for the rest of my logic and utilities, I can enjoy the magic of JSON without getting lost in some programming black hole.

In conclusion, my thoughts on data binding and type safety aren't as concrete as they once were, and I think that pure JSON objects and other goo-oriented data are great for Zero developers creating RESTful resources. Hopefully Pat and I can still be friends.

[1] You may need docs to find the values of enumerations (if any), but you would have to do that for a bean-based API too.

Labels: , ,

Friday, September 21, 2007

baptism

Let's make one thing clear: I know very little about security-related programming.

Now, don't misunderstand: I know the best practices related to creating secure web applications, and I've picked up many other general security principles in the last couple of years, but those do not require me to write security-related code - just secure code. Making sure that your web application is safe from SQL injection helps to make your code secure, but it doesn't require that you understand the fundamentals behind security features. You're just an end user who has managed to perform his job correctly.

I think of security-related code as those parts of a platform that are responsible for authentication, authorization, encryption, and the complex protocols that the industry has created to simplify them. That's the stuff that I am not familiar with and, to be quite frank, has never really interested me. I'm not sure if the lack of interest was driven by perceived difficulty or the overwhelming focus on encryption that I encountered while in academia[1], but either way, I've always avoided security-related projects.

Until now.

The code base for Zero Core isn't that big, so while its event-driven architecture makes it harder for a new person to read through the code and understand the flow, it really doesn't take that long to figure out how things work. For the past few weeks, I've been trying to help out with some bugs and enhancements to Zero Core, mainly to learn more about the code and fill in my mental gaps; one of the enhancements that I was particularly interested in was related to authorization, but because there are more work items than there are Core team programmers, I was told that the enhancement would have to wait until our third milestone, meaning December-ish. The only way the code would get into the next milestone (October-ish) would be if I agreed to write it. So I did.

Let's make a second thing clear: bug 972 was not very hard. It's not a big feature, and you don't have to be a computer scientist to understand how it works. Had someone from our security team been available, I'm sure they could have written the code and the unit tests before lunch[2]. Still, I'm happy to report that I was able to read through all of the zero.core.security.* code and understand it, and I didn't break the build once during development. I feel a bit more confident in my ability to tackle security-related projects now, even if I've only scratched the surface of understanding. Baby steps.

Special thanks go to Zero security lead Todd Kaplinger, who made sure that I started off on the right foot and who didn't even make a face when I told him that I don't know much about security.

[1] Encryption is discouraging because I am certain that it is difficult, and that I do not possess the mathematical mind required to conquer it.

[2] It took me a day in a half, when you add up all of the hours.

Labels: , ,

Thursday, September 6, 2007

overlooked

The Apache Muse code base includes a collection of DOM-based convenience methods that help us parse and construct XML fragments without a lot of DOM API boilerplate. These methods are included in an incredibly large class named XmlUtils and represent the collective knowledge, mistakes, and advice of a half dozen developers who have worked on WS-* technology for over three years. Changing any of the methods in XmlUtils could leave the Muse build totally broken; despite the fact that it's been extracted from the core engine into a small library, XmlUtils is very much a core piece of code because of how heavily dependent on it the rest of the project is.

Any piece of code that is so central to a project is bound to have one or two (or ten) ugly hacks to handle edge cases, bugs in dependencies, and backwards compatibilty. XmlUtils is fairly hack-free, but it does have some code that I consider... unsavory. In my opinion, the most cringe-worthy code snippets are the ones that need to accept an XML element and iterate over its child elements. The only way to get all of the direct children of an element is to call Node.getChildNodes() and save those Node objects that are actually Elements; such code involves one or more if blocks that check the concrete type of the Node objects and ignore the undesired ones:
if (nextNode.getNodeType() != Node.ELEMENT_NODE)
continue;
Today these checks are common, but when I first wrote the code I allowed myself to make a number of assumptions because it was only used for traversing schema-validated SOAP messages. Then, one day, we added a configuration file[1], and all of a sudden the input was much less predictable. One of the first bugs I had to fix was the one caused by comments in the configuration file; XML comments become their own DOM nodes - different from Element or Text - and my code failed when comments were added in places where I expected a child element. My final fix eliminated comments from the DOM tree all together, but I kept the conditionals mentioned above on the off chance that we encountered XML pre-processor nodes or CDATA nodes. I would not be fooled again!
DocumentBuilderFactory factory = 
DocumentBuilderFactory.newInstance();

//
// we don't need comment nodes - they'll only
// slow us down
//
factory.setIgnoringComments(true);
This bug, and others like it, were part of my education into the more pedantic corners of XML Land. Over the course of three years I learned many things about XML, and while many of them were logged in my brain and never used again, all of them helped to drive home the message that XML-related issues were never as simple as they first seemed. When reading or writing XML documents or schemas, great care must be taken to ensure that edge cases and ambiguity are not hiding in the bushes. And as much as I malign the DOM API, it does a pretty good job of alerting you to the fact that XML processing in a real world system is never as easy as more user-friendly APIs would have you believe.

Now, despite four excruciating paragraphs on the internals of Muse, this post is actually about frustration I've had with some of my Zero-related work. I had to create some custom Dojo widgets this week, and after a few hours of searching the Internet for proper instructions[2], I started to make some progress: I had a widget "class", a widget template, and an HTML page that was loading the widget class and calling its initialization routines. The only problem was that nothing was showing up in the page - I had picked off all of the initialization errors, and yet there was no Dojo-inspired HTML to be seen.

Just as I was about to break down and open a DOM inspector to try and read through the eighty-seven levels of HTML to find the answer, it dawned on me: comments! My widget template is encapsulated in a single <div/> tag, but the HTML file that contains that <div/> has two comment tags: one for the IBM copyright notice and one with my comments explaining the content of the file. I had a hunch that Dojo was assuming the template file had only one node (an element) and wasn't checking for other, irrelevant nodes.

I was right.

I took the copyright notice and comments out of my template file and everything worked beautifully. It's good to know that my WS-* skill set isn't completely wasted here in RESTtopia.

[1] The beginning of the end for any project.

[2] The Dojo team isn't big on documentation, so I had to piece things together using half-baked suggestions from mailing lists and forums, some of which were no longer operating. I love reading documentation through Google cache!

Labels: , ,

Friday, August 24, 2007

focus

Bridgid and I don't have many conflicts, but one of the things that forces us to compromise is the fact that we both match well-known gender stereotypes when it comes to our work habits and attention spans. This can be amusing and frustrating at the same time.

For those of you who never made it through Psych 101 and don't work for a company that requires lots of diversity training, allow me to summarize: men are incredibly single-minded and perform well on tasks that require deep concentration, long hours, and not talking to anyone; women are excellent multi-taskers who are most productive when they are faced with disparate tasks that exercise social as well as academic skills. This is why you meet so many male programmers and female marketing executives. There are exceptions, but in my experience this stereotype is more accurate than most[1].

In fact, in the case of me and Bridgid, it is incredibly accurate. Bridgid is a biochemist who gives cancer to fish and does experiments on "genes". The fact that she's a geek would lead you to believe that she is an exception to the female stereotype, and in many ways, she is; however, when it comes to multi-tasking and the desire to work on disparate tasks, she is a perfect match. Bridgid can switch contexts almost immediately and not lose a step. Her need for long-term scheduling is limited to her need to set up multi-day experiments in such a way that she can balance her classes with her time in lab.

I am a different animal. I exhibit classic programmer behavior when I'm at work, and it's even more obvious after work, when there are no meetings to distract me. I like to block off hours of time for one project, one feature, or one set of related bugs. I save big-ticket items for days when I work from home so that any communication that I have with other people is routed through email or IM, which allows me to manage it in the same way that I manage my list of tasks. Even when I'm working on a feature that touches code shared by multiple people and requires lots of questions and communication, at some point I will buckle down and write code by myself, with no distractions to knock down the house of cards I am building in my head.

Context switches can kill hours of my day if they are timed right: three consecutive meetings with thirty minutes in between each means that I lose an hour because I can't start anything significant before I'm pulled into the next meeting. The flip side of this is that, once I am working on something, I find it very hard to put it down. I wish that I had Bridgid's ability to let things go when a context switch happens; instead, it takes upwards of an hour for the thoughts surrounding whatever it is I'm working on to leave my brain. Of course, sometimes the delay is caused by thoughts about co-worker frustration or bureaucracy, but I think that is more understandable to the average person. Thinking about Ant's classloading behavior on the way to dinner is not.

The reverse of this behavior is interesting. A large project may require many days of intense concentration and occasional t-shirt re-use on my part, but when I'm done[2], I am suddenly aware of all of the great things that are happening around me. Things to do. Fun to be had. My non-programming intensity is as strong as my programming intensity, but the two cannot coincide. This can be very confusing for Bridgid and other female humans.

I try to temper this conflict by keeping an a personal schedule that extends at least two and usually three weeks in advance, identifying those times when I will be able to work consecutive days on larger problems without forgetting to eat or talk to my girlfriend; the other days become targets for meetings, smaller bugs, and tedious work that will not leave me distracted at the end of the day. This need for order and preservation means that I am constantly scheduling, with the ability to remake two weeks of plans in ten to fifteen minutes. Frequent re-ordering means that no "to do list" software is fast enough or natural enough for me; whether it's Microsoft Outlook or some Web 2.0 app with Atom feeds and rounded corners, I always come back to a plain text file on my desktop. I don't have time for calendar widgets and status markers. When something is re-scheduled, I cut and paste it. When it's done, I delete it. At the end of each day, I open todo.txt one last time, delete the current day's entry, Ctrl-S, Alt-F4, and turn off my monitor.

On that note, Fri PM (8/24) - blog post is complete. It's time for a trip to Fujisan. With my lady.

[1] I'm sure you've already thought of a few co-workers who completely defy these stereotypes. Good for you. I'm telling my story anyway.

[2] Where done means that it works on someone else's machine and I've found most of the edge cases.

Labels: , ,

Tuesday, August 14, 2007

endorsement

RESTdoc is now part of Zero Core! You can read about it here; the latest code is here. Much thanks to Steve Ims for working with me to resolve all the little details and get this into /trunk.

Go banana!

Labels: , ,

Thursday, August 9, 2007

rolling

My forum post about RESTdoc generated a lot of great ideas, almost all of which have been implemented in my RESTdoc SVN branch. I think the latest UI is pretty slick (considering it was made by a programmer, anyway), and integrating the test forms with the REST tables should help us lower the learning curve for those new to Zero's REST conventions. I've uploaded some screenshots of the latest RESTdoc UI and would appreciate any feedback related to making it prettier or more intuitive.

I have a few more features to complete and IE-related bugs to work around, but I've already opened a feature request in the bug database. I suppose I should also add some documentation to the wiki; it would be tragic irony if people had trouble using a documentation tool because it was poorly documented.

Labels: , ,

anguish

Pat Mueller: I suppose I must not have gotten around to telling Dan my horror stories of using WSDL in the early, early days of Jazz. The low point was when it once took me two working days to get the code working again, after we made some slight changes to the WSDL.

I'll see your WSDL ruined my week and raise you a WSDL caused me extreme pain that could only be soothed by setting my skin on fire and never reading a WSDL document again. I understand the pain associated with WSDL-oriented tools and code generation, but my experience creating those tools was just as difficult. During the first six months of working on the project would become Muse 2.0, the other IBMers that I was working with started asking for client code generation features; I was working well over sixty hours a week at this point, and as important as code generation is for WS-* programmers, I had other things to worry about, like implementing all 9,087 pages of WS-RF. I finally managed to throw something together one weekend, and while it wasn't pretty, it got the job done (for a while).

Eventually, Muse grew and tooling was added, and we needed more code generation support than my weekend side project could provide. Andrew Eberbach started working on a Muse-oriented version of WSDL2Java[1] that actually had a design behind it and, as a result, was much more flexible for both Muse developers and Muse users. During the creation of WSDL2Java, I found myself spending a lot of time trying to transfer my knowledge on the nuances (nuisances?) of WSDL 1.1 and their affect on Java-based service implementations. When I first started at IBM, I had only a cursory knowledge of WSDL, but after two years, I knew what I was doing and had the scars to show for it; unlike many skills (which seem easy once you have them), reading and writing WSDL documents was something that I never got over, and that made it all the more difficult to encourage new students. By the time Muse 2.0 was released, I could read WSDL documents that were thousands (thousands) of lines long and find obscure syntactical or semantic problems in a minute or two... but it never seemed easy.

Never.

The worst part was, even as Andrew picked up my unwritten, unofficial WSDL knowledge and took ownership of our command line tools, it didn't free me from the tyranny of port types, bindings, and <xsd:any/>. As author of Muse's WS-resource deployment and request-processing engines, I had to read WSDL documents at runtime to determine how SOAP requests and WS-Addressing data should be mapped to WS-resource instances and Java method calls[2]. Keeping the WSDL assumptions consistent between tools and engines was tough when I was the only author, so you can imagine what it was like when it was split between two people, one of whom had not yet come to terms with the unstoppable, day-ruining force that was WSDL 1.1.

I really thought that I had a point when I started this rant, but now that I'm nearing the end of it, I can see that I don't. Pat's comment just triggered a flashback and it was either vent through my blog or sit in the corner with spiders crawling over my skin. Whew. Crisis averted.

[1] I believe the first instance of a tool named WSDL2Java was released as part of Apache Axis, but every web services framework I've ever seen has its own implementation of this concept. Muse was no different.

[2] This requirement was put in place in order to avoid another JAX-RPC mapping file disaster.

Labels: , ,

Tuesday, July 31, 2007

restdoc

I've been working on a new tool for Zero, and I'm going to post some background information here (rather than the forum) because it will be easier to dig up later on.

Designing and implementing Zero applications usually starts with the creation of a REST API, which can then be documented using a REST table (or Gregorio table, as we sometimes call them). Comparing a REST table to a WSDL document is an excellent way to demonstrate ease-of-use differences between REST and WS-*. Upon reading one of these tables, one already has the mental model needed to write code that uses the documented service; REST tables manage to fit the important details (URI structure, method names, status codes, etc.) into a fairly spartan layout, while WSDL documents require pages of XML that is not really meant to be parsed by humans.

Of course, there is no official mapping of these REST APIs to the code that fulfills them, and so there is no RESTful code generation or documentation extraction like there is in WS-Land. I've had to create a lot of REST tables while working on Zero, and while it wasn't hard, it was kind of tedious because I had to duplicate the same information in my code comments[1]. Having documented my REST APIs manually and not wanting to do it again, I decided to create a tool that would read through my code, pull out JavaDoc-style comments, and turn them into a pretty HTML page full of REST tables.

RESTdoc works just like JavaDoc, except that it reads Groovy and PHP scripts that follow Zero's REST conventions[2]. RESTdoc's own JavaDoc provides the following overview:
To use RESTdoc, you must first include RESTdoc-style comments in your Zero scripts. RESTdoc comments are just like JavaDoc comments, with the following new tags: success, error, and format. The new tags are used to list the HTTP status codes for success and errors, as well as the expected data format in the request or response body for each method. RESTdoc will ignore any method that is not a Zero REST method (onList(), etc.) or that does not have RESTdoc-style documentation preceding its definition. Like Zero itself, RESTdoc only supports Groovy and PHP scripts, but it is possible to add other languages in the future.

To run RESTdoc from your Zero application's directory, just type:

restdoc

By default, RESTdoc will look in ./app/resources for Zero REST scripts and it will save the generated HTML page in ./docs/rest. If you want to run RESTdoc from a script that is not in your application's directory, you can use the input and output flags to override the defaults:

restdoc -input /my-app/app/resources -output /my-docs

Finally, RESTdoc does not produce any output if all goes well. You can turn on console logging using the verbose flag if you're having trouble debugging a problem:

restdoc -verbose

I've converted some of our sample applications to use RESTdoc-style comments, and so far, the tool works well. The HTML isn't beautiful, but neither is the stuff generated by JavaDoc. The important thing is that programmers can write comments using the familiar JavaDoc style and automatically get a nice document describing their application's RESTful public interface.

When I first got the idea to make this tool, I was pretty excited because I saw it as an opportunity to apply my elite compiler skillz. It's pretty hard to find a job where you get the chance to work on a new programming language, but creating developer tools often puts you in a situation where some level of formal AST-based parsing and analysis is needed to implement a feature correctly, and I enjoy those situations immensely. Of course, once I got started, I realized that parsing Groovy, PHP, and possibly Java source files with a complete compiler front end would require me to drag in a ton of dependencies, dependencies that zero.core already has but zero.tools does not. I was pretty sure the Zero tooling team would not be happy about my request to add N MB of Groovy and PHP-related JARs to their distribution, not to mention the possibility of adding Java-related JARs of questionable origin[3]. After looking at all my options, I realized that I would need to trade in my elite skillz for some dubious regex-based hacks if I was going to make a tool that was consumable by the rest of the team. Oh well.

The current code uses all sorts of string searching and pattern matching that would best be handled by a parser generator, but is instead handled by me. The net of this is a 32 KB binary, which includes the Ant task that enables RESTdoc to be called using Zero's command line interface. I'm not sure if RESTdoc will end up being included in Zero, but if it does, I'll be sure to update this post.

[1] Also, I hate making HTML tables. Hate.

[2] I've designed the code so that it can easily handle the addition of new languages to Zero, but I don't see that happening anytime soon.

[3] IBM Legal questions the origin of everything, no matter how obvious it may seem.

Labels: , ,

Friday, July 27, 2007

satisfaction

Coté: The culture of Java design is to push out commitment to a given way of doing things (an "implementation") as much as possible. In Java culture, dependencies, especially conceptual ones, are nasty and to be avoided. They're taboo, even.

First, this guy really has our number[1]. Second, I would go further to say that the overuse of interfaces and patterns not only helps us avoid commitment, it also helps us rationalize decisions to replace code written by other people with our own engineering masterpieces. Overly-abstract and unhelpful interfaces like the ones in JNDI give you much more flexibility when you're trying to satiate your NIH demons. Where else are programmers going to get that kind of ego-inflating satisfaction? The dating scene? I don't think so.

[1] We being Java programmers who work for large corporations.

Labels: ,

Thursday, July 26, 2007

obsession

A few days ago, members of the Zero team got together with editors from IBM's developerWorks team to discuss the Zero-oriented articles we wanted to publish, as well as the administrivia of the publishing process. One of the issues we talked about was whether to use a single, consistent example across every article; proponents said this would minimize redundant writing and allow authors to get to the heart of the matter more quickly, while opponents said it would make authorship cumbersome and discourage exploration of new ideas. I think this is an interesting issue that goes beyond what makes good copy and will affect people's overall perception of Zero.

From my perspective, the argument for reusing a scenario across multiple articles is only attractive when I consider my personal return on investment. IBM pays developerWorks authors for their work (whether they're IBM employees or not), and the less time I spend on each article, the more dollars-per-hour I earn. I really like money, so this argument does not fall on deaf ears. Less work, more money - what's not to like?

Unfortunately, the single-scenario approach does more harm than good. Every new project has its share of hype, but when you couple that hype with one example used over and over again, the whole thing starts to reek of technology for the sake of technology. If you think back on some of the technologies that have recently fizzled after years of empty promises, you'll notice that all of them were banking on one or two contrived examples to inspire people and make it a star.

Aspect-oriented programming (AOP) is a great example of technology that is based on a single-scenario obsession. AOP proponents speak in grandiose terms about code simplification and feature injection, but when you press them for details, they always fall back on the same example: logging. Your code is littered with logging statements! they huff. How can you even read your code? It's chaos! If pressed for another example, they'll usually stammer about the ugliness of exception handling, but I've never heard anyone offer a coherent explanation of AOP's solution to this. Logging is definitely the AOP-phile's bread and butter.

Now, the AOP lovers are right: logging and exception handling make your code ugly. However, even if there was some AOP framework that could extract all of that code out of my core logic while maintaining correctness and not introducing any cumbersome dependencies, is it really worth it? For two use cases? That don't even bother me that much? I find it hard to believe that an entire paradigm can be built on two use cases, only one of which I've ever seen implemented in a real demo.

I guess I'm not alone in feeling this way because the AOP frenzy has died down significantly in the last eighteen months. Even researchers and interns (who will usually buy into anything if there's a chance it will boost their resume) have stopped talking about it. No one believes in it. You can't sell a technology on one example, even if it's a good example. It makes your project look entirely academic.

I don't think that Zero has a problem when it comes to inspiring new ideas, which is all the more reason to avoid limiting ourselves to one scenario in the interest of simplifying the documentation. Fortunately, the team decided that variety was the spice of life, and if each article is a bit longer because it needs to introduce a custom example, so be it. If our forum is any indication, there should be a number of interesting and controversial ideas flooding developerWorks very soon.

Labels: , ,

Saturday, July 14, 2007

disenchanted

When I first started graduate school, my chosen specialization was generic programming[1], which required me to study and use the C++ Standard Template Library (STL) quite extensively. In fact, I would say that during my later college years, I was an expert user of the STL; I had memorized all of its tricky rules and appreciated its elegance even during its more verbose moments. I really loved using the STL and I didn't care if others found it academic and cumbersome. Plus, once you've invested a lot of time in learning to read the compiler error messages associated with C++ templates, you really want to believe that it's a valuable skill.

Towards the end of my first year, my advisor left for another university and I was faced with a choice: continue studying generic programming or move to a new field. My attention had been drawn to language design and compiler theory for a few months at that point, so I decided to jump ship and start learning more about the coolest topic in computer science. But despite my journeys into the deep, dark corners of compiler theory and development, the fact was that the Java community's tools for building compilers were far better than those in C++ Land, and my days of hardcore C++ hacking were about to come to an end.

It was tough to let go of C++ and parametric polymorphism, but with so many new technologies to learn, they soon faded from my brain. Only occasionally would they leak back in, like when a piece of Java code was snatched from the jaws of elegance because of some perceived deficiency in the Java grammar[2]. Eventually, people that I had taught at RPI - people who looked up to me and sought my advice on the matters of C++ template usage - started to ask me questions to which I could not remember the answers. And not only did I not remember the answers, but I realized that this failure had very little impact on my current job and any foreseeable jobs. I had moved on.

Until.

This past May I joined the IBM team that is working on Project Zero, a development platform that is centered around REST, AJAX, and server-side scripting languages. Project Zero has a number of interesting and excellent qualities, none of which I want to talk about today. This post is about parametric polymorphism and how it has gone from an academic indulgence that made me look smarter than my peers to a professional aggravation that sours my mood and makes me want to throw rocks at children.

Project Zero is built on Java 5.0, and our code is full of its phony template hackery. Lots of people are unimpressed by Sun's attempt to bring templates to the JDK, and I do not wish to rehash their points here. I, too, was disappointed to see that the need for bytecode compliance with previous versions meant Java templates would be limited to front-end trickery, nothing more than a minor convenience that helped programmers identify casting errors; despite this disappointment, I went easy on the Java language designers because I figured that some template support was better than none, and it wasn't hurting anything. I did not realize how disappointed I really was until I had to introduce Java templates into my own code, and that they do, in fact, hurt things.

I just cannot believe how much typing one has to do to use Java's templates given that they provide almost zero function. Java's version of parametric polymorphism has all of the verbosity of C++ and its STL, with none of the elegance or utility. It's just more characters for source files that most people already thought were too long. Further, I now realize that even though templates are optional, they are only optional if your whole team agrees to avoid them; once they become part of a core API, they spread to the rest of the code like a virus. Gross.

One of my favorite parts of working on Project Zero is that I get to learn and use a number of different languages in my development work. So far I have learned PHP and Groovy, and I have been teaching myself Ruby on my own time. It's been very educational, not to mention inspiring. I feel like the time has come to replace my primary programming language once more. Like last time, the switch will not be quick nor easy, but I'm already certain it will be for the best.

[1] Despite its name, generic programming is not about unremarkable software; in fact, it is quite remarkable. Many people have made remarks about it. The fact that a significant portion of those remarks are negative should not deter you from learning more about this topic.

[2] No function pointers! Anger!

Labels: ,