Tuesday, July 31, 2007

restdoc

I've been working on a new tool for Zero, and I'm going to post some background information here (rather than the forum) because it will be easier to dig up later on.

Designing and implementing Zero applications usually starts with the creation of a REST API, which can then be documented using a REST table (or Gregorio table, as we sometimes call them). Comparing a REST table to a WSDL document is an excellent way to demonstrate ease-of-use differences between REST and WS-*. Upon reading one of these tables, one already has the mental model needed to write code that uses the documented service; REST tables manage to fit the important details (URI structure, method names, status codes, etc.) into a fairly spartan layout, while WSDL documents require pages of XML that is not really meant to be parsed by humans.

Of course, there is no official mapping of these REST APIs to the code that fulfills them, and so there is no RESTful code generation or documentation extraction like there is in WS-Land. I've had to create a lot of REST tables while working on Zero, and while it wasn't hard, it was kind of tedious because I had to duplicate the same information in my code comments[1]. Having documented my REST APIs manually and not wanting to do it again, I decided to create a tool that would read through my code, pull out JavaDoc-style comments, and turn them into a pretty HTML page full of REST tables.

RESTdoc works just like JavaDoc, except that it reads Groovy and PHP scripts that follow Zero's REST conventions[2]. RESTdoc's own JavaDoc provides the following overview:
To use RESTdoc, you must first include RESTdoc-style comments in your Zero scripts. RESTdoc comments are just like JavaDoc comments, with the following new tags: success, error, and format. The new tags are used to list the HTTP status codes for success and errors, as well as the expected data format in the request or response body for each method. RESTdoc will ignore any method that is not a Zero REST method (onList(), etc.) or that does not have RESTdoc-style documentation preceding its definition. Like Zero itself, RESTdoc only supports Groovy and PHP scripts, but it is possible to add other languages in the future.

To run RESTdoc from your Zero application's directory, just type:

restdoc

By default, RESTdoc will look in ./app/resources for Zero REST scripts and it will save the generated HTML page in ./docs/rest. If you want to run RESTdoc from a script that is not in your application's directory, you can use the input and output flags to override the defaults:

restdoc -input /my-app/app/resources -output /my-docs

Finally, RESTdoc does not produce any output if all goes well. You can turn on console logging using the verbose flag if you're having trouble debugging a problem:

restdoc -verbose

I've converted some of our sample applications to use RESTdoc-style comments, and so far, the tool works well. The HTML isn't beautiful, but neither is the stuff generated by JavaDoc. The important thing is that programmers can write comments using the familiar JavaDoc style and automatically get a nice document describing their application's RESTful public interface.

When I first got the idea to make this tool, I was pretty excited because I saw it as an opportunity to apply my elite compiler skillz. It's pretty hard to find a job where you get the chance to work on a new programming language, but creating developer tools often puts you in a situation where some level of formal AST-based parsing and analysis is needed to implement a feature correctly, and I enjoy those situations immensely. Of course, once I got started, I realized that parsing Groovy, PHP, and possibly Java source files with a complete compiler front end would require me to drag in a ton of dependencies, dependencies that zero.core already has but zero.tools does not. I was pretty sure the Zero tooling team would not be happy about my request to add N MB of Groovy and PHP-related JARs to their distribution, not to mention the possibility of adding Java-related JARs of questionable origin[3]. After looking at all my options, I realized that I would need to trade in my elite skillz for some dubious regex-based hacks if I was going to make a tool that was consumable by the rest of the team. Oh well.

The current code uses all sorts of string searching and pattern matching that would best be handled by a parser generator, but is instead handled by me. The net of this is a 32 KB binary, which includes the Ant task that enables RESTdoc to be called using Zero's command line interface. I'm not sure if RESTdoc will end up being included in Zero, but if it does, I'll be sure to update this post.

[1] Also, I hate making HTML tables. Hate.

[2] I've designed the code so that it can easily handle the addition of new languages to Zero, but I don't see that happening anytime soon.

[3] IBM Legal questions the origin of everything, no matter how obvious it may seem.

Labels: , ,

Friday, July 27, 2007

satisfaction

Coté: The culture of Java design is to push out commitment to a given way of doing things (an "implementation") as much as possible. In Java culture, dependencies, especially conceptual ones, are nasty and to be avoided. They're taboo, even.

First, this guy really has our number[1]. Second, I would go further to say that the overuse of interfaces and patterns not only helps us avoid commitment, it also helps us rationalize decisions to replace code written by other people with our own engineering masterpieces. Overly-abstract and unhelpful interfaces like the ones in JNDI give you much more flexibility when you're trying to satiate your NIH demons. Where else are programmers going to get that kind of ego-inflating satisfaction? The dating scene? I don't think so.

[1] We being Java programmers who work for large corporations.

Labels: ,

Thursday, July 26, 2007

obsession

A few days ago, members of the Zero team got together with editors from IBM's developerWorks team to discuss the Zero-oriented articles we wanted to publish, as well as the administrivia of the publishing process. One of the issues we talked about was whether to use a single, consistent example across every article; proponents said this would minimize redundant writing and allow authors to get to the heart of the matter more quickly, while opponents said it would make authorship cumbersome and discourage exploration of new ideas. I think this is an interesting issue that goes beyond what makes good copy and will affect people's overall perception of Zero.

From my perspective, the argument for reusing a scenario across multiple articles is only attractive when I consider my personal return on investment. IBM pays developerWorks authors for their work (whether they're IBM employees or not), and the less time I spend on each article, the more dollars-per-hour I earn. I really like money, so this argument does not fall on deaf ears. Less work, more money - what's not to like?

Unfortunately, the single-scenario approach does more harm than good. Every new project has its share of hype, but when you couple that hype with one example used over and over again, the whole thing starts to reek of technology for the sake of technology. If you think back on some of the technologies that have recently fizzled after years of empty promises, you'll notice that all of them were banking on one or two contrived examples to inspire people and make it a star.

Aspect-oriented programming (AOP) is a great example of technology that is based on a single-scenario obsession. AOP proponents speak in grandiose terms about code simplification and feature injection, but when you press them for details, they always fall back on the same example: logging. Your code is littered with logging statements! they huff. How can you even read your code? It's chaos! If pressed for another example, they'll usually stammer about the ugliness of exception handling, but I've never heard anyone offer a coherent explanation of AOP's solution to this. Logging is definitely the AOP-phile's bread and butter.

Now, the AOP lovers are right: logging and exception handling make your code ugly. However, even if there was some AOP framework that could extract all of that code out of my core logic while maintaining correctness and not introducing any cumbersome dependencies, is it really worth it? For two use cases? That don't even bother me that much? I find it hard to believe that an entire paradigm can be built on two use cases, only one of which I've ever seen implemented in a real demo.

I guess I'm not alone in feeling this way because the AOP frenzy has died down significantly in the last eighteen months. Even researchers and interns (who will usually buy into anything if there's a chance it will boost their resume) have stopped talking about it. No one believes in it. You can't sell a technology on one example, even if it's a good example. It makes your project look entirely academic.

I don't think that Zero has a problem when it comes to inspiring new ideas, which is all the more reason to avoid limiting ourselves to one scenario in the interest of simplifying the documentation. Fortunately, the team decided that variety was the spice of life, and if each article is a bit longer because it needs to introduce a custom example, so be it. If our forum is any indication, there should be a number of interesting and controversial ideas flooding developerWorks very soon.

Labels: , ,

Thursday, July 19, 2007

greed

Pat Mueller: But honestly, I'd prefer to see none of these [Triangle cities] on any lists. We already know it's a good place to live. Until we start using impact fees or transfer taxes to help cover infrastructure required by all the new people coming in, problems like poor transportation options and overcrowded schools will only get worse.

Those familiar with my opinions on city planning already know that I despise the Triangle's unchecked growth and shameless conversion of trees into tax revenue. What was once a very pretty area is now overrun with shopping centers and McMansions, with road improvements and schools lagging far behind. The area continues to outperform most of the nation's cities when periodicals compile their Best Places to Live reports, but I think it may have jumped the shark. Le sigh.

Of course, for all of the complaints that residents make when they realize that there are now six grocery stores within a mile of their home, few of them show up to the town meetings that decide the fate of the surrounding land. I attended a meeting of Cary's Planning and Zone Board last month to support a group of citizens who are fighting against rampant growth at one of the town's most important intersections. When it came time for the public hearing, dozens of residents waited in line to voice their opposition to a developer's plan to add a significant amount of commercial buildings to his plot. I would estimate there were ten opponents for every supporter, and the supporters were all employed by the owner or the developer of the plot. It was clear that, as far as the citizens were concerned, the development was a bust.

Almost.

One of the last speakers at the meeting was the original owner of the land in question. For those of you who are not from Cary and do not drive through it on your way to work, this land includes a solid seventy acres of farm land and forest. There is a pretty white farm house on the corner of the intersection, and a beautiful red barn. From one side of the property, you can see a fishing pond at the bottom of a hill, about fifty yards from the house. The property is surrounded by white fencing and used to be home to a stable of horses. If I had to guess at the value (from the perspective of someone who will build dozens of residential and retail outfits on it), I'd peg it at ten million dollars[1].

The man who had sold his land and was guiding its fate came to the podium and introduced himself this way:
Hello. My name is Bill Sears. I was born at the corner of High House and Davis, just as my father before me, and my grandfather before him. If half as many people as are here tonight had shown up when the town decided to take my house for the widening of Davis Drive, I would still be living in it today, and we wouldn't be here.
Ouch. At that moment, I knew it was over. He went on to talk about his intentions, and how everything he was doing was legal and approved by the town, but it wasn't necessary. There was one last speaker that night who tried to bring the citizens' message back to life, but it paled in comparison to this man's bitter dismissal of an anti-growth message from the same people whose desire for faster roads had made his house uninhabitable. Final vote? 5 - 1 to go forth and build until you couldn't build anymore.

Where was I going with this? Oh, yes: first, community participation is not a part-time job. Second, I don't think that Pat's plan to use fees to prevent people from moving here will stop developers from building and/or paying the fees as "incentives"; losing a few thousand dollars on a sale isn't a big deal when you've got people chomping at the bit to pay $400,000 for half an acre of beige. Rather than fees, I think this county needs legislation requiring the development of entire schools, hospitals, and roads around the area of construction. Requirements such as the construction of a school could easily send costs soaring if not managed properly, and for some projects, the risk factor will be too high.

If you want to prevent greed from ruining our cities and towns, you have to target the source of the greed, and the source is not people who don't even live here yet. The new residents are easier to pick on, but they are just taking advantage of bad decisions made well before their time.

[1] This estimate is based on the development that has occurred on the other corners of the intersection, all of which is smaller than the proposal under debate.

Labels: ,

Tuesday, July 17, 2007

reasonable

Sun CEO Jonathan Schwartz has posted a commentary on the behavior of corporate bloggers, and everyone should read it. Plenty of A-List bloggers and technical leaders have posted their thoughts on effective blogging ad nauseum, but Jonathan's post touches on an area that still leaves many corporations confused and afraid: personal responsibility.

I think the matter of personal responsibility is much simpler than most corporations make it out to be: you wouldn't spill corporate secrets in a crowded bar or insult a competitor at a professional conference, so why would you do these things in a blog or forum? Usually when people get themselves into trouble it's because they've decided to say unsavory things over the Internet despite the fact that their online persona is tied heavily to their job; for some reason, these people believe the Internet is still some magical playground that only geeks know about, a place where they can post clammy, mustard-stained rants without ever being held accountable for their words. This may be true if you're using an online persona like joecool997 and writing on your MySpace blog, but the minute you start using your real name and real background data, you should realize that anything you write can and will be associated with you later on. Just like in real life!

When we were about to go live with Zero, a number of IBMers who had not previously engaged any open source communities expressed concern over what would happen if one of them said The Wrong Thing on our forum. Again, this seems to be a pervasive feeling throughout big corporations, but when you apply common sense to the situation, the paranoia starts to cool off. It's not as though our employees were going to turn into misogynistic hatemongers with Tourette's just because we flipped the switch and opened Zero to the public; in fact, their forum posts are exactly the same as if the site was still internal. There are rare exceptions in the case of IBM confidential material, but for the most part it's just a discussion between people, some of whom are IBMers and some of whom are not. The rules of reasonable discussion are the same for both sets of humans.

Anyway.

Using common sense during public discourse is not a unique suggestion. What is unique is Jonathan's last paragraph, where he predicts a shift in the way people refer to bloggers or blogging:
But I'd love it if we one day eliminated the term "blogging" from the web lexicon (and that we stopped pursuing "CEO's who blog."). CEO's who have cell phones aren't "cell-phoners," those who have email accounts arent "emailers," those who give interviews on television aren't "TV'ers" - they're all leaders using technology to communicate.
This is a fantastic point. An aside from the populist, feel-good nature of his explanation, removing blogging from the lexicon would also free us from one of the ugliest and cringe-inducing words ever added to the English language. I hate this word, and it bothers me that I've relented and used it in my own posts.

blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog blog

Terrible word. Just terrible.

Labels: , ,

Monday, July 16, 2007

abandoned

After lots of searching, I have learned that Blogger will not allow me to create a list of labels (categories) if my blog uses one of their classic templates. Since all blogs that are published via FTP are required to use classic templates, I guess I'm out of luck. It seems that other people have run up against this issue before and hacked around it, but I have no interest in such hacks. The public will have to go without.

Labels:

labels

I just realized that my blog's template does not include a list of post categories (which the Blogger team has decided to call labels); I'm going to fix that today, but whenever I make a template change, Blogger re-publishes all of my posts rather than just updating all of the HTML pages. In other words, those of you reading my Atom feed may receive a bunch of old posts today. Sorry.

Labels:

Saturday, July 14, 2007

disenchanted

When I first started graduate school, my chosen specialization was generic programming[1], which required me to study and use the C++ Standard Template Library (STL) quite extensively. In fact, I would say that during my later college years, I was an expert user of the STL; I had memorized all of its tricky rules and appreciated its elegance even during its more verbose moments. I really loved using the STL and I didn't care if others found it academic and cumbersome. Plus, once you've invested a lot of time in learning to read the compiler error messages associated with C++ templates, you really want to believe that it's a valuable skill.

Towards the end of my first year, my advisor left for another university and I was faced with a choice: continue studying generic programming or move to a new field. My attention had been drawn to language design and compiler theory for a few months at that point, so I decided to jump ship and start learning more about the coolest topic in computer science. But despite my journeys into the deep, dark corners of compiler theory and development, the fact was that the Java community's tools for building compilers were far better than those in C++ Land, and my days of hardcore C++ hacking were about to come to an end.

It was tough to let go of C++ and parametric polymorphism, but with so many new technologies to learn, they soon faded from my brain. Only occasionally would they leak back in, like when a piece of Java code was snatched from the jaws of elegance because of some perceived deficiency in the Java grammar[2]. Eventually, people that I had taught at RPI - people who looked up to me and sought my advice on the matters of C++ template usage - started to ask me questions to which I could not remember the answers. And not only did I not remember the answers, but I realized that this failure had very little impact on my current job and any foreseeable jobs. I had moved on.

Until.

This past May I joined the IBM team that is working on Project Zero, a development platform that is centered around REST, AJAX, and server-side scripting languages. Project Zero has a number of interesting and excellent qualities, none of which I want to talk about today. This post is about parametric polymorphism and how it has gone from an academic indulgence that made me look smarter than my peers to a professional aggravation that sours my mood and makes me want to throw rocks at children.

Project Zero is built on Java 5.0, and our code is full of its phony template hackery. Lots of people are unimpressed by Sun's attempt to bring templates to the JDK, and I do not wish to rehash their points here. I, too, was disappointed to see that the need for bytecode compliance with previous versions meant Java templates would be limited to front-end trickery, nothing more than a minor convenience that helped programmers identify casting errors; despite this disappointment, I went easy on the Java language designers because I figured that some template support was better than none, and it wasn't hurting anything. I did not realize how disappointed I really was until I had to introduce Java templates into my own code, and that they do, in fact, hurt things.

I just cannot believe how much typing one has to do to use Java's templates given that they provide almost zero function. Java's version of parametric polymorphism has all of the verbosity of C++ and its STL, with none of the elegance or utility. It's just more characters for source files that most people already thought were too long. Further, I now realize that even though templates are optional, they are only optional if your whole team agrees to avoid them; once they become part of a core API, they spread to the rest of the code like a virus. Gross.

One of my favorite parts of working on Project Zero is that I get to learn and use a number of different languages in my development work. So far I have learned PHP and Groovy, and I have been teaching myself Ruby on my own time. It's been very educational, not to mention inspiring. I feel like the time has come to replace my primary programming language once more. Like last time, the switch will not be quick nor easy, but I'm already certain it will be for the best.

[1] Despite its name, generic programming is not about unremarkable software; in fact, it is quite remarkable. Many people have made remarks about it. The fact that a significant portion of those remarks are negative should not deter you from learning more about this topic.

[2] No function pointers! Anger!

Labels: ,

Friday, July 13, 2007

obstacles

Per my last post, I have started working on an Ant script to cleanse my blog pages of Blogger's unsightly navigation bar. The most significant obstacle I am up against is the traversal of directory trees using Ant's ftp task. The ftp task allows you to specify a file set that spans sub-directories (**/*.html, etc.), but creating a listing with this pattern will result in a set of file names without relative paths:
07-10-07  01:24PM                18669 grumpy.htm
07-10-07 01:24PM 17914 neurosis.htm
07-11-07 11:24AM 18001 populism.htm
07-10-07 01:24PM 16911 three.htm
07-11-07 11:24AM 11247 atom.xml
07-11-07 11:24AM 17028 default.htm
If I try to list all artifacts in the current directory level and traverse the tree manually, I still find myself parsing a list that doesn't include directories. What bothers me is that creating a listing with an FTP client directly (using dir or ls) does show the sub-directories, so the ftp task must be filtering them out:
07-06-07  01:15PM       <DIR>          2007
07-10-07 01:11PM <DIR> archives
07-11-07 11:24AM atom.xml
07-11-07 11:24AM default.htm
07-06-07 11:56AM <DIR> images
07-06-07 01:18PM <DIR> labels
I took a look at the ftp code, and it's using the FTP client from Apache Commons. It looks like most of the magic is tied up in FTPClient.listFiles(), but a quick look at that code made it clear that a quick look would not suffice. Bother. I need to figure this out or I will not be able to re-upload the files after I've modified them.

Labels: ,

Tuesday, July 10, 2007

populism

The people have spoken, and they want comments, archives, and auto-discovery for the Atom feed. I have heard their words, and I am pleased to present all three features in my latest set of updates. Power to the people.

At the same time, I can't tell you how much it enrages me that Google modifies my blog template to include the blue navigation bar you see at the top of this site. The <iframe/> that houses this bar is not inserted until just before publishing, making it impossible to remove using the Blogger UI. This free advertising trickery is inconsistent with the rest of Google's services and it makes me want to jam a pen into my eye every time I look at this site. My current plan is to write an Ant script that pulls down the files modified by the latest post, removes the offending <iframe/> element, and puts the files back before anyone is the wiser. Power to the people!

Labels:

Monday, July 9, 2007

neurosis

Those of you who subscribe to this blog's Atom feed may notice that most of my posts are updated frequently in the hours that follow their initial arrival. Occasionally, the updates will come days or weeks afterward. Editing blog posts is a touchy subject for many bloggers, because undocumented edits give the impression that the author is trying to retract something controversial or shirk responsibility. I would like to take this opportunity to assure you, the reader, that my edits involve none of these shameful behaviors; the fact is, I'm extremely neurotic when it comes to copy editing and layout.

In fact, neurotic is being polite. Reading a blog post with misspellings, grammatical errors, orphaned words, or asymmetrical formatting leaves me in a mental state that borders on OCD. In my head, there is no reason for any of these atrocities to happen in a pre-meditated missive directed at the general public.

The irony of the situation is that half of these changes - the ones focused on formatting - serve no real purpose for those subscribed to the feed, because they're reading it in their feed reader of choice, not in a browser. In fact, changes made for the sake of Web 1.0 luddites actually irritate those who have adopted the Web 2.0 technology that I promote so vigorously in my day job. To those that are suffering from my Atom 1.0-compliant OCD, I apologize, but do not expect me to stop my relentless editing. Just be happy that you don't have to share a code base with me and tolerate patch upon patch of JavaDoc corrections for your code each week.

Labels:

Sunday, July 8, 2007

grumpy

Since the introduction of Zero last weekend, quite a few bloggers have gotten their shorts in a knot over the license that governs the use of Zero software. Given the huge swell of support there has been for open source software in recent years, some disappointment was expected; after all, IBM has contributed to a number of open source projects, and no other community is more open and free-spirited than the one focused on REST and RIA. It seems like the perfect match!

Despite this disappointment, I still find the jaded dismissal of the project by popular geek bloggers to be a bit over-the-top. Most of their dismissals are based on cursory inspection of the web site and the fairly narrow-minded assumption that because open source projects have become popular in the last three or four years, that thirty years of industry behavior is now irrelevant and no one except for Microsoft will sell a proprietary software platform ever again. For a group that is usually excited to see new technologies sprout up in the areas of REST and RIA, there's an awful lot of grumpiness surrounding this arrival. It reminds me of the Grumpy Old Man character that Dana Carvey used to portray on SNL's Weekend Update:
In my day, we didn't use software written by big companies. If you wanted to run a program that belonged to a big company, you just re-wrote it! In K&R C! And then you printed out the code and mailed it to the company employees, and you laughed at them, and said "Look at me, I re-wrote your program in one day and it's eight times faster and cures baldness! You're all worthless programmers!" And then you threw the code away, just to spite them! And if you ever had to run the program again, you just said "Flobble-dee-flee!" and you re-wrote it. And that's the way it was, and we liked it! We loved it!
It's been almost a week since those first rants started to roll in, and so far I have held back on my desire to fire back; at this point, I feel that I can safely ignore them and focus on more postive things surrounding Zero. Regarding future discussions, I welcome debate on the merits of IBM's decision to keep Zero proprietary and its ultimate effect on the success of the project, but I hope that future blog posts will be a little more thorough in their research and commentary. And less grumpy.

Labels: ,

Friday, July 6, 2007

three

Today is the third anniversary of my voluntary servitude with the IBM Corporation. Coincidentally, it is also the day I start my third blog under this domain. The former has had a very negative effect on the latter thus far, not because IBM is anti-blogging, but because I've had a lot of work to do.

Anyway, the team I currently work for has recently gone public with its plans for a RESTful development platform, and is allowing anyone with an Internet connection to view and comment on our code, forums, and processes. All of this has inspired me to give blogging one more go, and today seemed like the perfect day to throw my hat in the ring.

Third time's a charm? Let's hope so.

Labels: , ,