A colleague (Doc Searls, editor at Linux Journal) once asked "Why is there no Moore's Law for software?" I believe that part of the answer is "poor documentation". Documentation is the most neglected part of all software projects today -- no matter how excellent the rest of the software.
Introduction: a brief polemic.
More precisely, I think what is neglected is what Zen calls "beginner's mind". I don't mean "writing for beginners"; I mean writing documentation with the same "beginning mind" that the developer(s) had when they started a project.
- Why does the project exist?
- What is the original problem?
- Why did we choose this way of solving it?
- What's good about this approach? What's not so good about it?
I believe that this way of thinking must be applied, fractally, at every level of documentation -- from the user manuals to the individual function or class headers. Anyone (who's written code) can write details about how a particular piece of code works or is called. But it's uncomfortable to go back to the state of mind when a project, or even a single function, was part of the formless void -- that part of design that programmers call "nailing jelly to the wall" -- and hold it in our minds (again) long enough to summarize it and write it down. I've literally seen people squirm in their seats when I force them to do this. It really is uncomfortable.
The good news is that actually doing it feels good in the end. It offers closure, and pride in the total work. (How often have you heard a developer caught between pride in their work, and an uneasy disclaimer about "...the documentation's not really all there..."?) It can even be a spiritual discipline, for some. And it honors the sticky, messy, fascinating state of mind we were in when we designed the software in the first place!
Returning to the original question... this is one reason we don't get exponential improvement (in use or re-use) of software. We don't know the foundations the software was built on! Without that, everything we do to build on top of a piece of software is guesswork. Or, at best, it requires time-consuming detective work to reason backwords from how something works to why it works. Samples
OK, end of sermon. Here are some personal samples of how I've tried, imperfectly, to incorporate this belief into my work, in both documentation and code.
- At Q2Learning, I wrote all of the technical documentation, particularly mindful of the fact that we planned to use out-sourced developers in China for some projects. So my writing had to be clear, and in particular, had to be very clear about the context -- the why's and wherefore's of any particular project. Some public samples include:
- At Interactive Business Systems, I embraced the 'blog concept (short for web-log) as a professional equivalent of the laboratory notebook. My IBS blog records resources, backups, and (most importantly) lessons learned week by week, as I learned them. The weekly 'blog is an invaluable record, especially for fellow team members or as part of the knowlege transfer process.
- The Danly Shopping Experience is a good example of my style of "knowledge transfer", in this case to a client, www.danly.com.
- Caucus is an "industrial-strength" web-based asynchronous "computer conferencing" tool. I wrote it first as a fully internationalized text-only package in the early 1990's, and then rewrote it as a web tool, starting in early 1995.
- The Caucus Architecture description, circa 1996, is an under-the-hood tour of the design goals, basic architecture, and guide to the file layout.
- The Caucus Markup Language ("CML") Reference Guide defines the web-scripting language I designed and implemented for Caucus. The actual Caucus user interface is all written in CML, so that individual sites could easily customize the interface. Compared to languages like Python, CML is not very smart, but it did put the power where I wanted it -- in the site administrator's hands.
- func_wrap2html.c is a single 'C' function that is part of the CML interpreter. It tries to intelligently format text that a Caucus user would have typed into a <TEXTAREA> text-box in an HTML page.
- Viewitem.cml is the largest single CML file in the Caucus user interface. It controls the display of the "item" page, where all of the users' discussions about a single thread occur.
- Caucus patch history. A good sample of how I track code revisions.
- Caucus Technical Library. More detailed technical info for the would-be Caucus installer or customizer.
- Hostinfo.pl is a Perl tool I wrote for ComputerTalk to dig out as much info as possible about who visits a web site.
- Game.py is a Python module that represents the "game-state" of a chess board. It is part of a Python rewrite (2000) of a chess program that I first wrote in Algol (1977), rewrote in PL/1 (1978), and completely redesigned in C (1985). A nice non-trivial hobby project for sinking my teeth into a new language.
- Showchildren.i is a WebSphere function (Net.Data macro language) that recursively displays the selected product category in a shopping hierarchy. The language is pretty gruesome, but it was possible to simulate real programming.
- The Properties Master's User Guide is a good example of my non-technical documentation -- in this case, a theatre manual for UMGASS, the University of Michigan Gilbert & Sullivan Society.
- My IBS Resources/Log Page is a good example of how I like to track my work and information I uncover, especially when it doesn't (yet) obviously fit into some larger whole.
- I was a registered beta-tester for Microsoft's Internet Explorer 5.0; I described one insidious little bug at ie_location.html.