Skip to content

Relearning the joys of DocBook

I remember the first time I looked at Simple DocBook. I have always enjoyed compiling my writing–I wrote my senior thesis using LaTeX. When I found DocBook, I was hooked–it was easier to use and understand than any of the TeX derivatives, and the Simplified grammar had just what I needed for technical documentation. I used it to write my JAAS article.

But, I remember it being a huge hassle to set up. You had to download openjade, compile it on some systems, set up some environment variables, point to certain configuration files and in general do quite a bit of fiddling. I grew so exasperated that I didn’t even setup the XML to PDF conversion, just the XML to HTML.

Well, I went back a few weeks ago, and found things had improved greatly. With the help of this document explaining how to set DocBook up on Windows (updated 12/2/2006 to fix a broken link) I was able to generate PDF and HTML files quickly. In fact, with the DocBook XSL transformations and the power of FOP, turning a Simplified DocBook article into a snazzy looking PDF file is as simple as this (stolen from here):


java -cp "C:\Programs\java\fop.jar; \
C:\Programs\java\batik.jar;C:\Programs\java\jimi-1.0.jar; \
C:\Programs\java\xalan.jar; C:\Programs\java\xerces.jar; \
C:\Programs\java\logkit-1.0b4.jar;C:\Programs\java\avalon-framework-4.0.jar" \org.apache.fop.apps.Fop -xsl \ "C:\user\default\xml\stylesheets\docbook-xsl-1.45\fo\docbook.xsl" \ -xml test.xml -pdf test.pdf

Wrap that up in a shell script, and you have a javac for dcuments.

Abstractions, Climbing and Coding

I vividly remember a conversation I had in the late 1990s with a friend in college. He was an old school traditional rock climber; he was born and raised in Grand Teton National Park. We were discussing technology and the changes it wreaks on activities, particularly climbing. He was talking about sport climbing. (For those of you not in the know, there are several different types of outdoor rock climbing. The two I’ll be referring to today are sport climbing and traditional, or trad, climbing. Sport climbers clip existing protection to ensure their safety; traditional climbers insert their own protection gear into cracks.) He was not bagging on sport climbing, but was explaining to me how it opened up the sport of climbing. A rock climber did not need to spend as much money acquiring equipment nor as much time learning to use protection safely. Instead, with sport climbing, one could focus on the act of climbing.

At that moment it struck me that what he was saying was applicable to HTML generation tools (among many, many other things). During that period, I was just becoming aware of some of the WYSIWYG tools available for generating HTML (remember, in the late 1990s, the web was still gaining momentum; I’m not even sure MS Word had ‘Save As HTML’ until Word 97). Just like trad versus sport, there was an obvious trade off to be made between hand coding HTML and using a tool to generate it. The tool saved the user time, but acted as an abstraction layer, clouding the user’s understanding of what was actually happening. In other words, when I coded HTML from hand, I understood everything that was going on. On the other hand, when I used a tool, I was able to make snazzier pages, but didn’t understand what was happening. Let’s just repeat that—I was able to do something and have it work, all without understanding why it worked! How powerful is that?

This trend, towards making complicated things easier happens all the time. After all, the first cars were difficult to start, requiring hand cranking, but now I just get in the car and turn the key. This abstraction process is well and good, as long as we realize it is happening and are willing to accept the costs. For there are costs, in climbing, but also in software. Joel has something to say on this topic. I saw an example of this cost myself a few months ago, when Tomcat was not behaving as I expected, and I had to work around an abstraction that had failed. I also saw a benefit to this process of abstraction when I was right out of school. In 1999, there was not the body of frameworks and best practices that currently exist. There was a lot of invention from scratch. I saw a shopping cart get built, and wrote a user authentication and authorization system myself. These were good experiences, and it was much easier to support this software, since it was understood from the ground up by the authors. But, it was hugely expensive as well.

In climbing terms, I saw this trade off recently when I took a friend (a much better climber than I) trad climbing. She led a pitch far below her climbing level, and yet was twigged out by the need to place her own protection. I imagine that’s exactly how I would feel were I required to fix my brakes or debug a compiler. Dropping down to a lower abstraction takes energy, time, and sometimes money. Since you only have a finite amount of time, you need to decide at what abstraction level you want to sit. Of course, this varies depending the context; when you’re working, the abstraction level of Visual Basic may be just fine, because you just need to get this small application written (though you shouldn’t expect such an application to scale to multiple users). When you’re climbing, you may decide that you need to dig down to the trad level of abstraction in order to go the places you want to go.

I recently read an interview with Richard Rossiter, who has written some of the canonical guidebooks for front range area climbing. When asked where he thought “climbing was going” Rossiter replied: “My guess is that rock climbing will go toward safety and predictability as more and more people get involved. In other words, sport climbing is here to stay and will only get bigger….” A wise prediction; analogous to my prediction that sometimes understanding the nuts and bolts of an application simply isn’t necessary. I sympathize. I wouldn’t have wanted to go climbing with hobnail boots and manila ropes, as they did in the old days; nor would I have wanted to have to write my own compiler, as many did in the 1960s. And, as my college friend pointed out, sport climbing does make climbing in general safer and more accessible; you don’t have to invest a ton of time learning how to fiddle with equipment that will save your life. At the same time, unless you are one of the few who places bolts, you are trusting someone else’s ability to place equipment that will save your life. Just like I’ve trusted DreamWeaver to create HTML that’s readable by browsers—if it does not, and I don’t know HTML, I have few options.

Note, though, that it is silly for folks who sit at one level of abstraction to denigrate folks at another. After all, what is the real difference between someone using a compiler and someone using DreamWeaver? They’re both trying to get something done, using something that they probably don’t understand. (And if you understand compilers, do you understand chip design? How about photo-lithography? Quantum mechanics? Everyone uses things they don’t understand at some level.)

It is important, however, to realize that even if you are using a higher abstraction level, there’s a certain richness and joy that can’t be achieved unless you’re at the lower level. (The opposite is true as well—I’d hate to deal with strings instead of classes all the time; sport climbing frees me to enjoy movement on the rock.) Lower levels tend to be more complicated (that’s what abstraction does—hides complex ‘stuff’ behind a veneer of simplicity), so fewer folks enjoy the benefits of, say, trad climbing or compiler design. Again, depending on context, it may be well worth your while to dip down and see whether an activity like climbing or coding can be made more fulfilling by attacking it at a lower level. You’ll possibly learn a new skill, which, in the computer world can be a career helper, and in the climbing world may save your life at some time. You’ll also probably appreciate the higher level activities if and when you head back to that level, because you’ll have an understanding of the mental and temporal savings that the abstraction provides.

Passwords and authentication

Passwords are omnipresent, but just don’t work the way they should. A password should be a private string that only a user could know. It should be easy to remember, but at the same time hard to guess. It should be changed regularly, and only passed over a secure connection (SSL, ssh). At least, that’s what the password policies I’ve seen say. People, however, get in the way.

I have a friend who always has the same password: ‘lemmein’. She is non-technical. Whenever she tries to sign in to a system, she has invariably forgotten her password. She tries different incarnations, and eventually becomes so frustrated, she just types ‘lemmein’ and, voila, she is logged in.

I have another friend who is a computer security professional (or was). He has the same issue with forgotten passwords, but rather than have one insecure password, he keeps all his passwords in a file on a machine that he controls, protected by one master password. In this way, he only has to remember the one password, yet machines aren’t at risk.

I sympathize with both my friends, since, off the top of my head, I can easily think of ten different passwords that I currently use, for various systems and applications. In fact, the growth of the web applications (since the address bar is the new command line) has exploded the number of passwords that I have to remember.

I’m not as blase about security as my first buddy, nor as together as my second friend, so I just rely on my memory. That works, sometimes. Often, if I seldom visit a site that requires a password, I’ll always make use of the ‘mail me my password’ functionality that most such sites have. I won’t even bother to try to remember the password.

Sometimes, password changes are imposed on you. I’ve been at places where your password had to be changed every three weeks, and must be different rom your previous three passwords. I was only there for a short period of time, but I’m sure that there are some folks who are cycling passwords (‘oh, it’s one of these four, I know it’).

On the other hand, I worked at a place for three years; I had access to a number of web servers, often with sudo, yet I changed my passwords two times. It was just such a tremendous hassle to try to bring all my passwords in sync. (Yes, yes, we should have had an LDAP server responsible for all those passwords; that would have made changing it easier. There are some technical solutions that can ease password pain, at least within one organization.)

Passwords are even used in the ‘real world’ now. Leaving aside the obvious example of ATM pins, my bank won’t let me do anything serious to my account over the phone unless I know my password.

Passwords do have tremendous advantages. They let me authenticate myself without being physically present. They’re easy to carry with you. Computers don’t need special hardware or software to authenticate a user via a password. Everyone understands the concept. But passwords are really the least of the evils when it comes to authenticating remote users (/entities). They’re easy to pass around, or steal, since they’re aren’t physical. Passwords are either easy to forget or easy to crack.

I guess my solution has been to break up my passwords into levels. For simple things like logging into web applications, I have one or two very easy to remember passwords, or I use the ‘mail me my password’ functionality mentioned above. For more sensitive accounts that I use regularly, computer logins where I’m an administrator of some kind, my email, or web applications where my credit card details are viewable, I’ll have some more complicated password, which may or may not be shared among similar systems. And for other systems where I need a good password but don’t use it regularly, I’ll write it down and store it in a safe place.

Passwords are certainly better than using SSN, zip code, or some other arbitrary single token that could be stolen. But they certainly aren’t the optimal solution. I actually used a userid/biometric solution at a client’s office (for the office door) and it rejected me a very small percentage of the time. The overhead to add me to the system was apparently fairly substantial, since it took weeks for this to happen. For situations where the hardware is available and deployed, biometric solutions seem like a good fit.

No one, however, is going to add finger/eye/palm scanners to every machine that I want to access, to say nothing of various interesting remote applications (I want my travelocity!). Some scheme where you login to a single computer that then generates a certificate that uniquely identifies you (something like xauth) may be the best type of solution for general purpose non-physical authentication. But, as a software guy, my mind boggles at the infrastructure needed to support such a solution. Looks like passwords are here to stay for a while.

Slackware to the rescue

I bought a new Windows laptop computer about nine months ago, to replace my linux desktop that I purchased in 2000. Yesterday, I needed to check to see if I had a file or two on the old desktop computer, but I hadn’t logged in for eight months; I had no idea what my password was. Now, I should have a root/boot disk set, even though floppy disks are going the way of cursive. But I didn’t. Instead, I had the slackware installation disks from my first venture into linux: a IBM PS/2, with 60 meg of hard drive space, in 1997. I was able to use those disks to load a working, if spartan, linux system into RAM. Then, I mounted the boot partition and used sed (vi being unavailable) to edit the shadow file:

sed 's/root:[^:]*:/root::/' shadow > shadow.new
mv shadow.new shadow

Unmount the partition, reboot, pop the floppy out, and I’m in to find that pesky file. As far as I know, those slackware install disks are the oldest bit of software that I own that still is useful.

New approach to comment spam

Well, after ignoring my blog for a week, and dealing with 100+ comment spams, I’m taking a new tack. I’m not going to rename my comments.cgi script anymore, as that seems to have become less effective.

Instead, I’m closing all comments on any older entry that doesn’t have at least 2 comments. When I go through and delete any comment spam, I just close the entry. This seems to have worked, as I’ve dealt with 2-3 comment spams in the last week, rather than 10+.

I’ve also considered writing a bit of perl to browse through Movable Types DBM database to ease the removal of ‘tramadol’ entries (rather than clicking my way to carpal tunnel). We’ll see.

(I don’t even know what’s involved in using MT-Blacklist. Not sure if the return would be worth the effort for my single blog installation.)

Back to google

So, the fundamental browser feature I use the most is this set of keystrokes:
* cntrl-T–open a new tab
* g search term–to search for “search term”
(I set up g so the keyword expands and points to a search engine.)

Periodically, I’ll hear of a new search engine–a google killer. And I’ll switch my bookmark so that ‘g’ points to the new search engine. I’ve tried AltaVista, Teoma and, lately, IceRocket. Yet, I always return to Google. The others have some nice features–IceRocket shows you images of the pages–and the search results are similar enough. What keeps me coming back to google is the speed of the result set delivery. I guess my attention span has just plain withered.

Anyone else have a google killer I should try?