Skip to content

Why open source?

Updated 2/25/2007: Corrected url.

This page has an eloquent explanation of some of the motivating factors behind the open source movement.

In case you aren’t a computer geek, the term open source (free software is another name) refers to computer programs that you can download and share with your friends. Licenses vary, but a common one (the GPL) specifies that you can do whatever you want with the software that is licensed under it, but if you redistribute any changes, you have to make them available under the same terms that the code was originally made available to you.

Some software is core business software. I was talking to a consultant who dealt with telecommunications companies. Their billing and minute tracking software really is part of their core competency. You can use that software to actually make the company more efficient in scalable ways. Ditto companies that make pace-makers–the software is entwined with their hardware and is really integral to the product.

But for many businesses, there are huge swathes of software that aren’t integral in the same way. Their needed, but for their supporting functionality, not for the processes that they enable. For example, the web server that hosts a company’s web pages is not integral. The office suite is not a fundamental part of your business processes. The macros and files and VB programs that you write on top of an office suite probably are, but the bland office suite is not.

When software is written that defines a business process, then it is integral. When it’s a supporting platform for the business process, it’s not. And, as Bruce argues in the article above, when it isn’t integral, there are very good reasons to push the software out into the world and share the cost of maintenance.

Oh, and any discussion of open source software would be remiss if it didn’t link to The Cathedral and the Bazaar.

Book Review: The Worthing Saga

I recently re-read The Worthing Saga by Orson Scott Card. This book really matters to me, on a number of different levels. It’s not his most touching novel and by no means does it have the best characters. But it examines the nature of morality in a direct, simple manner that I’ve not found in too many other books.

The premise is that, due to genetics, a race of super beings exists, and they’ve saved humankind from all pain–they watch over everyone else. No more physical injuries–if you cut off your hand, they can heal it from afar. No more mental anguish–if your parent dies, they make it seem as though it was a year ago. No more social problems–bastards are prevented in the womb, and similar actions have no consequences.

And that’s the fundamental issue. What does it mean to be an adult human being when actions have no consequences? Without choice, what is morality? These are issues that religions and philosophers have struggled with for thousands and thousands of years, but I like Card’s answer.

In addition to the main novella, the book also contains a set of short stories that ‘back up’ the main one. Just as the Silmarillion, while not a fantastic read, enhances your appreciation of Middle Earth, these backing stories add depth to the Worthing universe. It’s not often that you get a chance to read this underlying material, and that’s another thing that makes this book unique.

It’s also fantastic to see Orson Scott Card evolve as a writer. He was able to pick and choose the best of these short stories, but even so, you can still see him pay homage to the writers he read (as he mentions in the preface) as well as develop ideas of his own.

It’s a great book, and I highly recommend it.

Documentation

I love documentation. I like writing it, and when it’s well written, I love reading it. There are many types of documentation, and they aren’t all the same. Some serves to illustrate what you can do with a product (think the little product manuals that everyone throws away). Some serves to nail down exactly what will be done (in software, between two business parties, etc). But what I’m writing about today is software documentation, especially programmer to programmer documentation.

I love it for a number of reasons. Good documentation cuts down on communication between software engineers, hence increasing scalability. At the company where I used to work, each developer had their own instance of the application server to which we were developing (whether it was ATG Dynamo, Weblogic, or Tomcat). So, every time a new developer rolled on to the project, they had to be set up. Either the programmer had to do it, or someone else did. On a couple of the projects, I was involved in setting up the first one or two, but I quickly tired of that. So, I wrote a step by step document that enabled the incoming programmer to do the setup themselves. This was good for me, because it saved me time, good for the programmer as it gave them a greater understanding of the platform on which they were developing, and good for the project, as if I got hit by a bus, the knowledge of how to set up a server wasn’t lost.

Good documentation also has come to my rescue more than once, by saving information that I struggled to find at one time, but did’t not use every day. For example, I imported a project I’m working on into Eclipse. It wasn’t strenuous, but it wasn’t a cakewalk either. So, for other programmers on the project, I wrote down how I did it. Now, a few months later, I couldn’t tell you how I did it. Not at all–that knowledge has been forced out of my brain by other more important stuff–like when my parents’ birthday’s are, what I’m going to bring to my potluck tonight, the name of that game where you roll plastic pigs around and score points based on their position–you know, important stuff. But, should I have a need to do another import, I can! I know the knowledge is stored somewhere safe (in CVS, but that’s a different entry).

There are two complaints about programmer to programmer documentation that I’d like to address. One is that it quickly becomes outdated. This is true. It takes an effort to maintain documentation. When I change the procedure or meaning of something, I try to remind myself of the two benefits above. If I can convince myself that I will save more time in the long run by documenting (through not having to explain the changes to others or myself), then I do it. I’m not always successful, I’ll admit. And you can see this with product documentation (both closed and open source). Out of date documentation can be very frustrating, and I’m not sure whether it’s better dealt with by tossing the documentation or by keeping it and marking it ‘OUT OF DATE.’

The other issue is what I call the ‘protecting your job’ excuse for avoiding documentation. If you don’t document what you’ve done, you probably will have a secure job–especially if it’s an important piece of work. But that security is also a chain that binds. In addition to being a subtle gesture of distrust towards your management (always a good idea to torque off your management in this time of uncertainty), it means that when a different, and possibly better, opportunity comes along, you won’t be able to take it. Since no one else knows how to do your job (because teaching someone also is a form of documenting) you’re stuck in the same position. Not exactly good for your personal growth, eh?

In short, documentation that gets used is good documentation, and well worth the effort to write.

Book Review: Second Edition of “A Programmer’s Guide to Java Certification”

Updated 2/25/2007: Added amazon link.

I used “A Programmer’s Guide to Java Certification” as a study guide for achieving my Java Certified Programmer (JCP) status two years ago, so when I had the chance to review the second edition, I jumped at it (full disclosure: the publisher sent me the second edition to review). As I expected, I was again aghast and delighted at the level of detail, the exercises and the arrangement of this fine book.

Mughal and Rasmussen do a good job of covering all the nitty gritty details that the JCP requires one to know. Whether the length in bits of an int, the difference between overloading and overriding, or the order in which initializer expressions get executed, this book gives one enough detail to overwhelm the novice Java programmer, as well as cause those more experienced to scratch their heads and perhaps write a small program to verify what was read was valid. While this book lacks the discussion of I/O and the GUI of the previous edition (due to changes in the JCP test), it has a fine set of chapters on some of the fundamental libraries and classes. My two favorite explications are the chapter on Threads (Chapter 9), where that complicated subject is treated well enough to motivate more learning while not overwhelming the reader with detail, and the String and StringBuffer section of Chapter 10. So much of the Java programming I’ve done has been dealing with Strings, so this section, which covers the String class method by method and deals with issues of memory and performance as well as normal use, is very welcome.

The exercises were crucial to my passing the JCP, and they remain useful in this book. Grouped at the end of logical sections of chapters, they break up the text and re-iterate the lessons learned in the previous sections. The answers to these exercises are in the back of the book. Also, a full mock exam is included at the back, as well as an annotated version of the JCP exam requirements which serves as a study guide (both for the full JCP 1.4 and for the upgrade exam). Reading over the mock exam definitely let me know what areas I’d need to study if I was taking the JCP again. In short, the didactic nature of this book has not been lost.

The arrangement of this book is also useful. A fine index and the logical progression through the features of the Java language eases the onslaught of detailed information mentioned above. The extensive use of UML diagrams (especially class and sequence diagrams) was helpful as well. If one reads the book sequentially, one learns about how object references are declared (Chapter 4), then the various control structures available in Java (Chapter 5), then the basics of Object Orientation (Chapter 6), then the object life cycle (Chapter 8), in a very linear fashion. Additionally, there is extensive cross-referencing. This may not be useful to the novice programmer, but to anyone using this book as a reference, it’s invaluable, because it allows Mughal and Rasmussen to provide yet more logical linking of disparate topics.

However, this book is not for everyone. I wouldn’t buy it if I wanted to learn to program. While there are a few chapters that have general value (Chapter 1, Chapter 6), the emphasis on mastering idiomatic Java, not general programming concepts. Also, as they state in the preface, this is not a complete reference book for Java. It covers only what is needed for the JCP. Finally, if one wants to know how to use Java in the real world, don’t buy this book. While most of the java programming I’ve done has benefited from the understanding I gained from this book, it has not resembled the coding I did for the exercises at all. This makes sense–this book is teaching the fundamentals, and does not pretend to cover any of the higher level APIs and concepts that are used in everyday programming.
Link to this book on Amazon.

Any sufficiently advanced technology…

… is indistinguishable from magic – Arthur C Clarke.

I took my car in to be serviced a few days ago. A normal 33,000 mile checkup, which I’d postponed for about 1500 miles. Not a good thing. So, I was already nervous when a fellow came out and started talking to me about “trans-axle fluid change” and “radiator back flush”. Now, I don’t know much about cars. Sure, I have some of the basic principles down–I understand in theory how internal combustion works, for example. But I really don’t know anything about the nuts and bolts of making a car work–I’ve never understood how the two front wheels in a turning car stay synchronized, even though the outer wheel goes a greater distance (or how they handle being out of synch). This is the case even though I’ve had it explained to me multiple times. Cars are complicated pieces of engineering that have taken decades of engineering to get where they are, and auto mechanics is a specialized discipline that takes years to learn.

But here’s the point. I don’t want to know. I don’t want to understand even the slightest bit of how a car converts old dinosaur bones into energy–I just want to harness that energy to go to the grocery store.

This has cost me a fair bit of money, as you can imagine, and wrecked at least one car from the inside out. I’ve learned to my cost, that you have to get the car checked out periodically, even if it makes me feel like a blithering idiot. “Sure, take care of that trans-axle fluidish stuff. You betcha.” And even if I take a car to the best shop in the world, I should still be verifying that everything is done according to the manual. Which requires me to read the manual. Which means that I have to learn something about a car. Dang it!

Now, consider computers:

Computers are complicated pieces of engineering that have taken decades of engineering to get where they are, and computer programming is a specialized discipline that takes years to learn. Learning how to interface with a computer takes time. They have their own jargon, just like automobiles. Most people (in the first world) need to use them every day.

Now I have a both a bit less and a bit more sympathy for the computer illiterate. More, because, hey, they don’t want to learn about computers–they just want to use them. I can dig that! Less, because if I can’t get
away with just driving my car, if I have to learn something about it, then they need to buck up and do the same. If they don’t, they’ll be in the same position I was at the service station–helpless before professionals.

Privacy

Update 2/25/2007: Added link to Amazon.

Database Nation, by Simson Garfinkel, is a fantastic book. I admit that I’m a fan of what I like to call ‘Chicken Little’ books (I like William Greider and I even remember thinking that Revelations was the best book in the Bible as a child). My friends tell me that one of my typical greetings is ‘Have you read XXX? You should!’ I like books that challenge me and confront me with realities that I haven’t considered before.

Database Nation definitely challenges. The author approaches the burgeoning issue of personal privacy, and the coming lack thereof, in several different ways. Whether it is biometric identification, the possibility of protecting privacy via property rights, or a chapter of possible solutions, he treats the topic in a manner befitting its fundamental nature. I found his historical emphasis, where he compares the current situation to the one created in the early 1950s by the newly forming credit reporting agencies, to be especially useful. There’s nothing new under the sun, as they say. And the problems we’ve faced with privacy before have dealt with. The sky has fallen before, but it’s possible to pin it back up.

Privacy has been on my mind for a while now. I work in technology, and one of the things that is allowing this current invasion of privacy is the ability to collect, store and mine vast amounts of information. As an example of just how far it has gone, I can access 12 million business records (and 120 million US households) via my library’s
website–they’ve bought access to a database called referenceUSA. Search on business size, focus, years advertising in the Yellow Pages, location, etc. Slice and dice as you wish. As part of the usage agreement, you can’t use the database for unsolicited commercial mail, but, having found the names in Reference USA, you could look up the business in the Yellow Pageseasily enough.

While such data aggregation has been possible for years and years (ask the insurance companies), computing power and disk space have become so cheap that it’s much less work than it used to be–and collecting such information is only getting easier. See Cringely’s column for a suggested solution. I’m not sure how I feel about it, but it’s one idea for keeping the sky from falling.

I watched Enemy of the State again recently. While I enjoyed watching Will Smith and Gene Hackman avoided the satellite images and bugs of the NSA, I have no idea how much the movie made up and how much it nailed on the head (the Economist had this to say about satellite imagery in 2000). Still, this movie displays in a fundamental way what loss of privacy can mean. When folks say ‘hey, I don’t have anything to hide’ I don’t think they realize just what it means to have no privacy. There are shades and shades of ‘hiding’; there are things that I would tell my parents that I wouldn’t tell an acquaintance. Likewise, there are items I’d tell a new friend that I would rather not be published in
the daily paper. Discretion is something that all humans need–you do have things to hide since no one is perfect at all times! Having something to hide doesn’t necessarily mean that you are doing something illegal–perhaps it’s just embarrassing (or would be if exposed to certain people).

Another aspect is the federal ‘do not call’ list and all the hullabaloo surrounding it. Telemarketers feel they aren’t going to be able to survive–everyone else feels they don’t want to be called unless they opt in. Even Dave Barry has chimed
in
. This is an issue that resonates with everyone and calls into dramatic perspective the tension between making your contact information publicly available and wanting to control what someone else does with that information. Imagine what it would be like if everything were public?

Expectation of reasonable privacy is something fundamental. I’d hate to lose it.

Link to “Database Nation” on Amazon.

Technology is not always the answer

I volunteer at the library. I put in 2-4 hours a week at the Special Services division. One of the primary missions of the Division (which consists of one part time employee and a bevy of volunteers) is to find books that homebound patrons would like and deliver the tomes to them. Of course, one wants to make sure that the same senior doesn’t get the same title twice.

The library has a large java based app (probably backed up by a mainframe) that keeps track of all the books. What’s checked out, what’s in transit, and most importantly, who owes fines. But it doesn’t keep records on what patrons have checked out (don’t tell the Feds).

What Special Services does is keep a stack of catalog cards (the old cards that I used to use to look up books on the Russian Revolution or beet production reports for school), and on the blank back of these cards, records the author and date and book title that have been picked for this particular person. These cards are all banded together and kept in a cabinet, filed under the patron’s last name.

This is a database, right? Just not a computerized one. The first day I volunteered, they showed me the system. Being the computer geek, I immediately thought of ways to computerize this database (with PDAs as the client and a java app talking to a database and delivering information to those PDAs). But, there are reasons to stick with the current system.

1. It’s cheap. The cards are being reused and the time of the volunteers is free as well. Not to be discounted in a time where branch libraries have to close one day a week to save funds. A new system would probably
cost thousands of dollars in hardware alone (even if it was built by volunteers with free software), because it would have to be mobile.

2. Mobility is built into the system. When I have to go pick the books for Mrs. Smith for this week, I can take an entire pack of cards out with me, and make sure that the mysteries I pick aren’t ones she’s read before. This is the primary purpose of the database, and it works very well.

3. The very low tech nature of this solution is a selling point. Many many folks are intimidated by new technologies. But darn near everyone is comfortable with pen and paper. There’s a very low barrier to entry. I didn’t have any trouble picking up the system in an hour, and neither has any of the other volunteers.

Not every process is amenable to being computerized. This experience has driven home the old saying–when all you’ve got is a hammer, everything looks like a nail. Even if it’s not.

A new purpose for RSS

I used to host at Dear Diary. Great service, provided for free. But now I host my own blog (thanks to the good folks at Movable Type and Dion (thanks for the tip, Dion), and I have total control over posting. I did use a simple CGI blog tool for a while, but Movable Type is quite feature rich. It generates RSS feeds from postings automatically.

I’m not sure what RSS really stands for, but it’s simple in concept. It’s an XML standard for making site content changes known to the world. It’s basically like those ‘what’s new’ announcements that appear on websites, but automatically generated and usually automatically picked up and formatted for human consumption (or aggregated). It takes over some of the functionality of those ‘sign me up to be notified of changes to this site’ email lists because, if you point your aggregator at a website’s RSS feed, you’ll be automatically
notified when there are changes–no need to clutter up your inbox. It also subsumes some of the functionality of bookmarks, because, again, you pull data you need, rather than having to visit the sites to see if content has changed.

I used to go out and check 4-5 pundits websites (Joel On Software, DaveNet, SkippingDotNet, and a few others) oh, once a week. I’d visit the sites to see if they’d put up any new articles, which I’d then read. Now, however, I rolled my own RSS aggregator, which outputs a nice listing of changes to some of those websites. It is nice to be informed of new postings, but the downside is that I hardly visit the sites that don’t provide RSS feeds.

I was chatting with some friends after seeing an author speak at a book signing at the Boulder Book Store (Neal Stephenson, promoting Quick Silver. It has pirates!). I was complaining because I am sure there are plenty of free and low cost events out there that I miss because I’m not aware of them. I thought it would be great to have a web site that aggregated all those events for a particular locality into one page that I could visit. ‘Hey, it’s Friday and the CU astronomy department is letting folks look through their telescopes!’ This would be a huge undertaking, however, if one had to screen scrape the ‘New Events’ pages of each interesting organization. If, however, they all made their schedule available as RSS, it would be trivial.

The question is, what do the organizations gain? Increased visibility. If it’s a book signing, the purpose is to draw folks in so they buy books. If it’s a library event, then the more folks one draws, the more the library is being used. If it’s the Boulder Theater, then the more people come to an event, the more beer they can sell.

Think of it as a automated version of the “What’s Happening” section of your daily paper. Wouldn’t that be sweet!