Dan Moore! – Page 84 – My place on the web

Exceptions in API design

Here’s an old but fantastic post about API (mis)design.

To handle an exception, you have four choices, one of which is:

Log it! We bought big hard disks for those servers, let’s use them! Log the exception toString() and print its stack trace, but only if you expect the exception to be thrown over 1,000,000 times each day. Alternatively, if you think that the exception would only occur rarely and that it could indicate a problem worth looking at, just print the exception class name … since stack traces just confuse people!

Hilarious. And, it looks like it’s part 4 of a 7 part series. Via Dejan Bosanac.

Java | moore | March 10, 2006

Wiki practices for requirements documentation

I have been heads down for the last couple of weeks helping write requirements and design documents. The team I’m on is building them using a wiki. (I discussed the wiki selection process a few weeks ago.)

I just wanted to outline a few practices (I hesitate to call them best practices) for using a wiki to document business and technical requirements.

Having a wiki allows anyone to edit the requirements. This doesn’t mean that everyone will or should. Documents should still have an owner.
Require folks to identify themselves. Require Author met our needs, as it requires an editing user enter some identifier. A history function, without tracking who made which edit, is fairly useless. Note that our solution works for a small team. A larger team may want to authenticate every user.
Make sure you lock down the wiki. We have ours behind the firewall, which means that we don’t have to require a user to remember yet another password, or even login at all (beyond providing some kind of identifier once, as mentioned above).
PDF generation allows you to generate decent looking print documents. I found PmWiki2PDF to be adequate.
Think carefully about document structure. We broke out the requirements into sections, and had each section on its own wiki page; more than that, we have pages for each section for each type of requirements (business, technical) or design document. These three section pages are pulled into a page for that one component, via the page include directive, which should describe everything known about a particular component. This kind of page seems useful at present, but we haven’t begun coding.
However, if I had to do it over again, I’d build each main document as one wiki page, and then pull the component info out of that. This allows a user to view the overall history of the document, as opposed to the above setup, where, to see what has changed in the requirements, you have to visit as many pages as there are sections. (You can also look at the RecentChanges page for a group, but that has only a page level granularity, as opposed to the line leve granularity of the page history.)
Choose page names carefully. While it’s easy to move content from one page to another, realize that you lose all the history when you do that. Well, actually, you might be able to move the file on the filesystem and retain the history, but for normal users, moving a pages (that is, changing a page name) causes history loss.
Keep requirements, whether in sections or in one document, in a different group than the design document. This allows you to lock down the requirements group, via a password while letting other documents, like design, continue to evolve.
Cross reference extensively. Don’t cut and paste, link or include.
Use pictures. The support for uploading pictures in PmWiki is alright, though the support for removing them isn’t great. Regardless, don’t shy away from diagrams and other graphics in the wiki.

I’m going to be interested to see how the process continues to evolve as we get further into development. But so far, I think that a wiki has everything you really need to generate requirements documentation for a small team of developers.

Programming | moore | March 7, 2006

MySQL performance and doing calculations on varchar columns

MySQL, along with other features designed to make it easy to use, tries to do the right thing regarding strings. When you perform a math calculation on a column or columns that are of type varchar, MySQL automatically conversts that string to a number (empty strings are treated as zero.):

To cast a string to a numeric value in numeric context, you normally do not have to do anything other than to use the string value as though it were a number[.]

However, this translation, convenient as it may be, is not free. A client of mine had a query that was running calculations against two such columns. After indexing and trying to simplify the query, we were still seeing query execution times of 2+ seconds (all times are quoted for MySQL 4.1, on my relativly slow personal laptop).

The solution appears to be to change the type of the columns using the alter table syntax to type double. After doing so and running analyze table mytable, I was seeing query execution times of 0.2 seconds for the same query on the same box. Fantastic.

I am not sure if this result was due to not having to do several string conversions for each row returned by the query, or the fact that:

In some cases, a query can be optimized to retrieve values without consulting the data rows. If a query uses only columns from a table that are numeric and that form a leftmost prefix for some key, the selected values may be retrieved from the index tree for greater speed[.]

Regardless of the cause, if you’re doing some complicated calculations on columns, consider making them numbers.

Databases Programming | moore | February 23, 2006

Choosing a wiki

I am setting up a wiki at work. These are our requirements:

File based storage–so I don’t have to deal with installing a database on the wiki server
Authentication of some kind–so that we can know who made changes to what document
Versioning–so we can roll back changes if need be.
PHP based–our website already runs php and I don’t want to deal with alternate technologies if I don’t have to.
Handles binary uploads–in case someone had a legacy doc they wanted to share.
Publish to PDF–so we can use this wiki for more formal documents. We won’t publish the entire site, but being able to do this on a per document basis is required.

I see this wiki both as a repository for company muscle memory (technical decisions, R&D results) and a place for more formal documents (requirements) that might be published to PDF. Basically, I’m trying to get rid of .doc files as a technical document format. I read about this on some XP mailing lists of which I used to be a part and am interested to see how effective it might be.

I started with PHPWiki but its support for PDF was lacking. (At least, I couldn’t figure it out, even though it was documented.)

After following the wizard process at WikiMatrix (similar to CMSMatrix, which I’ve touched on before), I found PmWiki, which has support for all of our requirements. It also seems to have a nice extensible architecture.

Installation was a snap and after monkeying around with authentication and PDF installation (documented here), I’ve added the following lines to my local/config.php:

include_once('cookbook/pmwiki2pdf/pmwiki2pdf.php');
$EnablePostAuthorRequired = 1;
$EnableUpload = 1;
$UploadDir = "/path/to/wiki/uploads";
$UploadUrlFmt = "http://www.myco.com/wiki/uploads";
$UploadMaxSize = 100000000; // 100M

Technology | moore | February 19, 2006

Prosper.com: an EBay for lending

Wow. I hope Prosper.com takes off. I think it will depend on the ability of the groups to enforce repayment and keep delinquencies down.

For more on microcredit, which is essentially what Prosper.com is, check out Wikipedia and this article by The Economist on Microcredit and Disasters.

I wonder what banks think of this? Via mnot.

Technology and Society | moore | February 16, 2006

PR: another industry being transformed by the internet

No, this entry isn’t about blogs. If you want to hear how blogs are transforming PR, I suggest you visit Micro Persuasion.

Via Dave Taylor, I just found out about PRLeads, a service that lets you field requests for information from journalists. When you see a request about a topic on which you consider yourself an ‘expert’, you can correspond with the journalist. You provide information and context to the journalist and if things work out, you get publicity and gain authority by being quoted in a story.

I asked a friend who has worked in PR for a long time and this service is rather revolutionary. Ten years ago, the journalist would have looked to friends or in-shop files for an expert, but now they have access to everyone who knows about the service and is willing to pay the fee ($99/month–hardly backbreaking).

This is good for everyone. Journalists get access to experts who they might not find otherwise as well as the chance to write a more correct story (due to the fact that they’ll be exposed to more viewpoints). Experts get a chance to shape public opinion as well as publicity. And the public is exposed to a wider range of views than they’d otherwise see. Win-win-win.

PRLeads is a perfect example of an internet company, by the way. It has network effects–journalists will go where the most experts are, and experts will go where the most journalists are. It’s a service that just couldn’t be efficiently run without the internet. And the main commodity is information.

Update: Here’s a blog entry about results from PRLeads.com.

Technology and Society | moore | February 16, 2006

mod_alias vs mod_proxy for XMLHttpRequest proxying

If you’re going to use an apache proxy to fix some of the issues with XMLHttpRequest, be aware that mod_alias sends a redirect to the browser. That is, it sends one of the 3XX HTTP status codes to the XMLHttpRequest object. The XMLHttpRequest object then issues the GET itself (well, some do, check out these XMLHttpRequest tests for more). That’s fine if you’re doing a GET, but if you’re doing a POST, then some redirects will require user interaction. I found that mod_alias, which was sending a 301 (redirect permanent) just turned the POST request into a GET. (mod_alias doesn’t always come to the rescue, apparently.)

The solution? Well, since you are really proxying the XMLHttpRequest’s request, use mod_proxy.

Programming | moore | February 15, 2006

Google Maps Gotchas

I’ve done some recent work with Google Maps for a client and thought I’d share some lessons learned. (It seems I’ve been posting a lot about Google lately–I don’t know why.)

First off, like many other folks on the web, I think Google Maps are great. I’ve been a long time MapQuest user and the fact is that Google’s image panes just produces a better, slicker user experience than MapQuest’s dynamic image generation. Not to mention the fact that Google’s map API is free (as in beer, not in speech). Now, I haven’t had a chance to compare Yahoo’s map offering (as Michael Yuan did), though I have played around with Yahoo! MapMaker for Excel, but more mapping options can only be better. On to the issues I had with Google Maps.

* Geocoding is not provided with Google Maps, which means that you need to set up your own geocoding engine. However, the Tiger/Line dataset has some holes in it. Especially for rural regions, I was seeing many addresses that could not be geocoded. Even for an urban area (like Boulder, CO) around ten percent of the addresses were not geocodable. As for accuracy of the geocoding itself, I don’t even know how to test it on a large scale, but my client said he saw at least some instances where an address was geocoded incorrectly (a small number, but still significant). Well, what can you say? If you want precision, much less accuracy, pay for it. I investigated using Yahoo’s geocoding service, which is free and based on higher quality commercial data. Since my client is a commercial site (even though the maps are available for free) Yahoo said that they would require Yahoo maps on the site if it were to use their geocoding service. Fair enough. (As an aside, this was an interesting podcast of a speech by an executive of Navteq outlining some of the issues surrounding procuring good geodata.)

* PNGs are the default image type for the map pinpoints on Google Maps. These images () let you mark certain locations. However, if you display them in a list alongside the map, you’ll quickly find that transparent PNGs don’t work with Internet Explorer. They show up, but are surrounded by a black text box (Update Feb 8: transparent PNGs are outlined by a box in Internet Explorer. I’ve seen black boxes and blue boxes; regardless of the color, it looks bad). Luckily, the transparent PNG/Internet Explorer problem has already been solved.

* Each pinpoint/marker is added using an overlay. If you have significant numbers of overlays, map rendering becomes quite slow. Now, for Firefox users, it’s not as big an issue, because Firefox renders the rest of the page before the map. But for IE users, the table containing both the list and the map is not rendered until the map is displayable. On a reasonably fast box, I was seeing times of 80 seconds until the page was rendering. Clearly, this was unacceptable. Even on Firefox, some of the rendering times were just too slow. I searched the Google Maps Discussion Group and most everyone was saying that if you have a large number of markers, you should cluster them into a few representative markers until the user has zoomed sufficiently. (If I recall correctly, the maximum number of markers most folks recommended was around 20.) Such clustering can happen on the server side, or on the client side.

* Data retrieval architecture affects performance. For the first revision of the mapping application, I sent the data for each pinpoint at the same time as the map and the listing. This was the wrong way to do it–this method makes perceived rendering time much longer. The correct way to go is documented in ‘Hacking Maps with the Google Maps API’ XML.com article (linked below), where you use an XMLHttpRequest to pull in the pinpoint data asynchronously. I am in the midst of developing revision two of the interface and have noticed an appreciable speed up in rendering, so I’d recommend heading down that path at the start.

Finally, some resources. The first article you should read is Hacking Maps with the Google Maps API. This tutorial steps you through everything you need to know. The API documentation is adequate. I found some interesting discussions happening on the API discussion group. And, finally, the GoogleMapki was useful in looking at performance concerns. I haven’t read Scott Davis’ Google Maps API article, but keep meaning to spend the $8.50 and do so.

All in all, I enjoyed learning this new technology. The API is simple and easy to use, even for someone who’s no javascript expert. I look forward to adding maps to some of my other pages; my cross country skiing resources page in particular would benefit. Google has kickstarted a whole new area of web development and I think there’s plenty more to do.

Programming | moore | February 7, 2006

Google tricks

Not only can RSS get you a job and Google spare you from remembering URLs, but combining them lets you find when your namesake is in the news. Via the Justin Pfister RSS generator and Q Digital Studio.

Technology | moore | February 3, 2006

Testing XMLHttpRequest’s adherence to HTTP response codes

mnot has a set of tests looking at the behaviour of XMLHttpRequest as it follows various HTTP responses. Some of it is pretty esoteric (how many folks are using the DELETE method?–oh wait). But all in all it’s interesting, especially given the buzz surrounding AJAX, of which XMLHttpRequest is a fundamental part.

Technology | moore | February 2, 2006

Exceptions in API design

Wiki practices for requirements documentation

MySQL performance and doing calculations on varchar columns

Choosing a wiki

Prosper.com: an EBay for lending

PR: another industry being transformed by the internet

mod_alias vs mod_proxy for XMLHttpRequest proxying

Google Maps Gotchas

Google tricks

Testing XMLHttpRequest’s adherence to HTTP response codes

Letters to a New Developer

Pages

Subscribe

Socials

Categories

Archives