Skip to content

All posts by moore - 85. page

Google Maps Gotchas

I’ve done some recent work with Google Maps for a client and thought I’d share some lessons learned. (It seems I’ve been posting a lot about Google lately–I don’t know why.)

First off, like many other folks on the web, I think Google Maps are great. I’ve been a long time MapQuest user and the fact is that Google’s image panes just produces a better, slicker user experience than MapQuest’s dynamic image generation. Not to mention the fact that Google’s map API is free (as in beer, not in speech). Now, I haven’t had a chance to compare Yahoo’s map offering (as Michael Yuan did), though I have played around with Yahoo! MapMaker for Excel, but more mapping options can only be better. On to the issues I had with Google Maps.

* Geocoding is not provided with Google Maps, which means that you need to set up your own geocoding engine. However, the Tiger/Line dataset has some holes in it. Especially for rural regions, I was seeing many addresses that could not be geocoded. Even for an urban area (like Boulder, CO) around ten percent of the addresses were not geocodable. As for accuracy of the geocoding itself, I don’t even know how to test it on a large scale, but my client said he saw at least some instances where an address was geocoded incorrectly (a small number, but still significant). Well, what can you say? If you want precision, much less accuracy, pay for it. I investigated using Yahoo’s geocoding service, which is free and based on higher quality commercial data. Since my client is a commercial site (even though the maps are available for free) Yahoo said that they would require Yahoo maps on the site if it were to use their geocoding service. Fair enough. (As an aside, this was an interesting podcast of a speech by an executive of Navteq outlining some of the issues surrounding procuring good geodata.)

* PNGs are the default image type for the map pinpoints on Google Maps. These images () let you mark certain locations. However, if you display them in a list alongside the map, you’ll quickly find that transparent PNGs don’t work with Internet Explorer. They show up, but are surrounded by a black text box (Update Feb 8: transparent PNGs are outlined by a box in Internet Explorer. I’ve seen black boxes and blue boxes; regardless of the color, it looks bad). Luckily, the transparent PNG/Internet Explorer problem has already been solved.

* Each pinpoint/marker is added using an overlay. If you have significant numbers of overlays, map rendering becomes quite slow. Now, for Firefox users, it’s not as big an issue, because Firefox renders the rest of the page before the map. But for IE users, the table containing both the list and the map is not rendered until the map is displayable. On a reasonably fast box, I was seeing times of 80 seconds until the page was rendering. Clearly, this was unacceptable. Even on Firefox, some of the rendering times were just too slow. I searched the Google Maps Discussion Group and most everyone was saying that if you have a large number of markers, you should cluster them into a few representative markers until the user has zoomed sufficiently. (If I recall correctly, the maximum number of markers most folks recommended was around 20.) Such clustering can happen on the server side, or on the client side.

* Data retrieval architecture affects performance. For the first revision of the mapping application, I sent the data for each pinpoint at the same time as the map and the listing. This was the wrong way to do it–this method makes perceived rendering time much longer. The correct way to go is documented in ‘Hacking Maps with the Google Maps API’ XML.com article (linked below), where you use an XMLHttpRequest to pull in the pinpoint data asynchronously. I am in the midst of developing revision two of the interface and have noticed an appreciable speed up in rendering, so I’d recommend heading down that path at the start.

Finally, some resources. The first article you should read is Hacking Maps with the Google Maps API. This tutorial steps you through everything you need to know. The API documentation is adequate. I found some interesting discussions happening on the API discussion group. And, finally, the GoogleMapki was useful in looking at performance concerns. I haven’t read Scott Davis’ Google Maps API article, but keep meaning to spend the $8.50 and do so.

All in all, I enjoyed learning this new technology. The API is simple and easy to use, even for someone who’s no javascript expert. I look forward to adding maps to some of my other pages; my cross country skiing resources page in particular would benefit. Google has kickstarted a whole new area of web development and I think there’s plenty more to do.

We’re from the government and we’re here to help

The US government has just released a DVD about identity theft. This DVD, “Identity Theft: Outsmarting the Crooks”,

features experts from the government and the private sector talking about the scope of the identity theft problem and how a few simple steps can significantly increase protection. Experts also cover topics such as: online safety; access to credit reports; taxpayer vulnerabilities to identity theft; and dealing with debt collectors if you are a victim of identity theft.

For only $2, it might be worth checking out.

What is interesting to me is how the government is using new technologies to increase citizen access to information. The government has RSS feeds (here are some from the Treasury Department) and a host of podcasts.

New bloggers

A couple of folks I’ve worked with in the past have begun blogging (or, have let me know they were blogging). They aren’t developers, but do deal with the software world. (I’m shocked to note that both of their blogs are much snazzier than mine.)

Susan Mowery Snipes does web design. Her blog is company news and also exhibits the perspective of a UI focused designer–sometimes it’s a bit through the looking glass (why would someone judge a website in 1/20 of second), but that’s good for us software folks to take a look at.

And if a designer has a different view, a project manager is on a different planet. I admire the best PMs because they are able to manage even though they might not have any clue about technology details. Come to think of it, that may actually help. Regardless, Sarah Gilbert has been blogging for a while, but just shared the URL with me. She’s an excellent writer and I am looking forward to reading more entries like Trust Me, You Don’t Do Everything. I do wish she’d allow comments, though.

Software Licensing Haiku

I thought this list of software licensing haikus was pretty funny.

Kinda old, thought I’d update with some other licenses:

Apache: not the GPL! / we let you reuse to sell / you break it you buy
Creative Commons: choose one from many / confused? we will help whether / simple or sample
Berkley: do not remove the / notice, nor may you entangl’ / berkeley in your mess
Artistic: tell all if you change / package any way you like / keep our copyright

Perl to the rescue

I am using Apache JMeter to load test a web application. JMeter has an XML file format for storing load test configuration information. I wanted to system test as well, and needed to generate a large number of unique URL hits. Rather than using the clunky UI to add them, and getting carpal tunnel from it, I analyzed the XML file format and split it up. Then I put tokens (XXXTIMETOWAITXXX) in the appropriate places, and used an Excel generated CSV file to drive perl to assemble the pieces of text into a valid JMeter config file.

Well, what happened next? I needed some way to generate a larger number URLs than Excel would be pleasant to handle? Again, perl came to the rescue, making it easy to generate umpteen lines of correctly formatted CSV.

Performance testing, complexity of

Performance testing is a bit like visiting your girlfriend’s father. You’re never quite sure what you’re accomplishing, it can be alternately frustrating and satisfying, and you have to do it. Right now I’m in the midst of performance testing a web based application for my new company. I’ve been in such testing tangles before, though always as a consultant on a fixed bid project. I’d have to say that performance testing as an employee is less stressful than that.

The reasons why performance testing, especially of web applications, is such a rat’s nest are many:

complexity of platforms
Most modern web applications are built on a lot of code. In our case, it’s a servlet and logging framework on top of tomcat on top of the JVM on top of the operating system. Four levels on the web server, not counting the back end or the load balancer or any interaction with the browser! And this is a relatively simple system. I’ve seen portal applications that had 6 or more levels in the web server. Each level of the software stack interacts in (sometimes unforeseen) ways with the others, which means that changing parameters can have unpredictable effects. You simply must test every change you make.

realistic hardware
Unless you’re working with Scrooge McDuck or an application that has yet to be deployed, you’re probably not going to be able to test on production hardware. Very few companies I’ve dealt with are willing to buy a duplicate of their production hardware for testing purposes, so you’ll probably be testing on a scaled down version of the production system. That means that you’ll have to make assumptions about what the smaller system will tell you about the bigger system. One usually safe conclusion is that the smaller system sets a performance minimum for the larger system.

amount of time required
Each performance test takes a significant amount of time, minutes rather than unit testing where you want the unit tests to run quickly. Such slow turnarounds mean that performance testing just can’t be done quickly.

difficulty of understanding real user behavior
The more complicated your application is, the harder it is to understand how people are going to use it. Will they move quickly through the application? Will they leave sessions open for a long time? How many states will they go through? Anyone can come up with a reasonable guess as to the answers for these questions, but the only way to know for sure is to a) user experience test it, or b) unleash the application.

ambiguous or arbitrary goals
Unless you really understand how your userbase is going to use the application, it’s hard to come up with reasonable goals. ‘Make it run faster’ doesn’t cut it. Nor does picking an arbitrary number: ‘we want to service 10,000 hits a second’ may seem like a good goal, but if that number just was plucked from the air, a lot of misery can result. Especially if you’re on a fixed bid project, and every hour you spend is eating into your margin. (It’s OK for performance testing to make a tech person miserable, as long as there’s business benefit—and an arbitrary number is likely to under- or overshoot the optimum for business.)

difficulty of reproducing real user behavior
I’ve not had a lot of experience with for pay tools, but have used a variety of free (as in beer) tools. I’ve written before about my experiences with The Grinder, and am currently using JMeter. I’ve also used apachebench. And all of these tools were great at hitting URLs repeatedly and rapidly, but it was hard to really reproduce user (and browser) behavior because they’re simple programs. An example is that some versions of IE can call a servlet multiple times. You can’t possibly hope to replicate all the quirks of browsers when testing, but sometimes those quirks can have performance impacts.

These dimensions of complexity feed on each other. Because it takes so long to performance test an application, you are tempted to change more than one level of the application at once. Because you think you understand user behavior, you come up with an erroneous performance target.

Is it hopeless? Nope, and it can be a very good exercise—it can turn up areas of real weakness in your application. Just remember to document your assumptions, make the tradeoffs abundantly clear to non technical folks and realize that you’re going to miss something important. Your results won’t be worth as much as you think they will be. Oh yeah, and don’t sign any fixed bid performance testing contracts unless you know what you’re doing.