Skip to content

All posts by moore - 92. page

Article on XmlHttpRequest

XmlHttpRequest popped up on my radar a few months ago when Matt covered it. Back then, everyone and their brother was talking about Google Suggest. Haven’t found time to play with it yet, but I like the idea of asynchronous url requests. There’s lots of power there, not least the ability to make pull down lists dynamic without shipping everything to the browser or submitting a form via javascript.

I found a great tutorial on XmlHttpRequest by Drew McLellan, who also has a interesting blog. Browser based apps are getting better and better UIs, as Rands notices.

The Economist on Blogging

That bastion of free trade economics and British pithy humor has an article about corporate blogging: Face Value. It focuses on Scoble and Microsoft, but also mentions other bloggers, including Jonathan Schwarz.

There’s defintely a fine line between blogging and revealing company secrets. Mark Jen certainly found that out. The quick, informal, personal nature of blogging, combined with its worldwide reach and googles cache, mean that it poses a new challenge to corporations who want to be ‘on message’.

It also exposes a new risk for employees and contractors. I blog about all kinds of technologies, including some that I’m paid to use. At what point does the knowledge I gain from a client’s project become mine, so that I can post about it? Or does it ever? (Obviously, trade secrets are off limits, but if I discover a better way to use Spring or a solution for a common struts exception, where’s the line?) Those required NDAs can be quite chilling to freedom of expression and I have at least one friend who has essentially stopped blogging due to the precarious nature of his work.

JMS at the most recent BJUG

I went to BJUG last Thursday, and enjoyed the informative talk about JMS by Chris Huston. It started out as a bit of a tutorial, with the typical “here’s a messaging system, here are the six types of messages, etc.” When he was doing the tutorial bit, I thought it was a bit simple for a main talk, but it got better as the the speaker continued. It was clear from the speaker’s comments that he was deeply knowledgeable in the subject, or, if not that, at least has been enmeshed in JMS for a while. This wasn’t just a “I downloaded an open source JMS server and ran through the Sun tutorial talk” and I appreciated that.

I had a couple of take aways. One is that managing messaging with transactions is something that you’re always going to want to do, but this is fraught with difficulty, since you’ll then have two transactional systems. And we all know what that means; you’ll have to buy this book. It also means that, in a real system, you’ll never want to use local transactions, as you’ll want the transactions to be managed by a global transaction service, typically your application server.

Recovery of such a transactional messaging service was touched upon. If you have two different transactional systems, and failure occurs, recovering can be a real issue. Chris recommended, if at all possible, having the JMS provider and your data layer live in the same database, as then you can use the backup tools and ensure the two systems are in a consistent state.

One of the most interesting parts of the evening was a question asked by the audience. A fellow asked what scenarios JMS was useful for, and Chris said it was typically used in two ways:

1. Clustering/failover. You can set up a large number of machines and since all they are getting is messages with no context, it’s much easier to fail over to another machine. There’s no state to transfer.

I’ve seen this in the Jetspeed 1.5 project, where messaging is used to allow clustering.

2. Handling a large amount of data while increasing the responsiveness of the system. Since messages can be placed into queues, with no need for immediate response, it’s possible for a message source to create a tremendous number of messages very quickly. These messages may take quite a bit of time to process, and this rules out a synchronous solution. JMS (and messaging solutions in general) allow hysteresis.

I’ve seen this in a client’s system, where they send out a tremendous number of emails and want to ensure they can track the status of each one. It’s too slow to write the status to the database for each email, but sending a message to a queue is quick enough. On the receiving end, there’s some processing and status is written to the database. The performance is acceptable and as long as the JMS provider doesn’t crash or run out of memory, no messages are lost.

The only scenario that I thought of that Chris didn’t mention is one that I haven’t seen. But I’ve heard that many legacy systems have some kind of messaging interface, and so JMS might be an easy way (again, no context required) to integrate such a system.

It was an interesting talk, and reminded me why I need to go to more BJUGs.

Database links

I just discovered database links in Oracle. This is a very cool feature which essentially allows you to (nearly) transparently pull data from one Oracle database to another. Combined with a view, you can have a read only copy of production data, real time, without giving a user access to the production database directly.

Along with the above link, section 29 of the Database Administrator’s Guide, Managing a Distributed Database, is useful. But I think you need an OTN account to view that link.

Concurrency, object orientation, and getting software done

The Free Lunch Is Over, via Random Thoughts, is a fascinating look at where CPUs are headed, and what effect that has on software development. The subtitle: “The biggest sea change in software development since the OO revolution is knocking at the door, and its name is Concurrency” drives home the fact that the author believes that concurrency will be the next big thing in software development.

I was struggling to write a relevant post about this topic, becuase I feel like, at least in the companies I’ve been with, there just wasn’t that much object oriented software being written. I’m working on a project right now that has a minimum of object orientation, even though it is written in java. I’m definitely more familiar with small scale projects and web applications, but I know there are plenty of applications out there that are written and working well without the benefits of objects.

Or, should I say, that are written and working well without the benefits of objects directly. Servers, operating systems and general purpose platforms are a different beast and require a different skill set. And by building on top of such platforms, normal programmers don’t have to understand the intricacies of object oriented development–they can benefit without that investment. Of course, they’d probably benefit more if they understood things and there may come a time in their development that they’ll have to. However, the short term gain of being able to continue on their productive plateau may be worth postponing the learning process (which will take them to a higher plateau at a short term cost).

In the same way, I think that multi-threading won’t be required of normal busines developers. I was struggling with this until the latest NTK came out, with this to say:

CPUs aren’t getting faster! Multi-core is the future! Which means we’ll all need to learn concurrent, multi-threaded programming, or else our software is never going to get faster again! That’s what Herb Sutter’s future shock article in Dr. Dobbs says (below). But before you start re-learning APL, here’s a daring thought: maybe programmers are just too *stupid* to write multi-threaded software (not you of course: that guy behind you). And maybe instead we’ll see more *background* processes springing up – filling our spare CPUs with their own weird, low i/o calculations. Guessing wildly, we think background – or remote – processes are going to be the new foreground.

From the Jan 21 edition, which should be online in a day or so. Those Brits certainly have a way with words.

If you’re a typical programmer, let the brilliant programmers who are responsible for operating systems, virtual machines and application servers figure out how to best use the new speed of concurrent processor execution, and focus on process and understanding business needsand making sure they’re met by your software. Or, if you have a need for speed, look at precalculation rather than multi threading.

Expresso and dbobjects and ampersands

If you’re ever pulling a url from a database via an Expresso dbobject (Expresso’s O-R layer) and you find yourself with mysterious & characters being inserted, you may want to visit this thread and the FilterManager javadoc. Long story short, add this line:

setStringFilter("fieldname", FilterManager.RAW_FILTER);

to any fields of the dbobject that you don’t want ‘made safe’ by the default filter (which screens out dangerous HTML characters). Tested on Expresso 5.5.

(I’m omitting the rant about changing data pulled from the database without making it loud and clear that default behavior is to filter certain characters. But it’s a Bad Idea.)

PL/SQL redux

I’ve written about PL/SQL before but recently have spent a significant amount of time writing stored procedures. Unlike some of my previous experiences, this time PL/SQL seemed like a great fit for the problem set, which was two fold.

In the first case, some of the stored procedures push data from stage tables, which are loaded via ODBC or SQL*Loader, into tables which the application accesses. PL/SQL is great for this type of task because cursors, especially when used with parameters, make row driven data transformations a pleasure, and fast as well. Handling deltas via updates instead of inserts was alright, and the fact is that PL/SQL code that manipulates data can be positively terse when compared to JDBC PreparedStatements and at least as fast. In addition, these stored procedures can be easily called over an ODBC connection, giving the client the capability to load new data to the stage tables and then call the stored procedure to update or insert the data as needed. (You could definitely do the same thing with a servlet and have the client hit a URL, but that’s a bit less self-contained.)

PL/SQL was also used to implement complex logic that was likely to change. Why do that in PL/SQL in the database rather than in java in the application server? Well, changes to PL/SQL programs don’t require a server restart, which can be quite an issue when a server needs high levels of uptime. Instead, you just recompile the PL/SQL. Sure, you can use the reloadable attribute of the context to achieve the same thing (if you’re using Tomcat) but recompiling PL/SQL doesn’t have the same performance hit as monitoring class files for changes.

Use the right tool for the job. Even if PL/SQL ties your application to Oracle, a judicious use of this language can have significant benefits.

Under Pressure

In almost every software project of any length that I’ve participated in, the last few weeks before a release are tense and pressure filled. (Please note that I write custom business software; that’s what these conclusions are drawn from.) Being in the middle of a project release myself, I thought I’d muse on the causes of this pressure. Why are the last few weeks before the deadline so tense? Because software is, above all else, about the details. Joel puts it well in his interview with Salon.com:

The fundamental problem that you’re trying to solve here is that humans think of things in vague, mushy terms. In order to visualize something, they don’t have to actually visualize every part of it. Whereas the programmer, in order to actually implement that thing, to create it, needs to have every part specified.

What happens on projects over a certain level of complexity is this specification is pushed off, often until a decision must be made, or even past that point. This occurs for a number of reasons: programmers want to start coding, the client doesn’t have the information at the moment the issue is raised and it is never revisited, the answers to certain questions (or the questions themselves) are dependent on answers to other questions. In the beginning of a project, big questions are decided, but the small niggling details, which the compiler most certainly needs to know about, are, perhaps noted, but not dealt with.

Why not specify how the system will work before building any of it, to every exacting detail? Some software processes try to do this, but in general, unless the problem is very well understood (in which case the client will almost always be better served by off-the-shelf software), the requirements will change as the project progresses. (Incidentally, if they don’t, the project is a great candidate for offshoring.) The client will better understand the problem and technology and the software team will likewise better understand the problem and domain space. So specifying the entire system up front will likely leave the customers unhappy or the system unused.

Because business software is actually business process crystallization, it matters very much that things are correct. Because business software is implemented by a group of people with specialized skills and a different focus from the users, at best, or no understanding of the business, at worst, software delivery is unlike other deadline driven industries in that changes are expensive and mysterious. I think every software engineer has an example of a simple change request that turned out to have massive implications throughout the system, and this effect is mysterious to normal users.

What matters is not why the details crop up, but that they do. So, the last few weeks of every project consists of mentally running around and nailing down every detail. I expect this is true of every job with fixed deadlines (ever been around a retail store the day before Thanksgiving?). Every issue should be resolved or acknowledged when the software is released, and while some facets are less important than others, no detail is unimportant.

Options for connecting Tomcat and Apache

Many of the java web applications I’ve worked on run in the Tomcat servlet engine, fronted by an Apache web server. Valid reasons for wanting to run Apache in front of Tomcat are numerous and include increased clickstream statistics, Apache’s ability to quickly and efficiently serve static content such as images, the ability to host other dynamic solutions like mod_perl and PHP, and Apache’s support for SSL certificates. This last is especially important–any site with sensitive data (credit card information, for example) will usually have that data encrypted in transit, and SSL is the default manner in which to do so.

There are a number of different ways to deal with the Tomcat-Apache connection, in light of the concerns mentioned above:

Don’t deal with the connection at all. Run Tomcat alone, responding on the typical http and https ports. This has some benefits; configuration is simpler and fewer software interfaces tends to mean fewer bugs. However, while the documentation on setting up Tomcat to respond to SSL traffic is adequate, Apache handling SSL is, in my experience, far more common. For better or worse, Apache is seen as faster, especially when when confronted with numeric challenges like encryption. Also, as of Jan 2005, Apache serves 70% of websites while Tomcat does not serve an appreciable amount of http traffic. If you’re willing to pay, Netcraft has an SSL survey which might better illuminate the differences in SSL servers.

If, on the other hand, you choose to run some version of the Apache/Tomcat architecture, there are a few different options. mod_proxy, mod_proxy with mod_rewrite, and mod_jk all give you a way to manage the Tomcat-Apache connection.

mod_proxy, as its name suggests, proxies http traffic back and forth between Apache and Tomcat. It’s easy to install, set up and understand. However, if you use this method, Apache will decrypt all SSL data and proxy it over http to Tomcat. (there may be a way to proxy SSL traffic to a different Tomcat port using mod_proxy–if so, I was unable to find the method.) That’s fine if they’re both running on the same box or in the same DMZ, the typical scenario. A byproduct of this method is that Tomcat has no means of knowing whether a particular request came in via secure or insecure means. If using a tool like the Struts SSL Extension, this can be an issue, since Tomcat needs such information to decide whether redirection is required. In addition, if any of the dynamic generation in Tomcat creates absolute links, issues may arise: Tomcat receives requests for localhost or some other hidden hostname (via request.getServerName()), rather than the request for the public host, whichApache has proxied, and may generate incorrect links.

Updated 1/16: You can pass through secure connections by placing the proxy directives in certain virtual hosts:

<VirtualHost _default_:80>
ProxyPass /tomcatapp http://localhost:8000/tomcatapp
ProxyPassReverse /tomcatapp http://localhost:8000/tomcatapp
</VirtualHost>

<VirtualHost _default_:443>

SSLProxyEngine On
ProxyPass /tomcatapp https://localhost:8443/tomcatapp
ProxyPassReverse /tomcatapp https://localhost:8443/tomcatapp
</VirtualHost>

This doesn’t, however, address the getServerName issue.

Updated 1/17:

Looks like the Tomcat Proxy Howto can help you deal with the getServerName issue as well.

Another option is to run mod_proxy with mod_rewrite. Especially if the secure and insecure parts of the dynamic application are easily separable (for example, if the application was split into /secure/ and /normal/ chunks), mod_rewrite can be used to rewrite the links. If a user visits this url: https://www.example.com/application/secure and traverses a link to /application/normal, mod_rewrite can send them to http://www.example.com/application/normal/, thus sparing the server from the strain of serving pages needlessly encrypted.

mod_jk is the usual way to connect Apache and Tomcat. In this case, Tomcat listens on a different port and a piece of software known as a connector enables Apache to send the requests to Tomcat with more information than is possible with a simple proxy. For instance, certain variables are sent via the connector when Apache receives an SSL request. This allows Tomcat full knowledge of the state of the request, and makes using a tool like the aforementioned Struts SSL Extension possible. The documentation is good. However using mod_jk is not always the best choice; I’ve seen some performance issues with some versions of the software. You almost always have to build it yourself: binary releases of mod_jk are few and far between, I’ve rarely found the appropriate version for my version of Apache, and building mod_jk is confusing. (Even though mod_jk 1.2.8 provides an ant script, I ended up using the old ‘configure/make/make install’ process because I couldn’t make the ant script work.)

In short, there are plenty of options for connecting Tomcat and Apache. In general, I’d start out using mod_jk, simply because that’s the option that was built specifically to connect the two; mod_proxy doesn’t provide quite the same level of integration.