Jini and JavaSpaces at BJUG

I went to BJUG last week to see a presentation about Jini by someone from GigaSpaces. It was an intensely interesting presentation for a number of reasons. First off, I knew the presenter, Owen Taylor. About 6 years ago, I took a class from him, along with a few other people. The class covered BEA Weblogic and EJBs. I’ve attended (and given a couple) technical presentations in my time including some conferences. I don’t think I’ve ever met someone who was more energetic and practiced at conveying hard concepts than Owen Taylor. Owen! Start blogging!

Another reason it was interesting is that Brian Pontarelli, an old friend, really likes Jini and has told me about some of his experiences. I actually looked into it when Bill DeHora published his entry two classic hardbacks. I downloaded Jini and JavaSpaces (Jini is the framework, JavaSpaces is the tuple space repository.) and started playing with it. The final reason that it was an interesting presentation is that JavaSpaces is something that I have never had a chance to use, and didn’t foresee using in the future. By the end of the presentation, I was convinced that this concept deserved more research, if nothing else.

What follows are my scribbled notes from that meeting, along with a smattering of other comments and thoughts regarding these technologies. More information is here, however no presentation artifacts are available, unfortunately.

The problem with distributed systems is that they move data around a lot. What you really want is for the processing and the data to be at most one step apart. Stored procedures do this, but you can’t change the logic easily.

Jini was originally developed for pervasive computing, but the focus of the presentation was on the enterprise applications that can be built based on that spec. This class of applications has some amazing features, including low latency, extremely high throughput and ‘100%’ uptime capability.

For that reason, many large institutions are looking at replacing or augmenting JEE (nee J2EE) applications with JavaSpace applications. He mentioned that GigaSpaces recruited him with the notion that a laptop could run 3 million events an hour. This kind of blew his mind.

JavaSpaces is the command pattern–code and data are distributed, based on Linda. Orbitz uses the technology and talks about 100% uptime. Anyplace where you are batching, you can now do it in real time. The key is to keep everytihing in memory and use replication for persistance, rather than disk. (Eventually you want to push it to disk, for reporting and auditing purposes, but you can do that asynchronously.) Databases tend to be used as a bus between in memory processes right now, and you can replace that with a JavaSpace.

Jini is composed of discrete objects that can run anywhere; more to the point, they don’t care where they run. It also expects failure, as opposed to many other technologies that simply assume that things will run correctly. Jini is a LAN based technology, though Owen mentioned that there are ways to turn it into a WAN technology and cited several examples. I am not competent to give a general overview of Jini–please check out this tutorial for more information.

One thing that really struck me is that all of the complexity that EJB and other JEE technologies hide (clustering, transaction management, thread management, lifecycle), JavaSpaces revels in. Owen actually mentioned that JavaSpaces brings skills that JEE developers currently rarely need to use, like threading and classloading, back into the toolbox, rather than depending on a vendor. That can be a plus and a minus, right? The whole point of not trusting servlet threading to a business developer is that it allows them to focus on the business logic. The problem with much of JEE is that it hasn’t done a very good job of doing this. Do you remember the ‘deployer’ role?

Jini has only interfaces; the named implementations are shipped around transparently. Ha ha, just like EJB remote calls are transparent. However, one very nice aspect of Jini is that when you register an implementation of an interface, you say how long the implementation is going to be available (the lease length). As a service provider, you can keep track of that lease and re-register yourself when it is near to up. Of course, if the service is no longer available, for whatever reason, it is not provided to clients–there’s no need for the JVM to garbage collect. The clients do need to be a bit smart about things though.

As for licensing, version 2.0 has been released under an apache license, as opposed to the Sun Community Source License, which was the previous license. This should grow the jini.org community significantly.

Configuration of Jini takes place with a java syntax, which can be a bit confusing, since you don’t compile and execute it. The names of the services (reggie, webster) are a bit cutesy. Webster is the web server which serves implementation classes, but shouldn’t be used in a production environment. Use Tomcat.

Spring and JavaSpaces are complementary; work is in progress to integrate them and completion is expected in the next few months. GigaSpaces has scaled implementations (linearly!) to 2000 cpus on 500 machines….

At this point Owen began talking about various architectural patterns that could be used with Jini; he also covered some war stories. However, I didn’t take any notes–you’ll have to see him talk sometime.

Issues include (so my friend says) versioning. Owen mentioned that debugging isn’t a strong suit. And I did some parallel computing for my senior thesis so I know that splitting up problems so they can be parallelized is not always as easy as you’d like. However, the web paradigm is actually rather suited to parallelization, since you do have the request/response model. The problem is, as it so often is, state.


apachebench drops hits when the concurrency switch is used

I’ve used apachebench (or ab), a free load testing tool written in C and often distributed with the Apache Web Server, to load test a few sites. It’s simple to configure and gives tremendous throughput. (I was seeing about 4 million hits an hour over 1 gigabit ethernet. I saw about 10% of that from jmeter on the same machine; however, the tests that jmeter was running were definitely more compex.)

Apachebench is not perfect, though. The downsides are that you can only hit one url at a time (per ab process). And if you’re trying to load test the path through a system (“can we have folks login, do a search, view a product and logout”), you need to map that out in your shell script carefully. Apachebench has no way to create more complicated tests (like jmeter can). Of course, apachebench doesn’t pretend to be a system test tool–it just hits a set of urls as fast as it can, as hard as it can, just like a load tool should.

However, it would be nice to be able to compare hits recieved on the server side and the log file generated by apachebench; the numbers should reconcile, perhaps with some fudge factor for network errors. I have found that these numbers reconcile as long as you only have one client (-c 1, or the default). Once you start adding clients, the server records more hits than apachebench. This seems to be deterministic (that is, repeatable), and worked out to around 4500 extra requests for 80 million requests. As the number of clients approached 1, the discrepancy between the server and apachebench numbers decreased as well.

This offset happened with Tomcat 5 and Apache 2, so I don’t think that the issues is with the server–I think apachebench is at fault. I searched the httpd bug database but wasn’t able to find anything related. Just be aware that apachebench is very useful for generating large HTTP request loads, but if you need to reconcile for accuracy, skip the concurrency offered.


Eclipse impressions

I have previously espoused opinions about IDEs. But, I’ve heard great things about Eclipse, including this rather direct statement from a developer who I respect:

Having a solid IDE like IntelliJ or Eclipse so radically improves your productivity that I quite simply don’t see how you can call yourself a professional developer without using one.

So, I thought I’d give Eclipse a try. Again. The latest version is Eclipse 3.1. This time I wasn’t going to try to get by with just the free tutorials. I did some browsing on Amazon and found Eclipse Distilled. This book, while aimed at Eclipse 3.0, is eminently readable and quite informative on the Eclipse way of doing things. All the views and perspectives and projects and jargon can be a bit confusing, so I was happy to pay $35 for this guide.

After using Eclipse for a few weeks, I have some likes and dislikes:

Likes:

  • Code completion: huge. Hitting control-space and choosing a method rather than having to remember exactly what it is named is big. (Charles Petzold talks about a similar feature called IntelliSense in Visual Studio and some of the ramifications. Not sure if all of them apply to Eclipse.)
  • Integration with existing projects: while you can easily start new projects with Eclipse, I was also very impressed with how easy it was to bring an existing codebase into the system and begin using Eclipse to modify it.
  • Refactoring: again, huge. I find that I use the ‘rename method’ refactoring most often. The ability to just change the name of a method in one place and have it propagate allows you much more flexibility.

Dislikes:

  • Using CVS externally confuses Eclipse: I consider myself a power user of CVS. This means it’s often easier for me to drop down and run commands from the prompt. This seems to confuse Eclipse, especially when I’m adding files.
  • No support for local CVS repositories: it’s a known bug, with some workarounds available.
  • Memory usage: 150M of memory is used, even when it is doing nothing. Now, I realize that most new boxes are shipped with gigs and gigs of memory, if you run Eclipse inside VMware with Oracle and Tomcat, eventually things start to get a bit crowded.
  • I have a few other quibbles, but above are the main ones I’ve run into so far.

So, ok, ok. I was wrong. Those of you who have used Eclipse or NetBeans or VisualStudio or IntelliJ or Visual SlickEdit are snoozing right now, but I’ve learned something important. IDEs can be very good and when a free cross platform IDE is available and paired with an external build tool, the results can be powerful indeed.

“Eclipse Distilled” at Amazon.

Technorati Tags: ,


JSVC and large log files

jsvc, which is used for daemoning Tomcat and other java applications on unix, takes filenames for stdout and stderr as arguments. One thing to be aware of is that when the either of these files reach a size of just over 2 gigabytes, jsvc simply fails. No error message. If you restart the application, it will note that it can’t write to the file and proceeds to write to the console. I saw this behavior using tomcat 5 on fedora core 4 with jsvc 1.0.1 (described here).

I am not sure exactly what the problem is, but when I started tomcat via the normal shell script, it was able to write to that file. The user that jsvc runs as had no limits on file size:

-bash-3.00$ ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
file size               (blocks, -f) unlimited
pending signals                 (-i) 1024
max locked memory       (kbytes, -l) 32
max memory size         (kbytes, -m) unlimited
open files                      (-n) 1024
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
stack size              (kbytes, -s) 10240
cpu time               (seconds, -t) unlimited
max user processes              (-u) 32764
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Therefore, it might be an issue with jsvc. Do note that there are nightly snapshots of jsvc, which might solve the issue. The solution I found is to use the copytruncate option of logrotate.



The Eolas Matter, or How IE is Going to Change ‘Real Soon Now’

Do you use <object> functionality in your web application? Do you support Microsoft Internet Explorer?

If so, you might want to take a look at this: Microsoft’s Active X D-Day, which details Microsoft’s plans to change IE to deal with the Eolas lawsuit. Apparently the update won’t be critical, but eventually will be rolled into every version of IE.

Here’s a bit more technical explanation of what how to fix object embedding from Microsoft, and a bit of history from 2003.

Via Mezzoblue.



© Moore Consulting, 2003-2017 +