Observations on a Writing a Custom Report with Java, Quickbooks, Jasper Reports, Google Spreadsheets and Google Drive

A recently released project is using java and spring to pull data from quickbooks, a mysql database and google spreadsheets, munging the data in various ways, and using jasper reports and jfreechart to generate a good looking report and a CSV of transactions that will give our brokers weekly updates on how they are doing compared to their goals for the year. I then upload it to Google Drive and send an email notifying each realtor that they have a new file.  It’s always nice to release useful software–feels like a new day dawns.

A few observations from this project:

  • The tech was interesting, but it was actually more interesting to see how the needs of tech drove the business to ‘tighten up’ their processes. Whether that was making sure there was one place to look for user type data, or defining exactly what constituted achieving a goal, or making sure that any new realtors who joined created business goals (and committed them to writing), the binary nature of software forced the business (or, more accurately, people in the business) to make decisions. Just another example of business process crystallization. This also meant deferring some software development. Where the business couldn’t answer a question definitively, rather than force them to do so, we chose to defer development.
  • I’m glad that jasper reports makes it so easy to generate PDFs–you basically create an XML file (I was unable to find a spec, but there were plentiful examples) and then put tokens for dynamic content. Then you compile the XML file, give it a map of said tokens and values (which can be text, numbers, dates or images), and then export the object to PDF. Note that I was not using Jasper in a typical way–reporting from large amounts of similar data via a data connection–because I wanted different PDFs for each user. Perhaps there was a way to do this elegantly, but I was just trying to get stuff done. Creating a grid in jasper was interesting, though.
  • JFreechart had a very weird issue where on stage the graph labels were bolded and italicized, but not on production. Since we make every effort to keep these two environments in sync and they were running exactly the same code, this was a mystery. Finally solved it when we discovered java was different (same version, different vendors: openjdk vs sun java). Had been running this way for years. Oops.
  • Interacting with google spreadsheets is great for the business but a pain in the butt for developers. It’s great for our business because it is extremely easy for someone who is not a programmer to create a ‘database’ that is versioned, backed up, almost always accessible and helps them ‘get stuff done’ in a measured way. It’s a pain for developers because it is a ‘database’ and not a database–no referential integrity or data typing. Also, google provides cell based access and row based access, forcing you to choose. And the java libraries are old. Beats excel though–at least it is accessible from a server. We ended up writing a library to wrap Google’s Java SDK to make some common operations easier.
  • Pushing to google drive is interesting–I alluded to this in my last post but you have to be ready for failure. I ended up using a BlockingQueue for this–I throw files (a data structure defining the file, actually) to be uploaded on the queue, then consumers executing in a different thread each take one off, try to upload it, and if it fails, put it back on. I considered using a third party durable queue like IronMQ, but thought it was overkill.
  • Using the Quickbooks SDK, with all the accounting data exposed, lets you build some pretty powerful graphs that are useful to the business. But the docs are jumbled, with a lot of them aimed at developers who are building integrations to sell to Quickbooks users. Support is OK for standard operations, but for things like renewing your token, you have to drop down to the REST API (see my SO question) This article does a good job of outlining the various projects but as a dev you’ll have to ignore certain sets of information–never fun when getting up to speed.
  • We do a lot of backend processing and spring and maven and a custom assembler that generates a tarball when using ‘maven install’ have been great. I also finally figured out how to work with maven to use ‘release:prepare’ and ‘release:perform’ for releasing libraries, as opposed to going my own way, and that has made things much much easier. Learn your tools, folks!
  • I’m once again astounded by the richness of the java library ecosystem. There doesn’t seem to be very much that I can think of doing that doesn’t have at least one, and probably three, java implementations.

Review of Emergent One

A few weeks ago I sat down with some folks at Emergent One and got a demo of their product.  The reason I reached out to them is because I heard great things about their demo at GlueCon.  My company is considering building a mobile app (who isn’t, right!) and I thought that what EmergentOne offered was a great way to accelerate the server side development of that app.

We are slightly outside of their target market–our API would be solely for internal use of a small team, while it seems like they are aiming at companies who want to make an API available to a larger audience (either external or a larger internal development staff).

Regardless of target market, they have a slick product.  They have a self service web application which can generate APIs directly from database tables.  You allow the app select access to your database.  (I believe only mysql and postgresql are supported at the moment, but I know things are moving fast over there as well.)  You then work within the app to build an API based on the tables in your database.  You can have derived fields as well as fields that map to columns in your database directly.  You can filter your data (so if you only want to expose a subset of your data, you can create an endpoint that only displays that: “users over 40”, for example).  You can also add comments to fields.

After you define as many API endpoints as you want, you can manage access to them with application keys or usernames and passwords.  Automatic documentation with datatypes and whatever comments you have added is generated, and there is a developer portal where it is easy to play around with the APIs and see what you missed.

What they showed me is great, but this product is still in private beta.  That means there are some rough edges.  The biggest hole (a fix for which the demoer promised was coming very soon in their development plan) is that you can’t search against the API.  So, if you have an API exposing your pets table, you can create an endpoint to retrieve all pets, and you can retrieve one pet based on id.  You can create a ‘dogs-only’ endpoint, and get all dogs or one dog.  But you can’t query the dogs-only endpoint for dogs that weigh over 25 pounds and have short hair (or any other type of querying).  I only played with read-only APIs, so I’m not sure how the write-access APIs work.

There’s also always the issue of introducing another vendor into the system.  Since we are looking at this for mobile apps, performance is very important.  It seemed like the demoer was well aware of this issue.  He mentioned an SLA would be likely when they went public, and also talked about some of the steps they are taking to make sure their app uses indexes and other metadata about the tables being exposed to execute as quickly as possible.

I haven’t built an API with any of the other tools out there, so I can’t compare the ease of Emergent One with, say, tools like jboss resteasy that work with a java layer, or usergrid, which autogenerates an API but requires moving data into it.  But I can say that this was a very easy way to go database to API in less than an hour (with iptables troubleshooting mixed in!).  If they get searching right–making it easy to use and performant–this will be a fantastic product.

Using Google Apps to generate a simple site from a form

I was blue skying with a friend the other day and came up with checklist to build a simple canned website from a form submission, using Google Apps and Google Apps Scripting.

  1. User goes to a form.  (More on creating a form)
  2. User submits images, video links, text, email, background color, etc.
  3. That all goes into a google spreadsheet, which triggers the building of a site.  Here’s trigger info.
  4. You can automatically build a google site.  Here’s the api: site creation, page creation.  If you want your own domain name (example.com) I think you can do that, but you’d need to integrate with a DNS service to set that up–this part bears a little more investigating.
  5. If desired, you can generate the QR code for the site.
  6. If you need to tell the form submitter that the site is created, you can email them a link to the QR code so they can print it out.
  7. You can even remove content from the site via cron (see triggers link up above), though you can’t automatically delete the site.
  8. It’s conceivable you could even use google checkout to charge users beforehand (though this is a bit less supported).

What’s great is that you don’t need to have any tool other than a browser to build this entire stack.  Google is truly creating a toolkit/macro language in the cloud.

Image Recognition Automation Fail

I had a friend who asked me to take a quick look at a business problem he was having.  He had a set of photos (of vinyl record albums) that he was looking at to identify the artist.  After finding the artist, he’d do some additional categorization work and then push the image and metadata to a ecommerce site he is running.

He wanted a way to more quickly identify the artist and album, preferably without his intervention.

The first thing I suggested was Mechanical Turk, as this seemed like a great example of a Human Interaction Task.  However, my friend tried this and found it to be more work (mostly proofing, I think) than it was help.

He also pointed out that Google Image Search does exactly what he wants.  You can post an image or URL to it (you have to use the camera in the search box), and it will give you back like images, best guess for the image matching, and links related to the image.  Pretty cool!

However, there is no API for the new Google Image Search.  Sure, you have the old, deprecated Images API, but it doesn’t have access to post an image or a URL, just keywords.  A bit of looking around revealed this StackOverflow post, which deconstructs the new Image Search parameters.  However, the deconstructed URL gives back javascript which needs to be executed by a browser.  I looked briefly at Selenium and webdriver to do that, but couldn’t figure it out.

I also looked at Kooaba’s API, but they didn’t get back to me when I signed up for a free developer account, and their API only covers books, CDs and DVDs.  Also from a StackOverflow post, I looked at MacroGlossa and IQEngines.  Neither of them seemed to work–MacroGlossa wanted a category (and, shocker, vinyl record albums was not a category) and IQEngines let me submit an image wasn’t successful in identifying it.

I had to admit I was defeated.

Running a Google Apps Script Once a Month

I needed a way to email a Google spreadsheet to my boss once a month, for some reporting purposes.  I could have put an entry in my calendar reminding me to do it, but I thought it would be a great time to try out the Google Docs scripting that I had read about for a year or two, and seen an AppSumo video about.  (I got the AppSumo video for free, from an ad on HARO.)

It was laughably easy to get write the actual script (here’s a great set of tutorials).  The only rub was Google doesn’t allow you to run scripts in month intervals, only hourly, daily or weekly.  A small bit of scripting got around that.

Here’s the final script (edited to remove sensitive data):

function myFunction() {
  var dayOfMonth = Utilities.formatDate(new Date(), "GMT", "dd");
  if (dayOfMonth == 05){
    MailApp.sendEmail("email@example.com", "Spreadsheet Report Subject", 

I set up a daily trigger for this script and installed it within the spreadsheet I needed to send.

I really really like Google Apps Script.  I think it has the power to be the VB of the web, in the way that VB made it easy to automate MS Office, reduce drudgery, and allow non developers to build business solutions.  It also ties together some really powerful tools–check out all the APIs you can access.

Once you let non developers develop, which is what Google Apps Script does, you do run into some maintenance issues (versioning, sharing the code, testing), but the same is true with Excel Macros, and solving those issues is for greater minds than mine.

Useful Tools: StatsMix makes it easy to build a dashboard

I haven’t been to a BDNT lately, but still get their email announcements.  In August, all the 2010 TechStars folks presented, and were listed in the email.  I took a look at each company, and signed up when the company seemed to be doing something cool.  I always want to capture my preferred username, mooreds!

One that was very interesting to me was StatsMix; I signed up for their beta.  On Nov 1, I got invited to sign up.  Wahoo!

Statsmix lets users build custom dashboards.  I am developing an interest in web analytics (aside: if you are interested in this topic, I highly recommended Web Analytics 2.0, by Avinahsh Kaushik).  I’ve been playing with Piwik, an open source analytics toolkit, but Statsmix offers a slicker solution.

They have made it dead simple to create a custom dashboard for users.  They offer integration with, at this time, 29 services (twitter, mailchimp, youtube, Google Analytics, etc).  I could not find an up to date list of integration services outside of their webapplication!  The best I could find was this list from September.  While the integration interface is slick, the data integration is rudimentary.  For example, they will let you monitor the number of rows in a Google Spreadsheet, but nothing more (like rows in different columns, or the value in a particular cell–would be nice to see them integrate with Google Apps Scripting); you can track the number of likes on Facebook, but not the number of comments.

The real power of StatsMix comes from the ease of integration with your own custom stats.  They offer an API which is accessible via REST.  This means that you can push information from your database to a beautiful looking dashboard with shell scripts and a cron job.  Very cool!  It would be nice to see a plugin for Magento or other ecommerce vendors; I recently had a client, The Game Frame, that would have been a great fit for this type of dashboard, since it aggregates beyond what the ecommerce software provides.

Other cool features:

  • The whole UI is beautiful and farily intuitive.
  • The dashboard supports custom date ranges.
  • They will send you an email of stats every day, and apparently have some kind of limited version you can pass onto clients.  I didn’t play with the email feature at all, though it is extremely useful.

However, all is not perfect.  Some issues with StatsMix include:

  • As mentioned above, the integration with third party services leaves something to be desired.  What they offer is a nice start, but it’d be great to see them create some kind of marketplace where developers could build solutions.  For example, the twitter widget only tracks the number of followers.  From the TWitter API, it appears to be pretty easy to track the number of mentions, which could be a useful metric.
  • It wasn’t clear how to share a dashboard, though that may be an upcoming feature.
  • The terms of use are, as always, pretty punishing.
  • Once you develop a number of custom metrics, you are tied to their platform.  That wouldn’t be so bad, except…
  • They are planning to charge for the service, but give no insight into what to expect.  There is a tab called ‘Billing’ but all it says is: “During our beta, StatsMix is free to use. After the beta, you’ll be able to manage your billing preferences on this page.”  If I was considering using this as part of my business, I would want much more insight into possible costs before I committed much time to custom metric buildouts.  I’m fine with them making money, just want more insight into this key aspect of their web app.

All in all, it is well worth a try.  If you to, let me know by posting a comment.  I have 5 invites to give out.

BrowserMob: Load test your applications using the cloud

Via this tweet from Matt Raible, I learned of BrowserMob.  This service allows you to easily load test your web application.

I set it up in about 2 minutes to do a simple load test of a client’s site (though 5 pages).  They make it free to ‘test drive’ their service (though the free not enough to actually stress your site).  It is extremely easy to test a path through a publicly facing system.

The report was good enough; you get screen captures of pages that have failures, and they do a good job of making some of the performance data pretty and intelligible.  Again, I didn’t really load test anything, so I didn’t examine the report as closely as I would have in a real world scenario.  The service is built using Selenium, and I believe they allow you to upload full featured selenium tests (if you have already invested in this technology, but don’t want to build out a cloud network).

This service is of particular interest to me because last year I was part of a project that built a selenium grid on Amazon EC2, using these instructions.

If we’d known about BrowserMob, I’m not sure we would have used them, as I don’t know what our budget was, but it would have been nice to have that in the evaluation mix.

Technorati Tags: , ,

In source your EC2 instances

If you have built a killer application on Amazon Web Services, you may reach a point where you don’t want to continue to use them.  I can think of any number of reasons you may want to migrate your servers.

It may be because you’ve reached the 20 server instance, or because you want more control, or because you want to buy your own machines and spend money on a system administrator instead of paying Amazon, or because there’s something that you need customized that’s ‘behind the curtain’ of AWS.

For whatever reason, if you decide to move off Amazon’s elastic compute cloud,  you probably should take a look at Eucalyptus (thanks to George Fairbanks for pointing this out to me!).  From the overview, this is a AWS compatible environment, so you can continue to use the same tools (capistrano!) to manage your instances.  You also gain the same abilities to spin up or spin down servers easily.

What you don’t get is AMI compatibility.  That is, you can’t transfer your AMI to a eucalyptus server farm and expect it to run.  They have a FAQ about AMIs (for 1.5, which is an older version of the software) that points to some forum posts about turning an AMI into an EMI (Eucalyptus Machine Image), but it doesn’t look like a trivial or easy operation.  It does seem possible, though.
However, it’s good to know that it is possible, and that a company can have a migration path off AWS if need be.

Technorati Tags: , ,

© Moore Consulting, 2003-2017 +