Running a Google Apps Script Once a Month

I needed a way to email a Google spreadsheet to my boss once a month, for some reporting purposes.  I could have put an entry in my calendar reminding me to do it, but I thought it would be a great time to try out the Google Docs scripting that I had read about for a year or two, and seen an AppSumo video about.  (I got the AppSumo video for free, from an ad on HARO.)

It was laughably easy to get write the actual script (here’s a great set of tutorials).  The only rub was Google doesn’t allow you to run scripts in month intervals, only hourly, daily or weekly.  A small bit of scripting got around that.

Here’s the final script (edited to remove sensitive data):

function myFunction() {
  var dayOfMonth = Utilities.formatDate(new Date(), "GMT", "dd");
  if (dayOfMonth == 05){
    MailApp.sendEmail("", "Spreadsheet Report Subject", 

I set up a daily trigger for this script and installed it within the spreadsheet I needed to send.

I really really like Google Apps Script.  I think it has the power to be the VB of the web, in the way that VB made it easy to automate MS Office, reduce drudgery, and allow non developers to build business solutions.  It also ties together some really powerful tools–check out all the APIs you can access.

Once you let non developers develop, which is what Google Apps Script does, you do run into some maintenance issues (versioning, sharing the code, testing), but the same is true with Excel Macros, and solving those issues is for greater minds than mine.

Useful Tools: StatsMix makes it easy to build a dashboard

I haven’t been to a BDNT lately, but still get their email announcements.  In August, all the 2010 TechStars folks presented, and were listed in the email.  I took a look at each company, and signed up when the company seemed to be doing something cool.  I always want to capture my preferred username, mooreds!

One that was very interesting to me was StatsMix; I signed up for their beta.  On Nov 1, I got invited to sign up.  Wahoo!

Statsmix lets users build custom dashboards.  I am developing an interest in web analytics (aside: if you are interested in this topic, I highly recommended Web Analytics 2.0, by Avinahsh Kaushik).  I’ve been playing with Piwik, an open source analytics toolkit, but Statsmix offers a slicker solution.

They have made it dead simple to create a custom dashboard for users.  They offer integration with, at this time, 29 services (twitter, mailchimp, youtube, Google Analytics, etc).  I could not find an up to date list of integration services outside of their webapplication!  The best I could find was this list from September.  While the integration interface is slick, the data integration is rudimentary.  For example, they will let you monitor the number of rows in a Google Spreadsheet, but nothing more (like rows in different columns, or the value in a particular cell–would be nice to see them integrate with Google Apps Scripting); you can track the number of likes on Facebook, but not the number of comments.

The real power of StatsMix comes from the ease of integration with your own custom stats.  They offer an API which is accessible via REST.  This means that you can push information from your database to a beautiful looking dashboard with shell scripts and a cron job.  Very cool!  It would be nice to see a plugin for Magento or other ecommerce vendors; I recently had a client, The Game Frame, that would have been a great fit for this type of dashboard, since it aggregates beyond what the ecommerce software provides.

Other cool features:

  • The whole UI is beautiful and farily intuitive.
  • The dashboard supports custom date ranges.
  • They will send you an email of stats every day, and apparently have some kind of limited version you can pass onto clients.  I didn’t play with the email feature at all, though it is extremely useful.

However, all is not perfect.  Some issues with StatsMix include:

  • As mentioned above, the integration with third party services leaves something to be desired.  What they offer is a nice start, but it’d be great to see them create some kind of marketplace where developers could build solutions.  For example, the twitter widget only tracks the number of followers.  From the TWitter API, it appears to be pretty easy to track the number of mentions, which could be a useful metric.
  • It wasn’t clear how to share a dashboard, though that may be an upcoming feature.
  • The terms of use are, as always, pretty punishing.
  • Once you develop a number of custom metrics, you are tied to their platform.  That wouldn’t be so bad, except…
  • They are planning to charge for the service, but give no insight into what to expect.  There is a tab called ‘Billing’ but all it says is: “During our beta, StatsMix is free to use. After the beta, you’ll be able to manage your billing preferences on this page.”  If I was considering using this as part of my business, I would want much more insight into possible costs before I committed much time to custom metric buildouts.  I’m fine with them making money, just want more insight into this key aspect of their web app.

All in all, it is well worth a try.  If you to, let me know by posting a comment.  I have 5 invites to give out.

BrowserMob: Load test your applications using the cloud

Via this tweet from Matt Raible, I learned of BrowserMob.  This service allows you to easily load test your web application.

I set it up in about 2 minutes to do a simple load test of a client’s site (though 5 pages).  They make it free to ‘test drive’ their service (though the free not enough to actually stress your site).  It is extremely easy to test a path through a publicly facing system.

The report was good enough; you get screen captures of pages that have failures, and they do a good job of making some of the performance data pretty and intelligible.  Again, I didn’t really load test anything, so I didn’t examine the report as closely as I would have in a real world scenario.  The service is built using Selenium, and I believe they allow you to upload full featured selenium tests (if you have already invested in this technology, but don’t want to build out a cloud network).

This service is of particular interest to me because last year I was part of a project that built a selenium grid on Amazon EC2, using these instructions.

If we’d known about BrowserMob, I’m not sure we would have used them, as I don’t know what our budget was, but it would have been nice to have that in the evaluation mix.

[tags]browsermob, cloud services,load testing[/tags]

In source your EC2 instances

If you have built a killer application on Amazon Web Services, you may reach a point where you don’t want to continue to use them.  I can think of any number of reasons you may want to migrate your servers.

It may be because you’ve reached the 20 server instance, or because you want more control, or because you want to buy your own machines and spend money on a system administrator instead of paying Amazon, or because there’s something that you need customized that’s ‘behind the curtain’ of AWS.

For whatever reason, if you decide to move off Amazon’s elastic compute cloud,  you probably should take a look at Eucalyptus (thanks to George Fairbanks for pointing this out to me!).  From the overview, this is a AWS compatible environment, so you can continue to use the same tools (capistrano!) to manage your instances.  You also gain the same abilities to spin up or spin down servers easily.

What you don’t get is AMI compatibility.  That is, you can’t transfer your AMI to a eucalyptus server farm and expect it to run.  They have a FAQ about AMIs (for 1.5, which is an older version of the software) that points to some forum posts about turning an AMI into an EMI (Eucalyptus Machine Image), but it doesn’t look like a trivial or easy operation.  It does seem possible, though.
However, it’s good to know that it is possible, and that a company can have a migration path off AWS if need be.
[tags]eucalyptus, open source, freedom in the cloud[/tags]

Tips: Deploying a web application to the cloud

I am wrapping up helping a client with a build out of a drupal site to ec2. The site itself is a pretty standard CMS implementation–custom content types, etc. The site is an extension to an existing brand, and exists to collect email addresses and send out email newsletters. It was a team of three technical people (there were some designers and other folks involved, but I was pretty much insulated from them by my client) and I was lucky enough to do a lot of the infrastructure work, which is where a lot of the challenge, exploration and experimentation was.

The biggest attraction of the cloud was the ability to spin up and spin down extra servers as the expected traffic on the site increased or decreased. We choose Amazon’s EC2 for hosting. They seem a bit like the IBM of the cloud–no one ever got fired, etc. They have a rich set of offerings and great documentation.

Below are some lessons I learned from this project about EC2. While it was a drupal project, I believe many of these lessons are applicable to anyone who is building a similar system in the cloud. If you are building an video processing super computer, maybe not so much.

Fork your AMI

Amazon EC2 running instances are instantiations of a machine image (AMI). Anyone can create a machine image and make it available for others to use. If you start an instance off an image, and then the owner of the image deletes the image (or otherwise removes it), your instance continues to run happily, but, if you ever need to spin up a second instance off the same AMI, you can’t. In this case, we were leveraging some of the work done by Chapter Three called Project Mercury. This was an evolving project that released several times while we were developing with it. Each time, there was a bit of suspense to see if what we’d done on top of it worked with the new release.

This was suboptimal, of course, but the solution is easy. Once you find an AMI that works, you can start up an instance, and then create your own AMI from the running instance. Then, you use that AMI as a foundation for all your instances. You can control your upgrade cycle. Unless you are running against a very generic AMI that is unlikely to go away, forking is highly recommended.

Use Capistrano

For remote deployment, I haven’t seen or heard of anything that compares to Capistrano. Even if you do have to learn a new scripting language (Ruby), the power you get from ‘cap’ is fantastic. There’s pretty good EC2 integration, though you’ll want to have the EC2 response XML documentation close by when you’re trying to parse responses. There’s also some hassle involved in getting cap to run on EC2. Mostly it involves making sure the right set of ssh keys is in the correct place. But once you’ve got it up and running, you’ll be happy. Trust me.

There’s also a direct capistrano/EC2 integration project, but I didn’t use that. It might be worth a look too.


If you are doing any kind of database driven website, there’s really no substitute for persistent storage. Amazon’s Elastic Block Storage (EBS) is relatively cheap. Here’s an article explaining setting up MySQL on EBS. I do have a friend who is using EC2 in a different manner that is very write intensive, that is having some performance issues with his database on EBS, but for a write seldom, read often website, like this one, EBS seems plenty fast.

EC2 Persistence

Some of the reasons to use Capistrano are that it forces you to script everything, and makes it easy to keep everything in version control. The primary reason to do that is that EC2 instances aren’t guaranteed to be persistent. While there is an SLA around overall EC2 availability, individual instances don’t have any such assurances. That’s why you should use EBS. But, surprisingly, the EC2 instances that we are using for the website haven’t bounced at all. I’m not sure what I was expecting, but they (between three and eight instances) have been up and running for over 30 days, and we haven’t seen a single failure.

Use ElasticFox

This is a FireFox extension that lets you do every workaday task, and almost every conceivable operation, to your EC2 instances. Don’t delay, use this today.

Consider CloudFront

For distributed images, CloudFront is a natural fit. Each instance can then reference the image, without you needing to sync files across instances. You could use this for other files as well.

Use Internal Network Addressing where possible

When you start an EC2 instance, Amazon assigns it two IP addresses–an external name that can be used to access it from the internet, and an internal name. For most contexts, the external name is more useful, but when you are communicating within the cloud (pushing files around, or a database connection), prefer the internal DNS. It looks like there are some performance benefits, but there are definitely pricing benefits. “Always use the internal address when you are communicating between Amazon EC2 instances. This ensures that your network traffic follows the highest bandwidth, lowest cost, and lowest latency path through our network.” We actually used the internal DNS, but it makes more sense to use the IP address, as you don’t get any abstraction benefits from the internal DNS, which you don’t control–that takes a bit of mental adjustment for me.

Consider reserved instances

If you are planning to use Amazon for hosting, make sure you explore reserved instance pricing. For an upfront cost, you get significant savings on your runtime costs.

On Flexibility

You have a lot of flexibility with EC2–AMIs are essentially yours to customize as you want, starting up another node takes about 5 minutes, you control your own DNS, etc. However, there are some things that are set at startup time. Make sure you spend some time thinking about security groups (built in firewall rules)–they fall into this category. Switching between AMIs requires starting up a new instance. Right now we’re using DNS round robin to distribute load across multiple nodes, but we are planning to use elastic IPs which allow you to remap a routable ip address to a new instance without waiting for DNS timeouts. EBS volumes and instances they attach to must be in the same availability zone. None of these are groundbreaking news, it’s really just a matter of reading all the documentation, especially the FAQs.


Be aware that there are a ton of documentation, one set for each API release, for EC2 and the other web services that Amazon provides. Rather than starting with Google, which often leads you to an outdated version of documentation, you should probably start at the AWS documentation center. This is especially true if you’re working with any of the systems that are newer with perhaps not as stable an API.

In the end

Remember that, apart from new tools and a few catches, using EC2 is not that different than using a managed server where you don’t have access to the hardware. The best document I found on deploying drupal to EC2 doesn’t talk about EC2 at all–it focuses on the architecture of drupal (drupal 5 at that) and how to best scale that with additional servers.

[tags]ec2,amazon web services,capistrano rocks[/tags]

Setting variables across tasks in capistrano

I am learning to love capistrano–it’s a fantastic deployment system for remote server management.  I’m even learning enough ruby to be dangerous.

One of the issues I ran into was I wanted to set a variable in one task and use it in another (or, more likely, in more than one other task).  I couldn’t find any examples of how to do this online, so here’s how I did it:

task :set_var
self[:myvar]= localvar

task :read_var
puts self[:myvar]

Note that myvar and localvar need to be different identifiers–“local variables take precedence”.  Also, the variable can be anything, I think.  I use this method to create an array in one task, then iterate over it in another.

[tags]capistrano, remote deployment, ruby newbie[/tags]

Amazon AMI search

It’s interesting to me that there is no Amazon Machine Image (AMI) search.  AMIs are virtual machine images that you can run on EC2, Amazon’s cloud computing offering.  Sure, you can browse the list of AMIs, but that doesn’t really help.  Finding an image seems to be haphazard, via a google search (how I found this alfresco image) or via the community around a product on an image (like this image for pressflow, a high performance drupal).

I’m not the only person with this complaint.  The Amazon EC2 API only provides limited data about various images, but surely some kind of search mechanism wouldn’t be too hard to whip up, if only on the image owner and platform fields.

Does anyone know where this exists?  My current best solution for finding a specific AMI is to use the fantastic ElasticFox FireFox plugin and just search free form on the ‘Images’ tab.

[tags]amazon, ec2, can I get a ‘search search'[/tags]

Notes from Tom Malaher’s cloud computing presentation

A former colleague, Tom Malaher, did an online presentation about cloud computing on Mar 11 at the Calgary JUG.  You can view the recording of it now.  It was titled: Cloud Computing and Amazon Web services (AWS), and was a great survey of cloud computing and then a nice dive into AWS.  I used to work with Tom and always enjoy the depth and breadth of his presentations.

Below are some of my notes.

  • This was their first online meeting, due to cash flow issues (lack of sponsorship), and to make it easier for speakers out of the Calgary area.  It was put on using  (This client was installed using JNLP; very easy to install and setup).  You can use Elluminate for up to three participants for free (but you cannot record your session).
  • Definition of cloud computing is in tug of war in vendor land.  According to Infrastructure Executive Council, cloud computing is elastic, multi-tenant, on-demand, usage based metering (no long term contracts), self service

Tom outlined a number of variations on cloud computing

  • Infrastructure as a service (s3, ec2)
  • Platform as a service (Google app engine, Microsoft Azure)
  • Software as a service (Google docs,
  • Grid computing–more homogenous, but lots of overlap

Diving into Amazon Web Services, he outlined all the webservices that Amazon provides.  I had already heard of a number of these, but two caught my eye:

  • DevPay–pass through payment for Amazon Web Services.
  • Public Data Sets–public domain data sets easily available for computation on the AWS platform

Composing AWS services makes sense, since there are no bandwidth charges between Amazon service calls within Amazon’s data centers (e.g. EC2->S3).

He had some interesting figures from the IEC: 70% surveyed are not using cloud computer (40% aren’t even considering it).  Only 10% are hosting an ‘app’ on the cloud (with no definition of an app).  I asked a question of Tom about what is considered an app.  I have a client who is hosting backups and images on s3, and friends who regularly back up servers to s3.  Is that an ‘app’?  I don’t think so, but Tom didn’t have a definition of ‘app’ for this survey.

Tom also did an interesting cost analysis when he was looking at pros and cons for AWS.

The high end hosting agreement: 1gb ram 50gb hd, 2000gb transfer: $59/month.

For a comparable AWS instance, with an ec2 image, 1.7 MB ram, 160gb hard drive (ephemeral), 2000 gb transfer, persistent 50gb hard drive: worst case $479.50/month, but for one day: ~$16.

In my opinion, this is the key con of AWS right now, at least for full fledged applications. It’s simply not cost competitive with some of the hosting you can find out there.

And with regular hosts, you don’t have to deal with as much infrastructure overhead. Tools like ElasticFox and S3Fox can help.  I’ve used S3Fox and love it.
The development model is suprisingly similar (Tom mentioned building his demo on his home machine and using some of the more exotic services, like SQS; then, when he was ready for the full cloud deployment, he just moved his war file to the appropriate image after some setup).

Then Tom demoed an app built by composing a number of Amazon web services.  Starting an an ec2 machine image (AMI) takes a long time (but still less than building a machine from scratch :).  During entire presentation and demo (1 hour, 3 instances, some messaging, he was only charged 50 cents.

Other interesting uses: The NY Times used it to build a bunch of web friendly pngs from tiffs of papers past.
You can use a regular RDBMS, with Elastic Block Storage.

Someone asked: where does AWS fit in larger organizations?  Tom thought it was a good fit for small organizations…  But he was not really sure about large organizations.

In my opinion, many of the technical decision makers I know are willing to use S3 as a storage mechanism, but they still want a backup solution, in case Amazon is unavailable (as it sometimes is).  This unavailability would be even more damning if you had an entire webapp running off ec2 and the other services.

Buying your own dedicated server has its own risks, but many people are still used to that paradigm.  But, for quickly scaling, or for a special one time project that needs a lot of firepower (like the NYTimes project above), it makes sense.

Stepping back from AWS, the idea of cloud computing seems to be continuing to make progress and attack the issues of network connectivity, security and cost that make it a hard sell at the present.  I love the delineation of the variations (infrastructure as a service, etc), and not all cloud computing will look like AWS.
Overall, a great presentation.  If you have the time (I stayed for some of the Q&A, and left at the 90 minute mark), it’s worth a listen. Go ahead, check it out.

[tags]cloud computing,cjug[/tags]

© Moore Consulting, 2003-2021