Skip to content

Building a System with IronMQ and Python

messages photo
Photo by andrewrennie

One of my most recent projects was writing a system to deliver real estate listing data to a content management system. The CMS was not in my control. Since the listing data source was bursty and I wasn’t sure how the CMS would handle the load, I decided to use a message queue, where the messages would have a JSON payload. Message queues are great at decoupling components of a system.

For the queue, I used IronMQ. The company already was using it, it has a free tier (up to 24 messages a second), the service has been stable and reliable, it has great language SDKs, and setting up a durable message queue is something I’d rather outsource. (I do wish Zapier supported it.) In other situations (when posting messages from mobile apps), we ran Varnish in front of IronMQ so that it could be replaced easily. In this case, we didn’t because there were fewer moving pieces (it was server to server communication and it would be easier to swap out IronMQ should that be required).

I wrote the bridge code from the listing database to the message queue in python. The shop was mostly Java and some python, and python seemed a better fit for a small ‘pull from here, push to there’ application. I used pytest for unit testing, jenkins to run the unit tests in a CI environment, and autopep8 for formatting. My colleague was a more experienced python programmer, so I was able to lean on him for questions. I didn’t find python hard to pick back up (I’d scripted in python a little years ago), and it was a fun language to code in. Reminded me of perl w/r/t packages and quick developer feedback. I did miss Java’s dependency management, though (my college recommended virtualenv as a possible solution).

The JSON payload would allow developers writing the message consumer to use almost any language they wanted–any language if they used the IronMQ REST API rather than an SDK.

I can’t share the code, but for this kind of problem, python was a great solution. And I’ll reach for IronMQ any time I need a message queue. This pair of technologies was quick to implement, easy to deploy, and high performance wasn’t really a requirement, since the frequency of the listing delivery was the real bottleneck.

On the benefits of a private, internal API

polygon photo
Photos by Double–M

A few years ago, the company for which I worked went through the monumental task of defining neighborhoods for a number of cities in the area where they had real estate agents.  Neighborhood data is hard to get, and this task required a lot of back and forth between the person responsible for the mapping and the people who knew the neighborhoods.  The maps were captured in Google’s My Maps feature, and exported as KML to a vendor who would then build neighborhood pages and maps with the data.  Much of the neighborhood page would be driven off data entered in an admin back end system (it was a custom CMS, essentially).

Almost as an afterthought, I asked the vendor to provide an API for the neighborhoods, including the polygon data.  I wrote up an API spec, had it reviewed by my team, and obtained approval for the vendor to build it. If I recall, it was in the neighborhood of a couple thousand dollars, and the vendor had never been asked to build something like this before.

This one API allowed the company to apply dearly won neighborhood information in so many ways:

  • generate statistics by neighborhood against any lat/lng coded data
  • tag any geocoded content with neighborhood meta data
  • find new and sold listings by neighborhood
  • understand who were top listing agents in each neighborhood
  • create internal BI tools
  • write internal recruiting tools
  • pull other geocoded data by neighborhood
  • tag transactions with neighborhood meta data

Many of these were accomplished with a plugin to the data processing tool (Pentaho Kettle) that used the Java Topology Suite. Creating JTS geometries is expensive, so the plugin caches them with a simple hashmap cache. The plugin java code is garbage collected fully on each data load run, so this simple solution is appropriate, rather than a more complex LRU cache.

However, this solution isn’t perfect. Often, if a property was on the boundary, the JTS code would often put it in the wrong neighborhood. Boundaries of neighborhoods are incorrect or overlap. Points are incorrect because geocoding isn’t perfect. Human review is still required.

But, the very fact that the neighborhood data was so accessible meant that the company could ask questions (how many homes are in each neighborhood, what are the three newest listings in this neighborhood) that simply couldn’t have been asked if there was no API. Having an internal API that exposed hard won business knowledge within the company was beneficial, even if it will never be exposed or monetized outside the company.

Posting to REST APIs from mysql triggers

This was a fun crazy idea that turned out not to be so good in practice. If you have a mysql table you are monitoring for changes, you can use a trigger to do so (as long as you have a semi-modern version of msyql).

Sometimes you might want to notify another service of any change (remote logging service, message queue, etc). For instance, at 8z, when the price of a listing changes, this is an interesting event that other software should be notified of.

The first step is to install the mysql http UDF (all commands below are for centos).

$ sudo yum install curl-devel

$ CPPFLAGS="-I/usr/include/mysql"  ./configure --with-mysql=/etc/my.cnf   --enable-shared

$ make

This gives us a .la file (built with libtool) but luckily there’s a .so file hiding:

sudo cp .libs/mysql-udf-http.so.0.0.0 /usr/lib64/mysql/plugin/mysql-udf-http.so

Then, your remote rest service will probably want JSON, so you’ll want to visit the mysql udf hub. There lives a JSON function. I just grabbed the C file from github and compiled it: gcc -fPIC -shared -o lib_mysqludf_json.so -I/usr/include/mysql/  lib_mysqludf_json.c

After these are compiled you have to copy the .so files to the appropriate directory and then add functions as a privileged user, per usual UDF convention.

Now you can create a trigger that calls these functions.

delimiter $
CREATE TRIGGER upd_check BEFORE UPDATE ON account 
FOR EACH ROW 
  BEGIN 
    IF NEW.amount > 0 THEN 
      set @json = select json_object(account_id,amount) 
      select http_post('http://restservice.example.com/account/post',@json); 
    END IF; 
  END;$ 

delimiter;

Note that I didn’t actually implement this, so the code up above is based on my memory. If you try this and have some corrections, please leave them in the comments and I’ll update this post.

Why didn’t I actually implement this, after doing the better part of a day’s worth of research? The database is not the best place for this kind of logic. Error handling for triggers is weak–if the trigger failed for any reason (like the remote service was down), you would need to build an error logging system, or send an email. Also, if there are a number of updated rows, which all trigger outbound http calls, you might run into performance issues which would be difficult to replicate or analyze, and, most importantly, might impact your database’s ability to act as a database. The three tier architecture exists for a reason.

But, it was fun to investigate, and, as my colleague said, would have been cool if it had worked. If you are still interested, there’s more on this topic here.

Update Jan 2021: The HTTP UDF is now here and a reader pointed out they had trouble with the above instructions, but that a golang UDF implementation worked fine.

Building an automated postcard mailing system with Lob and Zapier

Courtesy of smoothfluid
Courtesy of smoothfluid

I was looking at automated paper mailing systems recently (and listed what I found), and was especially impressed with Lob, especially the ease of its API.

Among other printing services, Lob will let you mail a postcard with a custom PDF on both sides, or a custom PDF on one side and a text message on the other, anywhere in the USA for $0.94.  (Sorry, not sure about international postcards) The company for which I work sends out tens of thousands of postcards every quarter. The vendor which we use charges them a similar fee (less, but in the same ballpark) but there’s a manual process to deliver the collateral and no API. So an on-demand, one by one post card sending system is very interesting to me.

Note that I haven’t received the Lob postcard which I sent myself, so I can’t speak to quality. Yet.

The Lob API is a bit weird, because the request is form encoded rather than a JSON payload.  It also uses basic auth, but only the username, not the password. But the API seems to have all the pieces you’d need to generate all kinds of postcards–reminder postcards, direct mail postcards, photo postcards, etc.

After testing out the service via the web interface and cURL examples, I thought that it’d be fun to build a Zapier zap. In particular, being able to send a postcard for an entry in a Google spreadsheet seemed like a useful use case. Plus, Zapier is awesome, and I’d wanted to test out their integration environment for myself.

So, I built a Zapier integration for Lob, using the Zapier developer docs in combination with the Lob developer docs. It was actually easy. The most complicated step was translating the Zapier action data, which is a one or two dimensional array of typed data, into the Lob data format, which wanted a couple of text fields and two address arrays. Zapier has a scripting environment that let me modify data from APIs pre and post send, and even had an example about form encoded APIs. Zapier’s JavaScript scripting development environment was full featured, including syntax and error highlighting. It had no real debugging available, but I could use the venerable debug-by-log-statement method fairly easily.

Where could I take this next? Everywhere people use postcards in real life. The postcards depend on PDF files (see a sample), so if you are generating a custom postcard for each interaction things become more complex, but there are a few APIs (based on a 30 second google search, here and here) available for dynamic PDF generation. There are also limits on API call throughput, if I stuck to the Zapier integration–I could send at most 300 postcards a day, unless I managed multiple spreadsheets.

I see reminders of high value events (dentist, house maintenance, etc), contests and marketing as key opportunities for this type of service. And if I had a product where direct mail was a key component, using Lob directly would be worth serious consideration.

Regarding the Zap, I believe I cannot make this Zap available to anyone else. Since I’m not a representative of Lob, I couldn’t commit to maintaining this Zap, and Zapier doesn’t want to have any of their customers depending on an integration that could disappear or be unsupported at any time–a fair position.

If the Zapier or Lob folks want to take this integration and run with it, I’d be happy to share my code–just leave a comment. If anyone else is interested in being able to generate Lob postcards from a Google spreadsheet (or any other compatible API) via Zapier integration, let me know and I’ll see what I can do.

Zapier is awesome!

APIs are sprouting up everywhere. This is great for developers (and for end users) because it allows all kinds of automation. However, there are times when the investment of writing code to connect two APIs is too much.

Why might writing code be too much?

  • The problem is still fluid and writing code will lock in a solution.
  • There are more pressing business problems to solve.
  • There are no engineering resources available, and/or no money to hire a dev.

If any of these reasons apply to a problem you are facing, consider Zapier (or a competitor like IFTTT). These services are much like Excel macros–they require less software engineering expertise but can leverage some of the power of programming to automate away work. I’m only going to write about Zapier, since that is the solution with which I am familiar.

Zapier runs the connections between each service (called ‘Zaps’) at regular intervals. Zaps are built using a web only interface that leverages the APIs in a manner that, while not completely intuitive, is thoroughly manageable by anyone who can sum up a column in Excel.

Here’s the class of problems for which Zapier is good:

  • Connecting two services for which Zapier has connectors–this list is quite extensive.
  • Syncing needs to happen no more than every 5 minutes (15 for the free account).
  • No processing of the data during the transfer is needed (except possibly omitting some fields)–you are simply moving data from one place to another–no Yahoo! Pipes like transformations are possible in transit.
  • One way sync is OK (though there are workarounds).
  • You don’t need to bulk load initial data via Zapier–you either can disregard initial data or load it in some other fashion.
  • You have a reasonable number of sources and sinks–each linked source and sink will take up one Zap.
  • You have to have valid accounts with each source and sink.

That’s a fair number of limitations, but even so there are a large number of common problems that are solvable by Zapier. Some examples:

  • Syncing a list of contacts from one source to another.
  • Taking a google form submission and adding a user to a mailing list.
  • Moving a row from one Google spreadsheet to another.
  • Taking an email and adding it to a database.
  • Adding a customer who you have just invoiced in QuickBooks to an email list.

These are all types of problems that can be done manually, but if frequency or scale increases, the process can run people ragged.

I want to call special attention to the email processing ability of Zapier. If you have a well-formatted email that you often receive that you want to further process, Zapier can parse it into interesting fields and send that data along to a sink like a Google Spreadsheet. Examples of well-formatted emails include order confirmations, contact us forms, and newsletter subscriptions.

I have found the Zapier support folks very responsive, whether that was troubleshooting an issue with Google docs, finding out how to pronounce the company name, or explaining why having 100 Zaps reading from one Google spreadsheet was a bad idea. Having responsive support staff reassures me, because once Zapier gets embedded into business processes, ripping it out is going to be very painful and a lot of work.

You can also write your own Zaps if you have custom APIs that you’d like to integrate with. I haven’t explored this much–it does seem like a developer centric task.

Zapier is not an all purpose tool nor a total replacement for developers, but it is definitely a great app to have in your toolbox. Take a look and see what little (or big) niggling problem that you haven’t had time to write code for (or, if you can’t write code, what you haven’t been able to get an engineer to write code for) it might solve.

Observations on a Writing a Custom Report with Java, Quickbooks, Jasper Reports, Google Spreadsheets and Google Drive

Image courtesy of Sean MacEntee: https://flic.kr/p/9P4CwC
Image courtesy of Sean MacEntee

A recently released project is using java and spring to pull data from quickbooks, a mysql database and google spreadsheets, munging the data in various ways, and using jasper reports and jfreechart to generate a good looking report and a CSV of transactions that will give our brokers weekly updates on how they are doing compared to their goals for the year. I then upload it to Google Drive and send an email notifying each realtor that they have a new file.  It’s always nice to release useful software–feels like a new day dawns.

A few observations from this project:

  • The tech was interesting, but it was actually more interesting to see how the needs of tech drove the business to ‘tighten up’ their processes. Whether that was making sure there was one place to look for user type data, or defining exactly what constituted achieving a goal, or making sure that any new realtors who joined created business goals (and committed them to writing), the binary nature of software forced the business (or, more accurately, people in the business) to make decisions. Just another example of business process crystallization. This also meant deferring some software development. Where the business couldn’t answer a question definitively, rather than force them to do so, we chose to defer development.
  • I’m glad that jasper reports makes it so easy to generate PDFs–you basically create an XML file (I was unable to find a spec, but there were plentiful examples) and then put tokens for dynamic content. Then you compile the XML file, give it a map of said tokens and values (which can be text, numbers, dates or images), and then export the object to PDF. Note that I was not using Jasper in a typical way–reporting from large amounts of similar data via a data connection–because I wanted different PDFs for each user. Perhaps there was a way to do this elegantly, but I was just trying to get stuff done. Creating a grid in jasper was interesting, though.
  • JFreechart had a very weird issue where on stage the graph labels were bolded and italicized, but not on production. Since we make every effort to keep these two environments in sync and they were running exactly the same code, this was a mystery. Finally solved it when we discovered java was different (same version, different vendors: openjdk vs sun java). Had been running this way for years. Oops.
  • Interacting with google spreadsheets is great for the business but a pain in the butt for developers. It’s great for our business because it is extremely easy for someone who is not a programmer to create a ‘database’ that is versioned, backed up, almost always accessible and helps them ‘get stuff done’ in a measured way. It’s a pain for developers because it is a ‘database’ and not a database–no referential integrity or data typing. Also, google provides cell based access and row based access, forcing you to choose. And the java libraries are old. Beats excel though–at least it is accessible from a server. We ended up writing a library to wrap Google’s Java SDK to make some common operations easier.
  • Pushing to google drive is interesting–I alluded to this in my last post but you have to be ready for failure. I ended up using a BlockingQueue for this–I throw files (a data structure defining the file, actually) to be uploaded on the queue, then consumers executing in a different thread each take one off, try to upload it, and if it fails, put it back on. I considered using a third party durable queue like IronMQ, but thought it was overkill.
  • Using the Quickbooks SDK, with all the accounting data exposed, lets you build some pretty powerful graphs that are useful to the business. But the docs are jumbled, with a lot of them aimed at developers who are building integrations to sell to Quickbooks users. Support is OK for standard operations, but for things like renewing your token, you have to drop down to the REST API (see my SO question) This article does a good job of outlining the various projects but as a dev you’ll have to ignore certain sets of information–never fun when getting up to speed.
  • We do a lot of backend processing and spring and maven and a custom assembler that generates a tarball when using ‘maven install’ have been great. I also finally figured out how to work with maven to use ‘release:prepare’ and ‘release:perform’ for releasing libraries, as opposed to going my own way, and that has made things much much easier. Learn your tools, folks!
  • I’m once again astounded by the richness of the java library ecosystem. There doesn’t seem to be very much that I can think of doing that doesn’t have at least one, and probably three, java implementations.

Want your next software purchase to ‘talk’ to your existing systems?

I wrote a post, over at the Geek Estate Blog (one of the best real estate tech blogs, imho) about APIs and why if you are a realtor or a broker, you should look for software with an API

Well, first, let’s talk about what an API is. API stands for “Applications Programming Interface”–not very illuminating. What that means is that the software providing the API has a way to allow other software (applications) to interact (interface) with it, with no human involvement. That is, the two software systems can ‘talk’ to each other.

Full post here.

How to access your QuickBooks Online data via API from the command line

As is so often with this blog, I’m documenting something that took me a long time to figure out.

The problem: the company I work for has data in QuickBooks Online (QBO) that we’d like to distribute elsewhere in our organization.

The answer: QBO has an API. Yahoo! They have several different SDKs (for .NET, Java and PHP) to make access even easier. (The APIs and surrounding docs are called the Intuit Partner Platform, or IPP.)

However…

The issue is that requests to the QBO API must be authenticated with OAuth (I believe it is OAuth 1.0a). And the entire Intuit Partner Platform documentation is focused on needs of developers who are building SaaS applications that consume or augment QBO data and are accessed via the QBO webapp. Which means there’s a heavy focus on web applications and OAuth flows via web applications.

But all I wanted was a command line client that could use the API’s query interface.

I was stymied by OAuth. In particular, I couldn’t find a way to get the accessToken and accessTokenSecret. I tried a number of different tacks (I beat my head against this for the better part of day).

But I just couldn’t find a way to generate the needed tokens, with either the PHP or Java SDK clients (most of the links in this post are for the Java client, because that’s more mature–the PHP client doesn’t support JSON output or have reference documentation).

Desperate, I turned to the IPP forums, which were full of advanced questions. I did stumble on this gem: “Simple way to integrate with our Quickbooks Account”, which took me to the OAuth Playground for IPP Developers. And, voila, if you follow the steps in the playground (I used IE because FireFox failed), you will end up with a valid accessToken and accessTokenSecret.

So, with that sad story told, here’s exactly how you can take access your own QBO data via the command line (I only cover a trial account here, but believe the process is much the same with a paid account):

  1. Start your IE browser (I’m using IE 10 on windows 8)
  2. Go to https://developer.intuit.com/us
  3. Sign up (click the ‘join’ link), click ‘remember me’
  4. Go to your email and find the verify link that was sent to you. Paste it into your IE browser.
  5. Sign in again
  6. Click on ‘My Apps’
  7. Click on ‘Create New App’, then ‘QuickBooks API’
  8. Fill out the name of the app, and the other required items. You can change these all later (I think). I know you can change the URLs later.
  9. Select the level of data access you need. Since this is a test app, you can select ‘All Accounting’
  10. Click ‘Save’
  11. Open up another tab in IE and go to the QuickBooks Online site (We are just adding some dummy data here, so if you have an account, you can skip this.)
  12. Click on ‘Free Trial’
  13. Click on ‘QuickBooks Online Plus’
  14. Click on ‘Already have an Intuit user ID’
  15. Fill out the username and password you used on when you signed up for your developer account.
  16. Ignore the upsell, if any
  17. Click the customers tab
  18. Click on the ‘new customer’ button
  19. Enter a first name and last name then press save
  20. Open a new tab and go to the API Console
  21. Choose the company that you want to access, and note the number next to that name. That is the company ID or the Realm ID.
  22. Open a new tab and go to the OAuth playground
  23. Go back to the developer.intuit.com tab
  24. Grab your app token (looks like b3197323bda36333b4b5fb17774440fe34d6)
  25. Go to the OAuth playground tab and put your app token in the proper field (called ‘App Token’). You’ll also want to have that later, so note it now.
  26. Click ‘Get Key and Secret using App Token’
  27. Note the consumer key and consumer secret, you’ll need them later.
  28. Click ‘Get Request Token Using Key and Secret’
  29. Click ‘Authorize Request Token’
  30. You should see a message like ‘testapp3 would like to access your Intuit company data’
  31. Click ‘Authorize’
  32. You should see a message like ‘You are securely connected to testapp3’
  33. Click ‘Return to TestApp3’
  34. Scroll down to the bottom, and you should see entries in the ‘Access Token’ and ‘Access Token Secret’ fields. Copy those, as you’ll need them later as well.
  35. Go to the SDKs page of developer.intuit.com
  36. Pick your language of choice, and follow the installation instructions.
  37. Follow the instructions in the ‘Data Service APIs’ section about setting up your environment. For Java, you’ll need to pull a few jar files into your classpath. Here’s my list of jar files in my Eclipse build path: ipp-java-qbapihelper-1.2.0-jar-with-dependencies.jar, ipp-v3-java-devkit-2.0.3-jar-with-dependencies.jar
  38. Write and run a class (like the one below) that runs a query, plugging in the six variables the values you captured above.
import static com.intuit.ipp.query.GenerateQuery.$;
import static com.intuit.ipp.query.GenerateQuery.select;

import com.intuit.ipp.core.Context;
import com.intuit.ipp.core.ServiceType;
import com.intuit.ipp.data.Customer;
import com.intuit.ipp.exception.FMSException;
import com.intuit.ipp.query.GenerateQuery;
import com.intuit.ipp.security.OAuthAuthorizer;
import com.intuit.ipp.services.DataService;
import com.intuit.ipp.services.QueryResult;

public class TestQBO {

	public static void main(String[] args) throws FMSException {
		String consumerKey = "...";
		String consumerSecret = "...";
		String accessToken = "...";
		String accessTokenSecret = "...";
		String appToken = "...";
		String companyId = "...";
		
		OAuthAuthorizer oauth = new OAuthAuthorizer(consumerKey, consumerSecret, accessToken, accessTokenSecret);           
		
		Context context = new Context(oauth, appToken, ServiceType.QBO, companyId);
		
		DataService service = new DataService(context);
		
		Customer customer = GenerateQuery.createQueryEntity(Customer.class);
		
		String query = select($(customer.getId()), $(customer.getGivenName())).generate();
		
		QueryResult queryResult = service.executeQuery(query);
	
		System.out.println("from query: "+((Customer)queryResult.getEntities().get(0)).getGivenName());      
	}
}

This code gets the first name and id of the first customer in your database, and prints it to stdout. Obviously just a starting point.

I am also not sure how long the accessToken and accessTokenSecret are good for, but this will actually give you command line access to your QBO data.

(Of course, I could have just used zapier, but that has its own limitations (limited ability to query data in an adhoc manner being the primary one).

Google Spreadsheets as REST API sources

I recently built out a read only JSON API with a Google spreadsheet as the back end data source.

Why do such a thing? I didn’t need to modify the back end, but wanted to make the data available to other software. The people who maintain this data are very comfortable using Google spreadsheets. While I could have written a custom CRUD app, this didn’t seem like a good use of time when Google spreadsheets had served us well in the past.

How did I do this? I first of all created a sanitized spreadsheet, using importrange and regexreplace. This level of indirection assures me that if the source data changes, I can adjust fairly easily. If the user managing the spreadsheet wants to rearrange columns, I can adjust my sanitized spreadsheet easily.

Then I created a Google apps script and used doGet to respond to requests and the spreadsheet service to retrieve the data. The content service lets you serve JSON with the appropriate mime type. I used qunit for Google apps script (invaluable when you are working in the loosely typed javascript world and relying on cloud resources). I also worked with the parameters to build the querying needed for our application.

Then, in order to make it look like a normal API call, I fronted the script with Varnish and did some regsub magic in my VCL file (as well as some light authentication).

This approach has the benefit of keeping everything in Google’s cloud, and allowing you to access the spreadsheet data easily.

This approach has significant limitations, however.

  • Google apps scripts calls are slow, especially when accessing spreadsheet data. Using the cache service can help.
  • You cannot return anything other than a 200 response code. None of the other response codes are available.
  • The actual content is served after a redirect, so caching it at the Varnish level is difficult (though possible), and clients must be able to follow redirects.
  • Google changes the ip of the server running the script. This is not such a big problem, unless your version of Varnish only takes IP addresses in the VCL file, not hostnames. Like ours.

Was this a good idea? Well, it let me build out the API relatively quickly without affecting the users managing the data or finding any other place to put it. But we’ll probably move away from this due to the limitations listed above. One we’ve found particularly painful is the IP address switching, which usually only shows up in our automated testing.

We’ll probably start pushing the data daily (it doesn’t change all that often) to a local JDBC database using the JDBC service and use either RestSQL or DropWizard to generate an API for it. (RestSQL is quicker, but DropWizard lets us maintain format compatibility.)

A tutorial for the Census Data API

I’ve been exploring APIs ever since attending the hackathon at Gluecon last year.  Since I work for a real estate firm, I’m interested in APIs with a geographic component.  I had heard that USA Today had APIs for census data.  They do.  I also wondered if the census bureau had any APIs available.  They do.  However, the USA Today APIs are much much friendlier.

But the USA Today APIs have some limitations.  Also, the data sets of these two APIs intersect, but neither is a subset of the other.  So, I thought it’d be useful to have a quick start for the census data API.

  1. First, register for an API key.  This only requires an email address and a name.  The key got sent to my spam folder, so make sure you check that if you don’t see the email about five minutes after you submit.
  2. Decide what type of information you want to find.  There are two sets of information–the 2010 census summary file or thee American community survey (for the last five orsix years).  Since the ACS is not available in the USA Today  API, that’s what I’ll explore, but the methodology is similar for the summary file.
  3. The first place to look is the variables XML file showing you what data is available.  Here is the ACS variables file for the most recent five years, and here is the census summary variables file.  For a high level overview of what you can find, you’ll want to focus on the ‘concept’ tags.  You’ll see names like “C24070.  Industry by Class of Worker for the Civilian Employed Population 16 Years and Over” and “B24123.  Detailed Occupation by Median Earnings for the Full-Time, Year-Round Civilian Employed Female Population 16 Yrs and Over”.  While the Census has documents explaining what each of these mean, this website seems to have aggregated the metadata that will let you explore the types of data available.
  4. OK, you’ve found your interesting data.  I’m going to look at the concept “B08126.  MEANS OF TRANSPORTATION TO WORK BY INDUSTRY” for the rest of this example.  You need to pick a variable from the concept, so I’ll pick B08126_003E/Construction and some subsets: B08126_018E/Car, truck, or van – drove alone, B08126_033E/Car, truck, or van – carpooled, and B08126_048E/Public transportation (excluding taxicab). As you see, you can request multiple variables at once.
  5. Now you need to find the area that you want to look at.  The census developer site has examples which I found to be massively confusing when I first looked at them.  What you need to do is decide what level of geography you want, and then find the appropriate code.  Here are state codes.  If you want anything more detailed, you’ll probably want a FIPS code which ranges from states to as small as the Census data is gathered.  Here’s a list of valid geographic delineations.  You may have the information if you are interested in a county, but if all you have is a lat/lng (as if, say, you are looking at a property listing), you can look up the FIPS code.  If you need to get a lat/lng for a city, I’d suggest google.
  6. I’m going to look up Lafayette, CO, which google tells me is at 39.9936, -105.0892.  So, the FIPS request looks like http://data.fcc.gov/api/block/find?latitude=39.9936&longitude=-105.0892&showall=false&format=json and reveals that the FIPS code for this lat/lng is 080130608005010.
  7. A FIPS code is broken into sections.  The first two numbers are the state, so 08 is Colorado’s code.  The next three are the county, so 013 represents Boulder county.  The next six are the tract code, so 060800 is the tract that we are looking at (tracts “provide a stable set of geographic units for the presentation of decennial census data”).  You can see tracts on maps, and our specific tract [pdf].  After the tract comes the block, which is an even smaller subsection of homes.  The block is 5010.  Apparently FIPS codes are on their way out, in favor of GNIS codes.  I couldn’t find a lat/lng to GNIS code service, however.
  8. Now you know what area you are looking at.  Rather than the county, I want to examine census data based on the tract I’ve found (good old tract 060800).
  9. Next step is to build the URL for the API request.  Here’s where the examples come in handy.  For tract level data for the ACS, we look at this example, and see the format of the call: http://api.census.gov/data/2011/acs5?get=[the data code that you found above]&for=tract:[tract code or *]&in=state:[state code]+county:[county code]&key=[your key]
  10. Therefore, my API request looks like http://api.census.gov/data/2011/acs5?get=B08126_033E,B08126_018E,B08126_048E&for=tract:060800&in=state:08+county:013&key=… and I get something like this: [[“B08126_003E”,”B08126_033E”,”B08126_018E”,”B08126_048E”,”state”,”county”,”tract”],[“309″,”57″,”188″,”0″,”08″,”013″,”060800”]] stating that there are 309 construction jobs reported in this tract that answered the transportation question, and that no one took public transport, but more carpooled than drove alone.

Note that I ignored the margin of error fields, but they are there.  Now you have the data, you can use something like d3.js to display it.

Note that the TOS of the Census APIs don’t mention explicit limits but if you are planning to make a lot of calls, consider downloading the full dataset and using that instead.

Finally, you can sign up for the Census API developer forums, though this seems to be more of a feature request forum and less of a technical help arena.  And there doesn’t appear to be anonymous browsing of forums–you have to sign in to view anything.

Update 1/22: From the developer forums comes the comment that the forums are meant to be both technical support and feature requests.