Don’t forget to deactivate inactive items in easyrec

I wrote before about easyrec, a recommendation system with an easy to integrate javascript API, but just recently realized that I was still showing inactive items as ‘recommended’.  This is because I was marking items inactive in my database, but not in my easyrec system.

Luckily, there’s an API call to mark items inactive.  You could of course manually login and mark them as inactive, but using the API and a bit of SQL lets me run this check for all the items.  Right now I’m just doing this manually, but will probably put it in a cron job to make sure all inactive items are marked so in easyrec.

Here’s the SQL (escaped so it doesn’t wrap):

select concat('wget "http://hostname/api/1.0/json/setitemactive?apikey=apikey&tenantid=tenantid&\
active=false&itemtype=ITEM&itemid=',id,'"; sleep 5;')
from itemtable where enabled = 0;

I have an item table that looks like this:

CREATE TABLE `itemtable` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
...
`enabled` tinyint(1) NOT NULL DEFAULT '1',
...

Where enabled is set to 0 for disabled items and 1 for enabled items. The id column is numeric and also happens to be the easyrec item id number, in a happy coincidence.

The end output is a series of wget and sleep commands that I can run in the shell. I added in the sleep commands because I’m on the demo host of easyrec and didn’t want to overwhelm their server with updates.

 


Dropwizard vs Spring Boot

wizard photo

Photo by seanmcgrath

I just rolled off a project where I chose to use Spring Boot to create a number of microservices.  I have also written a number of Dropwizard services, and wanted to compare the two while they were fresh in my mind.

They have a number of similarities, of course.  Both Spring Boot and Dropwizard create standalone jarfiles that can be deployed without needing a container.  Both favor convention over configuration and want to help you do the right thing out of the box.  Both are java based.  Both have monitoring, health checks, logging and other production nicitiies buit-in.  Both are opinionated–making a lot of choices for the developer, rather than forcing the developer to choose.  Both make it easy to leverage existing libraries.  Both have a focus on performance.

However, there were a number of reasons I choose Spring Boot over Dropwizard for the recent project, and these highlight the differences.  The first is that dependency injection is built into Spring Boot in a way that it simply isn’t with Dropwizard.  Of course, there are third party solutions for bolting DI onto Dropwizard, but we also needed a java DI framework that would handle lifecycle events, which pretty much means Spring.  Finally, this project wasn’t all about REST services, and while Dropwizard has some support for other types of services, it really is designed as a performant HTTP/REST layer, and certainly almost all the questions about Dropwizard online are about REST and APIs.  Spring Boot, on the other hand, aims to provide support for a plethora of different types of services.  Plus, you can leverage the rest of the large Spring codebase.

There are some other differences as well.  Dropwizard uses shading to build fat jars while Spring Boot uses nested jars.  As far as support, Spring, as usual, wins on the documentation front, with loads of accurate docs.  But Dropwizard definitely has a larger community around it (compare the activity of the DW google group to the Spring Boot forums).

If you are writing a REST API in Java, Dropwizard is a great choice (here’s a review of other options I did a few months ago).  If you want to build microservices that integrate with other types of components (queues, nosql, etc), Spring Boot is what I’d recommend.

Update 12/8: Per this tweet, the spring forums aren’t used because of spam, but you can find plenty of support on StackOverflow with questions tagged ‘spring-boot’.


Lob Postcard Review

A few months ago, I wrote a Zapier app to integrate with the Lob postcard API. I actually spent the 94 cents to get a postcard delivered to me (I paid 24 cents too much, as Lob has now dropped their price). The text of the postcard doesn’t really matter, but it was an idea I had to offer a SaaS that would verify someone lived where they said they lived, using postal mail. Here are the front and back of the postcard (address is blacked out).

Lob Postcard Sample, address side

Lob sample postcard, address side

Lob Postcard Sample, front

Lob sample postcard, front (from PDF)

Here is the PDF that Lob generated from both a PDF file I generated for the front (the QR code was created using this site) and a text message for the back.

A few observations about the postcard.

  • The card is matte and feels solid.
  • The QR code is smudged, but still works.
  • The text message on the back appears a bit closer to the edge on the actual postcard than it does on the PDF image.
  • The front of the postcard appears exactly as it was on the PDF.
  • It took about 5 business days (sorry, working from memory) for delivery.

So, if I were going to use Lob for production, I would send a few more test mailings and make sure that the smudge was a one off and not a systemic issue. I would definitely generate PDFs for both the front and back sides–the control you have is worth the hassle. Luckily, there are many ways to generate a PDF nowadays (including, per Atwood’s Law, javascript). I also would not use it for time sensitive notifications. To be fair, any postal mail has this limitation. For such notifications, services like Twilio or email are better fits.

In the months since I discovered Lob, I’ve been looking for a standalone business case. However, business needs that are:

  • high value enough to spend significant per notification money and
  • slow enough to make sending mail a viable alternative to texting or emailing and
  • split apart from a larger service (like dentist appointment scheduling)

seem pretty few and far between. You can see a short discussion I kicked off on hackernews.  However, they’ve raised plenty of money, so they don’t appear to be going anywhere soon.

But the non-standalone business cases for direct postcard mail are numerous (just look in your mailbox).


Java REST API Framework Options

resting photo

Photo by shioshvili

I’ve been working with a couple of REST API solutions that exist in the Java tech stack.  I haven’t seen any great analysis of REST API solutions (though Matt Raible does mention some in this exhaustive slide deck about Java frameworks [pdf]), so wanted to share my on the ground experience.

First up is restSQL.  This framework makes it easy to get data from a database to a JSON or XML REST API and back.  If you have a servlet container available, you write two configuration files, one with a SQL query and one with db connection information, and you have a RESTful API.  For prototyping and database access, it is hard to beat.

Pros:

  • Quick to set up
  • Only SQL knowledge is required
  • No programming required
  • Allows simple mapping of db table to resource, but can include one to one and one to many mappings
  • Supports all four REST operations out of the box
  • Supports XML as well as JSON
  • Is an embeddable java library as well as a standalone framework
  • Project maintainer is engaged and the project is moving forward

Cons:

  • Requires a servlet engine, and you have to restart it for changes to your configuration to be picked up
  • Output format has limited customization
  • Only works with mysql and postgresql databases (though there is some experimental support for Oracle and MS SQL)
  • Doesn’t work with views
  • The security model, while fine grained, isn’t modern/OAuth (can be solved with an API gateway (like 3scale, Tyk or ApiAxle) or proxy

The next framework I have experience with is Dropwizard.  This is a powerful framework that creates uberjars that you can run on any port as a standalone service.  It’s not limited to providing a JSON representation of database tables–if you can create a Java object, Dropwizard can serve it up as a JSON resource.

Pros:

  • Community support
  • Extreme output formatting flexibility, but be prepared to write a custom deserializer if you want to handle anything other than reads of custom formatted objects
  • Supports any database that hibernate supports
  • Built in testing support
  • Brings together ‘best of breed’ tools like Jersey, Jackson and Hibernate, so you don’t have to do the integration yourself
  • Great documentation

Cons:

  • Have to roll your own deployment solution (tarball, chef, puppet)
  • No services startup script provided
  • Shading can slow down development
  • Not yet at 1.0 release

The last one I don’t have familiarity with, but a colleague used it in the past.  It is Sparkjava.  This is a lightweight framework that fits when you have an existing Java library with functionality you want to expose.  I’m not competent to write pros/cons for this framework, but wanted to mention it.

The gorilla in the room that I haven’t had experience with (in terms of writing RESTful webs services) is Spring.  I would definitely include this in any greenfield solutions review.


I Want to Pay You Money! (Except When I Don’t)

money photo

Photo by 401(K) 2013

I saw this post from Kin Lane talking about Zapier and how one of the many advantages it has over similar services is its pricing.  I completely agree.  While I like free as much as the next person, when I’m building on something, I want to pay for it, or at least have it monetized in some fashion (Kin has a nice list of ways for API providers to monetize).  Paying for a service means:

  • the company can survive
  • great employees can be paid
  • when I complain, the company has an incentive to listen
  • the value I get from the service is above what I’m paying (aka consumer surplus), if I’m a reasonable facsimile of homo economicus.

All of these are really nice attributes of a technology I’m going to build on (not just ‘date’ as Kin says).  This is an interesting dichotomy, because the fastest way to growth is to provide a free service–then there’s no friction to signing up.

I guess the answer, at least for software products where the marginal cost is very very low, is a freemium offering, like Zapier.  Get the user in, show them how your value proposition works, and then ask them for money when they are hooked.  Just don’t make the freemium level unusable!


Consolidate external dependency notifications using Zapier

binoculars photo

Photo by M1key.me

As I wrote over at the Geek Estate Blog, if you build your business on vendors, you should monitor them.  In the past, I’ve used a variety of services to monitor vendor services, from pingdom to wget/cron to nagios.  These services are great about telling you when some external service is unavailable, but are not so hot at telling you when a service is going to be down (for planned maintenance) or back up.

For that, you need to be monitoring, and reading, vendor announcements, however the vendor has decided to provide them, whether that is as a blog/RSS feed, twitter feed, email newsletter, a status page or something else.

However, it can be tough to monitor and read announcements in two or more places.  Here, Zapier or a similar service can help.  Pick one place to be notified.  For me, that’s typically an email inbox, because, frankly, other data sources can be ignored (except phone texts), but I’ll always check my email.

Then, use Zapier’s zaps to transform any announcements from the other sources to emails.  For instance, there is an RSS trigger for new items in a feed and a Twitter trigger for tweets from a user.  Status pages often provide RSS feeds (Google’s does).  If the service provider doesn’t provide a structured method like an RSS feed to notify you of changes, but does provide a webpage of announcements, you could look at a service like changedetection.com and have the email sent to your inbox or parsed by Zapier and pushed to your notification location.

And for the output side, you can just use Zapier’s ‘send outbound email’ action.  If you want to have all notifications pushed to your phone, an RSS reader or Twitter acount, you can use Zapier to send texts, create RSS items or tweets as well.


Building a System with IronMQ and Python

messages photo

Photo by andrewrennie

One of my most recent projects was writing a system to deliver real estate listing data to a content management system. The CMS was not in my control. Since the listing data source was bursty and I wasn’t sure how the CMS would handle the load, I decided to use a message queue, where the messages would have a JSON payload. Message queues are great at decoupling components of a system.

For the queue, I used IronMQ. The company already was using it, it has a free tier (up to 24 messages a second), the service has been stable and reliable, it has great language SDKs, and setting up a durable message queue is something I’d rather outsource. (I do wish Zapier supported it.) In other situations (when posting messages from mobile apps), we ran Varnish in front of IronMQ so that it could be replaced easily. In this case, we didn’t because there were fewer moving pieces (it was server to server communication and it would be easier to swap out IronMQ should that be required).

I wrote the bridge code from the listing database to the message queue in python. The shop was mostly Java and some python, and python seemed a better fit for a small ‘pull from here, push to there’ application. I used pytest for unit testing, jenkins to run the unit tests in a CI environment, and autopep8 for formatting. My colleague was a more experienced python programmer, so I was able to lean on him for questions. I didn’t find python hard to pick back up (I’d scripted in python a little years ago), and it was a fun language to code in. Reminded me of perl w/r/t packages and quick developer feedback. I did miss Java’s dependency management, though (my college recommended virtualenv as a possible solution).

The JSON payload would allow developers writing the message consumer to use almost any language they wanted–any language if they used the IronMQ REST API rather than an SDK.

I can’t share the code, but for this kind of problem, python was a great solution. And I’ll reach for IronMQ any time I need a message queue. This pair of technologies was quick to implement, easy to deploy, and high performance wasn’t really a requirement, since the frequency of the listing delivery was the real bottleneck.


On the benefits of a private, internal API

polygon photo

Photos by Double–M

A few years ago, the company for which I worked went through the monumental task of defining neighborhoods for a number of cities in the area where they had real estate agents.  Neighborhood data is hard to get, and this task required a lot of back and forth between the person responsible for the mapping and the people who knew the neighborhoods.  The maps were captured in Google’s My Maps feature, and exported as KML to a vendor who would then build neighborhood pages and maps with the data.  Much of the neighborhood page would be driven off data entered in an admin back end system (it was a custom CMS, essentially).

Almost as an afterthought, I asked the vendor to provide an API for the neighborhoods, including the polygon data.  I wrote up an API spec, had it reviewed by my team, and obtained approval for the vendor to build it. If I recall, it was in the neighborhood of a couple thousand dollars, and the vendor had never been asked to build something like this before.

This one API allowed the company to apply dearly won neighborhood information in so many ways:

  • generate statistics by neighborhood against any lat/lng coded data
  • tag any geocoded content with neighborhood meta data
  • find new and sold listings by neighborhood
  • understand who were top listing agents in each neighborhood
  • create internal BI tools
  • write internal recruiting tools
  • pull other geocoded data by neighborhood
  • tag transactions with neighborhood meta data

Many of these were accomplished with a plugin to the data processing tool (Pentaho Kettle) that used the Java Topology Suite. Creating JTS geometries is expensive, so the plugin caches them with a simple hashmap cache. The plugin java code is garbage collected fully on each data load run, so this simple solution is appropriate, rather than a more complex LRU cache.

However, this solution isn’t perfect. Often, if a property was on the boundary, the JTS code would often put it in the wrong neighborhood. Boundaries of neighborhoods are incorrect or overlap. Points are incorrect because geocoding isn’t perfect. Human review is still required.

But, the very fact that the neighborhood data was so accessible meant that the company could ask questions (how many homes are in each neighborhood, what are the three newest listings in this neighborhood) that simply couldn’t have been asked if there was no API. Having an internal API that exposed hard won business knowledge within the company was beneficial, even if it will never be exposed or monetized outside the company.


Posting to REST APIs from mysql triggers

This was a fun crazy idea that turned out not to be so good in practice. If you have a mysql table you are monitoring for changes, you can use a trigger to do so (as long as you have a semi-modern version of msyql).

Sometimes you might want to notify another service of any change (remote logging service, message queue, etc). For instance, at 8z, when the price of a listing changes, this is an interesting event that other software should be notified of.

The first step is to install the mysql http UDF (all commands below are for centos).

$ sudo yum install curl-devel

$ CPPFLAGS="-I/usr/include/mysql"  ./configure --with-mysql=/etc/my.cnf   --enable-shared

$ make

This gives us a .la file (built with libtool) but luckily there’s a .so file hiding:

sudo cp .libs/mysql-udf-http.so.0.0.0 /usr/lib64/mysql/plugin/mysql-udf-http.so

Then, your remote rest service will probably want JSON, so you’ll want to visit the mysql udf hub. There lives a JSON function. I just grabbed the C file from github and compiled it: gcc -fPIC -shared -o lib_mysqludf_json.so -I/usr/include/mysql/  lib_mysqludf_json.c

After these are compiled you have to copy the .so files to the appropriate directory and then add functions as a privileged user, per usual UDF convention.

Now you can create a trigger that calls these functions.

delimiter $$
CREATE TRIGGER upd_check BEFORE UPDATE ON account 
FOR EACH ROW 
  BEGIN 
    IF NEW.amount > 0 THEN 
      set @json = select json_object(account_id,amount) 
      select http_post('http://restservice.example.com/account/post',@json); 
    END IF; 
  END;$$ 

delimiter;

Note that I didn’t actually implement this, so the code up above is based on my memory. If you try this and have some corrections, please leave them in the comments and I’ll update this post.

Why didn’t I actually implement this, after doing the better part of a day’s worth of research? The database is not the best place for this kind of logic. Error handling for triggers is weak–if the trigger failed for any reason (like the remote service was down), you would need to build an error logging system, or send an email. Also, if there are a number of updated rows, which all trigger outbound http calls, you might run into performance issues which would be difficult to replicate or analyze, and, most importantly, might impact your database’s ability to act as a database. The three tier architecture exists for a reason.

But, it was fun to investigate, and, as my colleague said, would have been cool if it had worked. If you are still interested, there’s more on this topic here.


Building an automated postcard mailing system with Lob and Zapier

Courtesy of smoothfluid

Courtesy of smoothfluid

I was looking at automated paper mailing systems recently (and listed what I found), and was especially impressed with Lob, especially the ease of its API.

Among other printing services, Lob will let you mail a postcard with a custom PDF on both sides, or a custom PDF on one side and a text message on the other, anywhere in the USA for $0.94.  (Sorry, not sure about international postcards) The company for which I work sends out tens of thousands of postcards every quarter. The vendor which we use charges them a similar fee (less, but in the same ballpark) but there’s a manual process to deliver the collateral and no API. So an on-demand, one by one post card sending system is very interesting to me.

Note that I haven’t received the Lob postcard which I sent myself, so I can’t speak to quality. Yet.

The Lob API is a bit weird, because the request is form encoded rather than a JSON payload.  It also uses basic auth, but only the username, not the password. But the API seems to have all the pieces you’d need to generate all kinds of postcards–reminder postcards, direct mail postcards, photo postcards, etc.

After testing out the service via the web interface and cURL examples, I thought that it’d be fun to build a Zapier zap. In particular, being able to send a postcard for an entry in a Google spreadsheet seemed like a useful use case. Plus, Zapier is awesome, and I’d wanted to test out their integration environment for myself.

So, I built a Zapier integration for Lob, using the Zapier developer docs in combination with the Lob developer docs. It was actually easy. The most complicated step was translating the Zapier action data, which is a one or two dimensional array of typed data, into the Lob data format, which wanted a couple of text fields and two address arrays. Zapier has a scripting environment that let me modify data from APIs pre and post send, and even had an example about form encoded APIs. Zapier’s JavaScript scripting development environment was full featured, including syntax and error highlighting. It had no real debugging available, but I could use the venerable debug-by-log-statement method fairly easily.

Where could I take this next? Everywhere people use postcards in real life. The postcards depend on PDF files (see a sample), so if you are generating a custom postcard for each interaction things become more complex, but there are a few APIs (based on a 30 second google search, here and here) available for dynamic PDF generation. There are also limits on API call throughput, if I stuck to the Zapier integration–I could send at most 300 postcards a day, unless I managed multiple spreadsheets.

I see reminders of high value events (dentist, house maintenance, etc), contests and marketing as key opportunities for this type of service. And if I had a product where direct mail was a key component, using Lob directly would be worth serious consideration.

Regarding the Zap, I believe I cannot make this Zap available to anyone else. Since I’m not a representative of Lob, I couldn’t commit to maintaining this Zap, and Zapier doesn’t want to have any of their customers depending on an integration that could disappear or be unsupported at any time–a fair position.

If the Zapier or Lob folks want to take this integration and run with it, I’d be happy to share my code–just leave a comment. If anyone else is interested in being able to generate Lob postcards from a Google spreadsheet (or any other compatible API) via Zapier integration, let me know and I’ll see what I can do.



© Moore Consulting, 2003-2017 +