Skip to content

Building a Bridge as Your Clients Walk Across It

There was an interesting article posted to hacker news about the nuts and bolts of a SaaS product that you might not expect (article, discussion).  I commented based on my experience that the early days of a SaaS product are like building a bridge while your clients are walking across it.  You want the bridge to be far enough ahead of your clients that they won’t fall off it.  But, not so far that if you or they want to go a different direction you’ve wasted time and materials building a useless walkway section.

So, don’t build features your customers aren’t going to use. But do build features they are going to need. How do you know what the difference is?

  • you can ask them.  This is the only way to start unless you are a target user of your SaaS product (in which case, ask yourself).  Depending on the technical sophistication of your users, you may or may not get good requirements, but there’s no better way to understand their pain.  They will speak very confidently about their pain, however they will also try to give you suggested solutions.  Don’t take those as gospel, as they may not have thought through the ramifications of said solutions.  Find them by looking where they congregate online (facebook groups, forums, reddit).  Targeted email may be OK if you have a relationship.
  • you can build a placeholder.  This is a great way to see if folks want the feature, if you have some folks using your app.  We built a placeholder for document management: “email us and we’ll upload your documents”.  After a few emails, we knew it would be worth it to build out some way for folks to self serve.
  • you can build a MVF (minimum viable feature).  A feature does not need to spring from your mind fully tested, polished and automated.  Sprinkle in manual steps, use emails to people instead of automation, or release only a subset of a feature.  The goal here is again to see usage before you fully develop it.  Another benefit is that the MVF may be all that is needed.
  • you can wait until clients ask for it.  The value of this depends on when they need what they ask for.  If they need it when they ask for it, then it’s just another data point (“thanks for the request.  We’ve noted it in our roadmap”).  If they need it a week or a month after they ask for it, then you can actually build it for them.

It actually can be quite helpful to checkpoint feature usage every so often.  I’ve seen this done two different ways, though I’m sure there are more.  The first is to look at the data and see what features clients are using.  This is nice because it just takes developer time, digging through your OLTP database.  Make sure you write down the results and the queries.  However, this won’t work until you have some users who’ve been using your system for some period of time.  The second is to schedule user interviews and watch your clients or prospects use your system.  This is time intensive, but can lead to many many insights and gives you definite user empathy.

Now, this type of development doesn’t free you from having a strategy. You need to pop your head up every three months or so and revisit the strategy and see if your business is working toward it. But if you are a completionist than early stage SaaS is not for you.

All the ways usernames can go wrong, or a story about why you favor third party solutions

This post on usernames is hitting the top of Hacker News right now. It’s worth a read for an in depth examination of how many ways something that seems fairly simple, usernames, can go wrong. Whether that is allowing impersonation due to unicode code points (see a related XKCD), how to handle email addresses, or what usernames should be prohibited, the simple idea of having someone use a text string as part of their authentication scheme is not so simple.

Here’s a great quote from the post talking about the right way to do it:

So if you’re building an account system from scratch today in 2018, I would suggest reading up on [the tripartite identity] pattern and using it as the basis of your implementation. The flexibility it will give you in the future is worth a little bit of work, and one of these days someone might even build a good generic reusable implementation of it (I’ve certainly given thought to doing this for Django, and may still do it one day).

What I took away from this is that usernames are hard. And that, paradoxically, the amount of business value that I create by doing usernames correct is minimal, until it’s a vector for attack. That means that I’m far better off choosing a third party library or service that focuses on authentication than rolling my own. That third party service is much more likely to have read, understood and implemented the tripartite identity pattern (in addition to any other benefits) than I will. (If I build many systems which require authentication, then maybe I can write my own library of username checks. But then I’d be better off open sourcing it, unless it was a competitive advantage.)

Now, this doesn’t mean I should blindly use any open source authentication library. I still need to examine it, see if it is used and maintained, and determine if it meets my requirements. This is not an “open source good, roll my own bad” post.

But the default should be to find an open source or third party system when I’m working on this type of software plumbing. (For rails and authentication, devise is where I’d start.) If I look around for an hour or two and can’t find anything that meets the needs of the project (either directly, with configuration or with minor code modification), then, and only then, should I start to think about rolling my own. Yes, it’s less fun to configure a third party library than it is to roll your own, but the kind of edge cases that a third party library or service will handle make it a better choice.

This has been an issue for a long time. I still remember walking into a contracting engagement a decade ago and seeing that they had their own database pooling solution. For Java. In 2004. When Apache DBCP had been around for a few years. I’ve been guilty of this myself, often at the beginning of a project when it feels like I need to get up to speed quickly, the problem space seems simpler and sometimes I’m not as familiar with the ecosystem. So this issue isn’t going away, but this post is my plea to my future self to default to third party solutions.

Serverless Framework

I had coffee with an acquaintance who is doing a lot of event driven data processing. Whereas ten years ago to tackle this problem you might use an ETL tool like Pentaho or Talend, now his process runs entirely on AWS Lambda functions. He is leveraging the Serverless framework to manage and deploy these applications. As I understand it there is a thin shim layer between the business logic and the lambda event handler, but the business logic is isolated and knows nothing about its environment. That makes the business logic very testable.

His description of the Serverless framework intrigued me. As he described it, the framework is driven by a simple yaml file and takes care of, among other tasks, the complicated infrastructure set up to tie Lambda functions to a variety of AWS events. I haven’t done it myself, but I’ve heard that setting up a lambda to API Gateway link is a real bear. Doing so allows a lambda function respond to a web requests without any AWS authentication, and is a key use case.

You can write and deploy lambda functions in any language that AWS Lambda supports (unfortunately, not java 9 at the moment). Here’s a java/maven/serverless tutorial. It also supports multiple cloud providers, though I haven’t done much beyond note that the documentation exists.

However, using Serverless does require writing code. If evaluating a a complicated ETL process which non developers needed to be able to understand and support, Serverless would not be a good fit. I’m not aware of any abstraction layers on top of it, though I guess you could run, for example, Pentaho Kettle jobs within lambda. There’s also an issue around cold start times–when your code hasn’t been invoked for a while, it can take longer to start up when a request or event occurs. Apparently there are partial solutions, but your lambdas still get cycled every few hours regardless.

I worked through some of the tutorials and was impressed at just how easy it was to get started. If I had a simple API or data processing pipeline to build, Serverless would definitely be on my short list of possible implementation options. It is very inexpensive, scales easily and encourages encapsulation.

Incidentally, my acquaintance’s company is hosting a lunch and learn on this technology at the end of the month. More details here.

Smashing: A Quick Dashboard Solution

I’m putting together a business metrics dashboard for The Food Corridor (what is old is new again, I remember a project at XOR, my first job out of school, that was all about creating a dashboard). I could have just thrown together some rails views, but I looked around and saw Smashing, which is a fork of Dashing, a dashboard project that came out of shopify.

Smashing is a sinatra app and is fairly simple to set up. It looks gorgeous, a lot better than anything I could hack together. I could install it on a free heroku dyno. Even though it will take a bit of time to spin up, it is now running for free. Smashing has nice MVC separation–you have dashboards which assemble widgets, and then jobs which push data to widgets on a schedule. Sending data looks something like this: send_event('val', { current: current }) where val is referenced in the widget.

You can create more than one dashboard (I did only one). They aren’t customizable by non developers, but once the widgets are written, they can be created by someone with a modicum of experience editing HTML.

Some tips:

  • Smashing stores its state in a file. If you are running on heroku, the filesystem is ephemeral. You have two options. You can store the state in an external data store like redis (patch mentioned here, I didn’t try it). Or you can rely on the systems you are polling for metrics to maintain the state. That’s the path I took.
  • The number widget has the ability to display percentage changed since last updated: send_event('val', { current: current, last: last }). Make sure that val is an integer–I sent a string like “100000” and that was treated as a zero for purposes of calculation.
  • If you are accessing any external systems, make sure you inject any secrets via environment variables.  For local development, I used dotenv.
  • You’ll want some kind of authentication system.
  • The widgets that come with Smashing aren’t complicated, but neither are they documented, so prepare to spend some time understanding what they expect.
  • I grouped jobs, which gather the data, by data source.  You can send multiple events per job, and I thought that made it clearer.  Connections to APIs or databases only needed to happen once as well.
  • The business metrics which I was displaying really only change on a monthly basis.  So I wanted to run the data gathering immediately, then in a week or two weeks.  Because of the ephemeral state, I expect the second run will never happy, but wanted to be prepared for it.  I did so by creating a function and calling it once on job load and then in the scheduler.

Here’s pseudo code for the job that pulls data from stripe:


Stripe.api_key = ENV['STRIPE_SECRET_KEY']

def stripe
  # pull data from stripe...
  send_event('stripeval', { current: current })
end

stripe

SCHEDULER.interval '1w' do
  stripe
end

Smashing is no full on technical metrics solution (like Scout or New Relic), but can be useful for displaying limited data in a beautiful format with a minimum of developmetn effort. If you’re looking for a dead simple dashboard, Smashing will work for you.

Getting access to the very nice date functionality in non Rails Ruby

I am doing some small ruby scripts for a dashboard and need to do some date calculations, like the timestamp of the first of the previous month and the timestamp of the end of the previous month. Rails makes this so easy with (DateTime.now - 1.month).beginning_of_month. I looked around for a way to do it with straight up Ruby, but didn’t see a good solution.

Luckily, some of the nice parts of Rails have been broken out into the active support gem. You need to add it to your Gemfile or however else you are managing your gems, of course. Confusingly, the gem is activesupport and the require statement is require active_support/... (see the underscore?).

There’s an entire guide on how to pull in just the active support functionality you need. Unfortunately, I couldn’t make the targeted includes work (I was trying to pull in both numeric and date extensions precisely, but kept getting the error message undefined method `month' for 1:Integer (NoMethodError).

Finally, I just pulled in active_support/time and everything worked.

Blast from the past: 5 worlds

5 Worlds is a Joel Spolsky classic. This article needs to be updated (it’s from 2002, when shrinkwrap software was still A Thing) but it still has a lot of wisdom and illustrates just how large the scope of work available to software developers is (even more so now that software is eating the world).

Whenever you read one of those books about programming methodologies written by a full time software development guru/consultant, you can rest assured that they are talking about internal, corporate software development. Not shrinkwrapped software, not embedded software, and certainly not games. Why? Because corporations are the people who hire these gurus. They’re paying the bill.

Note that assuming a software developer is a webdev is like assuming a lawyer is a trial attorney. Just like there’s lots of ways to practice law, there are lots of ways to build software. And, to be honest, this is probably true of every profession. If you go to a party and ask someone “what do you do” and really really listen, chances are you’ll get a startling view of the world, because everyone does something interesting.

Switching Scales

Developers need the ability to switch scales. I don’t know if this is crucial for any other profession (maybe a general contractor building a home?). For developers the ability to zoom in and focus on a specific problem for a few hours, and then zoom out and focus on the bigger picture, whether that is project management and reporting, or system architecture, is crucial.

The way I do it is to compartmentalize rigorously. This means that when I am focused on a bug, I’m focused on the bug. Sometimes I’ll timebox such investigation so I don’t get wrapped around the axle of the problem. If I encounter other issues, I file them off in the issue tracker (you do have an issue tracker, don’t you)?

If, instead, I’m focusing on the big picture, paradoxically it can be harder to focus. I think best when writing, so I’ll start out with a document outlining the issue. This has the added benefit of allowing me to easily collaborate (google docs FTW), refind the problem definition and solution, and revisit it over time. Other folks work better with other artifacts (app click throughs, diagrams) but the information density and flexibility of written communication work best for me. However, either way I need to set aside some time to focus on the question at hand.

Once you compartmentalize successfully, you can start to have the each scale inform decisions made at the other. Bugs that may feel urgent to fix can be deferred because that system is not accessed often, or will be rewritten soon. System diagrams may be reworked because you know that sub components belong together (or don’t). You may know where the “dark, scary” areas of the application are and make time to rework them. There may be patterns of code or services worth extraction.

One of the hardest parts of management, from personal experience and discussions with others, is letting go of the details. As an engineering manager, you should absolutely let go of critical path implementation items–you shouldn’t be working on the hardest problem or any key components. Depending on the size and needs of your team, your focus may be on recruiting, providing political cover or knocking down impediments to team goals. But I think you need to have a touchpoint at the lowest level of your application. Whether that is achieved through non blocking code reviews, fixing non critical bugs, or knocking out a customer support request, zooming in to the implementation level will inform your worldview. Just don’t let it be too large a focus; definitely don’t let it block your team.

If the idea of letting go of product implementation doesn’t appeal to you, you may want to dive in to management more. There are plenty of details and challenges to management (read High Output Management for a great intro). In fact, there are similarities between building a successful piece of software and building a team that builds a successful piece of software. People aren’t code though–imagine if when you ran a function you got a slightly different result based on how the function felt, how its home life was, what previous work the function had done, etc, and you’ll have some idea of the complexity of management.

If you’ve tried management and it doesn’t work for you, you can remain an individual contributor. Realize this will have an impact on your compensation (in most orgs) and your leverage (link) to effect change.

But it may be worth it to remain a builder.

A Few SQL Database Links

So, here are two different relational database links that I am currently interested in.

The first is PostgreSQL Exercises. This is a fun way to sharpen your SQL. While I’m not a big user of PostgreSQL, I have enjoyed it the few times I’ve touched it. And the basics of thinking in SQL are useful across any database.

The second is Modern SQL. I was pointed to this site (in particular this page on pivots) after asking a question on a slack (so I can’t link to it). My question was about how to transform a set of event values (X happened Y ago, then Z happened at Y+1, and so on) up into counts (X happened N times, Z happened M times). But the entire site is worth reading if you are a SQL nerd, especially the use cases.

Bonus, check out this Twitter thread about using a single SQL database as a data source for multiple microservices. Pragmatism!

Useful rails gem: rack mini profiler

Sometimes you need to do some profiling of your application to see where performance issues lie. I haven’t done an extensive survey of the options, but a friend recommended rack mini profiler. As the name suggests, this can be used with any rack based framework (so rails, hanami, etc).

There are a number of reasons to use this profiler. Learning to use a profiler, any profiler and how to performance tune is a key part of being a software developer, as opposed to programmer.

It can run in production. You can limit who can access the profiler, and it usually sits in the upper left hand corner with a “page load time”, but you can jump into more detail as needed.

It runs all the time. This lets you have an ambient view of the application (at least the parts that you are touching regularly).

When you are investigating a performance issue, you can easily dive into the views, controllers and SQL queries that your application is making.

It is simple to install, just a simple gem in your Gemfile.

Well worth getting to know, if you are supporting a rails application.