This post on usernames is hitting the top of Hacker News right now. It’s worth a read for an in depth examination of how many ways something that seems fairly simple, usernames, can go wrong. Whether that is allowing impersonation due to unicode code points (see a related XKCD), how to handle email addresses, or what usernames should be prohibited, the simple idea of having someone use a text string as part of their authentication scheme is not so simple.
Here’s a great quote from the post talking about the right way to do it:
So if you’re building an account system from scratch today in 2018, I would suggest reading up on [the tripartite identity] pattern and using it as the basis of your implementation. The flexibility it will give you in the future is worth a little bit of work, and one of these days someone might even build a good generic reusable implementation of it (I’ve certainly given thought to doing this for Django, and may still do it one day).
What I took away from this is that usernames are hard. And that, paradoxically, the amount of business value that I create by doing usernames correct is minimal, until it’s a vector for attack. That means that I’m far better off choosing a third party library or service that focuses on authentication than rolling my own. That third party service is much more likely to have read, understood and implemented the tripartite identity pattern (in addition to any other benefits) than I will. (If I build many systems which require authentication, then maybe I can write my own library of username checks. But then I’d be better off open sourcing it, unless it was a competitive advantage.)
Now, this doesn’t mean I should blindly use any open source authentication library. I still need to examine it, see if it is used and maintained, and determine if it meets my requirements. This is not an “open source good, roll my own bad” post.
But the default should be to find an open source or third party system when I’m working on this type of software plumbing. (For rails and authentication, devise is where I’d start.) If I look around for an hour or two and can’t find anything that meets the needs of the project (either directly, with configuration or with minor code modification), then, and only then, should I start to think about rolling my own. Yes, it’s less fun to configure a third party library than it is to roll your own, but the kind of edge cases that a third party library or service will handle make it a better choice.
This has been an issue for a long time. I still remember walking into a contracting engagement a decade ago and seeing that they had their own database pooling solution. For Java. In 2004. When Apache DBCP had been around for a few years. I’ve been guilty of this myself, often at the beginning of a project when it feels like I need to get up to speed quickly, the problem space seems simpler and sometimes I’m not as familiar with the ecosystem. So this issue isn’t going away, but this post is my plea to my future self to default to third party solutions.