Multi-tenancy is a key part of building a SaaS product. You want to amortize your software investment across different paying customers. Customers should never be able to access any other customer’s data. And it is often the case that customers don’t want their data intermingled with other customers’ data, though that depends on the type of data and the customer needs.
By separating customers into tenants, you can achieve this data separation. There are a number of levels of multi-tenancy. In increasing order of isolation, they are:
- no multi-tenancy. In this situation, all the customer data is co-mingled in the database. Any data that is unique to a customer is tied to that customer with an id.
- logical multi-tenancy. Isolation is enforced in code and the database (every table has a ‘tenant id’ key, and you’re running joins). When you are using this type of isolation, you want to resolve the tenant as soon as possible. This is often done with a different hostname or path. You also want to ensure that any users accessing tenant data are part of the tenant. If you use this approach, mistakes in your code can be used to ‘escape’ the tenant limitation and read other customer’s data. However, one advantage of this approach is that you have operational simplicity: one version of the code to maintain and one version of the database. There may be support for this in your framework, or you may be rolling your own.
- logical multi-tenancy supported by the database. Some databases support row level security isolation. In this scenario, each tenant has a different user, and the data isolation is enforced by the database. Your code is limited to looking up the correct user for a given tenant.
- container level multi-tenancy. In this scenario, you run separate containers for each tenant. If you are using a solution like Kubernetes, you can run them in different namespaces to increase the isolation. The operational complexity increases (I did mention Kubernetes, did I not?) but it becomes far more difficult for an attacker to use the access of one tenant to get another tenant’s data. However, now you can have multiple versions of the codebase running. This can be a blessing and a curse, as it allows each client to control their version (if you enable it). This can increase support burden depending on the complexity of your application. You could also choose to run the latest code on every container, upgrading all containers every time a change is made to your software.
- virtual machine multi-tenancy. Here you use different databases and virtual machines for each tenant. You can leverage common security defense-in-depth practices at the network level, using network access controls and firewalls. This physical isolation makes it even harder for an attacker to escape and view other tenants’ data. However, it increases your operational costs both in terms of complexity (are you going to force everyone to upgrade across the entire fleet?) and support (there may be configuration and/or code drift between the different VMs). If you pursue this, it behooves you to automate the creation of these virtual machines.
- physical hardware isolation. With this choice, you actually run different hardware for each tenant, possibly in different data centers. This is the most secure, but the most operationally intensive. There are some options for API driven hardware setup, but the isolation, while a boon for security, makes updates and upgrades more difficult.
What is the best option for your SaaS solution? It depends on the security needs of your customers as well as your cost structure and your operational maturity. The higher the level of isolation, the harder it is to run and upgrade the various systems.