Skip to content

All posts by moore - 3. page

20 years

Wow, it’s been 20 years since I started this blog. The world and my life have changed quite a bit, but I still like to muse on technology, business and more.

Thanks for reading! I appreciate everyone who takes the time to listen to me, whether you scan it, share it, or reach out to me to let me know.

Still think blogging is one of the best things you can do if you want to:

  • engage with people
  • understand a subject
  • build a business
  • create a name for yourself
  • think deeply about a topic
  • find a job (but not quickly!)

Thanks again for 20 great years!

Docker thoughts

Docker is amazing tech for developer productivity.

You can package up the dependencies for your application into one file (a Dockerfile) or more than one file (a docker-compose file). You might use the former for an application or service and the latter if your application depends on a database, cache, or other external architectural component.

I am definitely late to the Docker party, as I remember Eric Norlin interviewing Solomon Hykes in the early 2010s at Gluecon. In 2018 I used it to stand up some consulting projects, but wondered at the value and the required investment.

I’ve recently been using it extensively at work to set up quickstarts. These tutorials let folks stand up our software quickly for evaluation purposes. Docker is a fantastic fit for this use case.

I’ve been a big fan of detailed documentation and readmes for setting up development environments, but Docker takes all that documentation (including tedious things like ‘install python 3.8.12. 3.8.11 doesn’t work’) and puts it into code that is accessible with single command.

I have a foggy memory of detailed readmes, hours of setup and dependency hell, but Docker removes all of that. Doesn’t mean your Docker images/compose files can’t get stale and won’t need love and attention, but it does mean you can control when you test and upgrade such dependencies, rather than discovering it when a new developer comes on the team and can’t get their environment set up.

Shipping software to production via containers (Docker or otherwise) has operational benefits, but I’m less familiar with those. One of the nice things about developer relations is you’re separated a bit from operations. (It’s one of the downsides as well.) I haven’t recently worked with a production system where containers were used, but I have read nice things about the approach. Here’s an article from Google which covers the technical benefits.

What about the licensing? This business model change rocked the world of happy Docker users, but when you take funding, you have to make money. Actually, if you want to survive as a company, you have to make money, but if you take funding, you have to try to make a LOT of money. That’s the game, make sure you know before you start to play. But that’s a different post.

As far as Docker licensing, as a Docker user, you have two options:

  • pay docker when you reach the threshold
  • use a docker alternative such as finch or podman

As the previous paragraph implies, Docker is a subset of containers. The benefits I mention above apply to container based systems, not just Docker systems, but Docker is the solution I’m most familiar with.

Docker, and containerization in general, feels like as big a leap as Stackoverflow, git or memory managed languages in terms of developer productivity.

Ending a side project

I think side projects are great. They let you experiment, scratch and itch, and learn in a safe and low pressure environment. They can raise your professional profile too and are great fodder for a blog. Of course, they also take precious precious free time.

These all mean that it is entirely okay to shut down a side project when it is no longer serving you.

The side project could be:

  • less fun than it used to be
  • taking more time than you want
  • an area you used to be interested in but are no longer
  • related to a job you have left

In all of these cases, you have NO obligation to keep doing something simply because you used to.

So, shut it down.

How you do so depends on what the side project is. A blog should be treated differently than a webapp which should be treated differently than a library.

There are some things you can do for any side project:

  • Set an date to end your efforts–this is something for you internally.
  • Run through the finish line–keep up the good work.
  • Announce the end and what it means for the project.
  • Thank everyone who helped you.
  • Optionally make it available for a while longer.

What a “while” is depends on the effort and money required to make the project available, as well as how useful it is in the end state. A library can live on GitHub forever, especially if you archive it so it is clear that it won’t be getting updates from you. It’ll still be available to fork and may be useful to others. A directory of local restaurants offering patios, on the other hand, will decay in usefulness pretty quickly as businesses start and close.

Ending a project is just as natural as starting it. Make sure you do it right.

Protecting a CDN source using basic auth

I have a website that is behind a content delivery network (CDN). I want to protect it from being crawled by any robots. I want all access to go through the CDN for reasons. There may be errant links to the source; I don’t care if they continue to work.

htaccess and basic auth is perfect for this situation.

I added an .htaccess file that looks like this:

AuthType Basic
AuthName "Secure Content"
AuthUserFile /path/to/.htpasswd
require valid-user

I needed to make sure the path and the file are readable by the web server user.

Then, I added a .htpasswd entry that looks like this:

user:passwdvalue

If you don’t have access to htpasswd, the typical program used to generate the password value, this site will generate one for you.

Then, I had to configure my CDN to give it the appropriate header.

Use the Authorization header, and make sure to pass the username and the password. This site will generate the appropriately base64 encoded values.

Voila. Only the CDN has access.

Now, the flaws:

  • Depending on how the CDN accesses the site, it may be possible to snoop out the username and password
  • If you ever want to get the origin site over HTTP, you’ll need the username/password

Setting up password gorilla on an ARM Mac

I love password gorilla. It’s a portable locally hosted password manager that is compatible with passwordsafe. It has all the features I need in a password manager:

  • account groups
  • password generation
  • metadata storage (so you can add a note)
  • keyboard shortcuts for copy and pasting the url, username, and password
  • local storage of secrets

It doesn’t have fancy integration with browsers, remote backup or TOTP, but I don’t need these. For browser integration, I use the clipboard. For remote backup, pwsafe files can be copied anywhere. And for TOTP, the whole point of MFA is that each factor is separate.

But recently, I started setting up an Apple M1 machine, and the password gorilla downloads failed to start. I looked at the github issues of the repository and didn’t find a ton of help, though there were some related issues.

After some poking around, here’s what I did that worked for running password gorilla on my Apple M1:

  • installed tcl-tk using brew: brew install tcl-tk
  • cloned the password gorilla repository: git clone https://github.com/zdia/gorilla.git
  • wrote a small shell script and added the directory where it lived to my path.
#!/bin/sh

cd <cloned repository directory>/sources
/opt/homebrew/opt/tcl-tk/bin/tclsh gorilla.tcl

That’s it. After I did this, I can run this script any time and password gorilla starts right up.

LLM training data, or, a broken virtuous cycle

Have you used ChatGPT?

I have and it’s  amazing.

From the whimsical (I asked it to write a sonnet about SAML and OIDC) to the helpful (I asked for an example of a script with async calls using Typescript), it’s impressive.

However, one issue I haven’t seen mentioned is training data. Right now, there are sets of training data used to teach these large language models (LLMs) what is correct about the world and what is not. This podcast is a great intro to the model families and training process.

But where does that training data come from? I mentioned that question here, but the answer is humans provide it. Human effort and knowledge are gathered on reddit, wikipedia, and other places.

Why did humans spend so much time and effort publishing that knowledge? Lots of reasons, but some include:

  • Making money (establishing yourself as an expert can lead to jobs and consulting)
  • Helping other humans (feels good)
  • Internet points (also feels good)

In each case, the human contributing is acknowledged in some way. Maybe not by the end user who doesn’t, for example, read through the Wikipedia wiki editing history. But someone knows. Wikipedia editors know and celebrate each other. Here’s a list of folks who have edited that site for a decade or more.

What about search engines? Google reifies knowledge in a manner similar to ChatGPT. But, cards notwithstanding, Google offers a reputational reward to publishers. It may be in money (Adwords) or site authority. Other applications like Ahrefs help you understand that authority and I can tell you as a devrel, high search engine ranking is valuable.

ChatGPT offers none of that, at least not out of the box. You can ask for links to sources, but the end user must choose to do so. I doubt most do, and, in my minimal experience, the links are often broken or made up.

This fact breaks the fundamental virtuous cycle of internet knowledge sharing.

Before, with search engines:

  • Publisher/author writes good stuff
  • Search engine discovers it
  • User reads/enjoys/learns from it on the publishers site
  • Publisher/author gains value, so publishes more
  • Search engine “sees” people are enjoying publisher, so promotes it
  • More users read it
  • Back to step one

After, with LLMs:

  • Publisher writes good stuff
  • LLM trains on it
  • User reads/enjoys/learns from it via ChatGPT
  • … crickets …

The feedback loop is broken.

Now, some say that the feedback loop is already broken because Google over optimized Adwords. Content farms, SEO focused garbage sites and tricks to rank are hard to stomach, but they do make money from Google’s traffic. This is especially acute with products and product reviews because the path to monetization is so clear; end users are looking to buy and being on page 1 will result in money. I agree with this critique; I’m not sure the current knowledge sharing experience is optimal, but humans have been working around Google’s limitations.

More human labor helps with this. I’ve seen this happen in two ways, especially around products.

  • Social media, where searchers are relying on curation from experts. Here end users aren’t searching so much as browsing from a subset of answers.
  • Reddit, where searchers are relying on the moderators and groups of redditors to keep spam out of the system. Who among us hasn’t searched for “<product name> review reddit” to avoid trash SEO sites? This also works with other sites like Stackoverflow (for programming expertise).

In contrast, the knowledge product disintermediation of ChatGPT is complete. I’ll never know who helped me with Typescript. Perhaps I can’t know, because it was one million little pieces of data all broken up and coalesced by the magic of matrix algebra.

This will work fine for now, because a large corpus of training data is out there and available. But will it work forever? I dunno. The cycle has been broken, and we will eventually feel the effects.

In the short term, I predict that within the next three months, there will be a creative commons type license which prohibits the usage of published content by LLMs.

Using GPT to automate translation of locale messages files

At my current employer, FusionAuth, we have extracted out all the user facing messages to properties files. These files are maintained by the community, and cover over fifteen languages.

We maintain the English language version. Whenever new user facing messages are added, the properties file is updated. Sometimes, the community contributed messages files are out of date.

In addition, there are a number of common languages that we simply haven’t had a community member offer a translation for.

These include:

  • Korean (80M speakers)
  • Hindi (691M)
  • Punjabi (113M)
  • Greek (13.5M)
  • Many others

(All numbers from Wikipedia.)

While I have some doubts and concerns about AI, I have been using ChatGPT for personal projects and thought it would be interesting to use OpenAI APIs to automate translation of these properties files.

I threw together some ruby code, using ruby-openai, the ruby OpenAI community library that had been updated most recently.

I also used ChatGPT for a couple of programming queries (“how do I load a properties file into a ruby hash”) because, in for a penny, in for a pound.

The program

Here’s the results:


require "openai"
key = "...KEY..."

client = OpenAI::Client.new(access_token: key)

def properties_to_hash(file_path)
  properties = {}
  File.open(file_path, "r") do |f|
    f.each_line do |line|
      line = line.strip
      next if line.empty? || line.start_with?("#")
      key, value = line.split("=", 2)
      properties[key] = value
    end
  end
  properties
end

def hash_to_properties(hash, file_path)
  File.open(file_path, "w") do |file|
    hash.each do |key, value|
      file.write("#{key}=#{value}\n")
    end
  end
end

def build_translation(properties_in, properties_out, errkeys, language, client)
  properties_in.each do |key, value|
    sleep 1
# puts "# translating #{key}"
    message = value
    content = "Translate the message '#{message}' into #{language}"
    response = client.chat(
      parameters: {
        model: "gpt-3.5-turbo", # Required.
        messages: [{ role: "user", content: content}], # Required.
        temperature: 0.7,
      }
    )
    if not response["error"].nil?
      errkeys << key #puts response 
    end 

    if response["error"].nil? 
      translated_val = response.dig("choices", 0, "message", "content") 
      properties_out[key] = translated_val 
      puts "#{key}=#{translated_val}" 
    end 
  end 
end

# start the actual translation 
file_path = "messages.properties" 
properties = properties_to_hash(file_path) 
#puts properties.inspect 
properties_hi = {} 
language = "Hindi" 
errkeys = [] 

build_translation(properties, properties_hi, errkeys, language, client) 
puts "# errkeys has length: " + errkeys.length.to_s 

while errkeys.length > 0
# retry again with keys that errored before
  newprops = {}
  errkeys.each do |key|
    newprops[key] = properties[key]
  end

  # reset errkeys
  errkeys = []

  build_translation(newprops, properties_hi, errkeys, language, client)
  # puts "# errkeys has length: " + errkeys.length.to_s
end

# save file
hash_to_properties(properties_hi, "messages_hi.properties")

More about the program

This script translates 482 English messages into a different language. It takes about 28 minutes to run. 8 minutes of that are the sleep statement, of which more below. To run this, I signed up for an OpenAI key and a paid plan. The total cost was about $0.02.

I tested it with two languages, French and Hindi. I used French because we have a community provided French translation. Therefore, I was able to spot check messages against that. There was a lot of overlap and similarity. I also used Google Translate to check where they differed, and GPT seemed to be more in keeping with the English than the community translation.

I can definitely see places to improve this script. For one, I could augment it with a set of loops over different languages, letting me support five or ten more languages with one execution. I also had the messages file present in my current directory, but using ruby to retrieve them from GitHub or running this code in the cloned project would be easy.

The output occasionally needed to be reviewed and edited. Here’s an example:

[blank]=आवश्यक (āvaśyak)
[blocked]=अनुमति नहीं है (Anumati nahi hai)
[confirm]=पुष्टि करें (Pushṭi karen)

Now, I’m no expert on Hindi, but I believe I should remove the English/Latin letters above. One option would be to exclude certain keys or to refine the prompt I provided. Another would be to find someone who knows Hindi who could review it.

About that sleep call. I built it in because in my initial attempt, I saw error messages from the OpenAI API and was trying to slow down my requests so as not to trigger that. I didn’t dig too deep into the reason for the below exception; at first glance it appears to be a networking issue.


C:/Ruby31-x64/lib/ruby/3.1.0/net/protocol.rb:219:in `rbuf_fill': Net::ReadTimeout with #<TCPSocket:(closed)> (Net::ReadTimeout)
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/protocol.rb:193:in `readuntil'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/protocol.rb:203:in `readline'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http/response.rb:42:in `read_status_line'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http/response.rb:31:in `read_new'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1609:in `block in transport_request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1600:in `catch'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1600:in `transport_request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1573:in `request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1566:in `block in request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:985:in `start'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1564:in `request'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty/request.rb:156:in `perform'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty.rb:612:in `perform_request'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty.rb:542:in `post'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty.rb:649:in `post'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/ruby-openai-3.7.0/lib/openai/client.rb:63:in `json_post'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/ruby-openai-3.7.0/lib/openai/client.rb:11:in `chat'
        from translate.rb:33:in `block in build_translation'
        from translate.rb:28:in `each'
        from translate.rb:28:in `build_translation'
        from translate.rb:60:in `

(Yes, I’m on Windows, don’t hate.)

Given this was a quick and dirty program, I added the sleep call, but then, later, added the while errkeys.length > 0 loop, which should help recover from any network issues. I’ll probably remove the sleep in the future.

I signed up for a paid account because I was receiving “quota exceeded” messages. To their credit, they have some great billing features. I was able to limit my monthly spend to $10, an amount I feel comfortable with.

As I mentioned above, translating every message into Hindi using GPT-3.5 cost about $0.02. Well worth it.

I used GPT-3.5 because GPT-4 was only in beta when I wrote this code. I didn’t spend too much time mulling that over, but it would be interesting to see if GPT4 is materially better at this task.

Worries

Translating these messages was a great exploration of the power of the OpenAI API, but I think it was also a great illustration of this tweet.

I had to determine what the problem was, and how to get the data into the model, and how to pull it out. As Reid Hoffman says in Impromptu, GPT was a great undergraduate assistant, but no professor.

Could I have dumped the entire properties file into ChatGPT and asked for a translation? I tried a couple of times and it timed out. When I shortened the number of messages, I was unable to figure out how to get it to ignore comments in the file.

One of my other worries is around licensing. I’m not alone. This is prototype code running on my personal laptop and the license for all the localization properties files is Apache2. But even with that, I’m not sure my company would integrate this process given the unknown legal ramifications of using OpenAI GPT models.

In conclusion

OpenAI APIs expose large language models and make them easy to integrate into your application. They are a super powerful tool, but I’m not sure where they fit into the legal landscape. Where have we heard that before?

Definitely worth exploring more.

How to have great guest posts on a blog

I have another blog that I’ve been running for a few years, called Letters to a New Developer.

It is full of advice I wish I had had at the beginning of my software development career. I even turned it into a book.

One thing I’ve done the entire time I’ve been writing this is to ask for guest posts. No advice is generic; it’s all based on where and who you are. By asking for guests to share their experiences, I expose my readers to a wider variety of viewpoints. This leads to better understanding of the wide world of software.

After all, if you want to work for a FAANG, it’s better if you hear from someone who prepped, interviewed and was hired for such a company, rather than me, who will never work for such a company. Or if you are looking to understand what it’s like to work in open source, hearing from someone who has made it their career. Have imposter syndrome? Others have struggled with it too. Also works for techniques; it’s great to hear from someone who is a methodical journaler, for example.

I bet you get the point. There’s no way I could have written any of those articles.

Guest posting also helps spread the word about my blog, since a guest poster usually shares their writing with their friends and network.

For the right kind of blog, guest posts can be great. Here’s my criteria:

  • Does the blog have decent traffic? If not, folks probably won’t want to guest post. I’d say at least 25 visits a day.
  • Have you done this for a while? A years worth of regular posting shows you’re serious.
  • Does the blog present a variety of viewpoints? I wouldn’t want a guest post for this blog, because it’s all me, all the time.
  • Is the blog non-corporate? If it is a company blog, guest posts should be paid for with money, or be part of a co-marketing project.

If you want to have great guest posters, here’s my recipe for finding them and fostering them:

  • Don’t accept inbounds, unless you know the person or can vet what else they’ve written.
  • Don’t accept any content that is off-topic.
  • Set a target guest. For me, it is someone who is or was a developer with different viewpoint and perspective. I was especially interested in highlighting voices of people who were not white men.
  • Keep an eye out for interesting posts or interviews. Reach out to the authors and see if they might be game to guest post.
  • Be okay with a cross post. I’d say about 60% of my guest posts are original content, but more recently authors want to post the content on their blog first. Offer a link back.
  • Be okay with a repost, especially if the author is prominent. Review existing content and find one or two articles that fit with your blog’s theme. Ask if you can re-publish. Offer a link back.
  • Some folks will say no. They can do so for a variety of reasons (don’t want to write, don’t syndicate their content, etc), and you have to be okay with it.
  • If someone says “I can’t right now, sorry.” reply with something like “Totally get it. I’ll follow up in 6 months, unless you’d rather I didn’t.” Then, follow up. Sometimes they’ll ignore you, sometimes they’ll have more time.
  • If they are writing original content for you, offer to edit it. Find out how much editing they want. I’ve over-edited a few times and that’s a bad thing if someone is volunteering their time and knowledge.
  • If they ask about topics, advise them to focus. Lots of people want to write about “5 things you need to do” but in my experience, articles with focus on one topic are better received.
  • Write up guidelines so you can easily share. This can include audience, benefits, formatting, delivery mechanism, word count, topic ideas, and more.

Hope this helps. Happy guest blogging!

Why Developer Relations Jobs at Startups are Not Entry Level Positions

Developer relations jobs at startups are sometimes seen as suitable for entry-level developers, but this is a common misconception. In reality, these positions require a diverse skill set that can be overwhelming for new engineers.

The Variety of Tasks

Developer relations jobs at startups require a variety of tasks, including writing documentation, creating tutorials, writing example apps, giving presentations, and attending events. You’ll need to be able to prioritize these tasks, drive them to completion (often with the help of other startup colleagues) and re-order them based on changing needs and input.

This is all on top of the normal chaos at a startup, when you will be either searching for product market fit, pivoting or scaling. The combination of these tasks can be overwhelming for new engineers who are still learning the ropes.

The Solitude

This means that you need to be able to navigate the often chaotic environment of a startup on your own, without much in the way of project guidance or career development. You will need to rely on yourself and your own skills to be successful. This can be intimidating for new engineers who are just starting out their careers, but with the right attitude and knowledge it is possible to thrive.

Credibility with Developers

Developer relations jobs at startups require credibility with both beginner and expert developers. This means that you need to be able to communicate effectively with developers who are just starting out, as well as those who are experienced in the field.

While entry level developers can empathize with other beginning developers, technically connecting with experienced developers is a tall order. This requires a deep understanding of the technology and the ability to explain it in a way that is accessible to everyone.

Credibility with Founders

Developer relations is a long game, so you need to have buy-in from the founders of the company. They need to invest in you, in the community, and in your activities that strengthen the company’s ties to the community. This requires a deep understanding of the technology and an ability to communicate effectively with developers of all levels. It also requires demonstrating technical prowess and expertise that will be respected by the founders so they can trust that their investment in you and your activities will pay off. Having a few years of experience under your belt can help build this credibility, as it shows that you are familiar with the technology and have built up a base of knowledge that can be used to benefit the company.

Learning New Technologies

Developer relations jobs at startups require you to get up to speed quickly on a variety of technologies. This means that you need to be able to learn new things quickly and be able to explain them to others.

You need to be able to understand the abstractions your product sits upon, as well as the tools it can integrate with. The more experience you have with different technologies, the easier this is. Conversely, this can be a daunting task for new engineers who are still learning the basics.

Foundation of Production Level Code Experience

Finally, developer relations jobs at startups require a foundation of production code experience. This is because you need to be able to connect with developers who will be evaluating your tool.

You need to be able to speak their language and understand their needs. This requires a deep understanding of the technology and the ability to write production quality code.

Where This Advice Doesn’t Apply

If you are joining a larger startup or company with an established developer relations teams, much of this post does not apply. In this case, you’ll have founder buy-in, support from team members, and more defined tasks.

You may still have trouble connecting to experienced developers with complex questions, but may be able to connect them to other team members who can help them.

What To Do

If you are interested in developer relations, play the long game with your career. Spend a few years as a software developer, working on a team shipping code that users will enjoy. Learn how developers think and approach problems.

You can also engage in the communities that are important to you, either online (slacks, reddit, hackernews, etc) or in-person (conferences, meetups, etc). Volunteer or speak at events, which can help you understand the nuts and bolts of what goes on.

After a couple of years, you’ll have the software engineering foundation as well as the community experience to set yourself up for a fantastic devrel career.

In Conclusion

In conclusion, developer relations jobs at startups are simply not entry-level positions. They require a diverse skill set that can be overwhelming for new engineers. They require credibility with both beginner and expert developers, the ability to get up to speed quickly on a variety of technologies, and a foundation of production level code experience.

 

Multi-tenancy options

Multi-tenancy is a key part of building a SaaS product. You want to amortize your software investment across different paying customers. Customers should never be able to access any other customer’s data. And it is often the case that customers don’t want their data intermingled with other customers’ data, though that depends on the type of data and the customer needs.

By separating customers into tenants, you can achieve this data separation. There are a number of levels of multi-tenancy. In increasing order of isolation, they are:

  • no multi-tenancy. In this situation, all the customer data is co-mingled in the database. Any data that is unique to a customer is tied to that customer with an id.
  • logical multi-tenancy. Isolation is enforced in code and the database (every table has a ‘tenant id’ key, and you’re running joins). When you are using this type of isolation, you want to resolve the tenant as soon as possible. This is often done with a different hostname or path. You also want to ensure that any users accessing tenant data are part of the tenant. If you use this approach, mistakes in your code can be used to ‘escape’ the tenant limitation and read other customer’s data. However, one advantage of this approach is that you have operational simplicity: one version of the code to maintain and one version of the database. There may be support for this in your framework, or you may be rolling your own.
  • logical multi-tenancy supported by the database. Some databases support row level security isolation. In this scenario, each tenant has a different user, and the data isolation is enforced by the database. Your code is limited to looking up the correct user for a given tenant.
  • container level multi-tenancy. In this scenario, you run separate containers for each tenant. If you are using a solution like Kubernetes, you can run them in different namespaces to increase the isolation. The operational complexity increases (I did mention Kubernetes, did I not?) but it becomes far more difficult for an attacker to use the access of one tenant to get another tenant’s data. However, now you can have multiple versions of the codebase running. This can be a blessing and a curse, as it allows each client to control their version (if you enable it). This can increase support burden depending on the complexity of your application. You could also choose to run the latest code on every container, upgrading all containers every time a change is made to your software.
  • virtual machine multi-tenancy. Here you use different databases and virtual machines for each tenant. You can leverage common security defense-in-depth practices at the network level, using network access controls and firewalls. This physical isolation makes it even harder for an attacker to escape and view other tenants’ data. However, it increases your operational costs both in terms of complexity (are you going to force everyone to upgrade across the entire fleet?) and support (there may be configuration and/or code drift between the different VMs). If you pursue this, it behooves you to automate the creation of these virtual machines.
  • physical hardware isolation. With this choice, you actually run different hardware for each tenant, possibly in different data centers. This is the most secure, but the most operationally intensive. There are some options for API driven hardware setup, but the isolation, while a boon for security, makes updates and upgrades more difficult.

What is the best option for your SaaS solution? It depends on the security needs of your customers as well as your cost structure and your operational maturity. The higher the level of isolation, the harder it is to run and upgrade the various systems.