Skip to content

Using GPT to automate translation of locale messages files

At my current employer, FusionAuth, we have extracted out all the user facing messages to properties files. These files are maintained by the community, and cover over fifteen languages.

We maintain the English language version. Whenever new user facing messages are added, the properties file is updated. Sometimes, the community contributed messages files are out of date.

In addition, there are a number of common languages that we simply haven’t had a community member offer a translation for.

These include:

  • Korean (80M speakers)
  • Hindi (691M)
  • Punjabi (113M)
  • Greek (13.5M)
  • Many others

(All numbers from Wikipedia.)

While I have some doubts and concerns about AI, I have been using ChatGPT for personal projects and thought it would be interesting to use OpenAI APIs to automate translation of these properties files.

I threw together some ruby code, using ruby-openai, the ruby OpenAI community library that had been updated most recently.

I also used ChatGPT for a couple of programming queries (“how do I load a properties file into a ruby hash”) because, in for a penny, in for a pound.

The program

Here’s the results:


require "openai"
key = "...KEY..."

client = OpenAI::Client.new(access_token: key)

def properties_to_hash(file_path)
  properties = {}
  File.open(file_path, "r") do |f|
    f.each_line do |line|
      line = line.strip
      next if line.empty? || line.start_with?("#")
      key, value = line.split("=", 2)
      properties[key] = value
    end
  end
  properties
end

def hash_to_properties(hash, file_path)
  File.open(file_path, "w") do |file|
    hash.each do |key, value|
      file.write("#{key}=#{value}\n")
    end
  end
end

def build_translation(properties_in, properties_out, errkeys, language, client)
  properties_in.each do |key, value|
    sleep 1
# puts "# translating #{key}"
    message = value
    content = "Translate the message '#{message}' into #{language}"
    response = client.chat(
      parameters: {
        model: "gpt-3.5-turbo", # Required.
        messages: [{ role: "user", content: content}], # Required.
        temperature: 0.7,
      }
    )
    if not response["error"].nil?
      errkeys << key #puts response 
    end 

    if response["error"].nil? 
      translated_val = response.dig("choices", 0, "message", "content") 
      properties_out[key] = translated_val 
      puts "#{key}=#{translated_val}" 
    end 
  end 
end

# start the actual translation 
file_path = "messages.properties" 
properties = properties_to_hash(file_path) 
#puts properties.inspect 
properties_hi = {} 
language = "Hindi" 
errkeys = [] 

build_translation(properties, properties_hi, errkeys, language, client) 
puts "# errkeys has length: " + errkeys.length.to_s 

while errkeys.length > 0
# retry again with keys that errored before
  newprops = {}
  errkeys.each do |key|
    newprops[key] = properties[key]
  end

  # reset errkeys
  errkeys = []

  build_translation(newprops, properties_hi, errkeys, language, client)
  # puts "# errkeys has length: " + errkeys.length.to_s
end

# save file
hash_to_properties(properties_hi, "messages_hi.properties")

More about the program

This script translates 482 English messages into a different language. It takes about 28 minutes to run. 8 minutes of that are the sleep statement, of which more below. To run this, I signed up for an OpenAI key and a paid plan. The total cost was about $0.02.

I tested it with two languages, French and Hindi. I used French because we have a community provided French translation. Therefore, I was able to spot check messages against that. There was a lot of overlap and similarity. I also used Google Translate to check where they differed, and GPT seemed to be more in keeping with the English than the community translation.

I can definitely see places to improve this script. For one, I could augment it with a set of loops over different languages, letting me support five or ten more languages with one execution. I also had the messages file present in my current directory, but using ruby to retrieve them from GitHub or running this code in the cloned project would be easy.

The output occasionally needed to be reviewed and edited. Here’s an example:

[blank]=आवश्यक (āvaśyak)
[blocked]=अनुमति नहीं है (Anumati nahi hai)
[confirm]=पुष्टि करें (Pushṭi karen)

Now, I’m no expert on Hindi, but I believe I should remove the English/Latin letters above. One option would be to exclude certain keys or to refine the prompt I provided. Another would be to find someone who knows Hindi who could review it.

About that sleep call. I built it in because in my initial attempt, I saw error messages from the OpenAI API and was trying to slow down my requests so as not to trigger that. I didn’t dig too deep into the reason for the below exception; at first glance it appears to be a networking issue.


C:/Ruby31-x64/lib/ruby/3.1.0/net/protocol.rb:219:in `rbuf_fill': Net::ReadTimeout with #<TCPSocket:(closed)> (Net::ReadTimeout)
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/protocol.rb:193:in `readuntil'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/protocol.rb:203:in `readline'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http/response.rb:42:in `read_status_line'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http/response.rb:31:in `read_new'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1609:in `block in transport_request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1600:in `catch'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1600:in `transport_request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1573:in `request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1566:in `block in request'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:985:in `start'
        from C:/Ruby31-x64/lib/ruby/3.1.0/net/http.rb:1564:in `request'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty/request.rb:156:in `perform'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty.rb:612:in `perform_request'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty.rb:542:in `post'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/httparty-0.21.0/lib/httparty.rb:649:in `post'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/ruby-openai-3.7.0/lib/openai/client.rb:63:in `json_post'
        from C:/Ruby31-x64/lib/ruby/gems/3.1.0/gems/ruby-openai-3.7.0/lib/openai/client.rb:11:in `chat'
        from translate.rb:33:in `block in build_translation'
        from translate.rb:28:in `each'
        from translate.rb:28:in `build_translation'
        from translate.rb:60:in `

(Yes, I’m on Windows, don’t hate.)

Given this was a quick and dirty program, I added the sleep call, but then, later, added the while errkeys.length > 0 loop, which should help recover from any network issues. I’ll probably remove the sleep in the future.

I signed up for a paid account because I was receiving “quota exceeded” messages. To their credit, they have some great billing features. I was able to limit my monthly spend to $10, an amount I feel comfortable with.

As I mentioned above, translating every message into Hindi using GPT-3.5 cost about $0.02. Well worth it.

I used GPT-3.5 because GPT-4 was only in beta when I wrote this code. I didn’t spend too much time mulling that over, but it would be interesting to see if GPT4 is materially better at this task.

Worries

Translating these messages was a great exploration of the power of the OpenAI API, but I think it was also a great illustration of this tweet.

I had to determine what the problem was, and how to get the data into the model, and how to pull it out. As Reid Hoffman says in Impromptu, GPT was a great undergraduate assistant, but no professor.

Could I have dumped the entire properties file into ChatGPT and asked for a translation? I tried a couple of times and it timed out. When I shortened the number of messages, I was unable to figure out how to get it to ignore comments in the file.

One of my other worries is around licensing. I’m not alone. This is prototype code running on my personal laptop and the license for all the localization properties files is Apache2. But even with that, I’m not sure my company would integrate this process given the unknown legal ramifications of using OpenAI GPT models.

In conclusion

OpenAI APIs expose large language models and make them easy to integrate into your application. They are a super powerful tool, but I’m not sure where they fit into the legal landscape. Where have we heard that before?

Definitely worth exploring more.

Books and other resources to level up as a software developer

A while ago there was an HN post asking for suggestions on [r]eading material on how to be a better software engineer.

Here’s my list based in part off a comment I made there.

First, books:

  • Secrets of Consulting by Gerald Weinberg because every problem is a people problem.
  • Refactoring by Martin Fowler et. al. discusses how and why to refactor, as well as providing a nomenclature for the process.
  • Code Complete by Steve McConnell is a bit dated (the last version I could find was from 2004) but a great overview of the entire software process, from requirements to maintenance.
  • The Mythical Man-Month by Fred Brookes covers best practices about software development, written about a project from the 1960 and 1970s. Nothing new under the sun.
  • The Joel On Software Strategy Letters cover different aspects of software strategy. The link is the first one, but all of them (I think there are five) are great.
  • Letters to a New Developer is a collection of essays helpful to new developers. Note I wrote this book, but I think it does a good job of discussing the “soft skills” in an easily digestible format.
  • The Pragmatic Programmer by Dave Thomas and Andy Hunt. I haven’t read the revised 20th anniversary edition, but the first one opened my eyes to the craft of software.
  • High Output Management by Andy Grove illustrates how think about throughput.
  • The Phoenix Project by Gene Kim et. al. is a fun novel(!) about applying lean management principles to software engineering.
  • Good to Great by Jim Collins focuses on what great companies bring to the table. Helps me evaluate where to work.
  • Managing Humans by Michael Lopp. The whole Rands site is worth reading, but I enjoyed this book about how to manage teams and build software. See above.
  • Don’t Make Me Think by Steve Krug shows ways to think about usability, focusing on webapps. Short and easy.
  • Badass: Making Users Awesome, by Kathy Sierra helps you put yourself in the shoes of your users and think about how to build software they will love. Short and easy.

Then, podcasts and videos:

  • Mastery Autonomy and Purpose, a great video about what people really want in work.
  • The Manager’s Toolbox podcast, focuses on nuts and bolts skills for managing people. You didn’t say you wanted to be a people manager, but knowing what managers think about will make you more effective in any org.
  • Screaming in the Cloud is useful for keeping up with AWS and other cloud provider offerings.
  • SE Radio is a bit dry, but has a great back catalog of software engineering focused episodes.

A few other resources:

  • The Rands Leadership Slack has over 10,000 engineering leaders discussing all kinds of software related topics.
  • CTO lunches is an email list of engineering leaders. The discussions aren’t consistent, but when they happen, they’re great. Plus, it comes to your email.
  • HackerNews is a great way to burn time, but also a great way to keep on top of topics that are top of mind of some of the best developers in the world.

Reading up on software practices can help you level up as a software engineer because you’ll be able to avoid mistakes others have made before you. I can also offer a view of the big picture; knowing how your software helps your organization or company will only make you more valuable as a developer.

GitHub Actions Are Amazingly Easy

GitHub Workflows are automated jobs that can be triggered by various events against a GitHub repository. They are pretty awesome.

GitHub Actions are a way to encapsulate configuration and functionality in a way that can be easily reused in GitHub Workflows.

I was thinking it’d be fun to create some GitHub Actions (yes, I’m the life of the party), so I sat down a few mornings ago to do this. I was shocked at how easy it was.

I followed a few lines of this tutorial to create a workflow. Then I created an action by following this tutorial. Finally, I edited my workflow to use the new action. That was it.

It was amazingly simple and took me about 30 minutes. I ran into one unrelated issue (to set the executable bit on a shell script in windows, I had to modify the shell script contents in order to ensure the change was sent to the remote repo).

If you take a look, you’ll see these are both toy repositories, to be sure. However, the ability to write jobs which will be executed on a git push, pull request or other events is great and removes toil. Being able to extract common functionality to an action is even better. Finally, the ability to share the action publicly by adding it to the GitHub marketplace is fantastic.

I’ve liked CircleCI for a long time, but if I were them I’d be worried.

One issue I found is that the testing/release cycle is pretty tedious (I’ve mentioned that action debugging to be an issue for a while).

While I was troubleshooting my executable bit error, I had to do the following every time I wanted to test a change:

  • make a change in the action repository
  • create a new tag
  • push it to the remote
  • switch to the workflow repository
  • bump the action version
  • push to the remote
  • wait for the workflow to complete

Not horrific, but pretty tedious. I don’t know if there are other options such as local deployment which would reduce that cycle, but that would be swell.

Other than that, 10 out of 10, would write more actions.

How To Start Improving a Legacy App

An interesting question appeared on HN recently: “Ask HN: Inherited the worst code and tech team I have ever seen. How to fix it?

You can read that post and the answers there. I’m going to address a related, but different question in this post.

If you run encounter an application that has tremendous business value and yet is not following any modern software management processes, what are concrete steps you can take to help improve the application?

That is, what if you have (or are hired to be responsible  for) an app like the poster of the HN thread:

  • making plenty of money for the company
  • no version control
  • old school structure
  • a ball of mud architecture
  • no code deleted, just commented out
  • multiple versions of libraries on the front and back end
  • etc

First off, you need to convince someone that it is going to be worthwhile to invest in this process. If you can’t do that, you are dead in the water. So look for the pain points that occur when best practices are lacking:

  • slow delivery of features
  • catastrophic bugs which lose money, hurt the brand or impact data
  • talent hard to hire
  • hard to improve the application

If you can pinpoint pain caused by the app, you can start to build a case to improve it.

If you can’t, well then, maybe you shouldn’t touch it. If it ain’t broke, don’t fix it!

With that said, here’s my list of what to implement, in rough priority order. Don’t worry about best of breed for the tools, just pick what the company uses. If the tool isn’t in use at the company, pick something you and the team are familiar with. If there is nothing in that set, pick the industry standard. I include a recommendation for the latter.

1. Get the app under version control. Git is best if you don’t have any existing solution. GitHub or GitLab are great places to store your git repositories.

2. Start up a bug tracker. You have to have a place to keep track of all issues. GitHub issues is adequate, but there are a ton of options. This would be an awesome place to get buy-in from the existing team about whichever one they prefer. The truth it is doesn’t matter which particular bug tracker you use, just that you use one.

3. A way to get one click or zero click deploys. A SaaS tool like CircleCI, GitHub actions is fine. If you require “on prem”, Jenkins is a fine place to start. But you want to be able to deploy changes quickly.

4. Set up a staging environment. With one, you can manually test changes and debug issues without affecting production. Building this will also give you confidence that you understand how the system is deployed. Then you can can include that in the build tool process.

5. Unit and system/end to end testing. End to end testing can give you confidence in changes. However, it is overwhelming to add testing to an existing large, crufty codebase. I’d focus on two things: unit testing some of the weird logic; this is a relatively quick win. Second, setting up at least one or two end to end tests through core flows (login, purchase path, etc). In my experience, setting up the first instance of each of these is the toughest process, then it gets progressively easier. There’s usually an ‘xUnit’ framework in any language for unit testing. Look for that. I’m not sure what best practice is in end to end testing, but selenium or cypress are good for browser based applications.

6. Capture documentation. This might be a higher priority, depending on what your relationship with the existing team is. Few teams will say no to someone helping out with doc. Document high level architecture, deployment processes, key APIs, interfaces, data stores, and more. Capture this in google docs or a wiki if you don’t have an existing solution.

7. Start using data migrations. Having some way to automatically roll database changes forward and back is a huge help for moving faster.

None of these are about changing the code (except maybe the last one), but they all wrap the code in a blanket of safety.

After implementing one or more of these, the team should be able to move faster and with more confidence. This will build trust and allow you to suggest bigger changes, such as bringing in a framework or building abstraction layers.

GitHub actions and workflows

I recently wrote my first real GitHub action workflow at work. It was to publish our website after a merge or push to our main branch.

After this experience, I think these workflows are perfect for simple automation tasks. Things like:

  • Running a linter like rubocop on your code
  • Deploying a simple application (one or a few artifacts).
  • Running unit and integration tests.

I didn’t use self hosted actions, though that seems like a nice escape valve if you want to run things within your own network or run over limit. GitHub publishes the action and workflow limits (storage, runtime) and that’s definitely worth reviewing.

You also can easily stand up a couple of different service containers (right now only postgresql and resdis) for easy integration testing. You can also abstract out your commonly used workflow segments to versioned actions.

It was really a pain to write the workflow, however. I had to push repeatedly to our mainline branch, and there were times I screwed up the YAML or didn’t have my script correct. The feedback loop was slooow. Ouch. There are solutions to run them locally, but I didn’t try it yet.

Other than that, it was a positive experience. If you are using GitHub and have automation needs, take a look at GitHub actions. I am a big fan of CircleCI and have been for years. GitHub actions covers a lot of the same ground. GitHub actions are less sophisticated, but it seems like a definite “innovators dilemma” play. So I expect to see actions to get more and more sophisticated.

Book Review: The Cuckoo’s Egg

I recently finished “The Cuckoo’s Egg”, by Clifford Stoll. It was a fascinating non-fiction book exploring the foundations of computer security in a personable format.

The setting is the mid 1980s. The author discovers something weird on his academic computer system. There’s an unexplained charge of 75 cents. He digs deeper, discovers that someone who’s left the university is logging in.

After further investigation, he discovers that the user who is logging in is an intruder. After discussing the situation with his boss, he gets three weeks to find out who they are. He figures that’s plenty of time.

The investigation ends up taking a year.

It also extends far beyond his academic systems, both in scope and effort. Stoll talks to numerous government agencies and private organizations, letting them know they’ve been attacked and getting their assistance tracing the hacker. He sleeps under his desk. He rigs up a pager so that he can know which accounts the hacker is using. Stoll sets up printers so that every word the hacker types is recorded, unbeknownst to him.

It’s quite the tale. As someone who has worked with software for years, I really appreciated the historical nature of it. When I became aware of the internet, in my youth, some of the groups and communities he mentions were still around; I remember reading and posting to usenet. But many of the systems were before my time. I’ve never touched a computer running VMS, for instance.

But, for all the history, the people problems were the same: users not changing passwords, system managers not locking their software down, bureaucrats happy to take information but not willing to share. Let’s just say, mistakes were made.

I also enjoyed the author’s interspersal of lived experiences. We don’t simply follow one computer nerd tracking another. We also learn about milkshakes, parties in San Francisco, curry nights and his first experience with the microwave. While some phrases and analogies are repeated (“should we thank someone who goes to a little town and robs people to illustrate they should lock their doors” pops up at least twice), in general the book is pretty readable. Stoll’s personal stories and musings help that readability immensely.

All in all, a great book if you are interested in the history of computing or modern security practices. If you’re interested in learning more, you can check out a paper he wrote based on the same experience for ACM.

Book Review: Algorithms to live by

I finished Algorithms to Live By, by Brian Christian and Tom Griffiths. I enjoyed it immensely.

The premise of the book is that computer programs make decisions with algorithms all the time. There’s math behind how they do so, including tradeoffs and time considerations. Human beings face some of the same decisions and there’s no reason we can’t use this knowledge to live better lives.

Yes, this is kind of a self help book—“you can live a better life by thinking like a computer”. But it’s math, folks.

I actually recommended it to my SO because I feel like she’d understand me better after reading it. I’m always talking about how much of the stress in our life is due to resource contention.

The book covers a wide swathe of decision making. Here are some examples of the broad categories and some specific references:

  • when to decide to stop looking for a house or a partner
  • when to explore new knowledge vs exploit current knowledge in the context of clinical trials
  • how randomness can lead to better outcomes
  • how caches can help you determine what of your wardrobe to keep
  • how overfitting warps sports like fencing

As illustrated above, this is not a book about theory, but is actually hands on. I don’t recall seeing a single equation, though there are graphs. And the authors do mention plenty of researcher names, provide footnotes and have a twenty page bibliography. So if you want to learn more about the formal proofs, the info is there.

It’s hard to choose just one takeaway from this book, but if I had to pick one, it’d be the fact that game theory shows that you need external forces to avoid the tragedy of the commons, and that emotions may play a role in providing that.

Here’s an excerpt if you want a deeper look.

If you spend any time thinking about how you can make decisions better, Algorithms to Live By is worth reading.

What I wish I had known when I was starting out as a developer

As I get older, I wish I could reach back and give myself advice. Some of it is life advice (take that leap, kiss that girl, take that trip) but a lot of it is career advice.

There’s so much about being a software developer that you learn along the way. This includes technical knowledge like how to tune a database query and how to use a text editor. It also includes non technical knowledge, what some people term “soft skills”. I tend to think they are actually pretty hard, to be honest. Things like:

  • How to help your manager help you
  • Why writing code isn’t always the best solution
  • Mistakes and how to handle them
  • How to view other parts of the business

These skills took me a long time to learn (and I am still learning them to this day) but have had a material impact on my happiness as a developer.

I am so passionate about this topic that I’ve written over 150 blog posts. Where are these? They’re over at another site I’ve been updating for a few years.

And then I went and wrote a book. I learned a bunch of lessons during that process, including the fact that it’s an intense effort. I wrote it with three audiences in mind:

  • The new developer, with less than five years of experience. This book will help them level up.
  • The person considering software development. This book will give them a glimpse of what being a software developer is like.
  • The mentor. This book will serve as a discussion guide for many interesting aspects of development with a mentee.

The book arrives in August. I hope you’ll check it out.

Full details, including ordering information, over at the Letters To A New Developer website.

What senior engineers do: fix knowledge holes

I had a bad week a few weeks back, and got together with a friend and former colleague. We started trading war stories. He mentioned one time when he was working for a company which had both a desktop client and a server component. The communication between the two pieces of the system was completely, utterly undocumented. The original engineers who had written each piece of the app had departed. The system was key to the company’s business (it was what they sold!).

This “knowledge hole” was identified by my friend. He was a software engineer. This was a key piece of software, but was unknown. The problem was not getting solved.

So, he solved it.

He installed the client and a TCP sniffer, and clicked around the client. He recorded all the traffic. He literally reverse engineered the client/server communication protocol. He then documented what the communication protocol was, so the the next group of engineers could benefit. He was was a software developer who was responsible for the back end systems. I don’t think he was directly responsible for the client/server protocol and the client was, I believe, outside his purview. (Updated 7/20) Note, he was a software engineer, not QA or a network engineer. But he didn’t let the role definition stop him.

I thought to myself, this is the textbook definition of a senior engineer. You see a problem, you solve it (thoroughly), you document it and you level up your team.

It was pretty awesome to hear about.

Exercism.io: Level up your coding skills

Children at playI’ve really enjoyed Exercism.io. This is an online learning platform for coding. It’s similar to HackerRank or Codewars.

The main difference is that for a subset of the problems, you actually have a mentor give you feedback. You download a problem (with tests) and code up the solution however you like. Then you push the solution up to a central repository where a volunteer mentor reviews it. I’ve had a different mentor for every problem. The same mentor stays with you for the duration of one single problem, and gives you feedback on style and API usage. They also provide accountability in a way that a computer grading my code just doesn’t do it for me.

Each language has a track (as well as some specialized frameworks like React). I’m working on two tracks right now. The first is a language I’m intermediate in (ruby). I’ve learned a lot about the standard library as well as ruby idiom. The second is a language I’m at best a beginner at: javascript (specially modern javascript which has changed a lot).

But there are a lot of other languages I’m looking forward to looking at when I finish these tracks (functional languages like OCaml, for one).

A final benefit of Exercism is that each problem is a challenge but not an impossible one. I find coding up a naive solution to the challenge takes about 15 minutes, and I’ll often do that first before I try to refine it to make it a bit better. It’s kinda like TDD but someone has written all the tests for you, so you just get to do the fun part.

Check it out.