The business for which I work, 8z Real Estate, runs on Google spreadsheets–they are everywhere, and are especially powerful when combined with Google Forms. (Except in the tech department–we use more specialized tools like wikis and bug trackers.)
Recently I was cleaning up one of these spreadsheets and making it more efficient. This spreadsheet was an ongoing list of items that should be charged to various real estate agents. There was a fairly clear set of rules (person A should be charged this for item B, but not for item C, and person D should not be charged for items E, but should for item F). The rules were also fairly constant–not a lot of change. While they were clear, there were some intricacies that had tripped up some folks. And since there was real money involved, there was often a lot of time expended to make sure the charges were correct.
For all these reasons, it made sense to automate the charge calculations. Given the data was already in a Google spreadsheet, I decided a custom function was the best way to do this. The custom function could read from a configuration tab in the same spreadsheet with a list of people and items which would represent the charge rules. In this way, if there was a new item, a new row could be added to the configuration spreadsheet without requiring any help from a developer.
The initial implementation pulled data from the configuration tab with each custom function call. This naive implementation worked fine up to a couple of hundred rows. However, eventually this caused
Exceeded maximum execution time errors. I believe this is because when the spreadsheet was calculating all the charge values, it was accessing the configuration spreadsheet range hundreds of times a second.
My next step was to try to cache the configuration data. I used stringified JSON stored in the cache service. Unfortunately, this caused a different issue:
Service invoked too many times in a short time: cacheService rateMax. Try Utilities.sleep(1000) between calls.
Third time is the charm: have the function return multiple values. The configuration data is only read once, as is the list of items and names. These are iterated and the charges for all of them are calculated and returned in a double array to fill entire columns. Using this technique avoided the above issues, but created another one. Adding new rows with new items wouldn’t update the charges columns. Apparently there is additional, ill documented caching of custom function values. Luckily StackOverflow had an explanation: spreadsheets “evaluate your [custom] functions only when a parameter changes.” The way I worked around this was to pass a parameter to the custom function of the number of non blank rows of the spreadsheet:
=calculateCosts(counta(A:A)). When a new row is added, the custom function is re-evaluated.
In the end I have written unit tested code that works in the way the business wants to work (in Google Spreadsheets), that runs on someone else’s infrastructure (Google’s), that can be configured by non technical employees and that will increase the accuracy of the charge calculations. Wahoo!