Skip to content

All posts by moore - 27. page

Re:Invent Videos

AWS Re:Invent is supposed to be a great conference.  I have thus far been unable to attend, but the videos of the presentations are posted online with about a day’s lag.  So, like most conferences, you really should be networking and meeting people face to face rather than attending the presentations.

Here’s the AWS Youtube channel where you can watch all the videos, or just sample them.

I’ve found the talks to be of varying quality.  Some just rehash the docs, but others, especially the deep dives, discuss interesting aspects of the AWS infrastructure that I haven’t found to be documented anywhere (here’s a great talk about Elastic Block Storage from 2016).  The talks by real customers also give a great viewpoint into how AWS’s offerings are actually implemented to provide business value (here’s a great talk from 2016 about using Amazon Machine Learning to predict real estate transactions).

It’s a sprawling conference, well suited to AWS’s sprawling offering, and I bet no matter what your interest, you will be able to find a video worth watching.

Leverage

As a software developer, and especially as a senior (expensive) software developer, you need leverage. Leverage makes you more productive.  I find it also makes the job more fun.

Some forms of leverage:

  • test suite. A suite provides leverage both by serving as a living form of documentation (allowing others to understand the code) and a regression suite so that changes to underlying code can be made with assurances that external code behavior don’t happen.
  • libraries and frameworks, like Rails. By solving common problems, libraries, open source or not, can accelerate the building of your product. Depending on the maturity of the library or framework, they may cover edge cases that you would have to discover via user feedback.
  • iaas solutions, like AWS EC2. By giving you IT infrastructure that you can manipulate via software, you can apply software engineering techniques to ensure validity of your infrastructure and make deployments replicable.
  • paas solutions, like Heroku. These may force your application to conform to certain limitations, but take a whole host of operations tasks off your plate (deployments, patching servers). When a new bug comes out affecting Nginx, you don’t have to spend time checking your servers–your provider does. When you reach a certain scale, thost limitations may come back to bite you, but in the early days of a project or company, having them off your plate allow you to focus on business logic.
  • saas solutions, like Google Apps or Delighted. You can have an entire business solution available for a monthly fee. These can be large in scope, like Google Apps, or small in scope, like Delighted, but either way they solve an entire business problem. You can trade time for money.
  • experience, aka the mistakes you’ve made on someone else’s dime. This allows you leverage by pruning the universe of possibilities for solving problems, based on what’s worked in the past. You don’t spend time doing exploration or spikes. Note that experience may guide you toward or away from any of the points of leverages mentioned above. And that experience needs to be tempered with learning, as the software world changes.
  • team. There’s only so much software you can write yourself. A team can help, both in terms of executing against a software design/architecture and improving it via their own experience.

Leverage allows you to be more productive and the more experienced you get, the more you should seek it.

A blog post every work day for December

Dear Reader,

I’m planning to do a personal challenge the month of December.  I’m going to write a blog post every Monday, Tuesday, Wednesday, Thursday and Friday of the month (excepting any holidays).  Most of the topics will be technical, but some may focus on leadership.

I’m writing this blog post and publishing it in late November to give you a ‘heads up’.  Some of you receive my blog via email, and I wanted to give you a warning and let you unsubscribe if this flurry of posts wasn’t intriguing.

Thanks!

Dan

Useful Tool: Delighted for NPS reports

So, I was looking to add a simple widget to collect net promoter scores (often called NPS).  I was astonished at the dearth of options for standalone NPS tracking.  I assume that most customer relationship software has an NPS tracker embedded in it, but we didn’t want to use anything other than a simple standalone widget.

Two options that popped up: Delighted and Murm.  I did a quick spike to evaluate each of these tools.  At the time I reviewed it, I couldn’t get Murm to work, and they didn’t respond to my customer support request.  Delighted, on the other hand, did so.  They let me have access to web display, which was a beta feature at the time and worked with us on price.  It was trivial to install, though I’m still a bit unclear how the form determines when to display.  Highly recommended.

The nice thing about having a NPS tracker on your website is you can get direct feedback from your users.  This has led to numerous useful conversations and feature requests, as someone who is using our software for the first time brings new clarity to confusing features or user interface.  Plus, it is a great number to track.

The three stages where you can transform data for Amazon Machine Learning

When creating an AML system, there are three places where you can transform your data. Data transformation and representation are very important for an effective AML system.  I’d suggest watching about five minutes of this re:Invent video (from 29:14 on) to see how they leveraged Redshift to transform purchase data from a format that AML had a hard time “understanding” to one that was “easier” for the system to grok.

The first time to transform your data is before the data ever gets to an AML datasource like s3 or redshift.  You can preprocess the data with whatever technology you want (Redshift/SQL, as above, EMR, bash, python, etc).  Some sample transformations might be:

At this step you have tremendous flexibility, but it requires staging your data.  That may be an issue depending on how much data you have, and may affect which technology you use to do the preprocessing.

The next place you can modify the data is at datasource creation.  You can omit features (but only using the API by providing your own schema with an ‘excludedAttributeNames’ value, not the AWS console), which could speed up processing and lower the total model size.  It could also protect sensitive data.  You do want to provide AML with as much data as you can, however.

As long as a feature is valid in both types, you can create multiple data sources with different data types for a feature.  The only type of feature that I know of that is a valid in multiple AML datatypes is an integer number, which, as long as it only has N values (like human age), could be represented as either a numeric value or a categorical value.

The final place you can modify your data before the model sees it is in the ML recipe. You have about ten or so functions that AML provides that you can apply to your data as it is read from the data source and fed to the model.  You can also create intermediate representations and make the available to your model (lowercase a string feature, for example).

Using a recipe allows you to modify your data before the model sees it, but requires no staging on your part of the source or destination data.  However, the number of transformations is relatively limited.

You can of course combine all three of these methods when building AML models, to give you maximum flexibility.  As always, it’s best to try different options and test the results.

Talking to your customers

One of my favorite parts of The Food Corridor is talking to customers.  As the main technical force there, it’s a great opportunity for me to interact with folks whose lives my work is (hopefully) making better (and sometimes making worse).

The two main ways I do this:

  1. I do customer service.  We have zendesk and a common email inbox and I take time away from developing to answer emails.  This gives me a feel for the rough edges of our product and helps me build empathy for our users (“why couldn’t they see that you just click here and then here and then… oh, that’s why”).  It also has led to a number of bug reports that make the product better.  I also answer phone calls from our google voice number.
  2. I schedule a monthly meeting with some of our bigger customers.  These tend to be 15-60 minutes long.  This meeting lets me hear directly from them what they like about the platform, and more importantly, what is missing or broken.  I rotate among a number of different customers because I don’t want to be pulled too far in one direction, and they all have slightly different needs, but hearing from them regularly helps me triangulate.  It’s important to capture what they say in some kind of tracking system, even if you don’t execute against them for a while.  This in person call also lets me let the customer know of certain other features that are new and/or may be of interest.  Frankly, this meeting can be exhausting because of the wish list aspect of it (“oh man, what would it take to implement that?”) but I try to avoid that and just be an open listener.  I think that the customers also enjoy direct access to a developer.  This certainly doesn’t scale as well as option #1.

If you are going to pursue this:

  • make it a priority and realize it is going to affect your ability to deliver code
  • don’t get defensive when your product is criticized
  • take notes
  • seek out customers with a variety of perspectives
  • don’t commit to anything new in this call, but do let them know high level roadmap if they ask
  • ask them about items outside of your product if you have time.  This can clue you in to other problems they may have.

Customer service a different activity from developing software.  It’s very choppy, and you can encounter folks that are … having a rough day and perhaps taking it out on you.  But it also is one of the best ways to make sure that you, the person building a solution, stays in touch with the people using the solution.

Meetup talk outline

If you are thinking about doing a tech talk at a meetup, you should!  It’s a great way to deepen your experience, try a different skill and learn a lot.  It also has the benefit of making you a higher profile developer.

I was coercing a friend into talking at a meetup and he asked if I had any questions for his talk.  ‘X’ is what he was talking about.  (Where ‘X’ in this case was webhooks, but it could be any technology or protocol that is of interest to you.)

I rattled off the following set of questions that would be of interest.  I thought they might make a good template for any future meetup talks, so wanted to record them here for posterity.

  • what is X?
  • why does X exist?
  • what are prominent apps that use this tech?
  • how do you use it?
  • how would you test it?
  • how do you deal with dev/test/prod environments?
  • are there any gotchas?  Have any war stories?
  • how do you troubleshoot?
  • alternatives?  strengths and weaknesses of this solution or the alternatives?
  • any third party libraries that someone should be aware of?  How about tools?

What do you want to hear from presenters?

Things I wish I knew about Stripe

Caterpillar
Striped, but not charging your credit card

So, at The Food Corridor, we’ve been using Stripe happily since we launched in June of 2016.  As a developer, I’d used Stripe before in a couple of different ways, but this has definitely been my most sustained use of the payment service.  (If you don’t know what Stripe is, it is an API that makes charging customers as easy as an API call.  More here.)

I wanted to outline some of the things I’ve learned from months of using Stripe.

  • Stripe supports pulling money directly from bank accounts, via ACH, but it really isn’t the same ACH as your bank lets you do.  This is because Stripe isn’t a bank.  The biggest thing to be aware of here is that Stripe ACH takes 7 days to arrive in your bank account.  Another issue is that you have to do verification.  They have two ways of doing verification–micro deposits and Plaid.  Plaid is instant, but only supports major banks, which was a non starter for us (updated 9/8: Plaid supports around 1000 banks now).  The code for micro deposits is straightforward, but be prepared for some customer support issues.  Stripe deposits two amounts and withdraws just one amount, which was confusing for some of our users.  It also takes a couple of days, so if your users are hot to spend money, Stripe ACH may not be a fit.  The win?  Definitely cheaper.  (And I didn’t find any other service that would support both credit card and ACH transactions that was developer friendly.)
  • Don’t forget to set up your webhooks out of the gate.  Stripe mentions this, but I glossed over it in the early days, and missed some events that were important.  (The most relevant is that ACH is asynchronous, so when an ACH transfer fails, it is reported via webhook.  If bank account verification doesn’t work, you’ll get a different kind of webhook.  Review the docs and set up webhooks for all the ACH events.)  If you don’t have time for a full featured webhook processing implementation, Zapier can just send the webhook data to your email. This can be a great stopgap solution.  Or you can use stripe_event.
  • Per support, if a webhook post fails (because your app is down, for example), they are retried once an hour for 72 hours.
  • Speaking of stopgap solutions, the Stripe Dashboard is fantastic for manual processes.  Just because you can automate everything via an API, doesn’t mean you should.  There can be some complicated edge cases with payment processing, especially around refunds, but they can easily be handled with a google doc of instructions and the Stripe Dashboard.  I have found only one use case that the API can handle that the dashboard cannot (a partial refund of an ACH transaction).
  • I have found Stripe support to be excellent, quick and knowledgeable.
  • Occasionally customer charges will be declined because of bank fraud triggers.  Expect to occasionally ask your customers to call their bank.  (I think this has happend about once every third month).
  • Disputes are a total pain, because the process is opaque and slow (expect a resolution in about two months and know you are not in possession of the payment during that time).
  • Make sure to capture the payment id anytime you charge a card or run ACH.  It will make future automation a lot easier.
  • Monthly plans are complicated, so if you can lean on Stripe for management, even if you are doing manual plan management (applying coupons, adding, or removing users from plans via the dashboard), do that.
  • The first payment you charge takes 7 days to move from stripe to your​ bank account.  This is for fraud protection.  Payments thereafter typically take 2 days (but it depends on your country and industry).

And here are some special tips if you are using Stripe Connect (their marketplace product).

  • Read the docs!
  • Remember that first payment timeline?  It applies to every one of the connected accounts.  Think about charging your own credit card as soon as you connect an account to help with customer cash flow.
  • Consider whether you want to use managed vs standalone accounts.  Managed accounts are a lot more work but allow you to have a seamless UX that you control.  Standalone accounts, which we use, are far quicker to setup.  I think this depends on the number of sellers you have in your marketplace.
  • You also want to think about whether to place the charges on the platform account or on the connected accounts.  A major factor there is who bears the Stripe fees, the platform or the sellers.  We charged on the platform account because we wanted all our data in one place.  If you are selling plans, you can’t charge on the platform and use Stripe plans.
  • If you are charging on the platform account, and are using standalone accounts (where the sellers have to set up a stripe account) your sellers won’t see charge descriptions unless you manually copy the description over.  The code looks like:

# this will let the sellers know what invoice the charge was for
transfer_id = charge.transfer
transfer = Stripe::Transfer.retrieve(transfer_id, expand: ['destination_
payment'])
payment_id = transfer.destination_payment
payment = Stripe::Charge.retrieve(payment_id, {stripe_account: destinati
on_account_id})
payment.description = description
payment.save

Happy charging!

Announcing the Introduction to Amazon Machine Learning Video Course

Would you like an easy introduction to machine learning?  Without downloading any open source software, reading documentation and blog posts, and/or installing and configuring the system?

Amazon Machine Learning is a great way to explore machine learning without having to run any infrastructure.  It lets you build high performance cost effective systems to predict outcomes based on past data.

If you’re interested in learning more about Amazon Machine Learning, you can view my video course on O’Reilly Safari.  Over an hour and a half of video talking about all aspects of Amazon Machine Learning.

This course shows you how to build a model using Amazon Machine Learning (Amazon ML) and use it to make predictions. AWS expert Dan Moore covers the basic types of machine learning, how to prepare your data, and how to make your data available to the Amazon Machine Learning processes. You’ll also learn about evaluating a model for accuracy, using it both for batch and real-time predictions, and using tags to manage environments. Designed for developers and technical marketers new to machine learning and for data scientists interested in using the AWS Amazon ML platform, the course provides hands-on experience building a working predictive model using real data. Learners should obtain an AWS account (free from Amazon) and a basic understanding of AWS concepts before beginning the course.

Heroku 503 errors

One day recently I woke up to a text from our monitoring service (part of minimal heroku operations tasks) saying our application was down.

Darn it.

Here’s the recreation of my steps in troubleshooting this issue:

  • Login, see that the app is indeed down.
  • Look at newrelic.
  • Look at the logs (papertrail!).
  • Look at the deployment history.
  • Note when the issue started–curses, not when I did a deploy.
  • Open a ticket with heroku (after doing some research). (Love their support.)
  • Double check that the database is good and hasn’t hiccuped.
  • Look at the logs more.
  • Add more dynos, see if that helps.
  • Google the error message.
    •  <app-name> heroku/router: at=info method=POST path=”<url>” host=app.thefoodcorridor.com request_id=8a17648f-2d84-46ea-abf2-5903be894a2c fwd=”216.191.191.58″ dyno=web.3 connect=1ms service=4ms status=503 bytes=477 protocol=https“Notice that the issue is being stated right in the log file (passenger request queue filling up).  Here are sample error messages?
    • <app-name> app/web.3: [ 2017-07-12 14:47:44.3688 65/7f652dffd700 age/Cor/Con/CheckoutSession.cpp:261 ]: [Client 3-281] Returning HTTP 503 due to: Request queue full (configured max. size: 100)
  • Find some posts about the error message.  Here and here.
  • Start researching how to increase request queue size.
  • Talk a walk to clear my head.
  • Think about what external services we call, as that seems to be what might cause the request queue to back up.
  • Read another post that says restarting passenger helped.
  • Restart all dynos.
  • Problem disappears.
  • Look at logs more closely.
  • Last dyno to be restarted was the only problematic dyno.
  • Add comment to ticket about this being the cause.
  • Heroku confirms that the issue may have been the dyno: “sometimes individual dynos will hang and cause errors with 503 responses”
  • Write note to customers about the issue explaining how access to app was affected.
  • Lower number of dynos.
  • Breath a sigh of relief.