Heroku 503 errors

One day recently I woke up to a text from our monitoring service (part of minimal heroku operations tasks) saying our application was down.

Darn it.

Here’s the recreation of my steps in troubleshooting this issue:

Login, see that the app is indeed down.
Look at newrelic.
Look at the logs (papertrail!).
Look at the deployment history.
Note when the issue started–curses, not when I did a deploy.
Open a ticket with heroku (after doing some research). (Love their support.)
Double check that the database is good and hasn’t hiccuped.
Look at the logs more.
Add more dynos, see if that helps.
Google the error message.
- Jul 12 08:47:37 <app-name> heroku/router: at=info method=POST path=”<url>” host=app.thefoodcorridor.com request_id=8a17648f-2d84-46ea-abf2-5903be894a2c fwd=”216.191.191.58″ dyno=web.3 connect=1ms service=4ms status=503 bytes=477 protocol=https“Notice that the issue is being stated right in the log file (passenger request queue filling up). Here are sample error messages?
- Jul 12 08:47:44 <app-name> app/web.3: [ 2017-07-12 14:47:44.3688 65/7f652dffd700 age/Cor/Con/CheckoutSession.cpp:261 ]: [Client 3-281] Returning HTTP 503 due to: Request queue full (configured max. size: 100)
Find some posts about the error message. Here and here.
Start researching how to increase request queue size.
Talk a walk to clear my head.
Think about what external services we call, as that seems to be what might cause the request queue to back up.
Read another post that says restarting passenger helped.
Restart all dynos.
Problem disappears.
Look at logs more closely.
Last dyno to be restarted was the only problematic dyno.
Add comment to ticket about this being the cause.
Heroku confirms that the issue may have been the dyno: “sometimes individual dynos will hang and cause errors with 503 responses”
Write note to customers about the issue explaining how access to app was affected.
Lower number of dynos.
Breath a sigh of relief.

Letters to a New Developer

Pages

Subscribe

Socials

Categories

Archives