{"id":2043,"date":"2015-04-10T16:16:00","date_gmt":"2015-04-10T22:16:00","guid":{"rendered":"http:\/\/www.mooreds.com\/wordpress\/?p=2043"},"modified":"2015-04-15T06:49:41","modified_gmt":"2015-04-15T12:49:41","slug":"five-rules-for-troubleshooting-an-unfamiliar-system","status":"publish","type":"post","link":"https:\/\/www.mooreds.com\/wordpress\/archives\/2043","title":{"rendered":"Five rules for troubleshooting an unfamiliar system"},"content":{"rendered":"<figure style=\"width: 240px\" class=\"wp-caption alignleft\"><img decoding=\"async\" class=\"alignleft\" title=\"Stick Men in Trouble! by Ken and Nyetta\" src=\"http:\/\/www.mooreds.com\/wordpress\/wp-content\/uploads\/2015\/04\/6339854333_ebb3956897_m_trouble.jpg\" alt=\"trouble photo\" width=\"240\" \/><figcaption class=\"wp-caption-text\"><small>Photo by <a href=\"http:\/\/www.flickr.com\/photos\/71279764@N00\/6339854333\" target=\"_blank\">Ken and Nyetta<\/a> <a title=\"Attribution License\" href=\"http:\/\/creativecommons.org\/licenses\/by\/2.0\/\" target=\"_blank\" rel=\"nofollow\"><img decoding=\"async\" src=\"http:\/\/www.mooreds.com\/wordpress\/wp-content\/plugins\/wp-inject\/images\/cc.png\" alt=\"\" \/><\/a><\/small><\/figcaption><\/figure>\n<p>A few weeks ago, I engaged with a client who had a real issue.\u00a0 They sold a variety of goods via a website (if this was the 90s, they would have been called an &#8216;e-tailer&#8217;), and had been receiving intermittent double orders through their ecommerce system.\u00a0 Some customers were charged two times for one order.\u00a0 This led, as you can imagine, to very unhappy customers.\u00a0 This had been happening for a while and, unfortunately, due to some external obstacles, internal staff were not available to investigate the issue&#8211;they had their hands full with an existing higher priority project.<\/p>\n<p>I was called in to see if I could solve this issue.\u00a0 I had absolutely no familiarity with the system.\u00a0 But in less than ten hours of time, I was able to find the issue and resolve it.\u00a0 How I approached the situation can be summed up in five rules:<\/p>\n<p>Number one: <strong>define the problem<\/strong>.\u00a0 Ask questions, and capture the answers.\u00a0 What is the exact undesired behavior?\u00a0 When is the undesired behavior happening?\u00a0 What seems to trigger it?\u00a0 When did it start?\u00a0 Were there any changes that happened recently?\u00a0 Does the client have reproduction steps?<\/p>\n<p>I gathered as much information as I could, but keep it high level.\u00a0 I asked for architecture and system diagrams.\u00a0 For the history of the application.\u00a0 For access to all systems that could possibly be relevant (this will save you time in the future).\u00a0 For locations of log files, source repositories, configuration files.\u00a0 For database credentials and credentials for third party systems like CC processors.\u00a0 It is important at this time to resist the temptation to dive in&#8211;at this point the job is to get a high level understanding so I can be efficient in the next steps.<\/p>\n<p>You will get speculation about what the solution is when you are asking about the problem.\u00a0 Feel free to capture that, but don&#8217;t be influenced by it.<\/p>\n<p>Number two&#8211;<strong>find the finish line<\/strong>.\u00a0 After getting a clear definition of the problem, I looked in the orders database and find out if the double orders were showing up there.\u00a0 They were, which was a clue as to which part of the system was malfunctioning, but more importantly let me see the effectiveness of any changes I was making.\u00a0 It also lets the customer know the objective end goal, which can be important if this is a t&amp;m project, and it let me know the end state to which I was headed&#8211;important for morale.\u00a0 (BTW, don&#8217;t do fixed bids for this type of project&#8211;overruns will be unpleasant, and there will be overruns.)<\/p>\n<p>I was able to write a SQL script to find double orders over a given time frame.\u00a0 I ended up writing a script which emailed the results of this query to myself and the client nightly, as an easy way to track progress.\u00a0 The results of this query were a quantifiable, objective measure of the problem.<\/p>\n<p>Number three&#8211;<strong>start where you are familiar<\/strong>.\u00a0 I could have dove in and looked at the codebase, but due to my problem definition, I knew that there had been no changes to the checkout portion of the code base for years.\u00a0 I also was unfamiliar with the particular software that managed the ecommerce site and could have wasted a lot of time getting up to speed on the control flow.\u00a0 Instead, once I had the SQL query, I could find users that had been double charged, and look at their sessions in the web server logs.\u00a0 I&#8217;ve been looking at apache http logs for over a decade and was very familiar with this piece of the system.<\/p>\n<p>Number four&#8211;<strong>follow your nose<\/strong>. I followed a few of the user sessions using grep and noticed some weirdness in the logs.\u00a0 There were an awful lot of messages that indicated the server had been restarted, and all the double orders I looked at had completed 5-6 seconds after the minute changed.\u00a0 (It&#8217;s hard to define weirdness explicitly, which is why it behooved me to start with a portion of the system that I was experienced with&#8211;it made the &#8220;weirdness&#8221; more obvious.)\u00a0 From here, I ended up looking at why or how the server was being restarted regularly.\u00a0 Ended up finding an errant cron job which was restarting the server often enough that the ecommerce system was getting confused and double booking orders&#8211;once before the restart and once after.\u00a0 This was easily fixed by commenting out the cron job.<\/p>\n<p>Number five&#8211;<strong>know when to stop<\/strong>.\u00a0 This ecommerce system obviously had a logic flaw&#8211;after all, restarting the web server shouldn&#8217;t cause an order to be entered twice, whether you restart it every hour or once a year.\u00a0 I could have dug through the code to find that out.\u00a0 But instead, I commented out the cron job, let the system run for a week or so and waited for more double orders.\u00a0 There were none, indicating that the site was low traffic enough that whatever flaw was present didn&#8217;t get exercised often, if at all.\u00a0 I confirmed with the client that this situation met his expectations of completeness, and called it good.<\/p>\n<p>Being thrown into a new system, especially when troubleshooting, is a difficult task.\u00a0 I am thankful the client was relatively responsive to my questions, and that pressure, while present, wasn&#8217;t intense.\u00a0 These five steps should help you, if you are put in any troubleshooting situation.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A few weeks ago, I engaged with a client who had a real issue.\u00a0 They sold a variety of goods via a website (if this was the 90s, they would [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[12,37,20],"tags":[],"class_list":["post-2043","post","type-post","status-publish","format-standard","hentry","category-consulting","category-tips","category-web-applications"],"_links":{"self":[{"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/posts\/2043","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/comments?post=2043"}],"version-history":[{"count":3,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/posts\/2043\/revisions"}],"predecessor-version":[{"id":2049,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/posts\/2043\/revisions\/2049"}],"wp:attachment":[{"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/media?parent=2043"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/categories?post=2043"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.mooreds.com\/wordpress\/wp-json\/wp\/v2\/tags?post=2043"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}