Announcing the Introduction to Amazon Machine Learning Video Course

Would you like an easy introduction to machine learning?  Without downloading any open source software, reading documentation and blog posts, and/or installing and configuring the system?

Amazon Machine Learning is a great way to explore machine learning without having to run any infrastructure.  It lets you build high performance cost effective systems to predict outcomes based on past data.

If you’re interested in learning more about Amazon Machine Learning, you can view my video course on O’Reilly Safari.  Over an hour and a half of video talking about all aspects of Amazon Machine Learning.

This course shows you how to build a model using Amazon Machine Learning (Amazon ML) and use it to make predictions. AWS expert Dan Moore covers the basic types of machine learning, how to prepare your data, and how to make your data available to the Amazon Machine Learning processes. You’ll also learn about evaluating a model for accuracy, using it both for batch and real-time predictions, and using tags to manage environments. Designed for developers and technical marketers new to machine learning and for data scientists interested in using the AWS Amazon ML platform, the course provides hands-on experience building a working predictive model using real data. Learners should obtain an AWS account (free from Amazon) and a basic understanding of AWS concepts before beginning the course.


Heroku 503 errors

One day recently I woke up to a text from our monitoring service (part of minimal heroku operations tasks) saying our application was down.

Darn it.

Here’s the recreation of my steps in troubleshooting this issue:

  • Login, see that the app is indeed down.
  • Look at newrelic.
  • Look at the logs (papertrail!).
  • Look at the deployment history.
  • Note when the issue started–curses, not when I did a deploy.
  • Open a ticket with heroku (after doing some research). (Love their support.)
  • Double check that the database is good and hasn’t hiccuped.
  • Look at the logs more.
  • Add more dynos, see if that helps.
  • Google the error message.
    •  <app-name> heroku/router: at=info method=POST path=”<url>” host=app.thefoodcorridor.com request_id=8a17648f-2d84-46ea-abf2-5903be894a2c fwd=”216.191.191.58″ dyno=web.3 connect=1ms service=4ms status=503 bytes=477 protocol=https“Notice that the issue is being stated right in the log file (passenger request queue filling up).  Here are sample error messages?
    • <app-name> app/web.3: [ 2017-07-12 14:47:44.3688 65/7f652dffd700 age/Cor/Con/CheckoutSession.cpp:261 ]: [Client 3-281] Returning HTTP 503 due to: Request queue full (configured max. size: 100)
  • Find some posts about the error message.  Here and here.
  • Start researching how to increase request queue size.
  • Talk a walk to clear my head.
  • Think about what external services we call, as that seems to be what might cause the request queue to back up.
  • Read another post that says restarting passenger helped.
  • Restart all dynos.
  • Problem disappears.
  • Look at logs more closely.
  • Last dyno to be restarted was the only problematic dyno.
  • Add comment to ticket about this being the cause.
  • Heroku confirms that the issue may have been the dyno: “sometimes individual dynos will hang and cause errors with 503 responses”
  • Write note to customers about the issue explaining how access to app was affected.
  • Lower number of dynos.
  • Breath a sigh of relief.

AWS Questions: Certification

Lots of times folks in my class are interested in pursuing AWS certifications.  The classes I teach are good at preparing you to be certified, but are definitely not certification classes.  Here’s the answers I give to students interested in being certified:

To get certified, you should review the page for the cert you want.  Here’s the page for the AWS Architect – Associate certification.

When I got certified, I re-read the student guide from the class and made sure I understood everything covered in it.  I didn’t just look at what the student guide had–I went to the AWS documentation as well.  I also read some whitepapers as outlined in the exam guide (found on the certificate page linked above).  I then took the sample questions (answers not provided, but you can find them via googling, anda again, on the certificate page) and then the practice exam (which costs $20, I believe, and gives you familiarity with the test format, but can only be taken once.).  Those gave me feedback that I was on track to pass the exam.

Note that the course is more hands on than the exam and doesn’t map strictly to the exam.  However, AWS does a good job of explaining what they are looking for in the exam guide (on, you guessed it, the certificate page).

Some of my students and colleagues have also had good luck with acloudguru, but I have no personal experience with that service.  The company for which I work (but for which I do not speak) also offers a course that is designed to help folks pass certain certs, but I have no experience with the course.

Finally, it’s worth noting that all the certs I have taken have been proctored.  Depending on where you live, you may have a number of test centers available, or one (or none).  Find that out before hand!  I also found that the exams I wanted were never available next day–I had to schedule them out a few weeks in advance.  YMMV.


AWS machine learning talk

I enjoyed giving my “Intro to Amazon Machine Learning” talk at the AWS Denver Boulder meetup.   (Shout out to an old friend and colleague who came out to see it.) I didn’t get through the whole pipeline demonstration (I didn’t get a chance to do the batch prediction), but the demo gods were kind and the demo went well.

We also had a good discussion.  A few folks present had used machine learning before, so we talked about where AML made sense (hint, it’s not a fit for every problem).  Also had some good questions about AML, about performance and pricing.  One of the members shared a reinvent anecdote: the AML team looked at all the machine learning used in Amazon and graphed the use cases and solved for the most common ones.

As, usual, I also learned something. OpenRefine is a tool to help you prepare data for machine learning.  And when you change the score cut-off, you need to restart your real-time end point.

The “Intro to Amazon Machine Learning” slides are up on SlideShare, and big thanks to the Meetup organizers.



Advice for the bootstrapped developer

I participated in a panel for Boulder Startup Week, but previous to the panel, thought I’d be giving a 15 minute presentation.  I was going to talk about rules for the bootstrapped developer.  At The Food Corridor, we recently celebrated one year of having customers.  In celebration of that, here are my seven guidelines:

  1. It will take longer than you think.  For any meaningful value of “it”. expect the process to take longer than planned.  Whether that is acquiring your first customer, building your product, finding help, or anything else, plan for it to take longer than you think.  Heck, this blog post is taking longer than I thought it would.
  2. Know your runway.  It’s important to know your financial runway (both for the company and for yourself).  This is fairly easy to calculate–just find out your burn rate and your money in the bank.  If you are looking at your personal financial runway, don’t forget to allow some buffer for you to find a different job–typically you won’t be able to step from a cratered startup on Friday into a full time job on Monday.  However, more important than financial runway is emotional runway.  Talk about the stress with your spouse (if applicable), think about the other emotional difficulties you may encounter, and plan for some high highs and some low lows.
  3. Extend your runway.  Again, for the financial runway, lower your burn rate as much as you can.  This will mean cutting back on spending and savings.  Make sure the family is on board.  Then you will look to find other sources of money to feed and clothe yourself.  This may include digging into savings, borrowing against assets (HELOCs work for this), borrowing from relatives, pitch competitions (TFC won two), selling assets, taking contract work, having a spouse who earns enough for the household, and/or moonlighting.
  4. Talk to your customer. This was one of the great assets that one of my co-founders brought to the table.  She had deep domain expertise and had many connections to potential customers.  We had numerous customers give us feedback, including via interviews, a month long beta test, regular product advisory councils and surveys.  You will note that this point implies you can find your customer and communicate with them in a cost effective manner.  Find where your customer congregates online.  If you don’t find a place, make one and invite your potential customers to join.
  5. Everything is borked all the time. In a startup, you just don’t have time to do everything correctly.  It can be embarrassing, but if you are building something that solves customer pain, and you’ve found the right early adopters, the customers will stick around.  Just keep improving things.  And accept a certain amount of brokenness.
  6. Know your market. This is related to talking to your customer, as that is one fantastic way to learn about your market.  There are other ways too–market research, online reading, etc.  However, as you build your product, you will surprised by what your market wants.  For instance, we thought our market would want a better UX, but were surprised when someone said our v1 UX was “great”.
  7. Make your own rules. Know that my advice and rules above are based on my experience, with my co-founders, product and skillset, in this time.  You need to do the reading and figure out what your rules are.

DynamoDb: What’s left to manage

AWS recently announced that DynamoDb will now scale read and write capacity automatically.  While there was already a lot of database administration that DynamoDb took care of (backups, underlying infrastructure provisioning), setting the proper capacities initially, and updating them as your application changed, was a key task that fell to the user. No more.

I posted a link to the news to a discussion channel I participate in, and someone asked: “what’s left to manage?”. Drawing from that discussion, here are a few items remaining:

  • Appropriate partition keys.  Make sure they are spread uniformly.
  • Choosing the right primary key. Since you typically want to avoid table scans and can only query by primary key, making sure you pick the right one is important.  (I would also call this “data model design”.)
  • Enforcing data integrity, initially and through time.  This is a challenge with every nosql solution.
  • Creating the appropriate secondary global indices for your application.
  • Securing and controlling access to your data.

These are all still important tasks, but DynamoDb is getting easier and easier to use for high performance applications for which nosql is a good fit.  (And for which you don’t mind being tied to AWS.)


AWS Questions: SQS

More questions, this time about SQS, the simple queue service that AWS provides.

  • What was the first AWS service?
  • Are there upper limits on limits on SQS in terms of message/second?
    • FIFO Queues have a limit (300/s), but I wasn’t able to file any hard limits for standard SQS.  In the developer guide they have some examples that reach 2500 messages/second.  I found some benchmarks from 2014, which were able to get to 108k messages/second.
  • Can you create alarms based on the number of messages in a queue?
    • Yes, that is a metric that Cloudwatch tracks: “NumberOfMessagesSent”.  You can use this in combination with an auto scaling group to handle batch processes in a dynamic manner (scale out when you have more work in the queue, scale in when you have less).
  • What is the maximum visibility timeout for SQS?

Interview with a boot camp grad

Lots of folks are moving into development and technical fields these days.  I remember that happening during the dot com book, but back then folks just read one of those “Learn Java in 24 Hours” books.

Nowadays there are a profusion of boot camps that help people gain the skills they need to be a developer. I have interacted with a few of these grads and was interested in learning more about their experience. Noel Worden, one of the organizers of Boulder.rb and a blogger, agreed to an interview.

– what was your background before you were a developer?

I got my degree in fine art photography, spent 5 years working in NYC as a Digital Technician in the photo industry, then moved to Colorado and was a cabinetmaker for 3 years.

– how did you land your first job?

I found the posting on the Denver Full Stack meetup whiteboard (https://www.meetup.com/fullstack/)

– what surprised you about the software industry?

How willing everyone is to help. It’s very different in the photo world, [which is] much more competitive, it’s hard to find guidance and mentorship.

– how did you pick your boot camp?

Bloc had one of the longest curriculums I could find in an online program. I figured when push came to shove, having more experience under my belt couldn’t hurt when competing with other junior [developers] looking for that first job. Bloc also has a part-time track, which allowed me to still work [a] full-time job while going to school.

– what is good about the job? What is challenging?

I appreciate the balance of being challenged but knowing I have the full support of any other engineer on the team if I need to reach out for assistance. Canvas United [ed: his current employer] has a lot of projects that I’ve been maintaining lately, and all are running different versions of Rails, which makes for interesting challenges. 

– what do you see current boot camp students doing that you’d advise against?

Not getting out and networking while going to school. This industry is all about networking and if you’re hoping to capitalize on the advantages of a good network you have to be building it while still in school.

– why did you want to transition into technology and development?

I wanted/needed a career path where the leaning curve wouldn’t plateau. I’m not a ‘cruise control’ kind of person, as soon as the challenge isn’t there for me anymore I lose interest.

– how can employers help boot camp grads in their first job?

Be ok with answering questions, and be transparent with the employee about the proper channels to ask those questions. Also, be upfront about the fact that it’s ok to fail (assuming that’s the case) a few time before you get to the best solution [ed: if it isn’t ok to fail, find a new job!]. Also also, a healthy balance of low hanging fruit and multi-day problems. Nothing kills my morale faster than ticket after ticket of problems that grind me into the ground.

– should employers have any different expectations of a boot camp grad vs someone who just graduated from college or high school?

I definitely think that it takes a particular type of management style to successfully level up a boot camp grad. If you’ve hired them you must have liked something about them, play to those strengths, but also sprinkle in challenges to help that developer evolve.

[This content has been edited for grammar and clarity.]


AWS Questions: Windows Servers

Windows servers are supported on AWS, but recently I had students ask a bevy of questions about them.  Here are some answers.  As a reminder, I speak solely for myself with these blog posts, not for AWS or any employer.

  • What versions of Windows are supported?
  • Can I create an AMI from an EBS snapshot of a Windows root volume?
    • Unlike with a linux EBS snapshot, you cannot create an AMI from a root volume.  You can create an AMI from a running instance, however.  The reason for the limitation is that sysprep must be run on the Windows server, and you can’t run sysprep on a EBS volume that is not running.
  • In order to take an accurate snapshot, I need to quiesce the disk.  How can I do so?
    • This is a thorny problem and I don’t think there’s a great answer. You want to shut down as many apps as you can. You also may find the Volume Shadow Copy Service useful. You may want to review the answers here on this reddit thread.
  • I have a Windows bastion host, and I want to allow more than two users to access this host at one time.  How can I do this?
    • You need to purchase additional Remote Desktop Services licenses.  From the FAQ: “Amazon EC2 instances come with two Remote Desktop Services (aka Terminal Services) licenses for administration purposes. If additional Remote Desktop Services licenses are needed, they should be purchased from Microsoft or a Microsoft license reseller. Remote Desktop Services licenses purchased with Software Assurance have license mobility benefits and can be brought to AWS multi-tenant environments.”
  • Is powershell a first class citizen with the same functionality as the CLI or the supported SDKs?
    • Nope.  From the Powershell page: “The AWS Tools for Windows PowerShell lets you perform many of the same actions available in the AWS SDK for .NET. You can use it from the command line for quick tasks, like controlling your Amazon EC2 instances.”  (Emphasis added.)
  • Do you have any example userdata scripts for Windows AMIs?


© Moore Consulting, 2003-2017 +