Last week in AWS

If you are interested in AWS, you have to subscribe to ”Last Week in AWS”, a newsletter which covers, well, much of what has happened in AWS recently.  Both posts and news from the AWS team and external sources are covered.  There is a lot of snarkiness, but a ton of knowledge as well.

The newsletter is out every Monday and is run by Cory Quinn, an experienced AWS consultant.


Hackernews melted my server

My post about Founding Engineers caught the attention of some folks on Hacker News, which is kinda like a focused subreddit for tech folks.  There were some great comments.  My server melted down, though, and some folks had to read the google cached version.  I think the post peaked at around #25 on the front page.  Can’t imagine what a pounding the number one post gets.

I was able to restart my web server later in the day (the post unfortunately happened at the same time I was doing a release of The Food Corridor application), and saw from my web stats that I had as many people visit on that one day as I get in a typical week.  That was only the folks that my stats system was able to capture, so I’m guessing there were a lot more.

I have thought for a long time about making my wordpress site publish to s3, using a tool like simply static.  I haven’t done that yet, but was able to leverage WP Super Cache’s CDN integration to serve up the static assets of the blog from CloudFront, AWS’s CDN.  This post was very helpful.

 


Running servers in AWS without SSH access

When you allow SSH access to your server, the user sshing can do many things.  You can restrict their access with a tool like sudo or chroot, but at the end of the day, the user has access to the system and may be able to find a way to escalate their privileges.  It’d be simpler if no one could login to the server at all, but how would you configure the server to actually be useful?

With AWS and the AWS Systems Manager, you can install an agent (open source, under the Apache License) on your ec2 servers (perhaps via userdata at boot time) and run all your commands via this AWS managed service.  That means you never have to have an ssh server running.

What about limiting what users can do?  You have the full power of IAM to limit who can do what to which servers.  Here’s how you can use tagging to limit on which servers someone can run a command.

What about installing applications?  Uou can use userdata or the ec2 run command.

What about logfiles of those applications?  You can send your logfiles up to a log aggregation service like cloudwatch logs or splunk.  It’ll be easier to manage logfiles centrally anyway.  If you use cloudwatch logs, don’t forget to move your logfiles to s3 and then expire them, otherwise you’ll pay more than you should.

What about system updates (patches, etc)?  There’s a patch manager.

What about troubleshooting?  You can use the ec2 run command to execute arbitrary commands and get the response back.

If you lock down the ec2 run command, then suddenly you have a lot less attack surface.  No one can login to your AWS instances and nose around or run arbitrary commands to see what software is present or what security measures are in place.


Debugging options for AWS Lambda functions

AWS Lambda lets you write a ‘function as a service’, and run code from 100ms to 5 minutes in execution time without maintaining any servers.  This code has few limitations, but one of the issues I’ve always encountered is debugging lambda functions.

I mentioned it at a past meetup and encountered the following solutions:

I think the right debugging option depends on the complexity of your code and the urgency of the situation, but could see using all of these at different times.

Bonus: here’s a post on how to continuously deploy your lambda functions.


AWS Advent Calendar

There’s an AWS advent calendar, where new articles will be posted about various aspects of AWS starting Dec 25. If you’re interested in writing or reviewing the articles, feel free to sign up.  There are also some great posts from 2016, covering topics such as how to analyze VPC flow logs, cost control, lambda and building AMIs with packer.

There are also articles from years past.  I haven’t examined them closely, but I’d be wary of them, simply because the AWS landscape is changing so rapidly, and an article from three or four years ago may or may not be applicable.


The wonders of outsourcing devops

I have maintained a Jenkins server (actually, it was Hudson, the precursor). I’ve run my own database server.  I’ve installed a bug tracking system, and even extended it. I’ve set up web servers (apache and nginx).

And I’ll tell you what, if I never have to do any of these again, I’ll be happy as a clam. There are so many tools out there that let you outsource your infrastructure.  Often they start out free and end up charging when you reach a certain point.

By outsourcing the infrastructure to a service privder, you you let specialists focus on maintaining that infrastructure. They achieve scale that you’d be hard pressed to. They hire experts that you won’t be able to hire. They respond to vulnerabilities like it is their job (which it is).

Using one of these services also lets you punch above your weight. If you want, with AWS or GCP you can run your application in multiple data centers around the globe. With heroku, you can scale out during busy times, and you can scale in during slow times. With circleci or github or many of the other devtool offerings, you can have your ci/cd/source repository environment continually improved, without any effort on your part (besides paying the credit card bill).  Specialization wins.

What is the downside? You lose control–the ability to fine tune your infrastructure in ways that the service provider may not have thought of.  You have to conform to their view of the world.  You also may, depending on the service provider, have performance impacted.

At a certain scale, you may need that control and that performance.  But, you may or may not reach that scale.

It can be frustrating to have to workaround issues that, if you just had the appropriate level of access, you would be able to fix quickly.  It’s also frustrating having to come up to speed on the docs and environment that the service provider makes available.

That said, I try to remember all the other tasks that these services are taking off my plate, and the focus allowed on the unique business differentiators.


Re:Invent Videos

AWS Re:Invent is supposed to be a great conference.  I have thus far been unable to attend, but the videos of the presentations are posted online with about a day’s lag.  So, like most conferences, you really should be networking and meeting people face to face rather than attending the presentations.

Here’s the AWS Youtube channel where you can watch all the videos, or just sample them.

I’ve found the talks to be of varying quality.  Some just rehash the docs, but others, especially the deep dives, discuss interesting aspects of the AWS infrastructure that I haven’t found to be documented anywhere (here’s a great talk about Elastic Block Storage from 2016).  The talks by real customers also give a great viewpoint into how AWS’s offerings are actually implemented to provide business value (here’s a great talk from 2016 about using Amazon Machine Learning to predict real estate transactions).

It’s a sprawling conference, well suited to AWS’s sprawling offering, and I bet no matter what your interest, you will be able to find a video worth watching.


AWS Questions: Certification

Lots of times folks in my class are interested in pursuing AWS certifications.  The classes I teach are good at preparing you to be certified, but are definitely not certification classes.  Here’s the answers I give to students interested in being certified:

To get certified, you should review the page for the cert you want.  Here’s the page for the AWS Architect – Associate certification.

When I got certified, I re-read the student guide from the class and made sure I understood everything covered in it.  I didn’t just look at what the student guide had–I went to the AWS documentation as well.  I also read some whitepapers as outlined in the exam guide (found on the certificate page linked above).  I then took the sample questions (answers not provided, but you can find them via googling, anda again, on the certificate page) and then the practice exam (which costs $20, I believe, and gives you familiarity with the test format, but can only be taken once.).  Those gave me feedback that I was on track to pass the exam.

Note that the course is more hands on than the exam and doesn’t map strictly to the exam.  However, AWS does a good job of explaining what they are looking for in the exam guide (on, you guessed it, the certificate page).

Some of my students and colleagues have also had good luck with acloudguru, but I have no personal experience with that service.  The company for which I work (but for which I do not speak) also offers a course that is designed to help folks pass certain certs, but I have no experience with the course.

Finally, it’s worth noting that all the certs I have taken have been proctored.  Depending on where you live, you may have a number of test centers available, or one (or none).  Find that out before hand!  I also found that the exams I wanted were never available next day–I had to schedule them out a few weeks in advance.  YMMV.


AWS machine learning talk

I enjoyed giving my “Intro to Amazon Machine Learning” talk at the AWS Denver Boulder meetup.   (Shout out to an old friend and colleague who came out to see it.) I didn’t get through the whole pipeline demonstration (I didn’t get a chance to do the batch prediction), but the demo gods were kind and the demo went well.

We also had a good discussion.  A few folks present had used machine learning before, so we talked about where AML made sense (hint, it’s not a fit for every problem).  Also had some good questions about AML, about performance and pricing.  One of the members shared a reinvent anecdote: the AML team looked at all the machine learning used in Amazon and graphed the use cases and solved for the most common ones.

As, usual, I also learned something. OpenRefine is a tool to help you prepare data for machine learning.  And when you change the score cut-off, you need to restart your real-time end point.

The “Intro to Amazon Machine Learning” slides are up on SlideShare, and big thanks to the Meetup organizers.




© Moore Consulting, 2003-2017 +