AWS Questions: Windows Servers

Windows servers are supported on AWS, but recently I had students ask a bevy of questions about them.  Here are some answers.  As a reminder, I speak solely for myself with these blog posts, not for AWS or any employer.

  • What versions of Windows are supported?
  • Can I create an AMI from an EBS snapshot of a Windows root volume?
    • Unlike with a linux EBS snapshot, you cannot create an AMI from a root volume.  You can create an AMI from a running instance, however.  The reason for the limitation is that sysprep must be run on the Windows server, and you can’t run sysprep on a EBS volume that is not running.
  • In order to take an accurate snapshot, I need to quiesce the disk.  How can I do so?
    • This is a thorny problem and I don’t think there’s a great answer. You want to shut down as many apps as you can. You also may find the Volume Shadow Copy Service useful. You may want to review the answers here on this reddit thread.
  • I have a Windows bastion host, and I want to allow more than two users to access this host at one time.  How can I do this?
    • You need to purchase additional Remote Desktop Services licenses.  From the FAQ: “Amazon EC2 instances come with two Remote Desktop Services (aka Terminal Services) licenses for administration purposes. If additional Remote Desktop Services licenses are needed, they should be purchased from Microsoft or a Microsoft license reseller. Remote Desktop Services licenses purchased with Software Assurance have license mobility benefits and can be brought to AWS multi-tenant environments.”
  • Is powershell a first class citizen with the same functionality as the CLI or the supported SDKs?
    • Nope.  From the Powershell page: “The AWS Tools for Windows PowerShell lets you perform many of the same actions available in the AWS SDK for .NET. You can use it from the command line for quick tasks, like controlling your Amazon EC2 instances.”  (Emphasis added.)
  • Do you have any example userdata scripts for Windows AMIs?

From my book, Amazon Machine Learning: An Introduction:

Amazon Machine Learning, or AML, provides you access to widely applicable machine learning algorithms without having to run any servers.  This type of learning is useful for making predictions based on a set of data for which answers are known.  AML supports supervised learning with the stochastic gradient descent algorithm.  The end goal of AML is to create a model, which is what will allow you to make further predictions based on past data.

AML supports three different kinds of predictions.  For binary outcomes, where observations lead to a yes/no result, AML supports binary classification.  An example would be whether or not a prospect is likely to sign up for a new account, given their past interactions with your company.  For multi valued results, where observations lead to one of N results, AML supports multi class classification.  A good example of this would be which product to show a customer, given what they’ve looked at and bought in the past.  And, for numeric values, AML supports regression.  An example of that would be predicting house prices based on sales data and house attributes.

If you are not trying to use existing data and create predictions out of it using supervised learning, but are trying to instead recognize images or tease out patterns in text, you may want to consider alternatives to AML.


AWS Questions: EBS

So, more questions about AWS from students (and my own research/curiosity):

  • What happens when general purpose ebs volumes run out of credit because you’ve used too many IOPS?
    • Your disk performance reverts to the baseline performance: “If your gp2 volume uses all of its I/O credit balance, the maximum IOPS performance of the volume will remain at the baseline IOPS performance level (the rate at which your volume earns credits) and the volume’s maximum throughput is reduced to the baseline IOPS multiplied by the maximum I/O size.”
  • Before you take a snapshot you should quiesce the disk.  Can you do that via an AWS command?
    • You need to use whatever operating system or application command is recommended.  You can use EC2Run to execute that command, but you must determine what that command is.  From the backup and recovery whitepaper: “For this reason, you must quiesce the file system or database in order to make a clean backup. The way in which you do this depends on your database or file system. ”  Note also that the quiescing of the disk can be for only a few seconds, as the EBS snapshot process can be started quickly, though it may take a while to complete.
  • Can you be notified of snapshot completion via an event or do you have to poll?
  • Can you automate your EBS snapshots?
  • Does EBS encryption cost extra?
    • Nope: “This frequently requested feature provides you with seamless support for data encryption on block-level storage, at no additional cost.”

Amazon Machine Learning Video and Book

I’m working on a video series and an ebook about Amazon Machine Learning, or AML.

AML  is a great way to get started with machine learning, since you can focus on the key concepts of building and using a model and not worry about any infrastructure.  AWS takes care of provisioning all the underlying IT infrastructure–you just worry about getting your data to S3, choosing how to build the model, and then using the model.  Which, trust me, is quite enough to tackle if you are a machine learning newbie.

You can use the model to get predictions either in real time (with a default soft limit of 200 requests per second) or via batch processing, where you can upload up to 1TB of predictions to S3.  Like everything in AWS, you can control the entire process via a well documented API or from various SDKs.

AML isn’t a fit for all machine learning needs–it processes text that is in CSV format and supports only supervised learning.  There are other options on AWS (and other places as well).

The book is currently in progress, and I’ll be starting on the video soon.If you’d like to follow along as the book gets written, you can at leanpub: Amazon Machine Learning: An Introduction.

AWS Questions: Cloudformation

So, more questions from students.  This time about Cloudformation, the very cool way to built AWS infrastructure declaratively.  I would hate to have to pick a favorite AWS service, but if I had to, Cloudformation would be it.

  • By default a stack rolls back on failure.  You can also keep any successful stack elements by setting disable rollback to true.  Can you have some elements of a stack that must have successful creation, and others that may fail without rollback?
    • Nope.  I’d break this up into two stacks and chain them.
  • Why is YAML now supported for Cloudformation templates?
  • Can you a dry run of a cloudformation template?

AWS Questions: DynamoDB

Here are some questions and answers about DynamoDB, Amazon’s managed NoSQL database offering.

  • What are options for dynamically scaling DynamoDB provisioned throughput?
    • Hard to beat the options outlined in this StackOverflow post.  You can do it via scripting, the DynamicDynamoDB open source library, a lambda function, cloudwatch–lots of different ways.
  • Do DynamoDB streams support multiple readers?
  • How does optimistic concurrency control work?
    • Nicely outlined here but the long and the short of it is you need to make sure you associate a version with your items, read that version when you prepare to update, and then update if and only if the version is the same as the one you read.
  • Do you have any insight into the internals of DynamoDB?
  • How do you connect to DynamoDB?  Is there an IP address?
    • You use the SDK or CLI which connect to an endpoint in a region that you know no further details of.
  • What is the difference between eventual and strong consistency with respect to DynamoDB reads?
  • Does DynamoDB have any automatic encrypt at rest options?


AWS Questions: Elastic Load Balancer

More question answered from an AWS course.

  • Does the AWS ELB have the ability to throttle requests, to stop invalid/illegal traffic – if someone refreshes a page 10 times in 5 seconds and I want to block the unnecessary requests from the refreshes?
  • What is the availability of the ELB component?
    • I couldn’t find firm numbers, but here’s an interesting article about ELB best practices.
  • In a DDOS attack, since there is a lot of traffic to your environment, do you get charged for the additional traffic?
    • Depending on the attack type, not if you are fronted by an ELB or set up your security groups/NACLs to discard the traffic.  From the DDOS whitepaper: “When [an ELB detects certain types of attacks], it will automatically scale to absorb the additional traffic but you will not incur any additional charges.”
  • When an instance is decommissioned from an ASG, does the ELB know not to send new sessions to that ASG because the instance is getting ready to shut down?

AWS Questions: Cloudwatch

Cloudwatch is Amazon’s monitoring and alerting service.

Some questions and answers re: this awesome service.

  • How can you create custom metrics?
    • Cloudwatch doesn’t limit you to the metrics it collects by default (at the hypervisor level).  You can push any metric that makes sense up the the statistics repository, using custom metrics.
  • What protocol does cloudwatch use?  ICMP or SNMP?
  • What are cloudwatch logs?
    • A way for you to push logfiles from ec2 instances up to Cloudwatch, parse them, and create metrics out of them (“how many 404 errors has this application had in the last 30 minutes?”).  More here.
  • Does cloudwatch allow you to setup different metric thresholds at different times? For example, set an alarm at 70% CPU on Wed night but 90% on Sat night?
    • No, but you could do this with custom metrics.  You could read the cloudwatch default metrics and have an ‘cpualarm’ metric which would be 1 or 0 depending on if certain parameters were set.  Then you could vary the parameters over time.  Then you could set an alarm on the ‘cpualarm’ metric.


AWS Questions: VPC

Amazon VPC lets you create a virtual network in the cloud that you control–subnets, ip ranges, internet access, routing, etc.  At recent classes, I was asked some questions about VPC that I dug into to find answers.

  • Does AWS VPC support multicast or broadcast?
    • No, per the FAQs.  But there are some projects to overlay multicast functionality on top of the unicast network within a VPC.
  • Are vpc flow logs add on pricing?
    • There is no additional charge, but they go into Cloudwatch Logs and you are charged at the normal rate for that usage.
  • Is the NAT Gateway (used to provide internet access to ipv4 private subnets) highly available?
    • It is redundant within an availability zone.  But, from the docs: “If you have resources in multiple Availability Zones and they share one NAT gateway, in the event that the NAT gateway’s Availability Zone is down, resources in the other Availability Zones lose Internet access. To create an Availability Zone-independent architecture, create a NAT gateway in each Availability Zone and configure your routing to ensure that resources use the NAT gateway in the same Availability Zone.”  See also the Egress-only Internet Gateway, if you are using ipv6.

AWS Questions: Kinesis and IAM

  • What happens if you push AWS Kinesis (a high volume managed streaming solution from AWS) past the provisioned shard limits (as specified here)?
    • You start getting exceptions if you are trying to write to or read from the stream.  You can back off or you can increase the number of shards, which increases your throughput.
  • Any planned support for .NET with the Kinesis libraries (Kinesis Producer Library, Kinesis Client Library) which have some nice features?
    • I’m not aware of any future plans.  However both are available on github (KPL, KCL) and are open source(ish) under the Amazon Software License.  I say “ish” because of some concerns about section 3.3, limits of use.  So you could port the code to .NET.  In addition, there is support for running the KCL with other languages (Ruby, .NET, etc) but you still need to run a Java daemon.
  • Can someone create an IAM group with more permissions than the group they are in?
    • Yes, if the IAM system is misconfigured.  If a user is in group A which has the attach group policy permission, and has no other limits, they can attach an arbitrary policy to group B.  As per of the AWS shared responsibility model, you are responsible for your IAM setup.

© Moore Consulting, 2003-2017 +