Business Continuity in the Cloud with Multiple AWS Accounts - TriNimbus

Business Continuity in the Cloud with Multiple AWS Accounts

We’ve all heard the story of Code Spaces by now – on June 17th, 2014 in only 12 hours, the company went from a highly respected organization with a viable business to rubble. The event was covered by many – check ThreatPost, InfoWorld, NetworkComputing for a few views on what went wrong or read the official account on the company website.

It is easy to blame the folks at Code Spaces – with a little more discipline and a more organized response to the attack they likely could have prevented the outcome. After all, AWS does provide the equivalent of an Emergency Power Off: the root account user can delete all IAM users at once, thus preventing access to all AWS resources from anyone but the root account itself, which is presumably controlled by a select few and heavily locked down with the use of security questions and Multi-Factor Authentication (MFA) devices. However, we can imagine the folks at CodeSpaces trying to battle between trying to keep their service running and shutting down everything to protect the data. Maybe they had IAM keys embedded in code and deployed to EC2 servers and locking IAM down would have caused an immediate shutdown of service or even some damage. Of course, the actions they took lead to an even worse outcome so it is easy to say that they took the wrong actions after the fact.

If one follows the steps to Securing Your Amazon Web Services Account Using IAM and works with a company like TriNimbus, bringing their expertise in implementing many industry best practices and enforcing or automating processes for proper use of the AWS cloud, the discipline that Code Spaces may have lacked can easily be implemented and followed. For example, one of the rules we at TriNimbus try to push for is that creating passwords for IAM users to access the AWS Management Console requires IT authorization. On top of that, we often suggest to our customers that these users must be placed in specific IAM Group(s) according to the organizational structure and functional role to grant them access to specific resources. Advice like this, paired with simple rules like never enabling IAM permissions for anyone but super admins, forcing super admins to use MFA for their IAM logons, never creating access keys for named users with console access, etc. can greatly reduce the likelihood of a hacker or a rogue (ex-)employee gaining access to the console or the AWS API and causing damage.

But the story of being protected in the cloud is not complete with just a few best practices. One of the golden rules in the IT world is that you need an offsite backup – a complete backup of all of your data to a location outside of your production location. Many people think that a copy of the data in a second region outside their primary region is sufficient, but as in the Code Spaces case that doesn’t prevent hackers with console access to delete your backup. Some argue for a backup outside AWS altogether. While that is definitely a reasonable choice, it can be very costly and is often done improperly, such as the backups are taken at rates too slow for them to support the business continuity requirements for the organization. An alternative to both is to use a second AWS account. In a previous post, we recommend and discuss other best practices for Backup and Recovery in AWS.

Here is a simple strategy that uses a secondary AWS account as the equivalent of an offsite backup, and ensures that backup is protected from inadvertent access from the primary account:

  • Treat the secondary AWS account as you would a third-party escrow – the people with access to that account should be different from the people with access to the primary account.
  • Do not enable access from the primary account to any resources into the secondary account – all actions below should be done from the secondary account, which should be granted access to resources in the primary account as required.
  • Clone the data from S3 buckets and DynamodDB tables from the primary account to backup buckets and tables in the secondary account – this can be easily automated with Amazon’s Data Pipeline service.
  • Clone AMIs and EBS, RDS or Redshift snapshots from the primary account to backup AMIs and snapshots into the secondary account – this can be scripted with the AWS APIs and could be scheduled with Amazon’s Data Pipeline service.

You can go further and move from escrow to a complete Business Continuity Planning/Disaster Recovery (BCP/DR) implementation by implementing CloudFormation templates and other automation for launching your infrastructure from the backups as well, potentially enabling your escrow account to act as your recovered production account.

Even if you don’t think it is worth it going through all of this for your organization to ensure backups are moved to a secondary account, we highly recommend considering enabling Amazon’s CloudTrail service and either completely lock down access to the S3 bucket where CloudTrail drops the logs or clone that bucket to a separate account as described above. You will appreciate being able to do forensics in the aftermath of a problem that was caused by a person or rogue code with access to your account. You should also consider letting some of the CloudTrail partners regularly analyse your CloudTrail logs and raise alerts or let you query the data without having to understand the JSON format in which the logs are stored in on S3.