Conducting Cyber Breach Investigations in AWS Environments - re:Invent talk transcript

Dec 12, 2020
4 min read

Introduction

Recently, Matt May and I gave a talk at AWS re:Invent about conducting cyber breach investigations in AWS environments. The talk consisted of two parts:

Overview of incident response in AWS (Andrew Gorecki)
A client case (Matt May)

In this blog post, I provide a transcript of the first part and additional information I did not include in the talk.

AWS Security Framework

Over the years, various organizations and standardization bodies have defined frameworks that allow cybersecurity professionals to develop threat models, organize their cybersecurity functions, and communicate cybersecurity requirements to multiple audiences, including senior management. These models include the Cybersecurity Framework, various ISO standards, NIST incident response lifecycle, and MITRE ATT&CK.

Amazon Web Services organized its strategic approach to security around the following functions:

Prevent
Detect
Respond
Remediate

Each of those functions is supported by an ecosystem of AWS security services and tools available to their customers on a subscription basis. In this blog, I primarily focus on the Response function and discuss incident response in the context of the shared responsibility model.

AWS Partner Network

The AWS security ecosystem is supported and further enhanced by the AWS Partner Network. Incident response firms are an essential element in supporting the Response functions by assisting AWS customers with additional services, including:

Preparation through custom incident response plans, playbooks, maturity assessments, and threat intelligence.
Incident analysis leveraging AWS-native tools and more traditional approaches to incident response and digital forensics.
Assistance with tactical and strategic remediation of security breaches.

Third-party incident response firms often combine AWS-native services and tools with traditional incident response and digital forensics approaches to help their customers determine the full scope and extent of a breach.

Shared Responsibility

By migrating workloads to the AWS cloud, customers inherit security controls provided by AWS, including the security of the physical network, virtualization layer, and resources managed by AWS.

Customer responsibility increases as they provision unmanaged resources at lower levels of the application stack. For example, when an organization provisions an EC2 instance, their team will be responsible for securing the virtual private cloud (VPC), the EC2 instance itself, and the upper layers of the application stack, including middleware, applications, and data. Even with the Software-as-a-Service consumption model, security is easier said than done.

No matter what the consumption model is, customers are always responsible for securing the following:

Data
Identity and access management
End-user computing
On-premises systems connecting to AWS
Applications
Platforms provisioned and controlled by the customer

Security weaknesses in the resources controlled by customers commonly lead to breaches in AWS environments.

Breaches in AWS Environments

The following list provides common causes of compromises in the AWS environments based on my observations of client incident response engagement:

Access and secret key exposure through public code repositories This scenario occurs when developers hard coded credentials in the application’s codebase.
Application exploits resulting from vulnerabilities in applications and unpatched platforms provisioned by customers.
Non-Public Information (NPI) exposure through unsecured resources, such as internal S3 buckets, unintentionally exposed to the Internet.
Threat actors pivoting from an on-premises environment to AWS, including compromised end-user computers.
Third-party compromises resulting in threat actors harvesting AWS credentials.

The AWS client base is so vast that MITRE released an AWS-specific ATT&CK framework that includes nine tactics with multiple techniques included to help organizations model threats against their AWS resources and plan security controls.

Incident Response

Incident response in AWS environments combines traditional forensic methods with AWS native tools and capabilities. Also, the type of data analysts can acquire depends on the provisioned resources. For example, if an organization provisions EC2 instances, analysts can acquire system memory and snapshots of system volumes. With PaaS and SaaS resources, analysts must leverage event logs generated by AWS security services, such as CloudTrail.

Let’s take a brief look at AWS-native services that analysts can leverage during investigations before discussing more traditional approaches.

AWS Tools and Capabilities

CloudTrail, VPC flow logs, and CloudWatch are three primary services analysts can leverage to investigate incidents in AWS environments.

CloudWatch allows organizations to collect, search, and visualize data from multiple AWS resources. Administrators can configure specific resources to forward data to CloudWatch, such as security event logs for correlation, analytics, and searching.
CloudTrail, on the other hand, provides detailed event logs of every activity that occurs on AWS resources, including the AWS console, AWS command-line interface, and API calls. The service also provides an optional file integrity verification feature that analysts use to establish if an attacker tampered with log files.
Finally, VPC Flow Logs capture information about the IP traffic going to and from network interfaces in your VPC. Customers can enable VPC logging at an interface or global level.

AWS also offers several other services, including Inspector, Macie, Guard Duty, and Security Hub, among others that greatly simplify incident investigations. I encourage you to explore those services on the AWS website.

Traditional Incident Response Methods

Unfortunately, AWS does not provide its customers with a facility to create point-in-time snapshots of EC2 state, as in the case of VMware and other virtualization platforms commonly legered in on-premises environments. However, customers can still acquire the contents of a virtual machine in the following ways:

Memory: use a third-party tool to acquire the contents of virtual memory. Several tools exist to collect Windows memory. To acquire Linux memory, analysts must use the loadable kernel module LiME.
System volume: create a point-in-time snapshot of the Elastic Block Store (EBS) volume containing the operating system files.

To facilitate forensic analysis of the disk volume, analysts have two options:

Provision another EC2 instance in the AWS cloud, mount the system EBS volume to it, and acquire a forensic image of that volume for offline analysis using tools such as the dd utility.
An alternative approach is to provision a forensic analysis system in the AWS cloud and mount the EBS volume snapshot to that system for analysis in the cloud.

To analyze system memory, analysts can leverage traditional approaches and tools, such as the Volatility framework.

Of course, customers also can install EDR on EC2 instances and collect system telemetry during the course of the system operation. There are also various open-source tools that analysts can run to perform targeted forensic acquisitions from running systems.

Conclusion

Incident response in AWS environments combines traditional incident response and digital forensics methods with AWS native tools and capabilities. Moreover, the type of data analysts can acquire depends on the provisioned resources and auditing enabled on those services. Consequently, analysts must be familiar with both AWS and traditional forensic methods to investigate incidents in AWS environments effectively.