AWS: EKS Security Best Practices

AWS: EKS Security Best Practices

Amazon Elastic Kubernetes Service (EKS) is a powerful platform for running containerized applications using Kubernetes. It's Amazon's solution to the many other services they offer that encapsulate another making it easier for developers and infrastructure engineers to worry about the business logic of their applications rather than the backend infrastructure. Due to the nature of EKS and it's cloud-based environment, ensuring the security of your EKS clusters is paramount. This comprehensive guide covers best practices and strategies for securing AWS EKS clusters to protect your containerized workloads and sensitive data.

Guide

  1. Introduction

  2. Understanding AWS EKS Security Model

  3. Securing the EKS Control Plane

  4. Securing EKS Worker Nodes

  5. Implementing Pod Security

  6. IAM and EKS

  7. Networking and EKS Security

  8. Data Protection

  9. Monitoring and Auditing

  10. Incident Response and Recovery

  11. Compliance and Governance

  12. Ongoing Education and Best Practices

1. Introduction

Amazon Elastic Kubernetes Service (EKS) simplifies the deployment and management of containerized applications with Kubernetes. However, as with any cloud service, securing your EKS clusters is a shared responsibility between AWS and the customer.

This shared responsibility model is the cornerstone of the AWS security model, meaning that we, the customers, do not have to worry about the same old patch and configuration management that we normally would with on-prem systems. As detailed later in this guide, regarding EKS this takes the form of an AWS-managed control plane that servers as an endpoint that devs can connect to directly with kubectl.

2. Understanding AWS EKS Security Model

To secure your EKS clusters effectively, you need a good grasp of the AWS EKS security model:

  • EKS Control Plane Security: AWS manages and secures the control plane, ensuring that it is always up to date and protected against threats.

  • Worker Node Security: Securing your worker nodes is your responsibility. Regularly update the OS and node software, enforce security policies, and employ monitoring tools.

    • Note that this does not apply to fargate deployments, which are considered serverless as AWS manages them entirely with users having no visibility into the underlying infrastructure.
  • Pod Security: Implement best practices for pod security by using security contexts, network policies, and container runtime security features.

    • It is important to periodically audit your pods as this is where the heavy lifting of your application takes place. Code scanning and catching issues before they become applications in a pod is important and would be covered under the AppSec realm of things in the security space.

3. Securing the EKS Control Plane

  • Access Control: Secure the control plane by using AWS Identity and Access Management (IAM) roles and policies. Implement strong authentication mechanisms and Multi-Factor Authentication (MFA). Ensure routine audits take place and permissions follow the least-privilege principle.

  • Encryption: Use Transport Layer Security (TLS) for encrypting communications between the control plane and worker nodes. Employ AWS Key Management Service (KMS) for encryption keys.

  • Audit Logging: Enable audit logs for the control plane. Store and monitor these logs for potential security incidents, in a place such as CloudWatch or an external logging system.

  • GuardDuty: Enable GuardDuty at the EKS level to gain insides into potentially malicious control plane events.

4. Securing EKS Worker Nodes

  • OS Hardening: Keep the underlying OS updated with the latest security patches. Disable unnecessary services and applications. Follow guidance for your operating system, such as the CIS benchmarks for linux systems.

  • Container Runtime Security: Choose container runtimes with strong security postures, and regularly update them to patch vulnerabilities. Examples include Falco and Crowdstrike.

  • Access Control: Limit SSH access to worker nodes, monitor access logs, and use secure access mechanisms like AWS Key Pairs. If possible, do not enable SSH access to worker nodes

  • Network Access: Ensure network access is hardened, only workloads explicitly designed to be public live in a public subnet. Apply all appropriate network security controls.

5. Implementing Pod Security

  • Security Contexts: Utilize Kubernetes security contexts to set security parameters for pods. Define the least privilege principle to restrict capabilities and access.

  • Pod Network Policies: Implement Kubernetes Network Policies to control communication between pods. Enforce policies that follow the principle of least privilege.

  • Pod Identity and Secrets: Utilize service accounts to control pod identities. Manage and rotate secrets securely using Kubernetes Secrets. Ensure secrets are always encrypted and access is audited. For tighter security, use KMS customer-managed keys and custom key policies. See: https://docs.aws.amazon.com/eks/latest/userguide/enable-kms.html

6. IAM and EKS

  • IAM Roles and Policies: Implement the principle of least privilege for IAM roles associated with your EKS clusters. Regularly audit and update IAM policies. Keep track of user access, ensure access keys are limited and EKS permissions are hardened.

  • AWS Identity Provider for Kubernetes (IRSA): Use IAM Roles for Service Accounts (IRSA) to associate IAM roles directly with Kubernetes service accounts. This provides fine-grained control over pod permissions and reduces reliance on static AWS credentials.

7. Networking and EKS Security

  • VPC Security: Leverage Amazon Virtual Private Cloud (VPC) features like security groups and network ACLs to control inbound and outbound traffic to EKS clusters. Enable VPC flow logs with a logging system to ensure parsing logs can be done quickly in the event of a security incident.

  • Network Policies: Implement Kubernetes Network Policies to control pod-to-pod communication. Define and enforce policies that align with the principle of least privilege.

  • Service Mesh: Implement a service mesh like Istio for enhanced security, including traffic encryption and access control.

8. Data Protection

  • Data Encryption: Use encryption mechanisms like AWS Key Management Service (KMS) and TLS to protect sensitive data in transit and at rest.

  • Backup and Disaster Recovery: Regularly back up critical data and create disaster recovery plans to ensure data resilience.

9. Monitoring and Auditing

  • CloudWatch and CloudTrail: Utilize AWS CloudWatch and CloudTrail for monitoring and auditing EKS clusters. Set up alarms and notifications for suspicious activities.

  • Prometheus and Grafana: Implement monitoring solutions like Prometheus and Grafana for real-time insights into cluster health.

10. Incident Response and Recovery

  • Incident Response Plan: Develop an incident response plan detailing how to identify, contain, eradicate, and recover from security incidents.

  • Backups and Rollbacks: Maintain backups of cluster configurations and applications. Be prepared to roll back to a stable state if needed.

11. Compliance and Governance

  • Compliance Standards: Align your EKS clusters with relevant compliance standards and best practices, such as HIPAA, GDPR, or CIS benchmarks.

  • Resource Tagging: Use AWS resource tagging to manage and track resources efficiently.

12. Ongoing Education and Best Practices

  • Stay Informed: Keep up with the latest AWS updates, security best practices, and emerging threats. Follow multiple sources of cyber news, including AWS themselves for security bulletins.

    • Technology moves very fast, so it's important to keep up to date on not only AWS but also Kubernetes bulletins that will provide updates on security issues and related announcements.
  • Training and Certification: Invest in training and AWS certifications to enhance your knowledge and skills in EKS security.

  • Plan & Simulate: Routinely run table top incident response exercises to simulate real-world attack scenarios, such as access key compromise or data loss. Ensure incremental development and track the progress, ensuring key stakeholders are present and aware of the outcomes.