AWS – Nextra

Takeaway

Takeaway - Security

Data Encryption / Decryption
- Usually refers to data at rest encryption / decryption, in which case users need to explicitly specify the encryption key (symmetric).
- Since the encryption key needs to be specified explicitly, data at rest encryption is not enabled by default.
- Data in transit encryption is enabled by default and does not need user intervention, but users must adopt TLS supported endpoints for encryption to work.

AWS Architecture

AWS Architecture Center (opens in a new tab)

Migrate & Modernize

Migrate & Modernize (opens in a new tab)

7 Strategies for Migrating Applications to the Cloud, introducing AWS Mainframe Modernization and AWS Migration Hub Refactor Spaces (opens in a new tab)
- Relocate (Containers / VMware Cloud on AWS)
- Rehost / lift-and-shift
- Replatform / lift-and-reshape
- Repurchase / replace
- Refactor / rewrite
- Retain / move
- Retire / decommission

AWS Whitepapers

Disaster Recovery of Workloads on AWS: Recovery in the Cloud (opens in a new tab)

AWS Well-Architected Framework

AWS Well-Architected Framework (opens in a new tab)

Operational excellence

Security

Reliability

Disaster Recovery (DR)

Disaster Recovery (DR) (opens in a new tab)

Recovery Time Objective (RTO) is the maximum acceptable delay between the interruption of service and restoration of service. This determines what is considered an acceptable time window when service is unavailable.
Recovery Point Objective (RPO) is the maximum acceptable amount of time since the last data recovery point. This determines what is considered an acceptable loss of data between the last recovery point and the interruption of service.
DR strategies (opens in a new tab)
- Backup & Restore
  - RPO / RTO : Hours
  - Lower priority use cases
  - Provision all AWS resources after event
  - Restore backups after event
  - Cost $
- Pilot Light
  - RPO / RTO : 10s of minutes
  - Data live
  - Services idle
  - Provision some AWS resources and scale after event
  - Cost $$
- Warm standby
  - RPO / RTO : Minutes
  - Always running, but smaller
  - Business critical
  - Scale AWS resources after event
  - Cost $$$
- Multi-site
  - RPO / RTO : Real-time
  - Zero downtime
  - Near zero data loss
  - Mission Critical Services
  - Cost $$$$
Resources
- AWS Cloud Operations & Migrations Blog - Establishing RPO and RTO Targets for Cloud Applications (opens in a new tab)

Performance efficiency

Cost optimization

Sustainability

CLI

CLI - Pagination

Pagination (opens in a new tab)

By default, the AWS CLI uses a page size of 1000 and retrieves all available items.
If all available items are more than page size, multiple API calls are made until all available items are returned.
Parameters
- --no-paginate
  
  Return only the first page of results, therefore single API call
- --page-size
  
  Specify the number of items in a single page (by default 1000)
- --max-items
  
  Specify the total number of items returned (by default all available items)
- --starting-token
  
  When --max-items specifies a number smaller than all available items, the output will include a NextToken retrieving the remaining items.

CLI - Tagging

Find resources by specified tags in the specific Region

aws resourcegroupstaggingapi get-resources \
  --tag-filters Key=Environment,Values=Production \
  --tags-per-page 100

CLI - Filter

Server-side filtering
For filter name, refer to API documentation of the resource action.

AWS - EC2 - DescribeInstances (opens in a new tab)

CLI - Cheatsheet

CLI - CloudWatch - Get Log Groups

aws logs describe-log-groups

CLI - CloudWatch - Get Log Streams

aws logs describe-log-streams --log-group-name <log-group-name>

CLI - CloudWatch - Get Log Events

aws logs get-log-events --log-group-name <log-group-name> --log-stream-name <log-stream-name> --limit 100
aws logs get-log-events --log-group-name <log-group-name> --log-stream-name <log-stream-name> --start-time <start-time> --end-time <end-time>

CLI - CloudWatch - Get paginated all log events of a log group in text output

aws logs filter-log-events --log-group-name <log-group-name> --output text

Suitable for general browsing

CLI - CloudWatch - Search keyword in log events of a log group

aws logs filter-log-events --log-group-name <log-group-name> --limit 100 --filter-pattern %Keyword%

CLI - S3 - Listing all user owned buckets

aws s3 ls

Cost Management

AWS Docs - Cost Management (opens in a new tab)

Savings Plans

AWS Docs - Savings Plans (opens in a new tab)

In addition to EC2, also applicable only to Fargate and Lambda
Aims to simplify savings planning on EC2 instances
Types
- Compute Savings Plans
  - Most flexible
    - EC2
    - ECS Fargate
    - Lambda
  - Up to 66% off of On-Demand rates
- EC2 Instance Savings Plans
  - Provide the lowest prices, offering savings up to 72% in exchange for commitment to usage of individual instance families in a Region (e.g. M5 usage in N. Virginia)
  - Up to 72% off of On-Demand rates
- SageMaker Savings Plans
  - Up to 64% off of On-Demand rates

VPC

AWS Docs - VPC (opens in a new tab)

A VPC spans all AZs in the Region.
CLI
- aws ec2 create-default-vpc
  
  create a default VPC
- aws ec2 create-default-subnet --availability-zone <AZ>
  
  create a default subnet
Recipes
- Calculate subnet CIDR block based on VPC CIDR block
  
  Use ipcalc
References
- AWS IP address ranges (opens in a new tab)

VPC - Subnet

A subnet always belongs to one VPC once created.
A subnet is associated with only one AZ.
Subnet CIDR block must be a subset of the VPC CIDR block.
172.16.0.0/21 means the first 21 bits are used to identify network (subnet), the rest of bits are used identify hosts. In this case, 21 bits are used for network identification, while 32 - 21 = 11 bits are used for host identification. Therefore, when assigning IP addresses, the first 21 bits are fixed, while the rest bits will increment until all allocated.
public subnet is a subnet that's associated with a route table that has a route to an internet gateway.
You can make a default subnet into a private subnet by removing the route from the destination 0.0.0.0/0 to the internet gateway.
Resources
- Notes: Networks, Subnets, and CIDR (opens in a new tab)

VPC - Route Table

A route table always belongs to one VPC once created.
A subnet can only be associated with one route table at a time, but you can associate multiple subnets with the same route table.
Each subnet in your VPC must be associated with a route table, which controls the routing for the subnet (subnet route table).
If not explicitly specified, the subnet is implicitly associated with the main route table.
Your VPC has an implicit router table, and you use route tables to control where network traffic is directed.
If your route table has multiple routes, we use the most specific route (longest prefix match) that matches the traffic to determine how to route the traffic.

VPC - Static IP Address

When you stop an EC2 instance, its public IP address is released. When you start it again, a new public IP address is assigned.

VPC - Elastic IP Address

If you require a public IP address to be persistently associated with the instance, allocate an Elastic IP address, essentially reserved public IP address.
Elastic IP address is free of charge when allocated to running EC2 instances, while charge applies when they are reserved but not in use.

VPC - Network ACL

One Network ACL always belongs to one VPC once created.
Operates at the subnet level, able to be associated with multiple subnets within the same VPC, operating like filters, therefore stateless.
Black / white list
Return traffic must be explicitly allowed by rules
Rules evaluation order
- By Rule number in ascending order
- First matched first served like a if/else block

VPC - Security Group

Operates at the instance level, therefore only in effect when associated with instance(s), therefore stateful.
By default, a security group includes an outbound rule that allows all outbound traffic.
White list only, you can specify allow rules, but not deny rules.
Return traffic is automatically allowed, regardless of Inbound or Outbound
Inbound rules only specify source IP, while Outbound rules only specify destination IP.
All rules are evaluated before a decision is made.
At most 5 Security Group can be associated with an instance, and union of all rules from the all associated Security Group would be applied to the instance.
When you specify a security group as the source for an inbound or outbound rule, traffic is allowed from the network interfaces that are associated with the source security group for the specified protocol and port. Incoming traffic is allowed based on the private IP addresses of the network interfaces that are associated with the source security group (and not the public IP or Elastic IP addresses). Adding a security group as a source does not add rules from the source security group.
Default security group cannot be deleted.
By default, a security group includes an outbound rule that allows all outbound traffic.

VPC - Security Group - CLI Cheatsheet

VPC - Security Group - Get all Security Group rules permitting inbound traffic on the given TCP port

aws_ec2_describe_security_groups_rules_ingress () {
  local protocol=$1
  local port=$2
  local filters='!IsEgress && (IpProtocol == `'${protocol}'` || IpProtocol == `-1`) && (FromPort <= `'${port}'` && ToPort >= `'${port}'` || FromPort == `-1` && ToPort == `-1`)'
  aws ec2 describe-security-group-rules \
    --query "sort_by(SecurityGroupRules, &GroupId)[? $filters].{GroupID: GroupId, From: FromPort, To: ToPort, CIDR: CidrIpv4, RuleID: SecurityGroupRuleId}" \
    --output table
}
aws_ec2_describe_security_groups_rules_ingress tcp 22

VPC - Security Group - Create a Security Group in the given VPC

aws ec2 create-security-group \
  --group-name $group_name \
  --description $description \
  --vpc-id $vpc_id

VPC - Security Group - Add an inbound rule to the given Security Group

aws ec2 authorize-security-group-ingress \
  --group-id $group_id \
  --protocol $protocol \
  --port $port \
  --cidr $cidr
 
# e.g. allowing traffic from a given IP
# aws ec2 authorize-security-group-ingress \
#   --group-id sg-1234567890abcdef0 \
#   --protocol tcp \
#   --port 22 \
#   --cidr 10.64.1.121/32   // Only one host is allowed

VPC - ENI (Elastic network interface)

AWS Docs - ENI (Elastic network interface) (opens in a new tab)

Once created, ENIs are specific to a subnet, but an Elastic IP can be disassociated from an ENI and available again.
ENI can be detached from an EC2 instance, and attached to another instance.
The primary ENI cannot be detached from an EC2 instance.

VPC Connection Options

VPC - Internet Gateway

AWS Docs - Internet Gateway (opens in a new tab)

Only one Internet Gateway can be attached to one VPC at a time.
Instances must have public IPs.
Attaching an Internet Gateway to a VPC allows instances with public IPs to access the internet.

VPC - Egress-only Internet Gateway

AWS Docs - Egress-only Internet Gateway (opens in a new tab)

IPv6
- An egress-only internet gateway is for use with IPv6 traffic only.
- IPv6 addresses are globally unique, and are therefore public by default.
IPv4
- To enable outbound-only internet communication over IPv4, use a NAT gateway instead.

VPC - NAT Gateway

NAT Gateway (opens in a new tab)

Fully managed, highly available EC2 instance
NAT Gateway allows private subnet to access the internet,
NAT Gateway must have an EIP.
NAT Gateway traffic must be routed to Internet Gateway in the route table.
It only works one way. The internet cannot get through your NAT to your private resources unless you explicitly allow it.
EIP cannot be detached.
Bandwidth up to 45 Gbps
Cannot be associated with a Security Group
Cannot function as a Bastion host

VPC - NAT Instance

Self managed, but with more flexibility and customization
An EC2 instance configured to perform NAT
EIP can be detached.
Can be associated with a Security Group
Can function as a Bastion host

VPC - VPC endpoint

A VPC endpoint enables you to privately connect your VPC to supported AWS services and VPC endpoint services powered by AWS PrivateLink (opens in a new tab) without requiring an internet gateway, NAT device, VPN connection, or AWS Direct Connect connection.
VPC endpoint types
- Interface endpoint
- Gateway Load Balancer endpoint
- Gateway endpoint
Key points
- Pros
  - Secure and private connection
  - No internet needed
- Cons
  - Not all services are supported
  - Not all Regions are supported
  - Cross region not supported

VPC - Interface endpoint

An interface endpoint is an ENI with a private IP address from the IP address range of your subnet that serves as an entry point for traffic destined to a supported service.
interface endpoints are powered by AWS PrivateLink, which bills you for each hour that your VPC endpoint remains provisioned in each AZ, irrespective of the state of its association with the service.

VPC - Gateway endpoint

A gateway endpoint is a gateway that you specify as a target for a route in your route table for traffic destined to a supported AWS service.
Doesn't use PrivateLink, therefore no hourly charge.
Only work in the same Region
Only S3 and DynamoDB are supported
Gateway endpoints do not allow access from on-premises networks, from peered VPCs in other Regions, or through a transit gateway.

VPC peering

AWS Docs - VPC peering (opens in a new tab)

A VPC peering connection is a networking connection between two VPCs that enables you to route traffic between them using private IPv4 addresses or IPv6 addresses. Instances in either VPC can communicate with each other as if they are within the same network. You can create a VPC peering connection between your own VPCs, or with a VPC in another AWS account. The VPCs can be in different Regions (also known as an inter-Region VPC peering connection).

EC2

AWS Docs - EC2 (opens in a new tab)

Instance
- Each Linux instance launches with a default Linux system user account. Log in as default user to administrate the instance, while default user is not a root user, it has permissions to install additional software.
- To get password of default administrator on a Windows instance, you have to decrypt it with the EC2 instance key pair private key.
- EC2 metadata service is only accessible from the instance for query of meta information, such as IAM Role name but not IAM policy.
- User data
  - Scripts entered as user data are executed as the root user.
  - By default, user data scripts and cloud-init directives run only during the boot cycle when you first launch an instance.
- AMI (opens in a new tab)
  - AMI is Region specific.
  - AMI can’t be changed for an existing instance.
- Root volume (opens in a new tab)
  - The root volume contains the image used to boot the instance.
  - Each instance has a single root volume.
  - Instance store backed or EBS backed
- Use Security Token Service (STS) to decode encoded authorization message
- Instance type (opens in a new tab)
Instance purchasing options
- On-Demand
- Reserved Instance (opens in a new tab)
  - When you purchase a Reserved Instance, you determine the scope of the Reserved Instance to be either regional or zonal.
  - By default, when you purchase a Reserved Instance, the purchase is made immediately. Alternatively, you can queue your purchases for a future date and time. You can queue purchases for regional Reserved Instances, but not zonal Reserved Instances or Reserved Instances from other sellers.
  - Provide a capacity reservation when used in a specific AZ, applicable to EC2, RDS, Redshift and ElastiCache
    - Standard
      - Modifiable
      - Cannot exchange a Standard Reserved Instance
      - Can be sold in the Reserved Instance Marketplace
    - Convertible
      - Modifiable
      - Can exchange a Convertible Reserved Instance
      - Cannot be sold in the Reserved Instance Marketplace
  - Scope
    
    Regional Zonal
    AZ flexibility Region AZ
    Capacity reservation ❌ ✅
    Instance size flexibility Same instance family ❌
    Queuing a purchase ✅ ❌
  - Reserved Instance Marketplace (opens in a new tab)
    - Supports the sale of third-party and AWS customers' unused Standard Reserved Instances
- Scheduled Instance
  
  Capacity reservations that recur on a daily, weekly, or monthly basis, with a specified start time and duration, for a one-year term
- Spot Instance
  - Unused EC2 capacity in the AWS cloud for stateless, fault-tolerant workload
  - Spot Instance interruption (opens in a new tab)
    
    Spot Instance interruption is when EC2 reclaims a Spot Instance.
  - Spot Fleet (opens in a new tab)
    
    A Spot Fleet is a collection, or fleet, of Spot Instances, and optionally On-Demand Instances to meet the target capacity that you specified in the Spot Fleet request.
- Dedicated Host & Dedicated Instance (opens in a new tab)
  
  Physical server fully dedicated for your use
Placement groups (opens in a new tab)
- Cluster
  
  Packs instances close together inside an Availability Zone. This strategy enables workloads to achieve the low-latency network performance necessary for tightly-coupled node-to-node communication that is typical of high-performance computing (HPC) applications.
- Partition
  
  Spreads your instances across logical partitions such that groups of instances in one partition do not share the underlying hardware with groups of instances in different partitions. This strategy is typically used by large distributed and replicated workloads, such as Hadoop, Cassandra, and Kafka.
- Spread
  
  Strictly places a small group of instances across distinct underlying hardware to reduce correlated failures.
Key pair
- For SSH into EC2 instances
Troubleshoot
- Error connecting to your instance: Connection timed out
  
  Verify there is a Security Group inbound rule that allows traffic from your computer to a particular port (such as 22 for SSH).
Performance
- Overview of performance and optimization options - Amazon EC2 Overview and Networking Introduction for Telecom Companies (opens in a new tab)
- Improve network latency for Linux based Amazon EC2 instances - Amazon Elastic Compute Cloud (opens in a new tab)
Resources
- How do I delete or terminate EC2 resources? (opens in a new tab)

	`Regional`	`Zonal`
AZ flexibility	Region	AZ
Capacity reservation	❌	✅
Instance size flexibility	Same instance family	❌
Queuing a purchase	✅	❌

EC2 - Cheatsheet

Use metadata service to get instance metadata within the instance

List all categories of metadata

curl http://169.254.169.254/latest/meta-data
Resources
- Retrieve instance metadata (opens in a new tab)

Get instances by keyword in name

aws ec2 describe-instances --filters "Name=tag:Name,Values=*<keyword>*"

Server filter with AWS CLI v2
aws ec2 describe-instances | jq '.Reservations[].Instances[] | select(.Tags[].Key == "Name" and (.Tags[].Value | contains("<keyword>")))'

Client filter with jq

Get instances by state

aws ec2 describe-instances --filters "Name=instance-state-name,Values=running|stopped"

Get instance types and specification

aws ec2 describe-instance-types

Get public key of SSH key pair

aws ec2 describe-key-pairs --key-names <key-pair-name> --include-public-key

Create/Update a tag of an instance

aws ec2 create-tags --resources <instance-id> --tags 'Key=<key>,Value=<value>'

List all tags of an instance

aws ec2 describe-tags --filters "Name=resource-id,Values=<instance-id>"

ELB (Elastic Load Balancing)

To distribute traffic between the instances (often in a Auto Scaling group)
ELB can be enabled within a single AZ or across multiple AZ to maintain consistent application performance.
Sticky Session (opens in a new tab)

AKA Session affinity, enabling the load balancer to bind a user's session to a specific instance. This ensures that all requests from the user during the session are sent to the same instance, so user won't need to keep authenticating themselves.
Load balancers
- Application Load Balancer
  - Operate at OSI Layer 7
  - Supports WebSocket and HTTP/2
  - Register targets in target groups and route traffic to target groups.
  - Cross-zone load balancing is always enabled.
  - Access logs capture detailed information about requests sent to the ALB.
  - ALB exposes a static DNS for access.
  - Listeners
    - A listener is a process that checks for connection requests, using the protocol and port that you configure. The rules that you define for a listener determine how the load balancer routes requests to its registered targets.
    - Listener rule condition types (opens in a new tab)
      - host-header
      - http-header
      - http-request-method
      - path-pattern
      - query-string
      - source-ip
    - Authenticate users (opens in a new tab)
      - You can configure an ALB to securely authenticate users as they access your applications. This enables you to offload the work of authenticating users to your load balancer so that your applications can focus on their business logic.
- Network Load Balancer
  - Operate at OSI Layer 4
  - Exposes a public static IP for access.
  - Cross-zone load balancing is by default disabled.
  - Target type
    - EC2 Instances
    - IP addresses
- Classic Load Balancer
  - Only for EC2 Instances
  - CLB exposes a static DNS for access.
  - A CLB with HTTP or HTTPS listeners might route more traffic to higher-capacity instance types.
Target group
- Target type
  - One to many EC2 Instances
    - Supports load balancing to EC2 instances within a specific VPC.
    - Facilitates the use of EC2 Auto Scaling to manage and scale your EC2 capacity.
  - One to many IP addresses
    - Supports load balancing to VPC and on-premises resources.
    - Facilitates routing to multiple IP addresses and network interfaces on the same instance.
    - Offers flexibility with microservice based architectures, simplifying inter-application communication.
    - Supports IPv6 targets, enabling end-to-end IPv6 communication, and IPv4-to-IPv6 NAT.
  - Single Lambda function
    - Facilitates routing to a single Lambda function.
    - Accessible to ALB only.
  - Application Load Balancer
    - Offers the flexibility for a NLB to accept and route TCP requests within a specific VPC.
    - Facilitates using static IP addresses and PrivateLink with an ALB.
- Protocol
  - HTTP/1.1
  - HTTP/2
  - gRPC
Health Check

ELB - Cheatsheet

Describe all load balancers

aws elbv2 describe-load-balancers \
--query 'sort_by(LoadBalancers,&LoadBalancerName)[].{LoadBalancer:LoadBalancerName,Type:Type,DNS:DNSName}' \
--output table

Describe all listeners and their target group of the given load balancer

aws elbv2 describe-listeners \
--load-balancer-arn <load-balancer-arn> \
--query 'sort_by(Listeners,&ListenerArn)[].{Protocol:Protocol,Port:Port,TargetGroup:DefaultActions[0].TargetGroupArn}' \
--output table

Describe the given target groups

aws elbv2 describe-target-groups \
--filter Name=target-group-name,Values=<target-group-name> \
--query 'sort_by(TargetGroups,&TargetGroupName)[].{TargetGroup:TargetGroupName,Protocol:Protocol,Port:Port,VPC:VpcId}' \
--output table

Associate a Security Group with the given Load Balancer

aws elbv2 set-security-groups \
--load-balancer-arn $load_balancer_arn \
--security-groups $security_group_id

Show health state of all target groups

#!/bin/bash
 
# Get a list of all target groups
target_group_arns=($(aws elbv2 describe-target-groups --query "TargetGroups[].TargetGroupArn" --output text))
 
# Loop through the target groups and check if there are running instances
for arn in "${target_group_arns[@]}"; do
    echo "Checking target group: $arn"
    aws elbv2 describe-target-health \
      --target-group-arn "$arn" \
      --query 'TargetHealthDescriptions[].{"Target ID":Target.Id, Port:Target.Port, State:TargetHealth.State} | sort_by(@, &State)' \
      --output table
done

EC2 - Auto Scaling

AWS Docs - Auto Scaling (opens in a new tab)

Auto Scaling group can span across multiple AZs within a Region, but not across multiple Regions.
Auto Scaling works with all 3 load balancers.
CloudWatch Alarms can be used to trigger Auto Scaling actions.

EC2 - Launch Template

AWS Docs - Launch Template (opens in a new tab)

Improvements over Launch Configuration
- Supports versioning, while Launch Configuration is immutable
- Supports multiple instance types and purchase options
- More EC2 options
  - Systems Manager parameters (AMI ID)
  - The current generation of EBS Provisioned IOPS volumes (io2)
  - EBS volume tagging
  - T2 Unlimited instances
  - Elastic Inference
  - Dedicated Hosts

EC2 - ASG Capacity limits

ASG Capacity limits (opens in a new tab)

After you have created your Auto Scaling group, the Auto Scaling group starts by launching enough EC2 instances to meet its minimum capacity (or its desired capacity, if specified).
The minimum and maximum capacity are required to create an Auto Scaling group.
Desired capacity (either by manual scaling or automatic scaling) must fall between the minimum and maximum capacity.

EC2 - Scaling policy

AWS Docs - Scaling policy (opens in a new tab)

A scaling policy instructs Auto Scaling to track a specific CloudWatch metric, and it defines what action to take when the associated CloudWatch alarm is in ALARM. The metrics that are used to trigger an alarm are an aggregation of metrics coming from all of the instances in the Auto Scaling group.
Target tracking scaling
- The scaling policy adds or removes capacity as required to keep the metric at, or close to, the specified target value.
- Triggered by an automatically created and managed CloudWatch Alarm by EC2 Auto Scaling, which users shouldn't modify.
- You don't need to specify scaling action.
- eg: Configure a target tracking scaling policy to keep the average aggregate CPU utilization of your Auto Scaling group at 40 percent.
Step scaling
- Triggered by a specified existing CloudWatch Alarm
- Scaling action (add, remove, set) is based on multiple step adjustments
Simple scaling
- Triggered by a specified existing CloudWatch Alarm
- Scaling action (add, remove, set) is based on a single scaling adjustment
Scaling cooldown (opens in a new tab)

A scaling cooldown helps you prevent your Auto Scaling group from launching or terminating additional instances before the effects of previous activities are visible.

EC2 - Scheduled Actions

Scheduled actions (opens in a new tab)

Set up your own scaling schedule according to predictable load changes

EC2 - Termination Policy

Termination Policy (opens in a new tab)

Default termination policy
1. Determine whether any of the instances eligible for termination use the oldest launch template or launch configuration.
2. After applying the preceding criteria, if there are multiple unprotected instances to terminate, determine which instances are closest to the next billing hour.

EC2 Monitoring

Instances (opens in a new tab)
- By default, basic monitoring is enabled when you create a launch template or when you use the AWS Management Console to create a launch configuration.
- By default, detailed monitoring is enabled when you create a launch configuration using the AWS CLI or an SDK.
Health check (opens in a new tab)
- Auto Scaling can determine the health status of an instance using one or more of the following:
  - EC2 Status Checks
  - ELB Health Checks
  - Custom Health Checks
- The default health checks for an Auto Scaling group are EC2 status checks only.

EBS (Elastic Block Store)

Can only be attached to another instance within the same AZ
Backup and restore snapshot can be used to share data with instances in another AZ.
Usually one volume can only be attached to one instance at a time (Multi-Attach is not common)
You can use block-level storage only in combination with an EC2 instance where the OS is running
After you attach an EBS volume to your instance, it is exposed as a block device. You must create a file system if there isn't one and then mount it before you can use it.
- New volumes are raw block devices without a file system.
- Volumes that were created from snapshots likely have a file system on them already.
Amazon Data Lifecycle Manager
- Automate the creation, retention, and deletion of EBS snapshots and EBS-backed AMIs
Snapshot
- Incremental, tracking changes only
- A volume becomes available right when the restore operation begins, even though the actual data had not yet been fully copied to the disk
- Backup occur asynchronously; the point-in-time snapshot is created immediately, but the status of the snapshot is pending until the snapshot is complete
- Stored in S3
- Be aware of the performance penalty when initializing volumes from snapshots
- Fast Snapshot Restore (opens in a new tab)
  
  enables you to create a volume from a snapshot that is fully initialized at creation. This eliminates the latency of I/O operations on a block when it is accessed for the first time.
Volume types (opens in a new tab)
- General Purpose SSD (opens in a new tab) (gp2, gp3)
  - gp2 volumes can support a sustained load of up to 3000 IOPS for up to 30 minutes at a time.
  - IOPS/Volume
    - < 34 GiB: 100 IOPS
    - >= 34 GiB & <= 5333 GiB: incremental 3 IOPS / GiB
    - > 5333 GiB: 16,000 IOPS
- Provisioned IOPS SSD (opens in a new tab) (io1, io2, io2 Block Express)
  - Max IOPS/Volume: 64,000
  - The maximum ratio of provisioned IOPS to requested volume size (in GiB) is 50:1 for io1 volumes, and 500:1 for io2 volumes.
  - io2 Block Express Volumes (opens in a new tab)
    - Up to 4x higher throughput, IOPS, and capacity than io2 volumes, and are designed to deliver sub-millisecond latency and 99.999% durability.
- Throughput Optimized HDD (opens in a new tab) (st1)
  - It cannot be used as a bootable volume.
  - Recommended for a large and linear workload such as
    - Data warehouse
    - Log processing
    - Amazon Elastic MapReduce (EMR), and ETL workloads
- Cold HDD (opens in a new tab) (sc1)
Performance Characteristics
- Throughput = Size per IO Operation * IOPS
- Size per IO Operation
  - the amount of data written/read in a single IO request.
  - data / request
  - EBS merges smaller, sequential I/O operations that are 32 KiB or over to form a single I/O of 256 KiB before processing.
  - EBS splits I/O operations larger than the maximum 256 KiB into smaller operations.
- IOPS
  - the number of IO requests on a single block can be completed by the storage device in a second.
  - requests / second
- Throughput
  - the amount of data transferred from/to a storage device in a second. Typically stated in KB/MB/GB/s
  - data / second
Network bandwidth limits
- EC2 instances access EBS volumes over network connections.
- EBS volumes can be accessed using dedicated networks (available on EBS-optimized instances) and shared networks (non EBS-optimized instances).
Encryption
- You encrypt EBS volumes by enabling encryption, either using encryption by default or by enabling encryption when you create a volume that you want to encrypt.
- EBS encryption uses KMS CMK when creating encrypted volumes and snapshots.
- Encryption operations occur on the servers that host EC2 instances, ensuring the security of both data-at-rest and data-in-transit between an instance and its attached EBS storage.
- Encryption by default is a Region-specific setting. If you enable it for a Region, you cannot disable it for individual volumes or snapshots in that Region.
- Volumes
  - Can only be encrypted upon creation
  - Encrypted volumes cannot be unencrypted.
- Snapshots
  - Snapshots created from an encrypted volume are always encrypted.
  - Encrypted snapshots cannot be unencrypted.
  - Unencrypted snapshots can only be encrypted when being copied.
- Encrypted data include:
  - Data at rest inside the volume
  - Data in transit between the volume and the instance
  - All snapshots created from the volume
  - All volumes created from those snapshots

EFS (Elastic File System)

Region-specific
Traditional filesystem hierarchy
The main differences between EBS and EFS is that EBS is only accessible from a single EC2 instance in your particular Region, while EFS allows you to mount the file system across multiple Regions and instances.

Elastic Beanstalk

PaaS based on EC2, using CloudFormation under the hood.
Application
- Application version lifecycle settings (opens in a new tab)
  - If you don't delete versions that you no longer use, you will eventually reach the application version quota and be unable to create new versions of that application.
  - You can avoid hitting the quota by applying an application version lifecycle policy to your applications.
- Removing application will also trigger removal of all associated resources such as environment, EC2 Instance, etc.
Environment
- You can run either a web server environment or a worker environment.
- Use Validate VPC Settings button in Environment tab to troubleshoot network.
- If you associate an existing RDS instance to an existing EB environment, the RDS instance must be launched from a snapshot.
- Environment type can be Load Balanced or Single Instance.
- When you terminate an environment, you can save its configuration to recreate it later.
- HTTPS
  - The simplest way to use HTTPS with an Elastic Beanstalk environment is to assign a server certificate to your environment's load balancer.
Configuration (all under project root)
- .ebextensions (opens in a new tab) directory
  - Configuration files are YAML or JSON-formatted documents with a .config file extension.
  - Options can be specified as below, and is overridden as per precedence (opens in a new tab) rules
```
option_settings:
  - namespace: namespace
    option_name: option name
    value: option value
  - namespace: namespace
    option_name: option name
    value: option value
```
- .elasticbeanstalk directory
  - Saved configuration
    - Saved configurations are YAML formatted templates that define an environment's platform version, tier, configuration option settings, and tags.
    - Saved configurations are located under .elasticbeanstalk > saved_configs in project directory.
- Config files in the project directory
  - env.yaml
    
    You can include a YAML formatted environment manifest in the root of your application source bundle to configure the environment name, solution stack and environment links to use when creating your environment.
  - cron.yaml (Worker environment)
    
    You can define periodic tasks in a file named cron.yaml in your source bundle to add jobs to your worker environment's queue automatically at a regular interval.
- Elastic Beanstalk supports CloudFormation functions (Ref, Fn::GetAtt, Fn::Join), and one Elastic Beanstalk-specific function, Fn::GetOptionSetting.
Platform (opens in a new tab)
- Docker
  - Single-container
  - Multi-container
- Custom platform
  - A custom platform lets you develop an entire new platform from scratch, customizing the operating system, additional software, and scripts that Elastic Beanstalk runs on platform instances.
  - To create a custom platform, you build an AMI from one of the supported operating systems and add further customizations.
EB CLI
- Installation (opens in a new tab)
  - Install python3
  - Install pip3
  - Install awsebcli
- Useful commands (opens in a new tab)
  - eb status
    
    Gets environment information and status
  - eb printenv
    
    Shows the environment variables
  - eb list
    
    Lists all environments
  - eb setenv <env-variable-value-pairs>
    
    Sets environment variables
    
    eg: eb setenv HeapSize=256m Site_Url=mysite.elasticbeanstalk.com
  - eb ssh
    
    Opens the SSH client to connect to an instance

AWS CLI (opens in a new tab)

Creates an application version for the specified application (opens in a new tab)

aws elasticbeanstalk create-application-version \
--application-name MyApp \
--version-label v1 \
--description MyApp-v1 \
--source-bundle S3Bucket="<bucket-name>",S3Key="myApp.zip" \
--auto-create-application`

Deployment Strategies (opens in a new tab)

Update existing instances
- All-at-once
  
  Deploy the new version to all instances simultaneously.
- Rolling
  
  Updates are applied in a batch to running instances. The batch will be out of service while being updated. Once the batch is completed, the next batch will be started.
- Rolling with an additional batch
  
  The same as Rolling, except launching an additional batch of instances of the old version to rollback in case of failure. This option can maintain full capacity. When the deployment completes, Elastic Beanstalk terminates the additional batch of instances.
Deploying to new instances
- Immutable
  
  Instances of the new version are deployed as instances of the old version are terminated. There's no update to existing instances.
- Traffic-splitting (opens in a new tab)
  
  Elastic Beanstalk launches a full set of new instances just like during an immutable deployment. It then forwards a specified percentage of incoming client traffic to the new application version for a specified evaluation period. If the new instances stay healthy, Elastic Beanstalk forwards all traffic to them and terminates the old ones.
Blue/Green deployment (opens in a new tab)

A new environment will be created for the new version (Green) independent of the current version (Blue). When the Green environment is ready, you can swap the CNAMEs of the environments to redirect traffic to the newer running environment.
- Blue/green deployments require that your environment runs independently of your production database, if your application uses one.

Summary

Method	Impact of Failed Deployment	Deploy Time	Zero Downtime	No DNS Change	Rollback Process	Code Deployed To
All-at-once	Downtime	⌚		✅	Redeploy	Existing instances
Rolling	Single batch out of service; any successful batches before failure running new application version	⌚⌚	✅	✅	Redeploy	Existing instances
Rolling with additional batch	Minimal if first batch fails; otherwise, similar to Rolling	⌚⌚⌚	✅	✅	Redeploy	Existing instances
Blue/Green	Minimal	⌚⌚⌚⌚	✅	❌	Swap URL	New instances
Immutable	Minimal	⌚⌚⌚⌚	✅	✅	Redeploy	New instances

Java
- Default port 5000, to change that, update PORT environment variable.
- From Management Console, the application to be uploaded must be an executable JAR file containing all the compiled bytecode, packaged in a ZIP archive.

CodeCommit

Region specific
No public access
Authentication
- SSH
  
  Dedicated SSH key pair of current user for CodeCommit only
- HTTPS
  
  Dedicated HTTPS Git credentials of current user for CodeCommit only
- MFA
Authorization
- IAM
  
  You must have an CodeCommit managed policy attached to your IAM user, belong to a CodeStar project team, or have the equivalent permissions.
Cross-Account access to a different account
- Create a policy for access to the repository
- Attach this policy to a role in the same account
- Allow other users to assume this role
Notifications
- Events that trigger notifications (opens in a new tab) (CloudWatch Events)
  - Comments
    - On commits
    - On pull requests
  - Approvals
    - Status changed
    - Rule override
  - Pull request
    - Source updated
    - Created
    - Status changed
    - Merged
  - Branches and tags
    - Created
    - Deleted
    - Updated
- Targets
  - SNS topic
  - AWS Chatbot (Slack)
Triggers
- Triggers do not use CloudWatch Events rules to evaluate repository events. They are more limited in scope.
- Use case
  - Send emails to subscribed users every time someone pushes to the repository.
  - Notify an external build system to start a build after someone pushes to the main branch of the repository.
- Events
  - Push to existing branch
  - Create branch or tag
  - Delete branch or tag
- Target
  - SNS
  - Lambda

CodeBuild

When setting up CodeBuild projects to access VPC, choose private subnets only.
Need access to S3 for code source, therefore 2 approach
1. NAT Gateway (additional charge)
2. S3 Gateway Endpoint
Caching Dependencies (opens in a new tab)
- S3
  
  stores the cache in an S3 bucket that is available across multiple build hosts
- Local
  
  stores a cache locally on a build host that is available to that build host only
  - Docker layer cache
    
    Caches existing Docker layers so they can be reused. Requires privileged mode.
  - Source cache
    
    Caches .git metadata so subsequent builds only pull the change in commits.
  - Custom cache
    
    Caches directories specified in the buildspec file.

CodeDeploy

Application Revision
- A revision contains a version of the source files CodeDeploy will deploy to your instances or scripts CodeDeploy will run on your instances.
AppSpec (opens in a new tab)
- Configuration: appspec.yml must be present in the root directory of the application revision archive.
- files section (opens in a new tab)
  - The paths used in source are relative to the appspec.yml file, which should be at the root of your revision.
Compute platforms
- EC2/On-Premises
  - Requires installed and running CodeDeploy agent on instances
  - Requires an IAM instance profile (a Role) for EC2 instances (opens in a new tab)
- ECS
- Lambda
Deployment types
- In-place
  - EC2/On-Premises (opens in a new tab) compute platform only
  - Deployment configurations
    - One at a time
    - Half at a time
    - All at once
- Blue/green
  - Only EC2 not on-premises instances support blue/green deployment.
  - All Lambda and ECS deployments are blue/green.
  - Deployment configurations
    - EC2
      - One at a time
        
        Routes traffic to one instance in the replacement environment at a time.
    - ECS (opens in a new tab)
      - All at once
      - Canary
        
        Traffic is shifted in two increments, 10% in the first increment, and the remaining 90% after 5 / 15 minutes.
      - Linear
        
        Traffic is shifted in equal increments (10%) with a fixed interval (1 / 3 minutes).
    - Lambda (opens in a new tab)
      - All at once
      - Canary
        
        Traffic is shifted in two increments, 10% in the first increment, and the remaining 90% after 5 / 10 / 15 / 30 minutes.
      - Linear
        
        Traffic is shifted in equal increments (10%) with a fixed interval (1 / 2 / 3 / 10 minutes).
Deployment Group
- A deployment group contains individually tagged instances, EC2 instances in EC2 Auto Scaling groups, or both.
- EC2 instances must have tags to be added into a deployment group.
CodeDeploy agent
- The CodeDeploy agent is a software package that, when installed and configured on an instance, makes it possible for that instance to be used in CodeDeploy deployments.
- The CodeDeploy agent is required only if you deploy to an EC2/On-Premises compute platform.
- The CodeDeploy agent archives revisions and log files on instances to conserve disk space. (opens in a new tab)
- Checking CodeDeploy agent service is installed and running (opens in a new tab)
  
  sudo service codedeploy-agent status
- Logs on EC2 Linux instance
  - deployment log
    
    /var/log/aws/codedeploy-agent/codedeploy-agent.log
  - scripts log
    
    /opt/codedeploy-agent/deployment-root/deployment-group-ID/deployment-ID/logs/scripts.log
Deployment
- Rollback (opens in a new tab)
  - CodeDeploy rolls back deployments by redeploying a previously deployed revision of an application as a new deployment.
  - CodeDeploy first tries to remove from each participating instance all files that were last successfully installed, namely the instances which caused the deployment failure, and all other untouched instances will be involved later.
  - Automatic rollback
    - The last known good version of an application revision is deployed.
  - Steps
    1. First tries to remove from each participating instance all files that were last successfully installed.
    2. In the case of detecting exsting files, the options are as follows.
      1. Fail the deployment
      2. Overwrite the content
      3. Retain the content
Resources
- EC2/On-Premises Deployment - Lifecycle event hooks (opens in a new tab)
- Working with deployment configurations in CodeDeploy (opens in a new tab)

CodePipeline

In a default setup, a pipeline is kicked-off whenever a change in the configured pipeline source is detected. CodePipeline currently supports sourcing from CodeCommit, GitHub, ECR, and S3.
When using CodeCommit, ECR, or S3 as the source for a pipeline, CodePipeline uses a CloudWatch Event to detect changes in the source and immediately kick off a pipeline.
When using GitHub as the source for a pipeline, CodePipeline uses a webhook to detect changes in a remote branch and kick off the pipeline.
CodePipeline also supports beginning pipeline executions based on periodic checks, although this is not a recommended pattern.
To customize the logic that controls pipeline executions in the event of a source change, you can introduce a custom CloudWatch Event.
The pipeline stops when it reaches the manual approval action. If an SNS topic ARN was included in the configuration of the action, a notification is published to the SNS topic, and a message is delivered to any subscribers to the topic or subscribed endpoints, with a link to review the approval action in the console.
Resources
- Start a pipeline execution in CodePipeline (opens in a new tab)

ECR

ECR API Reference (opens in a new tab)
Authentication (opens in a new tab)
- To authenticate Docker to an ECR registry with get-login-password (opens in a new tab), run the aws ecr get-login-password command. (Formerly, the command is get-login, which has been deprecated but may still show up in the exam.)
AWS - Reducing AWS Fargate Startup Times with zstd Compressed Container Images (opens in a new tab)

ECR - Cheatsheet

Docker login to ECR

aws ecr get-login-password --region <region> | \
  docker login \
    --username AWS \
    --password-stdin <account-id>.dkr.ecr.<region>.amazonaws.com

Describe repositories

aws ecr describe-repositories \
  --query 'sort_by(repositories,&repositoryName)[].{Repo:repositoryName,URI:repositoryUri}' \
  --output table

Describe images

local repoName=<repo-name>
aws ecr describe-images --repository-name $repoName \
  --query 'reverse(sort_by(imageDetails,&imagePushedAt))[].{Repo:repositoryName,Tag:imageTags[] | [0],Digest:imageDigest,PushedAt:imagePushedAt}' \
  --output table

Find images with the given digest

local repoName=<repo-name>
local sha256Hash=<sha256-hash>
aws ecr describe-images --repository-name $repoName \
  --query 'imageDetails[?imageDigest==`sha256:$sha256Hash`].{Repo:repositoryName,Tag:imageTags[] | [0],Digest:imageDigest,PushedAt:imagePushedAt}' \
  --output table

Find images with the given tag

local repoName=<repo-name>
local tagKeyword=<tagKeyword>
aws ecr describe-images --repository-name $repoName \
  --query 'imageDetails[?contains(imageTags, $tagKeyword>)].{Repo:repositoryName,Tag:imageTags[] | [0],Digest:imageDigest,PushedAt:imagePushedAt}' \
  --output table

ECS

Container Instance
- If you terminate a container instance in the RUNNING state, that container instance is automatically removed, or deregistered, from the cluster. However, if you terminate a container instance in the STOPPED state, that container instance isn't automatically removed from the cluster.
ECS Container Agent (opens in a new tab)
- ECS_ENABLE_TASK_IAM_ROLE
  
  Whether IAM roles for tasks should be enabled on the container instance for task containers with the bridge or default network modes.
EC2 Launch Type
- An ECS Cluster is a logical group of EC2 instances, also called container instance.
- Each container instance has an ECS container agent (a Docker container) installed.
- Container instance can only use Amazon Linux AMI
- ECS container agent registers the container instance to the cluster.
- ECS container agent configuration
  - /etc/ecs/ecs.config
- Load balancing
  - ALB and NLB supports dynamic host port mapping (opens in a new tab), allowing you to have multiple tasks from a single service on the same container instance.
  - To enable dynamic host port mapping, host port must be set to 0 or empty in task definition.
  - CLB does not allow you to run multiple copies of a task on the same instance because the ports conflict.
- Task definition (opens in a new tab)
  - A task is similar to a pod in Kubernetes.
  - Container definitions (opens in a new tab)
    - Define one or multiple containers
    - Standard parameters: Name, Image, Memory, Port Mappings
  - Every container in a task definition must land on the same container instance.
  - Need to specify resources needed
  - Need to specify configuration specific to the task
  - Need to specify the IAM role that your task should use
- Task placement (opens in a new tab)
  - Strategy (opens in a new tab)
    - binpack
      
      Tasks are placed on container instances so as to leave the least amount of unused CPU or memory to minimize the number of container instances in use.
    - random
      
      Random places tasks on instances at random. This still honors the other constraints that you specified, implicitly or explicitly. Specifically, it still makes sure that tasks are scheduled on instances with enough resources to run them.
    - spread
      
      Tasks are placed evenly based on the specified value.
  - Constraint (opens in a new tab)
    - distinctInstance
      
      Place each task on a different container instance.
    - memberOf
      
      Place tasks on container instances that satisfy an Cluster query expression.
  - Cluster query language (opens in a new tab)
    - Cluster queries are expressions for targeting container instances, which can be used in task placement memberOf constraint.
Fargate Launch Type
- Fully managed
- Serverless
IAM
- Service-Linked Role (for ECS) (opens in a new tab)
  - Granting ECS the permissions it requires to call other AWS services on your behalf.
- Task Execution IAM Role (for container agent) (opens in a new tab)
  - Use case
    - Pulling image from ECR
  The task execution role grants the ECS container and Fargate agents permission to make AWS API calls on your behalf.
- ECS Container Instance IAM Role (for container instance) (opens in a new tab)
  - Container instances that run the ECS container agent require an IAM policy and role for the service to know that the agent belongs to you.
  - Containers that are running on your container instances have access to all of the permissions that are supplied to the container instance role through instance metadata.
  - Use case
    - Register a container into a cluster
- IAM Role for Tasks (for containers in a task) (opens in a new tab)
  - an IAM role that can be used by the containers in a task
Resources

EKS

Resources
- EKS Workshop (opens in a new tab)

Lambda

Lambda - Invocation Models

Invocation Models (opens in a new tab)
AWS Docs - Comparing Lambda invocation modes (opens in a new tab)
Invocation Type (opens in a new tab)
- The invocation type can only be specified at the time of manually executing a Lambda function. This Lambda function execution is called on-demand invocation.
- InvocationType parameter
  - RequestReponse
    
    Execute synchronously
  - Event
    
    Execute asynchronously
  - DryRun
    
    Test that the caller permits the invocation but does not execute the function.

Lambda - Invocation Models - synchronous

Synchronous invocation (default) (opens in a new tab)

RPC style
Invocation Type: RequestResponse
Services
- ELB (Application Load Balancer)
- Cognito
- Lex
- Alexa
- API Gateway
- CloudFront (Lambda@Edge)
- Kinesis Data Firehose
Details about the function response, including errors, are included in the response body and headers.

Lambda - Invocation Models - asynchronous

Asynchronous invocation (opens in a new tab)

Invocation Type: Event
Services
- S3
- SNS
- SES
- CloudFormation
- CloudWatch Logs
- CloudWatch Events
- CodeCommit
- AWS Config
Lambda adds events to a queue before sending them to your function. If your function does not have enough capacity to keep up with the queue, events may be lost.
Suitable for services producing events at a lower rate than the function can process, as there is usually no message retention and message loss would happen if function is overwhelmed.
For higher throughput, consider using SQS or Kinesis and Lambda event source mapping.
DLQ (opens in a new tab)
- Either a SNS topic or a SQS queue, as the destination for all failed invocation events.
- An alternative to an on-failure destination, but a part of a function's version-specific configuration, so it is locked in when you publish a version.
Destinations for asynchronous invocation (opens in a new tab)
- Types
  - SQS – A standard SQS queue
  - SNS – A SNS topic
  - Lambda – A Lambda function
  - EventBridge – An EventBridge event bus
- You can configure condition of the destination to be on success or on failure.

Lambda - event source mapping

Event source mapping (poll-based) (opens in a new tab)

A Lambda integration setup for poll-based event sources (with data in potentially large volume) such as queues and streams.
Lambda pulls records from the data stream of event sources and invokes your function synchronously with an event that contains stream records. Lambda reads records in batches and invokes your function to process records from the batch.
Process items from a stream or queue in services that don't invoke Lambda functions directly
Event source mappings that read from a stream are limited by the number of shards in the stream.
Services
- SQS
- DynamoDB Streams
- Kinesis
- MQ
- MSK (Managed Streaming for Apache Kafka)
- Self-managed Apache Kafka
Parallelization Factor
- Kinesis and DynamoDB Streams only

Lambda - authorization

Execution permissions
- Assigned to Lambda function
- Enable the Lambda function to access other AWS resources in your account.
Invocation permissions
- Assigned to event source
- Enable the event source to communicate with your Lambda function.

Lambda - runtime

Custom runtime (opens in a new tab)

You can implement a Lambda custom runtime in any programming language.
A runtime is a program that runs a Lambda function's handler method when the function is invoked. You can include a runtime in your function's deployment package in the form of an executable file named bootstrap.
A runtime is responsible for running the function's setup code, reading the handler name from an environment variable, and reading invocation events from the Lambda runtime API. The runtime passes the event data to the function handler, and posts the response from the handler back to Lambda.
The runtime can be included in your function's deployment package, or in a layer.
Scripting language runtime such as Node.js and Python runtime have better native support than Java, as some tooling support enables deploying source code directly.
Resources
- Lambda Layers and Custom Runtimes (opens in a new tab)

Lambda - execution environment lifecycle

Execution environment lifecycle (opens in a new tab)

Init
- Happens at the time of the first function invocation
- In advance of function invocations if you have enabled provisioned concurrency.
- 3 Tasks
  - Extension Init
  - Runtime Init
  - Function Init
    
    Runs the function’s initialization code (the code outside the main handler)
Invoke
Shutdown

Lambda - function deployment

Lambda function's code consists of scripts or compiled programs and their dependencies.
Deployment package size limit (opens in a new tab)
- 50 MB (zipped, for direct upload)
- 250 MB (unzipped, including layers)

Lambda - function handler

The handler is a method inside the Lambda function that you create and include in your package.
Node.js (opens in a new tab)
- Async handlers
  - If your code performs an asynchronous task, return a promise or await the promise to make sure that it finishes running
- Non-async handlers
  - Function execution continues until the event loop is empty or the function times out.
LambdaException (opens in a new tab)

Lambda - function configuration

The total size of all environment variables doesn't exceed 4 KB.
Memory
- From 128 MB to 3008 MB in 64-MB increments
- You can only directly configure the memory for your function, and Lambda allocates CPU power in proportion to the amount of memory configured.
Timeout
- Default is 3 seconds, and max is 15 minutes (900 seconds).
- AWS charges based on execution time in 100-ms increments.
Network
- Network configuration
  - default
  - VPC
- A Lambda function in your VPC has no internet access.
- Deploying a Lambda function in a public subnet doesn't give it internet access or a public IP.
- Deploying a Lambda function in a private subnet gives it internet access if you have a NAT Gateway / Instance.
- Use VPC endpoints to privately access AWS services without a NAT.
Concurrency
- By default, the concurrent execution limit is enforced against the sum of the concurrent executions of all functions.
- By default, the account-level concurrency within a given Region is set with 1000 concurrent execution as a maximum to provide you 1000 concurrent functions to execute. You can open a support ticket with AWS to request an increase in your account level concurrency limit.
- Lambda requires at least 100 unreserved concurrent executions per account.
- Concurrency = (average requests per second) * (average request duration in seconds)
- Reserved concurrency
  
  Applies to the entire function, including all versions and aliases
- Provisioned concurrency (opens in a new tab)
  - To enable a function to scale without fluctuations in latency.
  - Provisioned concurrency cannot exceeds reserved concurrency.
  - Provisioned concurrency simply initializes the assigned capactity upfront to avoid a cold-start, hence without noticeable latency.
- Parallelization Factor (opens in a new tab)
  - For stream processing (event source mapping), one Lambda function invocation processes one shard at a time, namely Parallelization Factor is 1.
  - Parallelization Factor can be set to increase concurrent Lambda invocations for each shard, which by default is 1.
Version (opens in a new tab)
- Each Lambda function version has a unique ARN. After you publish a version, it is immutable, so you cannot change it.
- A function version includes:
  - function code and all associated dependencies
  - Lambda runtime that invokes the function
  - All of the function settings, including the environment variables
  - A unique ARN to identify the specific version of the function
Alias (opens in a new tab)
- An alias is a pointer to a version, and therefore it also has a unique ARN. Assign an alias to a particular version and use that alias in the application to avoid updating all references to the old version.
- An alias cannot point to $LATEST.
- Weighted alias
  - An alias allows you to shift traffic between 2 versions based on specified weights (%).
Layers (opens in a new tab)
- A layer is a .zip file archive that contains libraries, a custom runtime, or other dependencies. With layers, you can use libraries in your function without needing to include them in your deployment package.
- A function can use up to 5 layers at a time. The total unzipped size of the function and all layers can't exceed the unzipped deployment package size limit of 250 MB.
- Layers are extracted to the /opt directory in the function execution environment. Each runtime looks for libraries in a different location under /opt, depending on the language.
Environment variables (opens in a new tab)
- X-Ray
  - _X_AMZN_TRACE_ID
    
    X-Ray tracing header
  - AWS_XRAY_CONTEXT_MISSING: RUNTIME_ERROR (default), LOG_ERROR
    
    Lambda sets this to LOG_ERROR to avoid throwing runtime errors from the X-Ray SDK.

Lambda - monitoring

Metrics (opens in a new tab)
- Invocations
  
  the number of requests billed
- Duration
  
  the amount of time that your function code spends processing an event

Lambda - service integration

Using AWS Lambda with other services (opens in a new tab)

Step Functions

Workflow type is either Standard or Express (opens in a new tab), and cannot be changed once created.
Standard Workflow
- Maximum execution time: 1 year
- Priced per state transition. A state transition is counted each time a step in your execution is completed.
Express Workflow
- Maximum execution time: 5 minutes
- Priced by the number of executions you run, their duration, and memory consumption.
- Types
  - Synchronous
  - Asynchronous
States (opens in a new tab)

IAM

Global, not Region specific
ARN (opens in a new tab)
- Format: arn:partition:service:region:account:resource
- Partition
  - Partition identifies the partition that the resource is in. You cannot delegate access between accounts in different partitions.
  - Supported partitions
    - aws - AWS Regions
    - aws-cn - China Regions
    - aws-us-gov - AWS GovCloud (US) Regions
CLI
- aws iam create-account-alias --account-alias <account-alias>
- aws iam list-account-aliases
Tools
- AWS Policy Generator (opens in a new tab)
- IAM Policy Simulator Console (opens in a new tab)
- IAMCTL (opens in a new tab)
  
  extract the IAM roles and policies from two accounts, compare them, and report out the differences and statistics
Resources
- Service Authorization Reference (opens in a new tab)
  
  a list of the actions, resources, and condition keys that are supported by each AWS service
- AWS IAM - API (opens in a new tab)

IAM - Access Analyzer

Helps identify the resources in your organization and accounts, such as S3 buckets or IAM roles, that are shared with an external entity, to find out unintended access to your resources and data.
Preview Access (opens in a new tab)
- CreateAccessPreview
- ListAccessPreviewFindings
Validate Policy
- ValidatePolicy
Resources
- API Reference (opens in a new tab)
- Using AWS IAM Access Analyzer (opens in a new tab)

IAM - Access Advisor

Use last accessed information to help identify unused permissions so that you can remove them.
Automate analyzing your permissions using IAM access advisor APIs (opens in a new tab)
Refining permissions in AWS using last accessed information (opens in a new tab)

IAM - User

Uniquely identified identity
Long-term effective
Access
- Programmatic (Access key ID and Secret Access key)
- Web (Web Management Console)

IAM - Role

Similar to a User with attached Permissions policies
Not uniquely identified, but a distinct identity with its own permissions
Temporarily effective for a designated timeframe
If an IAM user assumes a Role, only the policies of the assumed Role are evaluated. The user's own policies wouldn't be evaluated.
Cannot be added to IAM groups
Trust policy specifies who can assume a Role.
An IAM role is both an identity and a resource that supports resource-based policies (Trust policy).
Service-Linked Role
- Resources
  - Using service-linked roles (opens in a new tab)
  - Introducing an Easier Way to Delegate Permissions to AWS Services: Service-Linked Roles (opens in a new tab)
Cross account access (opens in a new tab) can be given by allowing principals in account A to assume roles in account B.
- When the principal and the resource are in different AWS accounts, an IAM administrator in the trusted account must also grant the principal entity (user or role) permission to access the resource.
- Trust policy to authorize the specified account to assume the role. For roles from a different account, the Principal ARN contains its AWS account ID.
```
{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Principal": {
        "AWS": ["arn:aws:iam::<another-Account-ID>:role/<DesiredRoleName>"]
      },
      "Action": "sts:AssumeRole"
    }
  ]
}
```
- Example
  - Account A
    - Trust policy to authorize a Role in Account B
  - Account B
    - Identity-based policy to authorize a User in Account B to access the resource in Account A
Instance profile (opens in a new tab)
- EC2 uses an instance profile as a container for an IAM role.
- If you use the AWS Management Console to create a role for EC2, the console automatically creates an instance profile and gives it the same name as the role.
- An instance profile is not an AWS CLI profile.

IAM - Policy

Shared or not
- Managed policies
  - Can be attached to multiple identities
- Inline policies
  - Can be attached to only one identity
Policy Types
- Identity-based policies
  - Policies that you attach to an AWS identity, such as a user, group of users, or role.
  - Control what actions an entity can perform, which resources they can use, and the conditions in which they can use the resources.
  - AWS Managed Policies
    - Policies that are created and managed by AWS itself
  - Custom Managed Policies
    - Policies that are created and managed by you in your AWS account
- Resource-based policies
  - Policies that you attach to a resource, specifying which API actions of the resource are allowed.
  - Different services have their unique resource-based policies.
  - Resource-based policies are inline policies and there are no managed resource-based policies.
  - Trust policy is a resource-based policy for IAM Role.
  - Example:
    - arn:aws:iam::aws:policy/AWSLambdaExecute
- Session policies (opens in a new tab)
Policy evaluation (opens in a new tab)
- Final effective policies are the union of all policies with Explicit deny having the highest priority.
- Actions or resources that are not explicitly allowed are denied by default (Implicit deny).
- Order
  - Explicit deny
  - Explicit allow
  - Implicit deny
Policy Variables (opens in a new tab)
- You can use policy variables as placeholders when you don't know the exact value in the Resource element and in string comparisons in the Condition element.
Preview Access (opens in a new tab)
- Access Analyzer - CreateAccessPreview (opens in a new tab)
- Access Analyzer - ListAccessPreviewFindings (opens in a new tab)
Validate Policy (opens in a new tab)
- Access Analyzer - ValidatePolicy (opens in a new tab)
Test Policy (opens in a new tab)
- IAM Policy Simulator Console (opens in a new tab)
Permission boundary (opens in a new tab)
- The permissions boundary for an IAM entity (user or role) sets the maximum permissions that the entity can have.
- You can attach permissions boundaries only to a user or role, not a group.

AWS Organizations (opens in a new tab)

Features
- Centralized management of all of your AWS accounts
- Consolidated billing for all member accounts
- Hierarchical grouping of your accounts to meet your budgetary, security, or compliance needs
- Service control policies (SCPs)
- Tag policies
- AI services opt-out policies
- Backup policies
- Free to use

Service control policies (SCP) (opens in a new tab)

Affect only the member accounts in an Organization
SCPs offer central control over the maximum available permissions for all accounts in an Organization.
SCPs are similar to IAM permission policies and use almost the same syntax. However, an SCP never grants permissions. Instead, SCPs are JSON policies that specify the maximum permissions for the affected accounts.
SCP can be used to restrict root account.

STS (Security Token Service) (opens in a new tab)

GetSessionToken (opens in a new tab)
- Returns a set of temporary credentials for an AWS account or IAM user. The credentials consist of an access key ID, a secret access key, and a security token.
- Using the temporary credentials that are returned from the call, IAM users can then make programmatic calls to API operations that require MFA authentication.
- Credentials based on account credentials can range from 900 seconds (15 minutes) up to 3600 seconds (1 hour), with a default of 1 hour.
AssumeRole (opens in a new tab)

Returns a set of temporary security credentials that you can use to access AWS resources that you might not normally have access to. These temporary credentials consist of an access key ID, a secret access key, and a security token.
DecodeAuthorizationMessage (opens in a new tab)

Decodes additional information about the authorization status of a request from an encoded message returned in response to an AWS request.

STS - Cheatsheet

STS - Get Caller Identity

GetCallerIdentity returns details about the IAM user or role whose credentials are used to call the operation.
```
aws sts get-caller-identity
```

STS - View the maximum session duration setting for a role

AWS Docs - IAM - View the maximum session duration setting for a role (opens in a new tab)

S3

Data Consistency (opens in a new tab)
- Strong read-after-write (GET or LIST) consistency for PUTs and DELETEs of objects
- Strong read consistency for S3 Select, S3 Access Control Lists, S3 Object Tags, and object metadata
- Updates to a single object key are atomic, and there is no way to make atomic updates across keys.
- High availability by replicating data across multiple servers within AWS data centers.
- Bucket configurations have an eventual consistency model.
- Wait for 15 minutes after enabling versioning before issuing write operations (PUT or DELETE) on objects in the bucket.
- S3 does not support object locking for concurrent writers.

S3 - Bucket

S3 lists all buckets, but bucket is created specific to a region, but Cross-Region Replication (CRR) can be used to replicate objects (and their respective metadata and object tags) into other Regions.
Flat structure, folders in S3 are simply shared name prefix
Bucket name must be globally unique, and cannbot be changed once created.
Bucket names can consist only of lowercase letters, numbers, dots (.), and hyphens (-).
To ensure Bucket names are DNS-friendly, it's preferable to avoid dots in names.
Objects in Bucket are private by default.
There are no limits to the number of prefixes in a bucket.

S3 - Bucket - Versioning

Buckets can be in one of 3 states
- Unversioned (default)
- Versioning-enabled
- Versioning-suspended
Once you enable versioning on a bucket, it can never return to the unversioned state. You can, however, suspend versioning on that bucket.
If you have not enabled versioning, S3 sets the value of the version ID to null.
Objects stored in your bucket before you set the versioning state have a version ID of null.
Suspend

This suspends the creation of object versions for all operations but preserves any existing object versions.

S3 - Bucket - Lifecycle

Multiple lifecycle rules
- Permanent deletion > Transition > Creation of delete markers (versioned bucket)
- Transition
  
  S3 Glacier Flexible Retrieval > S3 Standard-IA / S3 One Zone-IA

S3 - Bucket - Object Lock

Object Lock (opens in a new tab)

Prevent objects from being deleted or overwritten for a fixed amount of time or indefinitely.
Object Lock works only in versioned buckets, and retention periods and legal holds apply to an individual object version.
Use Object Lock to meet regulatory requirements that require WORM storage, or add an extra layer of protection against object changes and deletion.
Retention mode
- Compliance mode
  
  The protected object version can't be overwritten or deleted by any user, including the root user in your AWS account. When an object is locked in compliance mode, its retention mode can't be changed, and its retention period can't be shortened.
- Governance mode
  
  You protect objects against being deleted by most users, but you can still grant some users permission to alter the retention settings or delete the objects if necessary. You can also use governance mode to test retention-period settings before creating a compliance-mode retention period.

S3 - Bucket - Replication

Replication (opens in a new tab)

Both source and destination buckets must have versioning enabled.
Destination buckets can be in different Regions or within the same Region as the source bucket.
New objects
- Replicate new objects as they are written to the bucket
- Use live replication such as CRR or SRR
- CRR and SRR are implemented with the same API, and differentiated by the destination bucket configuration.
Existing objects
- Use S3 Batch Operations

S3 - Bucket - Static Website Hosting

Static website hosting (opens in a new tab)

index document must be specified, and error document is optional.
If you create a folder structure in your bucket, you must have an index document at each level. In each folder, the index document must have the same name, for example, index.html.
S3 website endpoints do not support HTTPS. Use CloudFront in that case.
Access a website hosted in a S3 bucket with a custom domain
- The Bucket is configured as a static website.
- Bucket name must match the domain name exactly.
- Add an alias record in Route53 to route traffic for the domain to the S3 Bucket

S3 - Bucket - Event Notifications

S3 Event Notifications (opens in a new tab)

Destination
- Lambda function
- SNS topic
- SQS standard queue (FIFO queue not supported)
- EventBridge event bus
If two writes are made to a single non-versioned object at the same time, it is possible that only a single event notification will be sent.
If you want to ensure that an event notification is sent for every successful write, you can enable versioning on your bucket. With versioning, every successful write will create a new version of your object and will also send an event notification.

S3 - Bucket - Management - Inventory

S3 Inventory (opens in a new tab)

Audit and report on the replication and encryption status of your objects for business, compliance, and regulatory needs.
Generates inventories of the objects in the bucket on a daily or weekly basis, and the results are published to a flat file.
The bucket that is inventoried is called the source bucket, and the bucket where the inventory flat file is stored is called the destination bucket.
The destination bucket must be in the same Region as the source bucket.
S3 inventory gives you a complete list of your objects. This list will be published to the destination bucket, and can be given in Parquet, ORC or CSV formats, therefore can be analyzed with Athena.

S3 - Bucket - Select

S3 Select (opens in a new tab)

Use a subset of SQL statements to filter the contents of S3 objects and retrieve just the subset of data that you need.
By using S3 Select to filter this data, you can reduce the amount of data that S3 transfers, which reduces the cost and latency to retrieve this data.
S3 Select works on objects stored in CSV, JSON, or Apache Parquet format with compression of GZIP or BZIP2.
You can only query one object at a time.
If you use FileHeaderInfo.USE, you can only reference column with column name.
Column name must be quoted with " if it contains special characters or is a reserved word. e.g. SELECT s."column name" FROM S3Object s

S3 - Bucket - Transfer Acceleration

Transfer Acceleration (opens in a new tab)

Use the edge locations of CloudFront network to accelerate transfer between your client and the specified S3 bucket.
Not recommended for small files or close proximity to the S3 Region.

S3 - Bucket - Analytics

S3 Analytics (opens in a new tab)

You use storage class analysis to observe your data access patterns over time to gather information to help you improve the lifecycle management of your STANDARD_IA storage.
Analyze storage access patterns to help you decide when to transition the right data to the right storage class.

S3 - Bucket - Access Points

Access Points (opens in a new tab)

Simplify managing data access at scale for shared datasets in S3, enabling different teams to access shared data with different permissions.
Traits
- Access points are named network endpoints attached to buckets that you can use to perform S3 object operations, such as GetObject and PutObject.
- For S3 object operations, you can use the access point ARN in place of a bucket name.
- Each access point has distinct permissions and network controls that S3 applies for any request that is made through that access point.
- You can only use access points to perform operations on objects.
- S3 operations compatible with access points
  
  Access point compatibility with S3 operations (opens in a new tab)

S3 - Bucket - Access Points - Object Lambda

S3 Object Lambda (opens in a new tab)

ETL with Lambda for S3 data retrieval API
Uses Lambda functions to automatically process the output of standard S3 GET, LIST, or HEAD requests.
An Object Lambda Access Point is based on an existing S3 Access Point and is used to invoke a Lambda function to process S3 objects on the fly.
Cannot make changes to the Bucket, only its Objects (opens in a new tab)
Resources

S3 - Object

At-Rest Encryption
- S3 only supports symmetric CMKs, not asymmetric CMKs.
- Server-side Encryption (opens in a new tab)
  - Adding the x-amz-server-side-encryption header to the HTTP request to demand server-side encryption.
  - SSE-S3 (opens in a new tab)
    - Use AWS managed CMK to generate data key for encryption, user intervention not needed
    - x-amz-server-side-encryption: AES256
  - SSE-KMS (opens in a new tab)
    - Use a CMK you created in KMS to generate data key for encryption, requiring permission for KMS access
    - x-amz-server-side-encryption: aws:kms
    - When you upload an object, you can specify the AWS KMS CMK using the x-amz-server-side-encryption-aws-kms-key-id header. If the header is not present in the request, S3 assumes the AWS managed CMK.
    - Permissions
      - kms:GenerateDataKey (opens in a new tab)
        
        Returns a plaintext copy of the data key and a copy that is encrypted under a customer master key (CMK) that you specify.
      - kms:Decrypt (opens in a new tab)
        
        Multipart uploading needs this permission to decrypt the encrypted data key kept with the encrypted data as the plain text one is deleted after the first part is uploaded.
  - SSE-C (opens in a new tab)
    - Provide your own data key upon every encryption and decryption action
    - Must use HTTPS
    - S3 does not store the encryption key you provide. Instead, it stores a randomly salted HMAC value of the encryption key to validate future requests.
    - x-amz-server-side-encryption-customer-algorithm
      
      must be AES256
    - x-amz-server-side-encryption-customer-key
      
      the 256-bit, base64-encoded encryption key
    - x-amz-server-side-encryption-customer-key-MD5
      
      message integrity check to ensure that the encryption key was transmitted without error
- Client-side Encryption (opens in a new tab)
  - Encryption and decryption happen on the client side with S3 only saving your data.
  - You can use your CMK stored locally or CMK stored in KMS.
- As an analogy, suppose you go to work on any business day, and need to figure out how to have lunch.
  - Client-side encryption is like having lunch at home.
  - SSE-S3 is like ordering takeaway from your office.
  - SSE-KMS is like having lunch at your company's onsite canteen.
  - SSE-C is like bringing your lunch from home to work.
S3 Batch Operations

S3 Batch Operations (opens in a new tab)
- Large-scale batch operations on S3 objects
- EB scale
- Requires S3 Inventory to be enabled
Uploading
- When a file is over 100 MB, multipart upload is recommended as it will upload many parts in parallel, maximizing the throughput of your bandwidth and also allowing for a smaller part to retry in case that part fails.
- You can upload a single object up to 5 GB. More than 5 GB, you must use multipart upload.
- Part size: 5 MB to 5 GB. There is no size limit on the last part of your multipart upload.
- Object size: 0 to 5 TB
- To perform a multipart upload with encryption using an AWS KMS key, the requester must have kms:GenerateDataKey permissions to initiate the upload, and kms:Decrypt permissions to upload object parts. The requester must have kms:Decrypt permissions so that newly uploaded parts can be encrypted with the same key used for previous parts of the same object.
Quota
- 3500 PUT/COPY/POST/DELETE and 5500 GET/HEAD requests per second per prefix in a bucket
- No limits to the number of prefixes in a bucket

S3 - Object - Presigned URL

Presigned URL (opens in a new tab)

Grant URL caller temporary access to the specified S3 object without authentication and authorization.
Generated programmatically
GET for downloading and PUT for uploading
As a general rule, AWS recommends using bucket policies or IAM policies for access control. ACLs is a legacy access control mechanism that predates IAM.
S3 stores access logs as objects in a bucket. Athena supports analysis of S3 objects and can be used to query S3 access logs.

S3 - Security

S3 - Security - Block public access

Block public access (opens in a new tab)

A shortcut switch to block all public access granted in Bucket Policy or ACLs.

S3 - Security - ACL

Access Control List (opens in a new tab)

Can define which AWS accounts or groups are granted access and the type of access.
Can manage permissions of Objects.

S3 - Bucket - Permissions - CORS

CORS (opens in a new tab)

To configure your bucket to allow cross-origin requests, you create a CORS configuration.

S3 - Storage Lens

Cloud storage analytics solution with support for AWS Organizations to give you organization-wide visibility into object storage, with point-in-time metrics and trend lines as well as actionable recommendations.

All these things combined in an interactive dashboard will help you discover anomalies, identify cost efficiencies, and apply data protection best practices across accounts.

S3 - Storage classes

Storage classes (opens in a new tab)

S3 - Storage classes - S3 Standard

S3 - Storage classes - S3 Intelligent-Tiering

S3 Intelligent-Tiering (opens in a new tab)

Characteristics
- No retrieval charges
- Automatic storage cost savings when data access patterns change, without performance impact or operational overhead
- Access tiers
  - Frequent Access tier
    
    Objects uploaded to S3 Intelligent-Tiering are stored in the Frequent Access tier.
  - Infrequent Access tier
    
    Objects not accessed for 30 consecutive days are automatically moved to the Infrequent Access tier.
  - Archive Instant Access tier
    
    Objects not accessed for 90 consecutive days are automatically moved to the Archive Instant Access tier.
- Frequent Access, Infrequent Access, and Archive Instant Access tiers have the same low-latency and high-throughput performance of S3 Standard
- The Infrequent Access tier saves up to 40% on storage costs
- The Archive Instant Access tier saves up to 68% on storage costs
Use cases
- Suitable for objects with unknown or changing access patterns
- Suitable for objects equal to or larger than 128 KB
Anti patterns
- Objects smaller than 128 KB will not be monitored and will always be charged at the Frequent Access tier rates, with no monitoring and automation charge.
- Data retrieval or modification is more frequent than the transition intervals.
- Access patterns are predictable and you can manage the storage classes transitions explicitly.

S3 - Storage classes - S3 Standard-IA

For data that is accessed less frequently, but requires rapid access when needed.
Incurs a data retrieval fee

S3 - Storage classes - S3 One Zone-IA (S3 One Zone-Infrequent Access)

Stores data in a single AZ and costs 20% less than S3 Standard-IA
Incurs a data retrieval fee

S3 on Outposts

S3 - CLI Cheatsheet

List buckets and objects (opens in a new tab)

aws s3 ls
Create a bucket (opens in a new tab)

aws s3 mb s3://<bucket-name>
Upload or copy objects (opens in a new tab)

aws s3 cp <local file path>/<S3 URI> <S3 URI>
Delete a bucket (opens in a new tab)

aws s3 rb s3://<bucket-name>
Delete an object (opens in a new tab)

aws s3 rm <target>
Options
- --recursive

S3 Glacier

S3 Glacier (opens in a new tab)

Vault Lock (opens in a new tab)
Glacier Select (opens in a new tab)
- Unlike S3 Select, does not support compressed CSV or JSON files
- AWS News Blog - S3 Select and Glacier Select – Retrieving Subsets of Objects (opens in a new tab)
AWS Docs - Storage classes for archiving objects (opens in a new tab)

S3 Glacier - Instant Retrieval

Ideal for long-lived archive data accessed once or twice per quarter with instant retrieval in milliseconds
The lowest cost archive storage with milliseconds retrieval
Offer a cost savings compared to the S3 Standard-IA, with the same latency and throughput performance as the S3 Standard-IA.
Higher data access costs than S3 Standard-IA
Min storage duration of 90 days

S3 Glacier - Flexible Retrieval

Ideal for long-lived archive data accessed once a year with retrieval times of minutes to hours
Min storage duration of 90 days
Archive Retrieval Options
- Expedited: 1–5 minutes
  - Incurs a data retrieval fee
- Standard: 3–5 hours
  - Incurs a data retrieval fee
- Bulk: 5–12 hours
  - Free data retrieval

S3 Glacier - Deep Archive

Ideal for long-lived archive data accessed less than once a year with retrieval times of hours
Default retrieval time of 12 hours
Min storage duration of 180 days
Incurs a data retrieval fee

CloudFront

Distribution (opens in a new tab)
- Origin Settings
  - S3 origins
  - Custom origins (EC2, ELB)
    - Origin Protocol Policy (opens in a new tab)
      - HTTP Only
      - HTTPS Only
      - Match Viewer
- Cache Behaviour Settings
  - Viewer Protocol Policy (opens in a new tab)
    - HTTP and HTTPS
    - Redirect HTTP to HTTPS
    - HTTPS Only
- Distribution Settings (opens in a new tab)
Lambda@Edge (opens in a new tab)
- Lambda functions of Python and Node.js runtime can be deployed at CloudFront edge locations
- Lambda@Edge allows you to pass each request through a Lambda to change the behaviour of the response.
- Authorization@Edge (opens in a new tab): You can use Lambda@Edge to help authenticate and authorize users for the premium pay-wall content on your website, filtering out unauthorized requests before they reach your origin infrastructure.
Origin access (opens in a new tab)
- Benefits
  - Restricts access to the AWS origin so that it's not publicly accessible
- Origin type
  - S3
    - OAC / Origin Access Control
      - S3 SSE-KMS
      - Dynamic requests (PUT and DELETE) to S3
    - OAI / Origin Access Identity (legacy)
      - Restricting Access to S3 content by using an Origin Access Identity, a special CloudFront user, which the target S3 bucket can reference in bucket policy. Once set up, users can only access files through CloudFront, not directly from the S3 bucket.
  - MediaStore
Serving private content (opens in a new tab)
- To use signed URLs or signed cookies, you need a signer. A signer is either a trusted key group (Recommended) that you create in CloudFront, or an AWS account that contains a CloudFront key pair (can only be created by root user).
- You cannot use either signed URLs or signed cookies if original URL contains Expires, Policy, Signature, Key-Pair-Id query parameters.
- Signed URL (opens in a new tab)
  - Uses a JSON policy statement (canned or custom) to specify the restrictions of the signed URL
  - Use signed URLs when you want to restrict access to individual files.
  - Use signed URLs when your users are using a client that doesn't support cookies.
- Signed cookies (opens in a new tab)
  - Use signed cookies when you want to provide access to multiple restricted files.
  - Use signed cookies when you don't want to change your current URLs.
Using HTTPS with CloudFront (opens in a new tab)
- Both connections between viewers and CloudFront, and connections between CloudFront and origin can be encrypted by using HTTPS.
- You can't use a self-signed SSL certificate for HTTPS communication between CloudFront and your origin, and the certificate must be managed by ACM.
- You don't need to add an SSL certificate if you only require HTTPS for communication between the viewers and CloudFront (default certificate provided by CloudFront).
Availability
- Origin failover (opens in a new tab)
  - an origin group with two origins: a primary and a secondary. If the primary origin is unavailable, or returns specific HTTP response status codes that indicate a failure, CloudFront automatically switches to the secondary origin.
  - To set up origin failover, you must have a distribution with at least 2 origins.

RDS

AWS Docs - RDS (Relational Database Service) (opens in a new tab)

Authentication
- IAM database authentication (opens in a new tab)
  - Only works with MySQL and PostgreSQL.
  - Instead of password, an authentication token is generated by RDS when you connect to a DB instance.
  - Each authentication token has a lifetime of 15 minutes.
  - Recommended as a temporary and personal access
Read Replicas (opens in a new tab) (for Scalability)
- Operates as a DB instance that only allows read-only connections; applications can connect to a read replica just as they would to any DB instance.
- Asynchronous replication to a Read Replica
- Uses a different DB connection string than the one used by the master instance
  
  To be able to switch at runtime, it'd need 2 connection pools in the application respectively .
- Can be promoted to the master
- Support Cross-Region read replicas (opens in a new tab)
Multi-AZ deployments (opens in a new tab) (for High Availability)
- Synchronous replication to a standby instance in a different AZ
- In case of an infrastructure failure, RDS performs an automatic failover to the standby instance (or to a read replica in the case of Amazon Aurora), so that you can resume database operations as soon as the failover is complete.
- The endpoint for your DB instance remains the same after a failover
- The failover mechanism automatically changes the DNS CNAME record of the DB instance to point to the standby instance.
- The standby instance cannot be used as a read replica.
- Multi-AZ DB instance deployment
  - 1 standby DB instance
  - failover support
  - no read traffic support
- Multi-AZ DB cluster deployment
  - 3 DB instances
  - failover support
  - read traffic support
- Resources
  - RDS Instances explained: Single-AZ, Multi-AZ & Multi-AZ Cluster (opens in a new tab)
Snapshot
- When you perform a restore operation to a point in time or from a DB snapshot, a new DB instance is created with a new endpoint (the old DB instance can be deleted if so desired). This is done to enable you to create multiple DB instances from a specific DB snapshot or point in time.
- Automated backups are limited to a single Region while manual snapshots and read replicas are supported across multiple Regions.
- Manual snapshot
  - When you delete a DB instance, you can create a final DB snapshot upon deletion.
  - Manual snapshots are kept after the deletion of the DB instance.
- Automated snapshot
  - Configurable retention period with 7 day by default up to 35 days
  - Cannot be manually deleted, automatically deleted when the DB instance is deleted
  - Stored in S3
  - Storage of automated snapshots are free as long as the DB instance is running. If the DB instance is stopped, the storage of automated snapshots would be charged as per standard pricing.
Encryption
- Encrypting a DB instance (opens in a new tab)
  - RDS provides at-rest data encryption, so all logs, backups, and snapshots are encrypted. RDS uses an AWS KMS CMK to encrypt these resources.
  - For Oracle or SQL Server, RDS also supports encrypting a DB instance with TDE (Transparent Data Encryption). TDE is a DB built-in feature and supports both in-transit and at-rest data encryption, but for SQL Server, only Enterprise Edition supports TDE.
- Encrypting a connection to a DB instance (opens in a new tab)
  - Use SSL/TLS from your application to encrypt a connection to a DB instance. Each DB engine has its own process for implementing SSL/TLS.
- Limitations (opens in a new tab)
Monitoring
- Enhanced Monitoring (opens in a new tab)
  - RDS provides metrics in real time for the OS that your DB instance runs on.
  - Enhanced Monitoring metrics are stored in the CloudWatch Logs instead of in Cloudwatch Metrics.
  - After you have enabled Enhanced Monitoring for your DB instance, you can view the metrics for your DB instance using CloudWatch Logs, with each log stream representing a single DB instance being monitored.
  - CloudWatch gathers metrics about CPU utilization from the hypervisor for a DB instance, and Enhanced Monitoring gathers its metrics from an agent on the instance.

RDS - Aurora

AWS Docs - Aurora (opens in a new tab)

Serverless, fully managed RDBMS compatible with MySQL and PostgreSQL.
Up to 5 times the throughput of MySQL and up to 3 times the throughput of PostgreSQL without requiring changes to most of your existing applications.
Up to 15 read replica
Automatic backup

RDS - RDS Proxy

RDS Proxy (opens in a new tab)

Establishes a database connection pool and reuses connections in this pool.
Makes applications more resilient to database failures by automatically connecting to a standby DB instance while preserving application connections.

RDS - Cheatsheet

List clusters

aws rds describe-db-clusters \
--query 'sort_by(DBClusters,&DBClusterIdentifier)[].{ClusterID:DBClusterIdentifier, ClusterARN:DBClusterArn, Port:Port, Engine:Engine, Version:EngineVersion, Status:Status}' \
--output table

List DB instances

aws rds describe-db-instances \
--query 'sort_by(DBInstances,&DBInstanceIdentifier)[].{InstanceID:DBInstanceIdentifier, InstanceARN:DBInstanceArn, Engine:Engine, Version:EngineVersion, Status:DBInstanceStatus}' \
--output table

DynamoDB

Schemaless, you can only specify keys upon creation of tables, non-key attributes can only be added as part of new records.

DynamoDB - Availability

Region specific
Data replicated among multiple AZs in a Region

DynamoDB - Table Class

Standard
- Offers lower throughput costs than DynamoDB Standard-IA and is the most cost-effective option for tables where throughput is the dominant cost.
Standard-IA
- Offers lower storage costs than DynamoDB Standard, and is the most cost-effective option for tables where storage is the dominant cost.
- When storage exceeds 50% of the throughput (reads and writes) cost of a table using the DynamoDB Standard table class, the DynamoDB Standard-IA table class can help you reduce your total table cost.

DynamoDB - Primary Key

Paritition key (required)

or

Partition key (required) + Sort key (optional), combination must be unique.
Write Sharding (opens in a new tab)
- To better distribute writes across a partition key space in DynamoDB is to expand the space.
- Random Sharding (opens in a new tab)
  
  add a random number to the end of the partition key values.
- Calculated Sharding (opens in a new tab)
  
  use a number that you can calculate based upon something that you want to query on.

DynamoDB - GSI

To speed up queries on non-key attributes
An index with a partition key and a sort key that can be different from those on the base table
It is considered global because queries on the index can span all of the data in the main table across all partitions.
The main table's primary key attributes are always projected into an index.
Up to 20 GSI / table (soft limit)
Can be created after table creation
RCU and WCU provisioned independently of main table, and therefore a Query operation on a GSI consumes RCU from the GSI, not the main table. When you change items in a table, the GSI on that table are also updated. These index updates consume WCU from the GSI, not from the main table.
If the writes are throttled on the GSI, the write activity on the main table will also be throttled.
Only support eventual consistent reads (cannot provide strong consistency)
In a DynamoDB table, each key value must be unique. However, the key values in a GSI do not need to be unique.

DynamoDB - LSI

An index with the same Partition key but a different Sort key
Up to 5 LSI / table (hard limit)
Cannot be created after table creation
Use the WCU and RCU of the base table
No special throttling considerations
Supports both strong and eventual consistent reads
A LSI lets you query over a single partition, as specified by the partition key value in the query.

DynamoDB - Read Consistency

Read committed isolation level
base table
- Strongly consistent read
- Eventually consistent read
LSI
- Strongly consistent read
- Eventually consistent read
GSI
- Eventually consistent read
DynamoDB streams
- Eventually consistent read

DynamoDB - Capacity

Throughput mode: Provisioned or On-Demand
Read Capacity Unit (RCU)
- 1 RCU = 1 strongly consistent read/s or 2 eventually consistent read/s, for an item up to 4 KB in size.
- For item size more than 4 KB, it would take an additional RCU.
- For item size less than 4 KB, it would still take one RCU.
- Calculation
  - strongly consistent
    1. Round data up to nearest 4
    2. Divide data by 4
    3. Multiplied by number of reads
  - eventual consistent
    1. Round data up to nearest 4
    2. Divide data by 4
    3. Multiplied by number of reads
    4. Divide final number by 2
    5. Round up to the nearest whole number
Write Capacity Unit (WCU)
- 1 WCU = 1 write/s for an item up to 1 KB in size.
- For item size more than 1 KB, it would take an additional WCU.
- For item size less than 1 KB, it would still take 1 WCU.
- Calculation
  1. Round data up to nearest 1
  2. Multiplied by number of writes
If your application consumes more throughput than configured in the provisioned throughput settings, application requests start throttling.
Adaptive Capacity (opens in a new tab)
- Boost Throughput Capacity to High-Traffic Partitions
  - Enables your application to continue reading and writing to hot partitions without being throttled, provided that traffic does not exceed your table’s total provisioned capacity or the partition maximum capacity.
- Isolate Frequently Accessed Items
  - If your application drives disproportionately high traffic to one or more items, adaptive capacity rebalances your partitions such that frequently accessed items don't reside on the same partition.
To retrieve consumed capacity by an operation, parameter ReturnConsumedCapacity (opens in a new tab) can be included in the request to API, with 3 options: INDEXES, TOTAL, NONE.

DynamoDB - Query

Query (opens in a new tab)

Query requires the partition key value and returns all items with it. Optionally, you can provide a sort key attribute and use a comparison operator to refine the search results.
A filter expression determines which items within the Query results should be returned to you. This happens after the itmes are returned therefore doesn't improve performance.
A single Query operation can retrieve a maximum of 1 MB of data.
Query results are always sorted by the sort key value, by default in ascending order.

DynamoDB - Scan

Scan (opens in a new tab)

Reads every item in a table or a secondary index
By default, a Scan operation returns all of the data attributes for every item in the table or index.
If the total number of scanned items exceeds the maximum dataset size limit of 1 MB (default page size), the scan stops and results are returned to the user as a LastEvaluatedKey value to continue the scan in a subsequent operation.
a Scan operation reads an entire page (by default, 1 MB), you can reduce the impact of the scan operation by setting a smaller page size.
Each Query or Scan request that has a smaller page size uses fewer read operations and creates a "pause" between each request.
Scan uses Limit parameter to set the page size for your request.
Parallel Scan
- The table size is 20 GB or larger.
- The table's provisioned RCU is not being fully used.
- Default sequential Scan operations are too slow.

DynamoDB - TTL

TTL (opens in a new tab)

Must identify a specific attribute name that the service will look for when determining if an item is eligible for expiration.
The attribute should be a Number data type containing time in epoch format.
Once the timestamp expires, the corresponding item is deleted from the table in the background.

DynamoDB - Data type

How to store temporal data (opens in a new tab)
- String
  - Human-friendly (ISO-8601 format)
- Number
  - Can be used for TTL

DynamoDB - DAX

AWS Docs - DynamoDB Accelerator (DAX) (opens in a new tab)

Characteristics
- A fully managed in-memory write through cache for DynamoDB that runs in as a cluster in your VPC.
- Should be provisioned in the same VPC as the EC2 instances that are accessing it.
Pros
- Fastest response times possible to microseconds
- Apps that read a small number of items more frequently
- Apps that are read intensive
Cons
- Reads must be eventually consistent, therefore apps requiring strongly consistent reads cannot use DAX
- Not suitable for apps that do not require microsecond read response times
- Not suitable for apps that are write intensive, or that do not perform much read activity
Supports following read operations in eventually consistent read mode
- Item Cache (opens in a new tab)
  - GetItem
  - BatchGetItem
- Query Cache (opens in a new tab)
  - Query
  - Scan
The following DAX API operations are considered write-through
- BatchWriteItem
- UpdateItem
- DeleteItem
- PutItem
Misc
- ElastiCache can be used with other DBs and applications, while DAX is for DynamoDB only.

DynamoDB - Transaction

Supports transactions via the TransactWriteItems and TransactGetItems API calls.
Transactions let you query multiple tables at once and are an all-or-nothing approach.

DynamoDB - Global table

Global table (opens in a new tab)

HA and fault tolerance
Lower latency for users in different Regions
With global tables you can specify the Regions where you want the table to be available. DynamoDB performs all of the necessary tasks to create identical tables in these Regions and propagate ongoing data changes to all of them.
DynamoDB global tables use a “last writer wins” reconciliation between concurrent updates, and therefore doesn't support optimistic locking.

DynamoDB - Streams

Capture item-level changes in your table, and push the changes to a DynamoDB stream. You then can access the change information through the DynamoDB Streams API.
View type
- Keys only
  
  Only the key attributes of the modified item
- New image
  
  The entire item, as it appears after it was modified
- Old image
  
  The entire item, as it appeared before it was modified
- New and old image
  
  Both the new and the old images of the item
Streams do not consume RCUs.
All data in DynamoDB Streams is subject to a 24-hour lifetime.

DynamoDB - Conditional operations

Put, Delete and Update can be performed with conditions specified with Conditional Expressions (opens in a new tab).
The internal implementation of optimistic locking within DynamoDBMapper uses conditional update and conditional delete support provided by DynamoDB.
Optimistic Locking with Version Number (opens in a new tab)
- Use @DynamoDBVersionAttribute annotation to mark the property

DynamoDB - Atomic counter

Atomic counter (opens in a new tab)

A numeric attribute that is incremented unconditionally, without interfering with other write requests
The numeric value increments each time you call UpdateItem.
An atomic counter would not be appropriate where overcounting or undercounting can't be tolerated.

DynamoDB - Quota

The maximum item size is 400 KB, which includes both attribute name binary length (UTF-8 length) and attribute value lengths (binary length). The attribute name counts towards the size limit.

DynamoDB - Point-in-time recovery (PITR)

Continuous backup with per-second granularity so that you can restore to any given second in the preceding 35 days.
Using PITR, you can back up tables with hundreds of TB of data, with no impact on the performance or availability of your production applications.

DynamoDB - Resources

ElastiCache

ElastiCache is only accessible to resource operating within the same VPC to ensure low latency.
Caching Strategies (opens in a new tab)
- Lazy Loading
  - On-demand loading of data from database if a cache miss occurs
- Write-Through
  - Update cache whenever data is written to the database, ensuring cache is never stale.
- TTL specifies the number of seconds until the key expires.
Memcached
- Simple key/value store, only supports string, therefore suitable for static, small data such as HTML code fragments
- Multi-threaded, scaling will cause loss of data
- Marginal performance advantage because of simplicity
Redis
- Supports advanced data structures
- Single-threaded, scaling causes no loss of data
- Finer-grained control over eviction
- Supports persistence, transactions and replication
Use case
- Session Management (opens in a new tab)
Resources
- Why Redis beats Memcached for caching (opens in a new tab)
- Redis persistence demystified (opens in a new tab)

Route 53

Supported DNS record types (opens in a new tab)
- A (Address) records
  
  Associate a domain name or subdomain name with the IPv4 address of the corresponding resource
- AAAA (Address) records
  
  Associate a domain name or subdomain name with the IPv6 address of the corresponding resource
- CAA
  
  A CAA record specifies which certificate authorities (CAs) are allowed to issue certificates for a domain or subdomain. Creating a CAA record helps to prevent the wrong CAs from issuing certificates for your domains.
- CNAME (opens in a new tab)
  - Reroute traffic from one domain name (example.net) to another domain name (example.com)
  - The DNS protocol does not allow you to create a CNAME record for the top node of a DNS namespace (zone apex).
- DS
  
  A delegation signer (DS) record refers a zone key for a delegated subdomain zone. You might create a DS record when you establish a chain of trust when you configure DNSSEC signing.
- MX (Mail server) records
  
  Route traffic to mail servers
- NAPTR
  
  A Name Authority Pointer (NAPTR) is a type of record that is used by Dynamic Delegation Discovery System (DDDS) applications to convert one value to another or to replace one value with another.
- NS
  
  An NS record identifies the name servers for the hosted zone.
- PTR
  
  A PTR record maps an IP address to the corresponding domain name.
- SOA
  
  A start of authority (SOA) record provides information about a domain and the corresponding Amazon Route 53 hosted zone.
- SPF
  
  Deprecated, TXT is recommended instead.
- SRV
  
  SRV records are used for accessing services, such as a service for email or communications.
- TXT
  
  A TXT record contains one or more strings that are enclosed in double quotation marks (").
Alias records (opens in a new tab)
- Unlike a CNAME record, you can create an alias record at the top node of a DNS namespace (zone apex).
- To route domain traffic to an ELB load balancer, use Route 53 to create an alias record that points to your load balancer.
- A zone apex record is a DNS record at the root of a DNS zone, and the zone apex must be an A record.
Routing policy
- Simple routing policy
  
  Use for a single resource that performs a given function for your domain, for example, a web server that serves content for the example.com website.
- Failover routing policy
  
  Use when you want to configure active-passive failover.
- Geolocation routing policy
  
  Use when you want to route traffic based on the location of your users.
- Geoproximity routing policy
  
  Use when you want to route traffic based on the location of your resources and, optionally, shift traffic from resources in one location to resources in another.
- Latency routing policy
  
  Use when you have resources in multiple Regions and you want to route traffic to the region that provides the best latency.
- Multivalue answer routing policy
  
  Use when you want Route 53 to respond to DNS queries with up to eight healthy records selected at random.
- Weighted routing policy
  
  Use to route traffic to multiple resources in specified proportions.
TTL
- DNS records cache has a TTL. Any DNS update will not be visible until TTL has elapsed.
- TTL should be set to strike a balance between how long the value should be cached vs how much pressure should go on the DNS.
Health checks
- Health checks that monitor an endpoint
- Health checks that monitor other health checks (calculated health checks)
- Health checks that monitor CloudWatch alarms

Route 53 Resolver (opens in a new tab)

A Route 53 Resolver automatically answers DNS queries for:
- Local VPC domain names for EC2 instances
  
  e.g. ec2-192-0-2-44.compute-1.amazonaws.com
- Records in private hosted zones
  
  e.g. acme.example.com
- For public domain names, Route 53 Resolver performs recursive lookups against public name servers on the internet.

Route53 - Cheatsheet

Update the given DNS record(s)

aws route53 change-resource-record-sets \
--hosted-zone-id <hosted-zone-id> \
--change-batch \
'{
  "Changes": [
    {
      "Action": "UPSERT",
      "ResourceRecordSet": {
        "Name": "<old-DNS-name>",
        "Type": "CNAME",
        "TTL": 300,
        "ResourceRecords": [
          {
            "Value": "<new-DNS-name>"
          }
        ]
      }
    }
  ]
}'

Get the key-signing keys (KSKs) public key and DS record of your parent hosted zone

# Reference: https://repost.aws/knowledge-center/route-53-configure-dnssec-domain
aws route53 get-dnssec --hosted-zone-id <hosted-zone-id>

CloudWatch

CloudWatch Events (`Amazon EventBridge`)

Rule
- Event Source
  - Timing
    - Event Pattern
    - Schedule
  - Supported services
    - Events directly supported by CloudWatch (opens in a new tab), sent by services having direct support for CloudWatch Events
    - Events delivered via CloudTrail, which log API actions with finer granularity, but requiring corresponding CloudTrail logging being enabled
- Target
  - A variety of AWS services
AWS service events are free
Custom events (PutEvents actions) may incur additional charges.
EventBridge
- supports a lot more targets, meaning you can integrate between a wider variety of services
- Its cross-account delivery capability further amplifies its reach. It’s easy to distribute events to Kinesis, Step Functions, and many other services running in another AWS account.
- supports native AWS events as well as third-party partner events.
- supports content-based filtering.
- supports input transformation.
- has built-in schema discovery capabilities.

CloudWatch Metrics

Metrics are Region based.
Default CloudWatch metrics (opens in a new tab)
Namespace
- A namespace is a container for metrics, and metrics in different namespaces are isolated from each other, such as AWS/EC2
- AWS services that publish CloudWatch metrics (opens in a new tab)
Dimension
- A dimension is a unique identifier of metrics, such as instanceID.
- Up to 10 dimensions per metric, and each dimension is defined by a name and value pair.
Custom Metrics (opens in a new tab)
- Can only be published to CloudWatch using the AWS CLI or an API.
- Use PutMetricData API action programmatically
Metric Math (opens in a new tab)
- Enables you to query multiple CloudWatch metrics and use math expressions to create new time series based on these metrics.
Resolution
- Predefined Metrics produced by AWS services are standard resolution.
- When you publish a Custom Metric, you can define it as either standard resolution or high resolution.
- Standard resolution: 1 minute granularity
- High resolution: 1 second granularity
EC2 (opens in a new tab)
- CloudWatch AWS/EC2 namespace (opens in a new tab)
  - These metrics are collected by CloudWatch Metrics under namespace AWS/EC2. 2 modes for metrics collection, basic monitoring or detailed monitoring.
    - Basic Monitoring
      - EC2 sends metric data to CloudWatch in 5-minute periods at no charge.
    - Detailed Monitoring
      - EC2 sends metric data to CloudWatch in 1-minute periods for an additional charge.
      - Enable detailed monitoring using AWS CLI
        
        aws ec2 monıtor-ınstances --ınstance-ıds <instance-IDs>
- Metrics collected by the CloudWatch Agent (opens in a new tab)
  - For metrics not available under namespace AWS/EC2, they can be collected by CloudWatch Agent.
  - The collected metrics is available under namespace CWAgent in CloudWatch Metrics.
  - CloudWatch Agent also can collect logs.
List AWS services publishing CloudWatch Metrics (opens in a new tab)
- aws cloudwatch list-metrics [--namespace <namespace>] [--metric-name <metric-name>]

CloudWatch Alarms

Metric
- An Alarm watches a single metric over a specified time period, and performs one or more specified actions, based on the value of the metric relative to a threshold over time.
- A value of the metric is a data point.
- Period for AWS Metrics cannot be lower than 1 minute.
- Alarm on High Resolution Custom Metrics
  - Period of 10 or 30 seconds (High Resolution Alarm (opens in a new tab) with additional charge)
  - Period of 1 minute (regular Alarm with no charge)
  Alarm Period Metrics Standard Resolution (60 Seconds) Metrics High Resolution (1 Second)
  10 Seconds ❌ ✅ (additional charge)
  30 Seconds ❌ ✅ (additional charge)
  60 Seconds ✅ ❌
Evaluation (opens in a new tab)
- Period
  
  The length of time in seconds to evaluate the metric or expression to create each individual data point for an alarm
- Evaluation Periods
  
  The number of the most recent periods, or data points, to evaluate when determining alarm state.
- Data points to alarm
  
  Define the number of data points within the evaluation period that must be breaching to cause the alarm to go to ALARM state.
Action
- a notification sent to a SNS topic
- Auto Scaling actions
- EC2 actions (only applicable to EC2 Per-Instance Metrics)
States
- ALARM
  
  The metric is within the defined threshold
- INSUFFICIENT
  
  The metric is beyond the defined threshold
- OK
  
  The alarm has only just been configured, the metric is unavailable, or we do not have sufficient data for the metric to determine the alarm state.

Alarm Period	Metrics Standard Resolution (`60 Seconds`)	Metrics High Resolution (`1 Second`)
10 Seconds	❌	✅ (additional charge)
30 Seconds	❌	✅ (additional charge)
60 Seconds	✅	❌

CloudWatch Logs

CloudWatch logs never expire by default.
Log data retention can be configured on Log group level.
Log groups: arbitrary name, usually representing an application
Log stream: instances within application / log files / containers
Synthetics (opens in a new tab)
- Canary
  - Canaries are scripts written in Node.js or Python. They create Lambda functions in your account that use Node.js or Python as a framework. Canaries work over both HTTP and HTTPS protocols.
  - Blueprints (opens in a new tab)
    - Heartbeat Monitor
    - API Canary
    - Broken Link Checker
    - Visual Monitoring
    - Canary Recorder
    - GUI Workflow
Resources
- AWS - CloudWatch Documentation (opens in a new tab)
- AWS - CloudWatch API Reference (opens in a new tab)

CloudWatch - Logs Insights

CloudWatch - Application Signals

CloudWatch - Application Signals - Synthetics Canaries

Synthetic monitoring works by issuing automated, simulated transactions from a robot client to your application in order to mimic what a typical user might do.
Based on Puppeteer

CloudWatch - Cheatsheet

List all metrics

aws cloudwatch list-metrics

List all metrics of a namespace

aws cloudwatch list-metrics --namespace <namespace>

e.g. aws cloudwatch list-metrics --namespace "AWS/Route53"

CloudTrail

Trail
- Applies to all Regions, recording events in all Regions
- Applies to one Region, recording events in that Region only
- Organization trail (opens in a new tab)
  - If you have created an Organization, you can also create a trail that will log all events for all AWS accounts in that Organization.
  - Organization trails can apply to all Regions or one Region.
  - Organization trails must be created in the management account.
  - Member accounts will be able to see the Organization trail, but cannot modify or delete it.
  - By default, member accounts will not have access to the log files for the Organization trail in the S3 bucket.
Events (opens in a new tab)
- Management events
- Data events (additional charges apply)
- CloudTrail Insights events

CloudTrail - Data Events

High-volume activities and include operations such as S3 object level API operations and Lambda function invoke API.

CloudTrail - CloudTrail Lake

CloudTrail Lake (opens in a new tab)

Converts existing events in row-based JSON format to ORC format

X-Ray

A distributed tracing solution, especially for apps built using a microservices architecture
Segment
- At a minimum, a segment records the name, ID, start time, trace ID, and end time of the request.
- A segment document can be up to 64 KB and contain a whole segment with subsegments, a fragment of a segment that indicates that a request is in progress, or a single subsegment that is sent separately. You can send segment documents directly to X-Ray by using the PutTraceSegments API.
- When you instrument your application with the X-Ray SDK, the SDK generates segment documents for you. Instead of sending segment documents directly to X-Ray, the SDK transmits them over a local UDP port to the X-Ray daemon.
Subsegment
- Subsegment provides more granular timing information and details about downstream calls that your app made to fulfill the original request.
- Subsegments can contain other subsegments, so a custom subsegment that records metadata about an internal function call can contain other custom subsegments and subsegments for downstream calls.
- A subsegment records a downstream call from the point of view of the service that calls it.
- Field
  - namespace - aws for AWS SDK calls; remote for other downstream calls.
Service Graph is a flow chart visualization of average response for microservices and to visually pinpoint failure.
Trace collects all Segments generated by a single request so you can track the path of requests through multiple services.
- Trace ID in HTTP header (Tracing header) is named X-Amzn-Trace-Id.
Sampling is an algorithm that decides which requests should be traced. By default, X-Ray records the first request each second and 5% of any additional requests.
Annotations
- Use Annotations (opens in a new tab) to record information on Segments or Subsegments that you want indexed for search.
- Annotations support 3 data types: String, Number and Boolean.
- Keys must be alphanumeric in order to work with filters. Underscore is allowed. Other symbols and whitespace are forbidden and ignored.
- X-Ray indexes up to 50 annotations per trace.
Use Metadata to record data you want to store in the trace but don't need to use for searching traces.
Daemon
- X-Ray deamon gathers raw segment data, and relays it to the X-Ray API
- The daemon works in conjunction with the X-Ray SDKs and must be running so that data sent by the SDKs can reach the X-Ray service.
- By default listens on UDP port 2000
- -r, --role-arn: Assume the specified IAM role to upload segments to a different account.
- ECS
  
  create a Docker image that runs the X-Ray daemon, upload it to a Docker image repository, and then deploy it to your ECS cluster.
Environment variables (opens in a new tab)
Instrumentation (opens in a new tab)
- Automatic
- Manual

KMS

Multi-tenant key store management service operated by AWS.
KMS can use its own hardware security modules (HSMs) or a customer managed CloudHSM key store.
Region specific, a key that is created in one region can't be used in another region
KMS centrally stores and manages the encryption keys called KMS Key, and KMS Keys are stored in plain text, by default is symmetric.
Encrypt, Decrypt and ReEncrypt API actions are designed to encrypt and decrypt data keys, as they use KMS Key and can only encrypt up to 4 KB data.
Data over 4 KB can only be encrypted with Envelope Encryption using a data key.

Types of KMS Key

Description	Customer-managed	AWS-managed	AWS-owned
Key creation	customer	`AWS` on behalf of customer	`AWS`
Key usage	Customer can control key usage through the `KMS` and `IAM` policy	can be used only with specific `AWS` services where `KMS` is supported	implicitly used by `AWS` to protect customer data; customer can't explicitly use it
Key rotation	manually configured by customer	rotated automatically once a year	rotated automatically by `AWS` without any explicit mention of the rotation schedule
Key deletion	can be deleted	can't be deleted	can't be deleted
User access	controlled by the `IAM` policy	controlled by the `IAM` policy	can't be accessed by users
Key access policy	managed by customer	managed by `AWS`	N/A

Encryption options in KMS
- AWS managed keys
  - Encryption Method (AWS managed)
  - Keys Storage (AWS managed)
  - Keys Management (AWS managed)
- Customer managed keys
  - Encryption Method (Customer managed)
  - Keys Storage (AWS managed, CloudHSM)
  - Keys Management (Customer managed)
- Custom key stores
  - Encryption Method (Customer managed)
  - Keys Storage (Customer managed)
  - Keys Management (Customer managed)
API
- Encrypt (opens in a new tab)
  
  Encrypts plaintext into ciphertext by using a KMS CMK.
- Decrypt (opens in a new tab)
  
  Decrypts ciphertext that was encrypted by a KMS CMK.
- GenerateDataKey (opens in a new tab)
  - Generates a unique symmetric data key for client-side encryption, including a plaintext copy of the data key and a copy that is encrypted under a CMK that you specify.
  - To encrypt data outside of KMS:
    - Use the GenerateDataKey operation to get a data key.
    - Use the plaintext data key (in the Plaintext field of the response) to encrypt your data outside of KMS (Using any 3rd party cryptography library)
    - Erase the plaintext data key from memory.
    - Store the encrypted data key (in the CiphertextBlob field of the response) with the encrypted data.
  - To decrypt data outside of KMS:
    - Use the Decrypt operation to decrypt the encrypted data key. The operation returns a plaintext copy of the data key.
    - Use the plaintext data key to decrypt data outside of KMS.
    - Erase the plaintext data key from memory.
- GenerateDataKeyWithoutPlaintext (opens in a new tab)
  
  The same result as GenerateDataKey, only without the plaintext copy of the data key.
Symmetric and asymmetric CMKs (opens in a new tab)
- All AWS services that encrypt data on your behalf require a symmetric CMK.
- Symmetric key
  - Encrypt / Decrypt
- Asymetric key
  - Encrypt / Decrypt
  - Sign / Verify
  - Doesn't support automatic key rotation
  - The standard asymmetric encryption algorithms that KMS uses do not support an encryption context.

KMS - Cross account access

Allowing users in other accounts to use a KMS key (opens in a new tab)

Cross-account access requires permission in the key policy of the KMS key and in an IAM policy in the external user's account.
- Add a key policy statement in the local account
- Add IAM policies in the external account
Cross-account permission is effective only for certain API operations

CloudHSM

AWS Docs - AWS CloudHSM (opens in a new tab)
Single-tenant hardware security module with complete control
Customer operated
Can be used as a custom key store for KMS

AWS Config

By default, the configuration recorder records all supported resources in the Region where AWS Config is running.
AWS Config Rules (opens in a new tab)
- AWS Config Rules represent your ideal configuration settings. AWS Config continuously tracks the configuration changes. Any resource violating a rule will be flagged as non-compliant.
Costs
- You are charged service usage fees when AWS Config starts recording configurations.
- To control costs, you can stop recording by stopping the configuration recorder. After you stop recording, you can continue to access the configuration information that was already recorded. You will not be charged AWS Config usage fees until you resume recording.

Secrets Manager

Automatic secrets rotation without disrupting applications

Service Catalog

AWS Docs - AWS Service Catalog Documentation (opens in a new tab)
Centrally manage and govern your curated IaC templates
Product
- CloudFormation
- HashiCorp Terraform Cloud
- External (such as Terraform OSS)

Systems Manager (formerly SSM)

Automation (opens in a new tab)

Automation helps you to build automated solutions to deploy, configure, and manage AWS resources at scale.

Parameter Store (opens in a new tab)

Centralized configuration data management and secrets management
You can store values as plain text (String) or encrypted data (SecureString).
For auditing and logging, CloudTrail captures Parameter Store API calls.
Parameter Store uses KMS CMKs (opens in a new tab) to encrypt and decrypt the parameter values of SecureString parameters when you create or change them.
You can use the AWS managed CMK that Parameter Store creates for your account or specify your own customer managed CMK.

Parameter Store - Cheatsheet

Search for a parameter with name containing the given keyword

local keyword=<keyword>
aws ssm describe-parameters --parameter-filters "Key=Name,Option=Contains,Values=$keyword" \
--query 'sort_by(Parameters,&Name)[]' --output table

CloudFormation

Template (opens in a new tab)
- Use a JSON or YAML file called Template to specify a declarative, static definition of AWS service stack.
- The Template file must be uploaded to S3 before being used.
- Parameters
  - Parameter Type (opens in a new tab)
    - String
    - Number
    - List<Number>
    - CommaDelimitedList
    - AWS-Specific Parameter Types (opens in a new tab)
    - SSM Parameter Types (opens in a new tab)
  - You use the Ref intrinsic function to reference a Parameter, and AWS CloudFormation uses the Parameter's value to provision the stack. You can reference Parameter from the Resources and Outputs sections of the same template.
  - Pseudo parameters
    - Pseudo parameters are Parameters that are predefined by CloudFormation.
    - Use them the same way as you would a Parameter, as the argument for the Ref function.
    - Their names start with AWS:: such as AWS::Region.
- Resources
  - The only mandatory section
- Conditions
  - The optional Conditions section contains statements that define the circumstances under which entities are created or configured.
  - Other sections such as Resource and Output can reference the conditions defined in Condition section.
  - Use Condition function (opens in a new tab) to define conditions.
- Mappings
  - The optional Mappings section matches a key to a corresponding set of named values, essentially a Map using String as key.
  - Fn::FindInMap
    - !FindInMap [ MapName, TopLevelKey, SecondLevelKey ]
- Outputs
  - To share information between stacks, export a stack's output values. Other stacks that are in the same AWS account and Region can import the exported values.
  - To export a stack's output value, use the Export field in the Output section of the stack's template. To import those values, use the Fn::ImportValue function in the template for the other stacks.
  - Exported output names must be unique within your Region.
- Intrinsic function (opens in a new tab)
  - Fn::Ref (opens in a new tab)
    - The intrinsic function Ref returns the value of the specified Parameter or Resource.
    - When you Ref the logical ID of another Resource in your template, Ref returns what you could consider as a default attribute for that type of Resource. So using Ref for an EC2 instance will return the instance ID, Ref an S3 bucket, it will return the bucket name.
  - Fn::GetAtt (opens in a new tab): The Fn::GetAtt intrinsic function returns the value of an attribute from a resource in the template.
  - Fn::FindInMap (opens in a new tab): The intrinsic function Fn::FindInMap returns the value corresponding to keys in a two-level map that is declared in the Mappings section.
  - Fn::ImportValue (opens in a new tab): The intrinsic function Fn::ImportValue returns the value of an output exported by another stack. You typically use this function to create cross-stack references.
  - Fn::Join (opens in a new tab): The intrinsic function Fn::Join appends a set of values into a single value, separated by the specified delimiter. If a delimiter is the empty string, the set of values are concatenated with no delimiter.
  - Fn::Sub (opens in a new tab): The intrinsic function Fn::Sub substitutes variables in an input string with values that you specify.
- Helper scripts (opens in a new tab)
  - CloudFormation provides Python helper scripts that you can use to install software and start services on an EC2 instance that you create as part of your stack.
Stack (opens in a new tab)
- Change set (opens in a new tab)
  - Change sets allow you to preview how proposed changes to a stack might impact your running resources.
  - Similar to a diff to the stack.
StackSet (opens in a new tab)
- StackSets extends the functionality of stacks by enabling you to create, update, or delete stacks across multiple accounts and regions with a single operation.
CLI
- package (opens in a new tab)
  - This command is only needed when there is local artifacts.
  - The command performs the following tasks:
    - Packages the local artifacts (local paths) that your CloudFormation template references.
    - Uploads local artifacts, such as source code for an Lambda function or a Swagger file for an API Gateway REST API, to an S3 bucket. Note it is the local artifacts being uploaded, not the template.
    - Returns a copy of your template, replacing references to local artifacts with the S3 location where the command uploaded the local artifacts.
- deploy (opens in a new tab)
  
  Deploys the specified CloudFormation template by creating and then executing a change set.
Resources
- AWS Documentation - CloudFormation User Guide - Template Reference (opens in a new tab)
- How do I delete an AWS CloudFormation stack that's stuck in DELETE_FAILED status? (opens in a new tab)
- Building CI/CD pipeline for Cloudformation templates
  - Continuous delivery with CodePipeline (opens in a new tab)
  - CodePipeline action structure reference - AWS CloudFormation (opens in a new tab)

SQS (Simple Queue Service)

A queue from which consumers pull data pushed by producers.
Messages more than 256 KB (opens in a new tab) must be sent with the SQS Extended Client Library for Java, which uses S3 for message storage, supporting payload size up to 2 GB.
Number of messages (up to 10) can be specified before retrieving.
SQS message retention period ranges from 1 minute to 14 days, by default 4 days.
Visibility timeout (opens in a new tab)
- After a message is polled by a consumer, it becomes invisible to other consumers.
- Message visibility timeout is the time for consumer to process the message, and it is 30 seconds by default.
- If not deleted within the visibility timeout window, the message will become visible to other consumers again.
- ChangeMessageVisibility action can be used to prolong visibility timeout window.
- If visibility timeout is too high, and consumer crashes meanwhile, reprocessing will take time.
- If visibility timeout is too low, consumers may get duplicate messages.
Delivery delay (opens in a new tab)
- Delay happens before message being consumed.
- If you create a delay queue, any messages that you send to the queue remain invisible to consumers for the duration of the delay period. The default (minimum) delay for a queue is 0 seconds. The maximum is 15 minutes.
Polling (opens in a new tab)
- SQS provides short polling and long polling to receive messages from a queue. By default, queues use short polling.
- Long polling decreases the number of API calls made to SQS while increasing the efficiency and latency of your application.
- Long polling is preferable to short polling.
- Long polling can have a wait time from 1 to 20 second.
Queue type
- Standard queues
  - Default queue type
  - Almost unlimited throughput, up to 120000 in-flight messages
  - at-least-once message delivery, requiring manual deduplication
  - Out-of-order message delivery
- FIFO queue
  - Throughput: 3000 messages / second, up to 20000 in-flight messages
  - Queue name must end with .fifo.
  - exactly-once message delivery
  - Message ordering via message grouping
    - Ordering across groups is not guaranteed.
    - Messages that share a common message group ID will be in order within the group.
  - Deduplication
    - If you retry the SendMessage action within the 5-minute deduplication interval, SQS doesn't introduce any duplicates into the queue.
    - If a message with a particular message deduplication ID is sent successfully, any messages sent with the same message deduplication ID are accepted successfully but aren't delivered during the 5-minute deduplication interval.
    - If your application sends messages with unique message bodies, you can enable content-based deduplication.
  - Cannot subscribe to a SNS topic
Dead-letter queue (DLQ)
- The DLQ of a FIFO queue must also be a FIFO queue.
- The DLQ of a standard queue must also be a standard queue.
- The DLQ and its corresponding queue must be in the same region and created by the same AWS account.
- Redrive policy
  - Redrive policy specifies the source queue, the DLQ, and the conditions under which SQS moves messages from the former to the latter if the consumer of the source queue fails to process a message a specified number of times.
  - As long as a consumer starts polling, the message Receive count will increment by 1 no matter the processing is successful or not, therefore Receive count is essentially receive attempt count.
  - If a message Receive count is more than the specified Maximum receives, the message will be sent to the specified DLQ.
  - SQS counts a message you view in the AWS Management Console against the queue’s redrive policy, because every attempt to view a message in the queue requires Poll for messages, and that will increment Receive count.
Resources
- Amazon SQS (from AWS) - The Ultimate Guide (opens in a new tab)

SNS

Max message size: 256 KB, extended client library supporting 2 GB.

SNS - Topic

A Topic allows multiple receivers of the message to subscribe dynamically for identical copies of the same notification.
By default, SNS offers 10 million subscriptions per Topic and 100,000 Topics per account.

SNS - Subscription

A subscriber receives messages that are published only after they have subscribed to the Topic. The Topics do not buffer messages.
When several SQSs act as a subscriber, a publisher sends a message to an SNS topic and it distributes this topic to many SQS queues in parallel. This concept is called fanout.

Cognito

User pool (opens in a new tab)
- User directory and access control for your application
- Sign-up, sign-in and related authentication functionality
Identity pool (opens in a new tab)
- Identity pools provide temporary AWS credentials for:
  - Unauthenticated guests
  - Users who have been authenticated and received a token.
- Access to AWS services
- Specific to your AWS account
Cognito Sync (opens in a new tab)
- Synchronizing application data across devices
- AppSync is a modern replacement for Cognito Sync.
- Cognito Streams (opens in a new tab)
  
  gives developers control and insight into their data stored in Cognito
- Cognito Events (opens in a new tab)
  
  allows you to execute an Lambda function in response to important events in Cognito.

API Gateway

REST API
- Stage variables
  - A stage is a named reference to a deployment, which is a snapshot of the API.
  - Stage variables are name-value pairs that you can define as configuration attributes associated with a deployment stage of a REST API. They act like environment variables and can be used in your API setup and mapping templates.
  - A stage variable can be used anywhere in a mapping template: ${stageVariables.<variable_name>}
- Integration type (opens in a new tab)
  - AWS (Lambda custom integration)
    
    expose AWS service actions, must configure both the integration request and integration response.
  - AWS_PROXY (Lambda proxy integration)
    - This is the preferred integration type to call a Lambda function through API Gateway and is not applicable to any other AWS service actions, including Lambda actions other than the function-invoking action.
    - In Lambda proxy integration (opens in a new tab), API Gateway requires the backend Lambda function to return output according to the following JSON format.
      { "isBase64Encoded": true|false, "statusCode": httpStatusCode, "headers": { "headerName": "headerValue", ... }, "multiValueHeaders": { "headerName": ["headerValue", "headerValue2", ...], ... }, "body": "..." }
  - HTTP
    
    expose HTTP endpoints in the backend, must configure both the integration request and integration response.
  - HTTP_PROXY
    
    expose HTTP endpoints in the backend, but you do not configure the integration request or the `integration response.
  - MOCK
    
    API Gateway return a response without sending the request further to the backend, useful for testing integration set up.
- Quota (opens in a new tab)
  - Integration timeout: 50 milliseconds to 29 seconds for all integration types.
- API Gateway responses (opens in a new tab)
  - 502 Bad Gateway
    - Usually an incompatible output returned from a Lambda proxy integration backend
    - Occasionally for out-of-order invocations due to heavy loads.
  - 504 INTEGRATION_TIMEOUT
  - 504 INTEGRATION_FAILURE
Canary release (opens in a new tab)

Total API traffic is separated at random into a production release and a canary release with a pre-configured ratio.
Mapping template
- A script expressed in Velocity Template Language (VTL) and applied to the payload using JSONPath expressions to perform data transformation.
API cache
- API Gateway caches responses from your endpoint for a specified TTL period, in seconds.
- Default TTL is 300 seconds, and TTL=0 means caching is disabled.
- Client can invalidate an API Gateway cache entry by specifying Cache-Control: max-age=0 header, and authorization can be enabled to ignore unauthorized requests.
Throttling
- Server-side throttling limits are applied across all clients.
- Per-client throttling limits are applied to clients that use API keys associated with your usage plan as client identifier.
Usage plan
- Uses API keys to identify API clients and meters access to the associated API stages for each key.
- Configure throttling limits and quota limits that are enforced on individual client API keys.
- Throttling
  - Rate
    - Number of requests per second that can be served
    - The rate is evenly distributed across given time period.
  - Burst
    - Maximum number of concurrent request submissions that API Gateway can fulfill at any moment without returning 429 Too Many Requests error responses
    - Burst essentially means the maxium number of requests that can be queued for processing. Once Burst is exceeded, request will be dropped.
  - As an analogy, imagine you are in a bank branch waiting to be served, Rate is the number of customers that are being served at that same time. Burst is the number of customers that can wait in a queue in the branch lobby. How long the queue can be is limited by the lobby space. Therefore if there are more customers not able to queue in the lobby, they must wait outside or choose another time to come to the branch.
Security
- IAM (opens in a new tab)
  - Authentication: IAM
  - Authorization: IAM
  - Signature version 4 signing (opens in a new tab)
- Cognito user pool (opens in a new tab)
  - Authentication: Cognito user pool
  - Authorization: API Gateway methods
  - Seamless integration, no custom code needed
- Lambda authorizer (opens in a new tab)
  - Authentication: 3rd-party (invoked by Lambda authorizer)
  - Authorization: Lambda function
  - Authorizer type
    - TOKEN authorizer
      
      Token-based Lambda authorizer receives the caller's identity in a bearer token, such as a JWT or an OAuth token.
    - REQUEST authorizer
      
      Request parameter-based Lambda authorizer receives the caller's identity in a combination of headers, query string parameters, stageVariables, and $context variables. WebSocket only supports REQUEST authorizer.
Metrics (opens in a new tab)
- 4XXError
  
  number of client-side errors captured in a given period
- 5XXError
  
  number of server-side errors captured in a given period
- Count
  
  total number of API requests in a given period
- IntegrationLatency
  
  the responsiveness of the backend
- Latency
  
  the overall responsiveness of your API calls
- CacheHitCount & CacheMissCount
  
  optimize cache capacities to achieve a desired performance.
CORS
- To enable CORS support, you may or may not need to implement the CORS preflight response depending on the situation.
  - Lambda or HTTP non-proxy integrations and AWS service integrations
    
    Manual adding CORS response headers could be needed
  - Lambda or HTTP proxy integrations
    
    Manual adding CORS response headers is required
Resources
- Amazon API Gateway quotas and important notes (opens in a new tab)
- A Detailed Overview of AWS API Gateway (opens in a new tab)

SAM

The declaration Transform: AWS::Serverless-2016-10-31 is required for SAM template files.
Globals section is unique to SAM templates.
Resource type
- AWS::Serverless::Api
  - API Gateway
- AWS::Serverless::Application
  - Embeds a serverless application
- AWS::Serverless::Function
  - Lambda function
- AWS::Serverless::HttpApi
  - API Gateway HTTP API
- AWS::Serverless::LayerVersion
  - Creates a Lambda LayerVersion that contains library or runtime code needed by a Lambda Function.
- AWS::Serverless::SimpleTable
  - a DynamoDB table with a single attribute primary key.
- AWS::Serverless::StateMachine
  - an Step Functions state machine
Installation
- GitHub - aws/aws-sam-cli (opens in a new tab)
Notes
- Use SAM CLI for local Lambda function development. (sam local invoke)
- Don't use SAM CLI for deployment as it creates additional resources.
- Use CloudFormation for unified deployment and provisioning.
- Use container image for deployment but not for local development as it's slow to build image, IntelliJ also does not support debugging Lambda function packaged as an image.
Resources
- SAM (Serverless Application Mode) (opens in a new tab)

CDK (Cloud Development Kit)

Assets

Assets are local files, directories, or Docker images that can be bundled into AWS CDK libraries and apps; eg: a directory that contains the handler code for an AWS Lambda function. Assets can represent any artifact that the app needs to operate.
Bootstrapping
- Deploying AWS CDK apps into an AWS environment (a combination of an AWS account and region) may require that you provision resources the AWS CDK needs to perform the deployment. These resources include an S3 bucket for storing files and IAM roles that grant permissions needed to perform deployments. The process of provisioning these initial resources is called bootstrapping.
- cdk bootstrap aws://<Account-ID>/<Region>

Billing and Cost Management

Free Tier usage summary (opens in a new tab) can be found under Billing Management Console.
Consolidated billing for AWS Organizations (opens in a new tab)
- You can use the consolidated billing feature in AWS Organizations to consolidate billing and payment for multiple AWS accounts.

Savings Plans (opens in a new tab)

Types
- Compute
- EC2 Instance
- SageMaker
Pricing
- No upfront
- Partial upfront
- All upfront

Code Samples

GitHub - AWS Samples (opens in a new tab)

Java project scaffolding

Maven

mvn -B archetype:generate \
    -DarchetypeGroupId=software.amazon.awssdk \
    -DarchetypeArtifactId=archetype-lambda \
    -Dservice=s3 \
    -Dregion=US_EAST_1 \
    -DgroupId=cq.aws \
    -DartifactId=playground-aws

Best Practices

Tagging (opens in a new tab)
- Both keys and values are case sensitive.
- Using Tags to index resources, so they can be found easily.
- Typical tags
  - Name
  - Project
  - Environment
  - Version
  - Owner

Resources

AWS Documentation - AWS Code Sample Catalog (opens in a new tab)
AWS Documentation - AWS General Reference (opens in a new tab)

AWS Service endpoints and quotas
AWS Documentation - Service Authorization Reference (opens in a new tab)

Service Authorization Reference provides a list of the actions, resources, and condition keys that are supported by each AWS service. You can specify actions, resources, and condition keys in IAM policies to manage access to AWS resources.
AWS Prescriptive Guidance (opens in a new tab)
AWS Pricing Calculator (opens in a new tab)

Cloud Native Fast Chicken and Slaw Tacos