AWS SAP Notes 04 - Compute, Scaling and Load Balancing
aws sap

Nguyễn Huy Hoàng viết ngày 10/10/2021

Regional and Global AWS Architecture

  • There are 3 main type of architectures:
    • Small scale architectures: one region/one country
    • Small architecture with DR: one region + backup region for disaster recovery
    • Multiple region based systems
  • Architectural components at global level:
    • Global Service Location and Discovery
    • Content Delivery (CDN) and optimization
    • Global health checks and Failover
  • Regional components:
    • Regional entry point
    • Scaling and resilience
    • Application services and components alt text alt text


EC2 Purchase Options (Launch Types)

  • On-Demand (default): alt text
    • Average of anything, no specific cons of pros
    • On-demand instances are isolated but multiple customer instances run on a shared hardware
    • Multiple instance types (different sizes) can run on the same EC2 hosts, consuming a different allocation of resources
    • Billing: per-second billing while an instance is running, if a system is shut down, we don't get billed for that
    • Associated resources such as storage consume capacity, we will be billed regardless the instance is running or it is stopped
    • We should always start the evaluation process using on-demand
    • With on-demand there are no interruptions. We start an instance and it should run as long as we don't decide to shut it down
    • In case of resource shortage the reserved instances receive highest priority, consider them instead of on-demand in case of business critical systems
    • On-demand offers predictable pricing without any discount options
  • Spot instances: alt text
    • Cheapest way to get EC2 compute capacity
    • Spot pricing is selling EC2 capacity at lower price in order make use of spare EC2 capacity on the host machines
    • If the spot price goes above selected maximum price, our instances are terminated
    • We should never use the spot instances for workloads which can't tolerate interruptions
    • Anything which can tolerate interruptions and can be re-triggered is good for spot
  • Standard Reserved Instances: alt text
    • On-demand is generally used for unknown or short term usage, reserved is for long term consistent usage of EC2
    • Reservations:
      • They are commitments that we will use a instance/set of instances for a longer amount of time
      • The effect of a reservation is to reduce the per second cost or remove it entirely
      • Reservation needs to be planned in advance
      • We pay for unused reservations
      • Reservations can be bought for a specific type of instances. They can be region and AZ locked
      • Az locked instances reserve EC2 capacity
      • Reservations can have a partial effects in a sense the we can get discounts for larger instances compared to which the reservation was purchased
      • We can commit to reservations of 1 year of 3 years
      • Payment structures:
        • No upfront: we pay per second a lower amount of fee compared to on-demand. We pay even if the instance is not used
        • All upfront: the whole cost of the 1 or 3 years. No second per fee payment will be required. Offer the greatest discount
        • Partial upfront: we pay a reduced fee upfront for smaller per second usage
    • Reserved instances are good for components which have known usage, require consistent access for compute for a long term basis
  • Scheduled Reserved Instances: alt text
    • Great for long term requirements which does not run constantly, ex. batch processing running 5 hours/day
    • For scheduled reserved instances we specify a time window. The capacity can be used only during the time window
    • Minimum purchase per year is 1200 hours, minimum commitment is 1 year
  • Dedicated Hosts: alt text
    • They are EC2 hosts allocated to a customer entirely
    • They are hosts designed for specific instances, ex. A, C, R, etc.
    • Hosts come with all of the resources we expected from a physical machine: number of cores and CPUs, memory, local storage and connectivity
    • We have a capacity for a dedicated hosts, we can launch different sizes of instances based on the available capacity
    • Reasons for dedicated hosts: we want to use software which is licensed for number of cores or number of sockets
    • Host affinity: links instances to hosts, if we stop and start the instance, it will remain on the same host
  • Dedicated Instances: alt text
    • Our instances run on an EC2 host with other instances of ours. The host is not shared with other AWS customers
    • We don't pay for the host, nor do we share the host
    • Extra fees:
      • One of hourly fee for any regions in which we are using dedicated instances
      • There is a fee for the dedicated instances themselves

Capacity Reservations

alt text

  • AWS prioritizes any scheduled commitment for delivering EC2 capacity
  • After scheduled instances on-demand is prioritized
  • The leftover capacity can be used for spot instances
  • Capacity reservation is different compared to reserved instances
  • Regional reservation provides a billing discount for valid instances launched in any AZ in that region
  • While this is flexible, region reservation don't reserve capacity within az AZ - risky if the capacity is limited during a major fault
  • Zonal reservation: same billing discount as for region reservation, but the reservation applies only to specific AZs
  • Regional/zonal reservation commitment is 1 or 3 years
  • On-Demand capacity reservation: can be booked to ensure we always have access to capacity in an AZ when we need it but at full on-demand price. No term limits, but we pay regardless if we consume the reservation or not

EC2 Savings Plan

  • A hourly commitment for 1 or 3 years term
  • Saving Plan can be 2 different types:
    • General compute dollar amounts: we can save up to 66% version on-demand
    • EC2 Saving Plan: up to 72% saving for EC2
  • General compute savings plan currently apply to EC2, Fargate and Lambda
  • Resource usage consumes savings plan commitment at the reduced saving plans rate, beyond commitment on-demand billing is used

EC2 Networking

  • Instances are created with a primary ENI, this can not be removed or detached from the instance
  • Secondary ENIs can be added to an instance which can be in different subnets (NOT AZs!)
  • Secondary ENIs can be detached and attached to other instances
  • Security Groups are associated with an ENI, not an EC2 instances
  • Every instances is allocated a primary private IPv4 address from the subnet range. This IP address remains the within the lifetime of EC2 instance
  • The primary IP address is exposed to the OS
  • ENIs can also have one or more secondary IP addresses depending on the instance type
  • Public IP address is allocated to the instance if we launch it in a subnet where this is enabled or we explicitly enable a primary address to the instance
  • Public IPs are not static
  • Public IPs are not visible to the OS
  • In order to get static public IP addresses, we can associate an Elastic IP to the instance
  • We can allocate one public IP per private IP
  • We get charged if the Elastic IPs are not associated to instances
  • IPv6 addresses are always visible to the OS
  • Source/destination checks: each ENI has a flag which can be disabled
  • By default source/destination check is enabled, if disabled the ENI can process traffic which was not created by the EC2 instances or traffic for which the EC2 instance is not the destination

Bootstrapping and AMI Baking

  • Bootstrapping:
    • Is a way of building EC2 instances in a flexible way
    • Flexible, automated building of EC2 instances
    • We provision EC2 instances and add a scrip to the user data
    • CloudInit runs the script on the instance when the instance is launched
    • This process can longer time, although it is very flexible
  • AMI Baking:
    • We front-load the time and effort required to configure an instance
    • We launch an EC2 instance and perform the necessary tasks from which we can create an AMI
    • We can use the AMI to deploy many instances quickly

Placement Groups

  • Allow us to influence EC2 instance placements, insuring that instances are closed together or not
  • There are 3 types of placements groups in AWS:
    • Cluster: any instances in a single placement groups are physically close
    • Spread: instances are all using different underlying hardware
    • Partition: groups of instances which are spread apart

Cluster Placement Groups

  • Used for highest possible performance
  • Best practice is to launch all of the instances at the same time which will be part of the placement group. This ensures that AWS allocates capacity in the same location
  • Cluster placement groups are located in the same AZ, when the first instance is launched, the AZ is locked
  • Ideally the instances in a cluster placement group are located on the same rack, often on the same EC2 host
  • All instances have fast bandwidth between each other (max 10 Gbps)
  • They offer the lowest latency possible and max PPS possible in AWS
  • Cluster placement group should be used for highest performance. They offer no HA and very little resilience
  • Considerations for cluster placement groups:
    • We can not span AZs, the AZ is locked when the first instance is launching
    • We can span VPC peers, but this will impact performance negatively
    • Cluster placement groups are not supported for every instance
    • Recommended: use the same type of instances and launch them at the same time
    • They offer 10 Gbps for single stream performance

Spread Placement Groups

  • They offer the maximum possible availability and resiliency
  • They can span multiple AZs
  • Instances in the same spread placement group are located on different racks, having isolated networking and power supplies
  • There is a limit for 7 instances per AZ in case of spread placement groups
  • Considerations:
    • Spread placement provides infrastructure isolation
    • Hard limit: 7 instances per AZ
    • We can not use dedicated instances or hosts

Partition Placement Groups

  • Similar to spread placement groups
  • Are designed when we need more than 7 instances per AZ but we still need separation
  • Can be created across multiple AZ in a region
  • At creation we specify the number of partition per AZ (max 7 per AZ)
  • Each partition has its own rack with isolated power and networking
  • We can launch as many instances as we need in a partition group
  • Use cases for partition groups: HDFS, HBase, Cassandra, topology aware applications
  • Instances can be placed in a specific partition or we can let AWS to decide

EC2 Spot Instances

  • Can get a discount of up to 90% compared to On Demand instances
  • We can define a max spot price and get he instance of our price is bigger than the current price
  • If the current spot price goes beyond our max price, we can choose to stop or terminate the instance within 2 minutes grace period
  • If we don't want our spot instance to be reclaimed by AWS, we can use a Spot Block
    • We can block a spot instance during a specified time frame (1 to 6 hours) without interruptions
    • In rare situations the instance may be reclaimed
  • Use cases for spot instances: batch jobs or workloads that are resilient to failure
  • We can launch spot instances with a spot request. A spot request contains the following information:
    • Maximum price
    • Desired number of instances
    • Launch specification
    • Request type: on-time, persistent
    • Valid from, valid until
  • Request types:
    • One time request: as soon as the request is fulfilled, the request will go away
    • Persistent request: the number of instances is attempted to be kept even if some instances are reclaimed, meaning that the request will not go away as soon as it is completed first time
  • Canceling a spot instances: in order ot cancel a spot instance, it has to be in an open, active or disabled state
  • Spot instance states: alt text
  • Cancelling a spot request, it will not terminate the instances themselves. In order to terminate instances, first we have to terminate the spot request, if there is one active

Spot Fleets

  • Spot Fleet - set of spot instances + (optional) on-demand instances
  • The spot fleet will try to meet the target capacity with price constraints
  • A launch pool can have the following can have different instance types, OS, AZ
  • We can have multiple launch pools, so the fleet can choose the best
  • Spot fleet will stop launching instances the target capacity is reached
  • Strategies to allocate spot instances:
    • lowestPrice: the spot fleet will launch instances from the pool with the lowest price
    • diversified: distribute instances across all pools
    • capacityOptimized: launch instances based on the optimal capacity for the number of instances
  • Spot fleets allow us to automatically request spot instances with the lowest price

ELB - Elastic Load Balancer

ELB Architecture

  • It is the job of the load balancer to accept connection from an user base and distribute it to the underlying services
  • ELB support many different type of compute service
  • LB architecture: alt text alt text
  • Initial configurations for ELB:
    • IPv4 or double stacking
    • We have to pick the AZ which the LB will use, specifically we are picking one subnet in 2 or more AZs
    • When we pick a subnet, AWS places one or more load balancer nodes in that subnet
    • When an LB is created it has a DNS A record. The DNS name resolves all the nodes located in multiple AZs. The nodes are HA: if the node fails, a different one is created. If the load is to high, multiple nodes are created
    • We have to decide on creation if the LB is internal or internet facing (have public IP addresses or not)
  • Listener configuration: what the LB is listening to (what protocols, ports etc.)
  • An internat facing load balancer can connect to both public and private instances
  • Minimum subnet size for a LB is /28 - 8+ fee addresses per subnet (AWS suggests a minimum of /27)

Cross-Zone Load Balancing

alt text

  • Initially each LB node could distribute traffic to instances in the same AZ
  • Cross-Zone Load Balancing: allows any LB node to distribute connections equally across all registered instances in all AZs

User Session State

  • Session state:
    • A piece of server side information specific to one single user of one application
    • It does persist while the user interacts with the application
    • Examples of session state: shopping cart, workflow position, login state
  • The date representing a sessions state is either stored internally or externally (stateless applications)
  • Externally hosted session:
    • Session data is hosted outside of the back-end instances => application becomes stateless
    • Offers the possibility to do load balancing for the back-end instances, the session wont get lost in case the LB redirects the user to a different instance

ELB Evolution

  • Currently there are 3 different types of LB in AWS
  • Load balancers are split between v1 and v2 (preferred)
  • LB product started with Classic Load Balancers (v1)
  • CLBs can load balance http and https and lower level protocols as well, although they can not understand the http protocol
  • CLBs can have only 1 SSL certificates
  • They can not be considered entirely being a layer 7 product
  • Application Load Balancer (ALB - v2 LB) are layer 7 products supported HTTP(S) and WebSocket
  • Network Load Balancers (NLB) are also v2 load balancers supporting lower level protocols such as TCP, TLC and UDP

Application and Network Load Balancers

  • Consolidation of load balancers: alt text
    • Classic load balancers do not scale, they do not support multiple SSL certificate (no SNI support) => for every application a new load balancer is required
    • v2 load balancers support rules and target groups
    • v2 load balancers can have host based rules using SNI
  • Application Load Balancer (ALB):
    • True layer 7 LB, configured to listen either HTTP or HTTPS protocols
    • ALB can not understand any other layer 7 protocols
    • ALB requires HTTP and HTTPS listeners
    • It can understand layer 7 content, such as cookies, custom headers, user location, app behavior, etc.
    • Any incoming connection (HTTP, HTTPS) is always terminated on the ALB - no unbroken SSL
    • All ALBs using HTTPS must have SSL certificates installed
    • ALBs are slower than NLBs because they require more levels of networking stack to process
    • ALB offer health checks evaluation at application layer
    • Application Load Balancer Rules: alt text
      • Rules direct connection which arrive at a listener
      • Rules are processed in a priority order, default rule being a catch all
      • Rule conditions: host-header, http-header, http-request-method, path-pattern, query-string and source-ip
      • Rule actions: forward, redirect, fixed-response, authenticate-oidc and authenticate-cognito
    • The connection from the LB and the instance is a separate connection
  • Network Load Balancer (NLB):
    • NLBs are layer 4 load balancers, meaning they support TPC, TLS, UDP, TCP_UDP connections
    • They have no understanding of HTTP or HTTPS => no concept of network stickiness
    • They are really fast, can handle millions of request per second having 25% latency of ALBs
    • Recommended for SMTP, SSH, game servers, financial apps (not HTTP(S))
    • Health checks can only check ICMP or TCP handshake
    • They can be allocated with static IP addresses
    • They can forward TCP straight through the instances => unbroken encryption
    • NLBs can be used for PrivateLink

Session Stickiness

alt text

  • Stickiness: allows us to control which backend instance to be used for a given connection
  • With no stickiness connections are distributed across all backend services
  • Enabling stickiness:
    • CLB: we can enable it per LB
    • ALB: we can enable it per target group
  • When stickiness is enabled, the LB generates a cookie: AWSALB which is delivered to the end-user
  • This cookie has a duration defined between 1 sec and 7 days
  • When the user accesses the LB, it provides the cookie to the LB
  • The LB than can decide to route the connection to the same backend instance every time while the cookie is not expired
  • Change of the backed instance if the cookie is present:
    • If the instance to which the cookie maps to fails, then a new instance will be selected
    • If the cookie expires => the cookie will be removed, new cookie is created while a new instance is chosen
  • Session stickiness problems: load can become unbalanced

ASG - Auto Scaling Groups

alt text

  • Auto Scaling Groups provide auto scaling for EC2
  • Provide the ability to implement a self-healing architecture
  • ASGs make use of configurations defined in launch templates or launch configurations
  • ASGs are using one version of a launch template/configuration
  • ASG have 3 important values defined: Minimum, Desired and Maximum size
  • ASG provides on foundational job: keeps the size of running instances at the desired size
  • Scaling Policies: update the desired capacity based on some metric (CPU usage, number of connections, etc.)
    • They are essentially rules defined by us which can adjust the values of an ASG
    • Scaling types:
      • Manual Scaling
      • Scheduled Scaling: scheduling based on know time window
      • Dynamic Scaling
  • Dynamic Scaling has 3 subtypes:
    • Simple Scaling: example "CPU above 50% +1", "CPU Below 50% -1"
    • Stepped Scaling: scaling based on difference, allowing to react quicker
    • Target Tracking: example desired aggregate CPU = 40%. Not all metrics are supported by target tracking scaling
  • Cooldown Period: a value in seconds, controls how long to wait after a scaling action happened before starting another action
  • ASG monitor the health of instances, by default using the EC2 health checks
  • ASG can integrate with load balancers: ASG can add/remove instances from a LB target group
  • ASG can use the LB health checks in case of EC2 health checks

Scaling Processes

  • Launch and Terminate: if Launch is suspended, the ASG wont scale out / if Terminate is suspended the ASG wont scale in
  • AddToLoadBalancer: add instance to LB
  • AlarmNotification: control is the ASG reacts to CloudWatch alarms
  • AZRebalance: balances instances evenly across all of AZs
  • HealthCheck: controls if instance health checks are on/off
  • ReplaceUnhealthy: controls if instances are replaced in case there are unhealthy
  • ScheduledActions: controls if scheduled actions are on/off
  • Standby: suspend any activities of ASG in a specific instance

ASG Consideration

  • ASG are free, we pay only for the instances provisioned
  • We should use cool downs to avoid rapid scaling
  • We should use smaller instances for granularity
  • ASG integrates with ALBs alt text
  • ASG defines when and where, LT defines what

ASG Lifecycle Hooks

  • Allow to configure custom actions which can occur during ASG actions
  • When an ASG scales out/in instances may pause within the flow to allow execution of lifecycle hooks
  • We can specify a timeout for the lifecycle action, after the pause the system can decide if the ASG process continues or is abandoned
  • We can resume the ASG process by calling CompleteLifecycleAction
  • Lifecycle event hooks can be integrated with EventBridge, SQS or SNS alt text


Bình luận

{{ }}
Bỏ hay Hay
Male avatar
{{ comment_error }}

Hiển thị thử

Chỉnh sửa


Nguyễn Huy Hoàng

17 bài viết.
10 người follow
{{userFollowed ? 'Following' : 'Follow'}}
Cùng một tác giả
11 4
(Ảnh) Tại hội nghị Build 2016 diễn ra từ ngày 30/3 đến hết ngày 1/4 ở San Francisco, Microsoft đã đưa ra 7 thông báo lớn, quan trọng và mang tầm c...
Nguyễn Huy Hoàng viết hơn 4 năm trước
11 4
7 0
Viết code chạy một cách trơn tru ngay lần đầu tiên là một việc rất khó, thậm chí là bất khả thi. Do đó debug là một kỹ năng vô cùng quan trọng đối ...
Nguyễn Huy Hoàng viết hơn 4 năm trước
7 0
1 0
MultiFactor Authentication (MFA) Factor: different piece of evidence which proves the identity Factors: Knowledge: something we as users know: ...
Nguyễn Huy Hoàng viết 4 tháng trước
1 0
Bài viết liên quan
0 0
FSx FSx For Windows File Servers FSx for Windows are fully managed native Windows file servers/file shares Designed for integration with Wind...
Nguyễn Huy Hoàng viết 4 tháng trước
0 0
0 0
CloudFront It is a content deliver network (CDN) Its job is to improve the delivery of content from its original location to the viewers of the...
Nguyễn Huy Hoàng viết 4 tháng trước
0 0
0 0
Kinesis Is a scalable streaming service, designed to ingest lots of data Producers send data into a Kinesis stream Streams can scale from low...
Nguyễn Huy Hoàng viết 4 tháng trước
0 0


{{ comment_count }}

bình luận

{{liked ? "Đã kipalog" : "Kipalog"}}

{{userFollowed ? 'Following' : 'Follow'}}
17 bài viết.
10 người follow

 Đầu mục bài viết

Vẫn còn nữa! x

Kipalog vẫn còn rất nhiều bài viết hay và chủ đề thú vị chờ bạn khám phá!