AWS SAP Notes 06 - Databases
aws sap

Nguyễn Huy Hoàng viết ngày 10/10/2021

RDS - Relational Database Service

  • RDS is often described as a Database-as-a-service (DBaaS)
  • RDS provides managed database instances, which can themselves hold one or more databases
  • Benefits of RDS are the we don't need to manage the physical hardware, the server operating system or the database system itself
  • RDS supports MySQL, MariaDB, PostgreSQL, Oracle, Microsoft SQL Server
  • Amazon Aurora: it is a db engine created by AWS and we can select it as well for usage

RDS Database Instance

  • Runs one of the few types of db engine mentioned above
  • Can contain multiple user created databases
  • A database instance after creation can be accessed using its hostname (CNAME)
  • RDS instances come in various types, share many of features of EC2. Example of instances: db.m5, db.r5, db.t3
  • RDS instances can be single AZ or multi AZ (active-passive failover)
  • When an instance is provisioned, it will have a dedicated storage allocated as well (usually EBS)
  • Storage allocated can be based on SSD storage (IO1, GP2) or magnetic (mainly for compatibility)
  • Billing for RDS:
    • We are billed based on instance size on a hourly rate
    • We are also billed per storage (GB/month) + extra per iops in case of provisioned iops (IO1)

RDS Multi AZ

  • Used to add resilience to an RDS instance
  • Multi AZ is an option which can be enabled on an RDS instance, when enabled secondary hardware is allocated in another AZ (standby replica)
  • RDS enables synchronous replication between primary and standby instances
  • RDS is accessed via provided endpoint address (CNAME)
  • With a single instance the endpoint address points the instance itself, with multi AZ, by default the endpoint points to the primary instance
  • We can not directly access the standby instance
  • If an error occurs with the primary instance, RDS automatically changes the endpoint to point to the standby replica. This failover occurs in around 60-120 seconds
  • Multi AZ is not available in the Free-tier (generally costs double as it would the single AZ)
  • Backups are taken from the standby instance (removes performance impact)
  • Failovers can happen if:
    • AZ outage
    • Primary instance failure
    • Manual failover
    • Instance type change
    • Software patching

RDS Backups and Restores

  • RPO (Recovery Point Objective): time between the last working backup and the failure. Lower the RPO value, usually the more expensive the solution
  • RTO (Recovery Time Objective): time between the failure and system being fully recovered. Can be reduced with spare hardware, predefined processes, etc. Lower the RTO value, the system is usually more expensive
  • RDS backup types:
    • Manual snapshots:
      • Have to be run manually
      • First snapshot is full content of the DB, incremental onward
      • When any snapshot occures, there is brief interruption in the flowing of data
      • Manual snapshots do not expire
      • When we delete an RDS instance, AWS offers to make one final snapshot
    • Automatic backups:
      • Snapshots which occure automatically, first being full snapshot, incremental afterwards
      • Every 5 minute transaction logs are written to S3
      • Automatic backups are not retained, we can set the retention period between 0 and 35 days
      • Automatic backups can be retained after a DB is deleted, by they still expire after the retention period
  • Backups are stored in AWS manages S3 buckets
  • When we create a restore, RDS creates a new instance

RDS Read-Replicas

  • Provide 2 main benefits: performance and availability
  • Read replicas are read-only replicas of an RDS instance
  • Read replicas can be used for reading only data
  • The primary instance and read replica is kept sync using asynchronous replication
  • There can be a small amount of lag in case of replication
  • Read replicas can be created in a different AZ or different region (CRR - Cross-Region Replication)
  • We can 5 direct read-replicas per DB instance
  • Each read-replica provides an additional instance of read performance
  • Read-replicas can also have read-replicas, but lag starts to be a problem in this case
  • Read-replicas can provide global performance improvements
  • Snapshots and backups improve RPO but not RTO. Read-replicas offer near 0 RPO
  • Read-replicas can be promoted to primary in case of a failure. This offers low RTO as well (lags of minutes)
  • Read-replicas can replicate data corruption

Data Security

alt text

  • With all the RDS engines we can use encryption in transit (SSL/TLS). THis can be set to be mandatory
  • For encryption at rest RDS supports EBS volume encryption using KMS which is handled by the host EBS and it is invisible for the database engine
  • We can use customer managed or AWS generated CMK data keys for encryption at rest
  • Storage, logs and snapshots will be encrypted with the same customer master key
  • Encryption can not be removed after it is activated
  • In addition to encryption at rest MSSQL and Oracle support TDE (Transparent Data Encryption) - encryption at the database engine level
  • Oracle supports TDE with CloudHSM, offering much stronger encryption
  • IAM authentication with RDS:
    • Normally login is controlled with local database users (username/password)
    • We can configure RDS to allow IAM authentication (only authentication, not authorization, authorization is handled internally!): alt text

Aurora Architecture

  • Aurora uses a Cluster:
    • Made up of a single primary instance and 0 or more replicas
    • The replicas can be used for reads (not like the standby replica in RDS)
    • Storage: the cluster uses a shared cluster volume (SSD based by default). Provides faster provisioning, improved availability and better performance
    • The storage has 6 replicas across AZ. The data is replicated synchronously. Replication happens at the storage level
    • Self-healing: Aurora can repair its data if a replica or part of the replica fails
    • With Aurora can have up to 15 replicas, any of the replicas can be failed over to
    • Billing for Aurora storage:
      • Storage is billed to what we consume up to 64TiB limit
      • High water mark: we get billed for the most used data at a time, in case of free-up we will be billed for the max usage consumed
    • Aurora clusters use endpoints, providing multiple endpoints:
      • Cluster endpoint: always point to the primary instance, can be used for reads and writes
      • Reader endpoint: points to the primary instance and also to the read-replicas. Aurora does load balancing we using this endpoint
      • Custom endpoint: created by us
  • Aurora cost:
    • No free-tier option, Aurora does not support micro instances
    • Beyond RDS singleAZ (micro) Aurora offers better value compared to other RDS options
    • Compute: hourly charge, per second, 10 minute minimum
    • Storage: GB/month consumed, IO cost per request
    • Backups: 100% DB size in backups are included
  • Aurora restore, clone and backtrack:
    • Backups in Aurora work the same way as other RDS
    • Restores create a new cluster
    • Backtrack can be enabled per cluster. They allow in-place rewinds to a previous point-in-time
    • Fast clone: makes a new database much faster than copying all the data (references the original storage, modification will be added to the new clone)

Aurora Serverless

  • It provides a version of Aurora without the need to staticly provision the database instance
  • Removes admin overhead for managing db instances
  • Aurora Serverless uses the concept of ACU - Aurora Capacity Units: represent a certain amount of compute and a corresponding amount of memory
  • We can set minimum and maximum ACU values per cluster, can go down to 0
  • Consumption billing is per-second basis
  • Aurora Serverless provides the same resilience as Aurora provisioned (6 copies across AZs)
  • Aurora cluster architecture still exists, instead of using provisioned instances we have capacity units
  • Capacity unites are allocated from a warm pool of Aurora instances managed by AWS
  • ACUs are stateless, shared across multiple AWS customers
  • If the load increases beyond the ACU limit and the pool allows it, than more ACU will be allocated to the instance
  • In Aurora Serverless we have shared proxy fleet for connection management

Aurora Multi-Master

  • In contrast with default mode for Aurora, multi-master offers multiple endpoints which can be used for reads and writes
  • There is no cluster endpoint to use, the application is responsible for connection to instances in the cluster
  • There is no load balancing for the node endpoints


  • NoSQL, wide column, DB-as-service product
  • DynamoDB can handle key/value data or document data
  • It requires no self-managed servers of infrastructure to be managed
  • Supports a range of scaling options:
    • Manual/automatic provisioned performance IN/OUT
    • On-Demand mode
  • DynamoDB is highly resilient across AZs and optionally globally
  • DynamoDB is really fast, provides single-digit millisecond data retrieval
  • Provides automatic backups, point-in-time recovery and encryption at rest
  • Supports event-driven integration, provides actions when data is modified inside a table

DynamoDB Tables

  • A table in DynamoDB is a grouping of items with the same primary key
  • Primary key can be a simple primary key (Partition Key - PK) or composite primary key (Partition Key + Sort Key - SK)
  • In a table there are no limits to the number of items
  • In case of composite keys, the combination of PK and SK should be unique
  • Items can have, besides primary key, other data named attributes
  • Every item can be different as long as it has the same primary key
  • An item can be at max 400 KB
  • DynamoDB can be configured with provisioned and on-demand capacity (capacity = speed)
  • For on-demand capacity, we have to set:
    • Wite-Capacity Units (WCU): 1 WCU = 1KB per second
    • Read-Capacity Units (RCU): 1 RCU = 4KB per second

DynamoDB Backups

  • On-demand backups:
    • Full copy of the table is retained until the backup is removed
    • On-demand backups can be used restore data and config to same region or cross-region
    • We can retain or remove indexes
    • We can adjust encryption settings
  • Point-in-time Recovery:
    • Not enabled by default, has to be enabled
    • It is a continuous record of changes
    • Allows replay to any point in the window (35 days recovery window)
    • From this 35 day window we can restore to another table with a 1 second granularity

DynamoDB Operation, Consistency and Performance

  • We can chose between to different capacity mode at table creation: on-demand and provisioned
  • We may be able to switch between this capacity mode afterwards
  • On-demand capacity mode:
    • Designed for unknown, unpredictable load
    • Requires low administration
    • We don't have to explicitly set capacity settings, all handled by DynamoDB
    • We pay a price per million R or W unit
  • Provisioned capacity mode:
    • We set the RCU/WCU per table
  • Every operation consumes at least 1RCU/WCU
  • 1 RCU is 1 * 4KB read operation per second for strongly consistent reads, 2 * 4KB read operations per second for eventual consistent reads
  • 1 WCU is 1 * 1KB write operation per second
  • Every table has a RCU and WCU bust pool (300 seconds)
  • DynamoDB operations:
    • Query:
      • When a query is performed we need to take a partition key
      • Query item can return 0 or more items, but we have to specify the partition key every time
      • We can specify specific attribute we would want to be returned, we will be charged for querying the whole item anyway
    • Scan:
      • Least efficient operation, but the most flexible
      • Scan moves through a table consuming the capacity of every item
      • Any attribute can be used and any filters can be applied, but scan will consume the capacity for every item scanned through
  • DynamoDB can operate using two different consistency modes:
    • Eventually consistent
    • Strongly consistent
  • DynamoDB replicates data cross AZs using storage nodes. Storage nodes have a leader node, which is elected from the existing nodes
  • Writes are directed to leader node
  • The leader nodes replicates data to other nodes, typically finishing within a few milliseconds

WCU/RCU Calculation

  • Example: we need to store 10 items per second, 2.5K average size per item
    • WCU required: ROUND UP(ITEM SIZE / 1 KB) => 3 MULT by average (30) => WCU required = 30
  • Example: we need to retrieve 10 items per second, 2.5K average size per item
    • RCU required: ROUND UP (ITEM SIZE / 4 KB) => 1 MULT by average read ops per second (10) => Strongly consistent reads = 10, Eventually consistent reads => 5

DynamoDB Indexes

  • Are way to improve efficiency of data retrieval from DynamoDB
  • Indexes are alternative views on table data, allowing the query operation to work in ways that it couldn't otherwise
  • Local Secondary Indexes allow to create a view using different sort key, Global Secondary Indexes allow to create create different partition and sort key

Local Secondary Indexes (LSI)

  • Must be created with the base table!
  • We can have at max 5 LSIs per base table
  • LSIs are alternative sort key, same partition key
  • They share the same RCU and WCU with the main table
  • Attributes which can be projected into LSIs: ALL, KEYS_ONLY, INCLUDE (we can specifically pick which attribute to be included)
  • Indexes are sparse: only items which have a value in the index alternative sort key are added to the index

Global Secondary Indexes (GSI)

  • They can be created at any time
  • It is a default limit of 20 GSIs per base table
  • We cane define different partition and sort keys
  • GSIs have their own RCU and WCU allocations
  • Attributes projected into the index: ALL, KEYS_ONLY, INCLUDE
  • GSIs are also sparce, only items which have values in the new PK and optional SK are added to the index
  • GSIs are always eventually consistent, the data is replicated from the main table

LSI and GSI Considerations

  • We have to be careful with the projection, more capacity is consumed if we project unnecessary attributes
  • If we don't project a specific attribute and require that when querying the index, it will fetch the data from the main table, the query becoming inefficient
  • AWS recommends using GSIs as default, LSI only when strong consistency is required

DynamoDB Streams and Triggers

  • A stream is a time ordered list of item changes in a DynamoDB table
  • A stream is a 24H rolling window
  • Streams has to be enabled per table basis
  • Streams record inserts, updates and deletes
  • We can create different view types influencing what is in the stream
  • Available view types:
    • KEYS_ONLY: the stream will only record the partition key and available sort keys for items which did change
    • NEW_IMAGE: stores the entire item with the new state after the change
    • OLD_IMAGE: stores the entire state of the item before the change
    • NEW_AND_OLD_IMAGE: stores the before/after states of the items in case of a change
  • In some cases the new/old states recorded can be empty, example in case of a deletion the new state of an item is blank
  • Streams are the foundation for database triggers
  • An item change inside a table generate an event, which contains the data which changed
  • An action is taken using that data in case of event
  • We can use streams and Lambda in case of changes and events
  • Streams and triggers are useful for data aggregation, messaging, notifications, etc.

DynamoDB Accelerator (DAX)

  • It is an in-memory cache directly integrated with DynamoDB
  • DAX operates within a VPC, designed to be deployed in multiple AZs in a VPC
  • DAX is a cluster service, nodes are placed in different AZs. There a primary nodes from which data is replicated into replica nodes
  • DAX maintains 2 different caches:
    • Items cache: holds results of (Batch)GetItem calls
    • Query cache: holds the collection of items based on query/scan parameters
  • DAX is accessed via an endpoint. This endpoint load balances across nodes
  • Cache hits are returned in microseconds, cache misses in milliseconds
  • When writing data to DynamoDB, DAX uses write-through caching, the data is written at the same time to the cache as it is written to the DB
  • DAX is not suitable for applications requiring strongly consistent reads

DynamoDB Global Tables

  • Global tables provide multi-master cross-region replication
  • Tables are created in multiple regions and added to the same global table (becoming replicate tables)
  • DynamoDB utilizes last writer wins in conflict resolution
  • We can read and write to any region, updates are replicated generally sub-second
  • Strongly consistent reads are only supported in the same region as writes
  • Global tables provide global HA and global DR/BC

DynamoDB TTL

  • TTL = Time-to-Live
  • In order to use TTL we have to enable it on a table and select a specific attribute for the TTL
  • The attribute should contain a number representing an epoch (number of seconds)
  • A per-partition process periodically runs for checking the current time to the value in the TTL attribute
  • Items where the TTL attribute is older than the current time are set to expired
  • Another per-partition background process scans for expired items and removes them from tables and indexes, adding a delete event to the streams is enabled
  • These processes run on the background without affecting the performance of the table and without any additional charge
  • We can create a dedicated stream linked to the TTL processes, having 24h rolling window for deletes

AWS ElasticSearch - ES

  • Is a managed implementation of ElasticSearch (open source search solution)
  • ElasticSearch is part of the ELK stack (ElasticSearch, Logstash, Kibana)
  • ElasticSearch is not serverless, it runs in a VPC using compute
  • ES is an alternative to AWS services
  • Can be used for log analytics, monitoring, security analytics, full text search and indexing, clickstream analytics

ELK Stack

  • ElasticSearch: search and indexing services
  • Kibana: visualization and dashboard tool
  • Logstash: similar to CloudWatch Logs, needs a Logstash agent installed on anything to ingest data

Amazon Athena

alt text

  • It is a serverless interactive querying service
  • We can take data stored in S3 and perform ad-hoc queries on the data paying only for the data consumed
  • Athena uses a process named Schema-on-read - table-like translation
  • Original data in S3 is never changed, it remains in its original form. It is translated to the predefined schema when it is read for processing
  • Supported formats by Athena: XML, JSON, CSV/TSV, AVRO, PARQUET, ORC, Apache, CloudTrail, VPC Flowlogs, etc. Supports standard formats of structured data, semi-structured and unstructured data
  • "Tables" are defined in advance in a data catalog and data is projected through when read. It allows SQL-like queries on data without transforming source data
  • Athena has no infrastructure. We don't need set up anything in advance
  • Athena is ideal for situations where loading/transforming data isn't desired
  • It is preferred for querying AWS logs: VPC Flow Logs, CloudTrail, ELB logs, cost reports, etc
  • Athena Federated Query: Athena now supports querying other data sources than S3. Requires a data source connector (AWS Lambda)

Amazon Neptune

alt text

  • Neptune is a managed graph database in AWS
  • A graph database is a database type where the relationships between the data is as important as the data itself
  • Neptune runs in a VPC, private by default
  • It is resilient, it can be deployed in multiple AZs and scales via read replicas
  • It does continuous backups and allows point-in-time recovery
  • Common use cases for graph based data models:
    • Social media used databases - anything involving fluid relationships
    • Fraud prevention
    • Recommendation engines
    • Network and IT Operations
    • Biology and other life sciences

Amazon Quantum Ledger Database - QLDB

alt text

  • Part of AWS Blockchain part of products
  • It as an immutable append-only ledger-only database
  • It provides a cryptographically verifiable transaction log
  • It is transparent: full history is always accessible
  • It is a serverless product, it provides Ledgers and Tables. We have no servers to manage
  • It is resilient through 3 AZs, replicates data within each of those AZs
  • It can stream data to Amazon Kinesis, it can stream any changes to data into Kinesis in real-time
  • It is a document database model, storing JSON documents
  • Provides ACID transactions
  • Uses cases for QLDB:
    • Anything related to finance: account balances and transactions
    • Medical application: full history of data changes matters
    • Logistics: track movement of objects
    • Legal: track the usage and change of data (custody)


Bình luận

{{ }}
Bỏ hay Hay
Male avatar
{{ comment_error }}

Hiển thị thử

Chỉnh sửa


Nguyễn Huy Hoàng

17 bài viết.
10 người follow
{{userFollowed ? 'Following' : 'Follow'}}
Cùng một tác giả
11 4
(Ảnh) Tại hội nghị Build 2016 diễn ra từ ngày 30/3 đến hết ngày 1/4 ở San Francisco, Microsoft đã đưa ra 7 thông báo lớn, quan trọng và mang tầm c...
Nguyễn Huy Hoàng viết hơn 4 năm trước
11 4
7 0
Viết code chạy một cách trơn tru ngay lần đầu tiên là một việc rất khó, thậm chí là bất khả thi. Do đó debug là một kỹ năng vô cùng quan trọng đối ...
Nguyễn Huy Hoàng viết hơn 4 năm trước
7 0
Bài viết liên quan
0 0
FSx FSx For Windows File Servers FSx for Windows are fully managed native Windows file servers/file shares Designed for integration with Wind...
Nguyễn Huy Hoàng viết 10 ngày trước
0 0


{{ comment_count }}

bình luận

{{liked ? "Đã kipalog" : "Kipalog"}}

{{userFollowed ? 'Following' : 'Follow'}}
17 bài viết.
10 người follow

 Đầu mục bài viết

Vẫn còn nữa! x

Kipalog vẫn còn rất nhiều bài viết hay và chủ đề thú vị chờ bạn khám phá!