AWS SAP Notes 10 - Migrations and Extensions
aws sap

Nguyễn Huy Hoàng viết ngày 10/10/2021

The 6R's of Cloud Migrations

  • Are a set of 6 different strategies of migrating systems into the cloud
  • Starting point of a migration is the discover, assess and prioritize all the applications to migrate
  • Once we have done this, there are 6 different ways of doing migrations:
    • Rehosting: lif and shift
    • Replatforming: lift and shift with optimizations
    • Repurchase: migrate the application while using something newer, example SaaS
    • Refactoring / Re-architecting: take advantage of the feature offered by the cloud
    • Retire: dump the applications which are no longer needed
    • Retain: do not migrate the application, not worth the time and money or it is too scary to migrate


  • Lift and shift or migrate as is: move the application with the least amount of changes into the cloud
  • Generally used with legacy or monolithic applications
  • Reasons to do application rehosting:
    • Reduce admin overhead using IaaS
    • Potentially easier to optimize the application when is running in the cloud compared to working with legacy tooling
    • Cost savings, consuming a certain type of instances
  • Negatives:
    • We wont be able to take advantage of the full Cloud offerings
    • Potentially "kicking the can down the road" - delaying what we can do today until tomorrow
  • For doing rehosting we can use VM Import/Export tools and Server Migration Service


  • Similar to rehosting with the addition of applying certain optimizations to the applications as part of the migration process
  • We might decide to use RDS instead of self-managed database instances
  • We might use ELB's instead of self managed load balancers
  • We might use S3 as a backup or media storage
  • Replatforming approach brings no real negatives but also no world-changing benefits
  • Potential benefits of migration:
    • Admin overhead reduction
    • Performance benefits
    • More effective backups
    • Improved HA/FT


  • Unless we have a reason to use self-managed application, we would rather use XaaS products
  • Examples of this kind of migrations:
    • MS Exchange => Microsoft 365
    • Self managed CRM => SalesForce
    • Self managed payroll => Xero
  • Using a managed service reduces admin overhead, costs and risks. Almost always a preferred option

Refactoring / Re-architecting

  • Requires a full review of the architecture of an application
  • The aim is to adopt cloud-native architecture and products
  • We might look at adopting service-origin and microservice based architectures
  • We might adopt a more API based architecture, event-driven architecture or serverless architecture
  • This approach is initially very expensive and time-consuming
  • In the long term it does offer the best benefits compared to other types of migrations


  • Systems are often running for no reasons
  • Auditing their usage is often more work than leaving them to run
  • A migration is perfect time to re-evaluate the usefulness of an application. If we don't need it, we should switch if off
  • Often saves 10% to 20% cost in case of large scale migration


  • Essentially do nothing
  • For some application the usage is uncertain with some complicating factors against being able to retire it or migrate it to the cloud
  • In other cases some old applications might have some usage, but it wont worth the effort to move them to the cloud
  • Or we might have a complex application - leave it till later
  • Super-import application - risky to move
  • The best advice is to complete the migration of other applications and swing back to focus on the left-overs

6R Migration Plan

alt text

DMS - Database Migration Service

alt text

  • Database migrations are complex actions to perform
  • DMS it is a managed database migration service
  • It starts with using a replication instance running on EC2
  • This instance runs one or more replication task
  • For these tasks we have to specify the source and destination endpoints at source and target databases. One endpoint must be on AWS!
  • Databases supported are: MySQL, Aurora, Microsoft SQL, MariaDB, MongoDB, PostgreSQL, Oracle, Azure SQL, etc.
  • Jobs can be one of 3 types:
    • Full load migrations: used to migrate existing data, simply migrates the data from source to target. Great if we can afford an outage for the source DB
    • Full load + CDC: migrates the existing data and replicates any ongoing changes
    • CDC only: designed to replicate only data changes. In some situations might be more efficient to use other tools for full migration and use CDC only for ongoing changes afterwards
  • DMS does not support any form of schema conversions, for this we should use Schema Conversion Tool (SCT) provided by AWS

SCT - Schema Conversion Tool

  • SCT is a standalone app used for converting one database engine to another including conversion of schema from a DB to S3
  • SCT is not used when migrating between DBs of the same type
  • SCT works with OLTP DBS (MySQL, Oracle, Aurora, etc.) and OLAP databases (Teradata, Oracle, Vertica, Greenplum, etc.)

DMS and Snowball

  • Larger migrations might imply moving dbs with sizes of multi-TB
  • Moving data over networks takes time and consumes capacity
  • DMS is able to utilise Snowball products to migrate databases
  • Migration steps:
    1. Use SCT to extract data locally and move the data to a Snowball
    2. Ship the device back to AWS. They will load the data into an S3 bucket
    3. DMS migrates from S3 into a target source
    4. Change Data Capture (CDC) can capture changes and via S3 intermediary they are also written to the target database

VM Migrations AWS <=> On-Premises

Application Discovery Service (AMS)

  • Allows us to discover on-premises infrastructure:
    • What VM we have
    • What CPU and memory they are allocated
    • MAC addresses
    • Resource utilization
    • etc.
  • It also tracks these properties over time for more effective migration
  • AMS runs in 2 modes:
    • Agentless (Application Discovery Agentless Connector):
      • OVA appliance integrating with VMWare
      • Measures performance and resource usage, information which can be obtained from the outside of a VM
    • Agent Based mode: offers additional information from inside of a VM
      • Offers data gathering for network, processes, performance
      • We can see applications running on a VM
      • We can even see dependencies between VM based on network activity
  • AMS integrates with AWS Migration Hub and Athena
  • AWS Migration Hub: tracks migrations of different types in AWS

Server Migration Service (SMS)

alt text

  • Used to migrate whole VMs into AWS (including OS, Data, Apps, etc.)
  • This is the tool which actually performs the migration
  • It runs on agentless mode using a connector which runs on-premises
  • Integrates with VMware, Hyper-V and AzureVM
  • SMS does incremental replication of live volumes
  • Offers orchestration of multi-servers migrations
  • Creates AMIs which can be used to create EC2 instances
  • Integrates with AWS Migration Hub

Storage Gateway

  • Normally runs as a VM on-premises (or hardware appliance)
  • Acts as bridge between storage that exists on-premises and AWS
  • Presents storage using iSCSI, NFS or SMB
  • On AWS integrates with EBS, S3 and Glacier
  • Storage gateways is used for migrations, extensions, storage tiering, DR and replacement of backup systems

Volume Gateway

  • Offers 2 different types of operation:
    • Stored Mode:
      • The VM appliance presents volumes over iSCI to servers
      • Servers can create files system on and use it in a normal way
      • This volumes consume capacity on-premises
      • Storage gateway has local storage, used as primary storage, everything is stored locally
      • Upload buffer: any data written to the local storage is also copied in the upload buffer and it will be uploaded to the cloud asynchronously via the storage gateway endpoint
      • The upload data is copied into S3 as EBS snapshots which can be converted into EBS volumes
      • It is great to do full disk backups, offering excellent RTO and RPO values
      • Stored Mode does not allow extending the data center capacity! The full copy of the data is stored locally alt text
    • Cached Mode:
      • Cached Mode shares the same basic architecture with Stored Mode
      • The main location of data is no longer on-premises, it is on AWS S3
      • It has a local cache for the data only storing the frequently accessed data, the primary data will be in S3
      • The data will be stored in AWS managed area of S3, meaning it wont be visible using the AWS console. It can be viewed from the storage gateway console
      • The data is stored in raw block state
      • We can create EBS volumes out of the data
      • Cached Mode allows for an architecture know as datacenter extension alt text

Tape - VTL Mode

alt text

  • VTL - Virtual Tape Library
  • Examples of tape backups: LTO-9 Media which can hold 24TB raw data per tape
  • Tape Loader (Robot): robot arm can insert/remove/swap tapes
  • A Library is 1 ore more drives, 1 or more loaders and slots
  • Traditional tape backup architecture: alt text
  • Storage Gateway Tape (VTL) Mode architecture: alt text
  • A Virtual tape can be from 100 GiB to 5 TiB
  • A Storage Gateway can handle at max 1PB ot data across 1500 virtual tapes
  • When virtual tapes are not used, they can be exported in the backup software marking them not being in the library (equivalent of ejecting them and moving them to the offsite storage)
  • When exported, the virtual tape is archived in the Virtual Shelf which is backed by Glacier

File Mode

  • Storage Gateway manages files in File Mode
  • File Gateway bridges on-premises file storage and S3
  • With File Gateway we create one or more mount points (shares) available via NFS or SMB
  • File Gateways maps directly onto on S3 bucket above which we have visibility from the AWS console
  • File Mode uses Read an Write Caching ensuring LAN-like performance
  • File Gateway architecture: alt text
  • For Windows environments we can use AD authentication to access the File Gateway
  • File Mode can be used for multiple contributors (multiple shares on-premises)
  • NotifyWhenUploaded: API to notify other gateways when objects are changed
  • File Gateway does not support any kind of object locking!
  • The bucket backing the File Gateway can be used with CRR
  • The lifecycle policies can also be used for files to be moved automatically between classes

Snowball and Snowmobile

  • Snowball series are designed to move large amount of data in or out of AWS
  • The products in Snow series are physical storage units: suitcases and trucks
  • We can order them empty, load them up and return them or vice-versa


  • It is a device which is ordered from AWS, log a job and device will be delivered to us
  • Any data stored in Snowball is encrypted using KMS
  • There are 2 types of devices with 50TB and 80TB capacity
  • In terms of network connectivity we can have 1Gbps (RJ45 1GBase-TX) or 10Gbps (LR/SR) networking
  • Economical range for a Snowball is 10TB to 10PB range of data (multiple devices can be used)
  • Multiple devices can be ordered and be sent to multiple business premises
  • Snowball only includes storage capability

Snowball Edge

  • Includes both storage capability and compute capability
  • It has a larger capacity compared to classic Snowball and has faster networking connection
  • There are 3 different type of Snowball Edge:
    • Storage optimized (with EC2 capability): 80TB, 24vCPU, 32Gib RAM, 1TB SSD
    • Compute optimized: 100TB + 7.68 NVME, 52vCPU, 208Gib RAM
    • Compute optimized with GPU: 100TB + 7.68 NVME, 52vCPU, 208Gib RAM, GPU
  • Ideal for remote sites or where data processing on ingestion is needed


  • Portable data center within a shipping container on a truck
  • Needs to be specially ordered from AWS
  • Ideal for single location when 10 PB+ is required
  • Can store up to 100PB of data per Snowmobile
  • Not economical for multi-site or sub 10PB

AWS DataSync

  • It is a data transfer service which allows data to be transferred into or out of AWS
  • Can be used for workflows such as migrations, data processing transfers, archival, cost effective storage, DR/BC
  • Each agent can handle 10Gbps transfer speed, each job can handle 15 million files
  • It also handles the transfer of metadata (permissions, timestamps)
  • It provides built in data validation

Key Features

  • Scalable: 10Gbps per agent (~100TB of data per day)
  • Bandwidth Limiters: used to avoid link saturation
  • Incremental and scheduled transfer options
  • Compressions and encryption
  • Automatic recovery from transit errors
  • Service integration: S3, EFS, FSx, service-to-service transfer
  • Pay as you use service: per GB of data transferred
  • The DataSync agent runs on a virtualization platform such as VMWare

DataSync Components

  • Task: a job within DataSync, defines what is being synced, how quickly, from where and to where
  • Agent: software used to read or write to on-premises data stores using NFS or SMB
  • Location: every task has two locations from and to, examples: Network File System (NFS), Server Message Block (SMB), Amazon EFS, Amazon FSx and Amazon S3


Bình luận

{{ }}
Bỏ hay Hay
Male avatar
{{ comment_error }}

Hiển thị thử

Chỉnh sửa


Nguyễn Huy Hoàng

17 bài viết.
10 người follow
{{userFollowed ? 'Following' : 'Follow'}}
Cùng một tác giả
11 4
(Ảnh) Tại hội nghị Build 2016 diễn ra từ ngày 30/3 đến hết ngày 1/4 ở San Francisco, Microsoft đã đưa ra 7 thông báo lớn, quan trọng và mang tầm c...
Nguyễn Huy Hoàng viết hơn 4 năm trước
11 4
7 0
Viết code chạy một cách trơn tru ngay lần đầu tiên là một việc rất khó, thậm chí là bất khả thi. Do đó debug là một kỹ năng vô cùng quan trọng đối ...
Nguyễn Huy Hoàng viết hơn 4 năm trước
7 0
1 0
MultiFactor Authentication (MFA) Factor: different piece of evidence which proves the identity Factors: Knowledge: something we as users know: ...
Nguyễn Huy Hoàng viết 2 tháng trước
1 0
Bài viết liên quan
0 0
FSx FSx For Windows File Servers FSx for Windows are fully managed native Windows file servers/file shares Designed for integration with Wind...
Nguyễn Huy Hoàng viết 2 tháng trước
0 0
0 0
CloudFront It is a content deliver network (CDN) Its job is to improve the delivery of content from its original location to the viewers of the...
Nguyễn Huy Hoàng viết 2 tháng trước
0 0


{{ comment_count }}

bình luận

{{liked ? "Đã kipalog" : "Kipalog"}}

{{userFollowed ? 'Following' : 'Follow'}}
17 bài viết.
10 người follow

 Đầu mục bài viết

Vẫn còn nữa! x

Kipalog vẫn còn rất nhiều bài viết hay và chủ đề thú vị chờ bạn khám phá!