Migrating to Google Cloud SQL using Database Migration Service
Overview
Database Migration Service provides options for one-time and continuous jobs to migrate data to Cloud SQL using different connectivity options, including IP allowlists, VPC peering, and reverse SSH tunnels.
In this blog, I will detail all the steps needed to migrate a stand-alone PostgreSQL database (running on a virtual machine) to Cloud SQL for PostgreSQL using a continuous Database Migration Service job and VPC peering for connectivity.
Migrating a database via Database Migration Service requires some preparation of the source database, including creating a dedicated user with replication rights, adding the pglogical database extension to the source database and granting rights to the schemata and tables in the database to be migrated, as well as the postgres database, to that user.
After we create and run the migration job, we confirm that an initial copy of our database has been successfully migrated to our Cloud SQL for PostgreSQL instance. We also explore how continuous migration jobs apply data updates from our source database to our Cloud SQL instance. To conclude the migration job, we promote the Cloud SQL instance to be a stand-alone database for reading and writing data.
Objectives
After reading this blog, we will have learnt how to configure a continuous Database Migration Service job to migrate databases from a PostgreSQL instance to Cloud SQL for PostgreSQL.
- Prepare the source database for migration. Specifically we want to migrate ordersand gmemegen_db databases for demo purposes.
- Create a profile for a source connection to a PostgreSQL instance (e.g., stand-alone PostgreSQL).
- Configure connectivity between the source and destination database instances using VPC peering.
- Configure firewall and database access rules to allow access to the source database for migration.
- Create, run, and verify a continuous migration job using Database Migration Service.
- Promote the destination instance (Cloud SQL for PostgreSQL) to be a stand-alone database for reading and writing data.
Setup
Activate Cloud Shell
Cloud Shell is a virtual machine that is loaded with development tools. It offers a persistent 5GB home directory and runs on the Google Cloud. Cloud Shell provides command-line access to our Google Cloud resources.
- Click Activate Cloud Shell at the top of the Google Cloud console.
When we are connected, we are already authenticated, and the project is set to our Project_ID. The output contains a line that declares the Project_ID for this session:
Our Cloud Platform project in this session is set to “PROJECT_ID”
gcloud is the command-line tool for Google Cloud. It comes pre-installed on Cloud Shell and supports tab-completion.
- (Optional) We can list the active account name with this command:
gcloud auth list
- Click Authorize.
Output:
ACTIVE: *
ACCOUNT: "ACCOUNT"
To set the active account, run:
$ gcloud config set account `ACCOUNT`
- (Optional) We can list the project ID with this command:
gcloud config list project
Output:
core]
project = "PROJECT_ID"
Verify that the Database Migration API is enabled
- In the Google Cloud console, enter Database Migration API in the top search bar. Click on the result for Database Migration API.
This page will either show status information or give we the option to enable the API.
- If necessary, Enable the API.
Verify that the Service Networking API is enabled
The Service Networking API is required in order to be able to configure Cloud SQL to support VPC Peering and connections over a private ip-address.
- In the Google Cloud Console, enter Service Networking API in the top search bar. Click on the result for Service Networking API.
This page will either show status information or give we the option to enable the API.
- If necessary, enable the API.
Step1. Prepare the source database for migration
In this Step, we will add supporting features to the source database which are required in order for Database Migration Service to perform a migration. These are:
- Installing and configuring the pglogical database extension.
- Configuring the stand-alone PostgreSQL database to allow access from Cloud Shell and Cloud SQL.
- Adding the pglogical database extension to the postgres, ordersand gmemegen_db databases on the stand-alone server.
- Creating a migration_adminuser (with Replication permissions) for database migration and granting the required permissions to schemata and relations to that user.
Upgrade the database with the pglogical extension
In this step we will download and add the pglogical database extension to the orders and postgres databases on the postgresql-vm VM Instance.
- In the Google Cloud Console, on the Navigation menu(), click Compute Engine > VM instances.
- In the entry for postgresql-vm, under Connectclick SSH.
- If prompted, click Authorize.
- In the terminal in the new browser window, install the pglogicaldatabase extension:
sudo apt install postgresql-13-pglogical
Note: pglogical is a logical replication system implemented entirely as a PostgreSQL extension. Fully integrated, it requires no triggers or external programs. This alternative to physical replication is a highly efficient method of replicating data using a publish/subscribe model for selective replication.
- Download and apply some additions to the PostgreSQL configuration files (to enable pglogical extension) and restart the postgresql service:
sudo su – postgres -c “gsutil cp gs://cloud-training/Migration/pg_hba_append.conf .”
sudo su – postgres -c “gsutil cp gs://cloud-training/Migration/postgresql_append.conf .”
sudo su – postgres -c “cat pg_hba_append.conf >> /etc/postgresql/13/main/pg_hba.conf”
sudo su – postgres -c “cat postgresql_append.conf >> /etc/postgresql/13/main/postgresql.conf”
sudo systemctl restart postgresql@13-main
In pg_hba.conf these commands added a rule to allow access to all hosts:
#MIGRATION - allow access to all hosts
host all all 0.0.0.0/0 md5
In postgresql.conf, these commands set the minimal configuration for pglogical to configure it to listen on all addresses:
MIGRATION - added configuration for pglogical database extension
wal_level = logical # minimal, replica, or logical
max_worker_processes = 10 # one per database needed on provider node
# one per node needed on subscriber node
max_replication_slots = 10 # one per node needed on provider node
max_wal_senders = 10 # one per node needed on provider node
shared_preload_libraries = 'pglogical'
max_wal_size = 1GB
min_wal_size = 80MB
listen_addresses = '*' # what IP address(es) to listen on, '*' is all
The above code snippets were appended to the relevant files and the PostgreSQL service restarted.
- Launch the psqltool:
sudo su – postgres
psql
- Add the pglogicaldatabase extension to the postgres, orders and gmemegen_db
\c postgres;
CREATE EXTENSION pglogical;
\c orders;
CREATE EXTENSION pglogical;
\c gmemegen_db;
CREATE EXTENSION pglogical;
- List the PostgreSQL databases on the server:
Here we can see, besides the default postgresql databases, the orders and gmemegen_db databases provided for demo purposes. We will not use the gmemegen_db database in this step, but will include it in the migration for use in a later step.
List of databases:
Name | Owner | Encoding | Collate | Ctype | Access privileges
-------------+----------+----------+---------+---------+-----------------------
gmemegen_db | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
orders | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
postgres | postgres | UTF8 | C.UTF-8 | C.UTF-8 |
template0 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
template1 | postgres | UTF8 | C.UTF-8 | C.UTF-8 | =c/postgres +
| | | | | postgres=CTc/postgres
(5 rows)
Create the database migration user
In this step we will create a dedicated user for managing database migration.
- In psql, enter the commands below to create a new user with the replication role:
CREATE USER migration_admin PASSWORD ‘DMS_1s_cool!’;
ALTER DATABASE orders OWNER TO migration_admin;
ALTER ROLE migration_admin WITH REPLICATION;
Assign permissions to the migration user
In this step we will assign the necessary permissions to the migration_admin user to enable Database Migration Service to migrate our database.
- In psql, grant permissions to the pglogicalschema and tables for the postgres
\c postgres;
GRANT USAGE ON SCHEMA pglogical TO migration_admin;
GRANT ALL ON SCHEMA pglogical TO migration_admin;
GRANT SELECT ON pglogical.tables TO migration_admin;
GRANT SELECT ON pglogical.depend TO migration_admin;
GRANT SELECT ON pglogical.local_node TO migration_admin;
GRANT SELECT ON pglogical.local_sync_status TO migration_admin;
GRANT SELECT ON pglogical.node TO migration_admin;
GRANT SELECT ON pglogical.node_interface TO migration_admin;
GRANT SELECT ON pglogical.queue TO migration_admin;
GRANT SELECT ON pglogical.replication_set TO migration_admin;
GRANT SELECT ON pglogical.replication_set_seq TO migration_admin;
GRANT SELECT ON pglogical.replication_set_table TO migration_admin;
GRANT SELECT ON pglogical.sequence_state TO migration_admin;
GRANT SELECT ON pglogical.subscription TO migration_admin;
- In psql, grant permissions to the pglogicalschema and tables for the orders
\c orders;
GRANT USAGE ON SCHEMA pglogical TO migration_admin;
GRANT ALL ON SCHEMA pglogical TO migration_admin;
GRANT SELECT ON pglogical.tables TO migration_admin;
GRANT SELECT ON pglogical.depend TO migration_admin;
GRANT SELECT ON pglogical.local_node TO migration_admin;
GRANT SELECT ON pglogical.local_sync_status TO migration_admin;
GRANT SELECT ON pglogical.node TO migration_admin;
GRANT SELECT ON pglogical.node_interface TO migration_admin;
GRANT SELECT ON pglogical.queue TO migration_admin;
GRANT SELECT ON pglogical.replication_set TO migration_admin;
GRANT SELECT ON pglogical.replication_set_seq TO migration_admin;
GRANT SELECT ON pglogical.replication_set_table TO migration_admin;
GRANT SELECT ON pglogical.sequence_state TO migration_admin;
GRANT SELECT ON pglogical.subscription TO migration_admin;
- In psql, grant permissions to the publicschema and tables for the orders
GRANT USAGE ON SCHEMA public TO migration_admin;
GRANT ALL ON SCHEMA public TO migration_admin;
GRANT SELECT ON public.distribution_centers TO migration_admin;
GRANT SELECT ON public.inventory_items TO migration_admin;
GRANT SELECT ON public.order_items TO migration_admin;
GRANT SELECT ON public.products TO migration_admin;
GRANT SELECT ON public.users TO migration_admin;
- In psql, grant permissions to the pglogicalschema and tables for the gmemegen_db
\c gmemegen_db;
GRANT USAGE ON SCHEMA pglogical TO migration_admin;
GRANT ALL ON SCHEMA pglogical TO migration_admin;
GRANT SELECT ON pglogical.tables TO migration_admin;
GRANT SELECT ON pglogical.depend TO migration_admin;
GRANT SELECT ON pglogical.local_node TO migration_admin;
GRANT SELECT ON pglogical.local_sync_status TO migration_admin;
GRANT SELECT ON pglogical.node TO migration_admin;
GRANT SELECT ON pglogical.node_interface TO migration_admin;
GRANT SELECT ON pglogical.queue TO migration_admin;
GRANT SELECT ON pglogical.replication_set TO migration_admin;
GRANT SELECT ON pglogical.replication_set_seq TO migration_admin;
GRANT SELECT ON pglogical.replication_set_table TO migration_admin;
GRANT SELECT ON pglogical.sequence_state TO migration_admin;
GRANT SELECT ON pglogical.subscription TO migration_admin;
- In psql, grant permissions to the publicschema and tables for the gmemegen_db
GRANT USAGE ON SCHEMA public TO migration_admin;
GRANT ALL ON SCHEMA public TO migration_admin;
GRANT SELECT ON public.meme TO migration_admin;
The source databases are now prepared for migration. The permissions we have granted to the migration_admin user are all that is required for Database Migration Service to migrate the postgres, orders and gmemegen_db databases.
Make the migration_admin user the owner of the tables in the orders database, so that we can edit the source data later, when we test the migration.
- In psql, run the following commands:
\c orders;
\dt
ALTER TABLE public.distribution_centers OWNER TO migration_admin;
ALTER TABLE public.inventory_items OWNER TO migration_admin;
ALTER TABLE public.order_items OWNER TO migration_admin;
ALTER TABLE public.products OWNER TO migration_admin;
ALTER TABLE public.users OWNER TO migration_admin;
\dt
List of relations
Schema | Name | Type | Owner
--------+----------------------+-------+-------
public | distribution_centers | table | migration_admin
public | inventory_items | table | migration_admin
public | order_items | table | migration_admin
public | products | table | migration_admin
public | users | table | migration_admin
(5 rows)
- Exit psqland the postgres user session
\q
exit
Step2. Create a Database Migration Service connection profile for a stand-alone PostgreSQL database
In this step, we will create a connection profile for the PostgreSQL source instance.
Get the connectivity information for the PostgreSQL source instance
In this step, we identify the internal IP address of the source database instance that we will migrate to Cloud SQL.
- In the Google Cloud Console, on the Navigation menu(), click Compute Engine > VM instances.
- Locate the line with the instance called postgresql-vm.
- Copy the value for Internal IP(e.g., 10.128.0.2).
Create a new connection profile for the PostgreSQL source instance
A connection profile stores information about the source database instance (e.g., stand-alone PosgreSQL) and is used by the Database Migration Service to migrate data from the source to our destination Cloud SQL database instance. After we create a connection profile, it can be reused across migration jobs.
In this step we will create a new connection profile for the PostgreSQL source instance.
- In the Google Cloud Console, on the Navigation menu(), click Database Migration > Connection profiles.
- Click + Create Profile.
- For Database engine, select PostgreSQL.
- For Connection profile name, enter postgres-vm.
- For Hostname or IP address, enter the internal IP for the PostgreSQL source instance that we copied in the previous Step(e.g., 10.128.0.2)
- For Port, enter 5432.
- For Username, enter migration_admin.
- For Password, enter DMS_1s_cool!.
- For Regionselect the region we have choosen for the demo.
- For all other values leave the defaults.
- Click Create.
A new connection profile named postgres-vm will appear in the Connections profile list.
Step3. Create and start a continuous migration job
When we create a new migration job, we first define the source database instance using a previously created connection profile. Then we create a new destination database instance and configure connectivity between the source and destination instances.
In this step, we use the migration job interface to create a new Cloud SQL for PostgreSQL database instance and set it as the destination for the continuous migration job from the PostgreSQL source instance.
Create a new continuous migration job
In this step we will create a new continuous migration job.
- In the Google Cloud Console, on the Navigation menu(), click Database Migration > Migration jobs.
- Click + Create Migration Job.
- For Migration job name, enter vm-to-cloudsql.
- For Source database engine, select PostgreSQL.
- For Destination region, select region you chose before.
- For Destination database engine, select Cloud SQL for PostgreSQL.
- For Migration job type, select Continuous.
Leave the defaults for the other settings.
- Click Save & Continue.
Define the source instance
In this step, we will define the source instance for the migration.
- For Source connection profile, select postgres-vm.
Leave the defaults for the other settings.
- Click Save & Continue.
Create the destination instance
In this step, we will create the destination instance for the migration.
- For Destination Instance ID, enter postgresql-cloudsql.
- For Password, enter supersecret!.
- For Choose a Cloud SQL edition, select Enterprise
- For Database version, select Cloud SQL for PostgreSQL 13.
- In Choose region and zonesection, select Single zone and select a zone as primary zone.
- For Instance connectivity, select Private IPand Public IP.
- Select Use an automatically allocated IP range.
Leave the defaults for the other settings.
- Click Allocate & Connect.
Note: This step may take a few minutes. If asked to retry the request, click the Retry button to refresh the Service Networking API.
When this step is complete, an updated message notifies we that the instance will use the existing managed service connection.
We will need to edit the pg_hba.conf file on the VM instance to allow access to the IP range that is automatically generated in point 5 of the previous step. We will do this in a later step before testing the migration configuration at the end of this step.
Enter the additional information needed to create the destination instance on Cloud SQL.
- For Machine shapes. check 1 vCPU, 3.75 GB
- For Storage type, select SSD
- For Storage capacity, select 10 GB
- Click Create & Continue.
If prompted to confirm, click Create Destination & Continue. A message will state that our destination database instance is being created. Continue to the next step while we wait.
Define the connectivity method
In this step, we will define the connectivity method for the migration.
- For Connectivity method, select VPC peering.
- For VPC, select default.
VPC peering is configured by Database Migration Service using the information provided for the VPC network (the default network in this example).
When we see an updated message that the destination instance was created, proceed to the next step.
- Click Configure & Continue.
Allow access to the posgresql-vm instance from automatically allocated IP range
In this step we will edit the pg_hba.conf PostgreSQL configuration file to allow the Database Migration Service to access the stand-alone PostgreSQL database.
- Get the allocated IP address range. In the Google Cloud Console on the Navigation menu(), right-click VPC network > VPC network peering and open it in a new tab.
- Click on the servicenetworking-googleapis-com
- In the Imported routestab, select and copy the Destination IP range (e.g. 10.107.176.0/24).
- In the Terminal session on the VM instance, edit the conffile as follows:
sudo nano /etc/postgresql/13/main/pg_hba.conf
- On the last line of the file:
#MIGRATION - allow access to all hosts
host all all 0.0.0.0/0 md5
replace the “all IP addresses” range (0.0.0.0/0) with the range copied in point 3 above.
#MIGRATION - allow access to all hosts
host all all 10.107.176.0/24 md5
Note: The above step is not required to make the migration work, but it is good practice to make the source database more secure during the migration process, and also restricts access after the migration when the migrated database becomes the source of truth.
- Save and exit the nano editor with Ctrl-O, Enter, Ctrl-X
- Restart the PostgreSQL service to make the changes take effect. In the VM instance Terminal session:
sudo systemctl start postgresql@13-main
Test and start the continuous migration job
In this step, we will test and start the migration job.
- In the Database Migration Servicetab we open earlier, review the details of the migration job.
- Click Test Job.
- After a successful test, click Create & Start Job.
If prompted to confirm, click Create & Start.
Review the status of the continuous migration job
In this step, we will confirm that the continuous migration job is running.
- In the Google Cloud Console, on the Navigation menu(), click Database Migration > Migration jobs.
- Click the migration job vm-to-cloudsqlto see the details page.
- Review the migration job status.
- If we have not started the job, the status will show as Not started. We can choose to start or delete the job.
- After the job has started, the status will show as Startingand then transition to Running Full dump in progress to indicate that the initial database dump is in progress.
- After the initial database dump has been completed, the status will transition to Running CDC in progressto indicate that continuous migration is active.
When the job status changes to Running CDC in progress, proceed to the next step.
Note: Continuous migration jobs remain in a running status to ensure that the destination database continues to receive data updates from the source.
A completed status is achieved after the destination database is promoted to be a stand-alone database for reading and writing data which we will see in the final Step.
Step4. Confirm the data in Cloud SQL for PostgreSQL
Check the PostgreSQL databases in Cloud SQL
- In the Google Cloud Console, on the Navigation menu(), click Databases > SQL.
- Expand the instance ID called postgresql-cloudsql-master.
- Click on the instance postgresql-cloudsql(PostgreSQL read replica).
- In the Replica Instancemenu, click Databases.
Notice that the databases called postgres, orders and gmemegen_db have been migrated to Cloud SQL.
Connect to the PostgreSQL instance
- In the Replica Instancemenu, click Overview.
- Scroll down to the Connect to this instancesection and click Open Cloud Shell.
The command to connect to PostgreSQL will pre-populate in Cloud Shell:
gcloud sql connect postgresql-cloudsql --user=postgres --quiet
- Run the pre-populated command.
If prompted, click Authorize for the API.
- When prompted for a password, which we previously set, enter:
supersecret!
We have now activated the PostgreSQL interactive console for the destination instance.
Review the data in the Cloud SQL for PostgreSQL instance
To select the database in the PostgreSQL interactive console, run the following command:
- \c orders;
- When prompted for a password, enter:
- supersecret!
- Query the distribution_centerstable:
select * from distribution_centers;
(Output)
longitude | latitude | name | id
-----------+----------+---------------------------------------------+----
-89.9711 | 35.1174 | Memphis TN | 1
-87.6847 | 41.8369 | Chicago IL | 2
-95.3698 | 29.7604 | Houston TX | 3
-118.25 | 34.05 | Los Angeles CA | 4
-90.0667 | 29.95 | New Orleans LA | 5
-73.7834 | 40.634 | Port Authority of New York/New Jersey NY/NJ | 6
-75.1667 | 39.95 | Philadelphia PA | 7
-88.0431 | 30.6944 | Mobile AL | 8
-79.9333 | 32.7833 | Charleston SC | 9
-81.1167 | 32.0167 | Savannah GA | 10
- Exit the PostgreSQL interactive console by typing:
\q
Update stand-alone source data to test continuous migration
- In Cloud Shell, type the following commands to connect to the source PostgreSQL instance:
export VM_NAME=postgresql-vm
export PROJECT_ID=$(gcloud config list –format ‘value(core.project)’)
export POSTGRESQL_IP=$(gcloud compute instances describe ${VM_NAME} \
–zone=(zone) –format=”value(networkInterfaces[0].accessConfigs[0].natIP)”)
echo $POSTGRESQL_IP
psql -h $POSTGRESQL_IP -p 5432 -d orders -U migration_admin
Note: The above is an alternative approach to accessing the stand-alone database on the VM instance.
- When prompted for a password, enter:
DMS_1s_cool!
- In psql, enter the following commands:
\c orders;
insert into distribution_centers values(-80.1918,25.7617,’Miami FL’,11);
- Close the interactive psqlsession:
\q
Connect to the Cloud SQL PostgreSQL database to check that updated data has been migrated
- In Cloud Shell, type the following commands to connect to the destination Cloud SQL PostgreSQL instance:
gcloud sql connect postgresql-cloudsql –user=postgres –quiet
- When prompted for a password, which we previously set, enter the password for the Cloud SQL instance:
supersecret!
We have now activated the PostgreSQL interactive console for the destination instance.
Review data in Cloud SQL for PostgreSQL database
- In Cloud Shell, select the active database in the PostgreSQL interactive console:
\c orders;
- When prompted for a password, which we previously set, enter:
supersecret!
- Query the distribution_centerstable:
select * from distribution_centers;
(Output)
longitude | latitude | name | id
-----------+----------+---------------------------------------------+----
-89.9711 | 35.1174 | Memphis TN | 1
-87.6847 | 41.8369 | Chicago IL | 2
-95.3698 | 29.7604 | Houston TX | 3
-118.25 | 34.05 | Los Angeles CA | 4
-90.0667 | 29.95 | New Orleans LA | 5
-73.7834 | 40.634 | Port Authority of New York/New Jersey NY/NJ | 6
-75.1667 | 39.95 | Philadelphia PA | 7
-88.0431 | 30.6944 | Mobile AL | 8
-79.9333 | 32.7833 | Charleston SC | 9
-81.1167 | 32.0167 | Savannah GA | 10
-80.1918 | 25.7617 | Miami FL | 11
Note that the new row added on the stand-alone orders database, is now present on the migrated database.
- Exit the PostgreSQL interactive console:
\q
Step5. Promote Cloud SQL to be a stand-alone instance for reading and writing data
- In the Google Cloud Console, on the Navigation menu(), click Database Migration > Migration jobs.
- Click the migration job name vm-to-cloudsqlto see the details page.
- Click Promote.
If prompted to confirm, click Promote.
When the promotion is complete, the status of the job will update to Completed.
- In the Google Cloud Console, on the Navigation menu(), click Databases > SQL.
Note that postgresql-cloudsql is now a stand-alone instance for reading and writing data.
Fawad Arshad
Sr. Consultant