Easily launch RDS and EC2 instance, with SSH access, using Terraform

; Date: April 6, 2020

Tags: Terraform »»»» AWS »»»» AWS EC2 »»»» AWS RDS

We commonly must deploy on AWS both running code and a database service. AWS offers a cloud based database system, RDS, and in this tutorial we will look at launching an AWS EC2 instance along with an RDS database inside a custom-built AWS Virtual Private Cluster (VPC). The RDS database will not be reachable from the public Internet, and only from within the VPC we will create. Our tool for this will be Terraform, a powerful tool for configuring infrastructure on a long list of cloud-based systems

An AWS Virtual Private Cluster carves out an area hosted by AWS into which you deploy AWS infrastructure items. A VPC can contain EC2 instances, deployments of containerized application using Docker (ECS) or Kubernetes (EKS) ECS service and a long list of other possible services. There are so many services offered by AWS that it's like stumbling through a maze of twisty passages all alike.

It is complex building AWS infrastructure using the tools provided by AWS. In this example we will use Terraform, since it offers a much simpler way to implement cloud based systems on AWS or many other services. We still have to learn how the AWS infrastructure pieces work, but using Terraform simplifies everything.

This example builds on a previous example - Easily launch EC2 instance, with SSH access, using Terraform

In that example we used Terraform to set up a VPC with EC2 instances, and in one case one of the EC2 instances had a self-hosted MySQL database.

The AWS RDS service offers many advantages over self-hosted MySQL. It is easy to implement a massively scaled database, with replication, fail-over in case a database crashes, automatic backups, automatic upgrades, since all of that is available by clicking some buttons in the AWS management console.

The code for the example shown here is on Github at (github.com) robogeek / terraform-aws-modules

Specifically, the VPC cluster is built using a Terraform module (github.com) robogeek /terraform-aws-modules / modules / vpc-double

And the example is at (github.com) robogeek / terraform-aws-modules / examples / rds

To understand configuration of the VPC cluster, I suggest reading Easily launch EC2 instance, with SSH access, using Terraform. That tutorial focused on a VPC cluster with one private subnet, which was built using the vpc-simple module. This tutorial uses the vpc-double module, which is the same as the vpc-simple module but contains two subnets.

Initializing a Terraform project, and an AWS VPC, for deploying RDS instances

In the rds example you will find several Terraform scripts comprising the tutorial. Let's start with main.tf, where we initialize the Virtual Private Cluster (VPC) using the vpc-double module:

provider "aws" {
    shared_credentials_file = "$HOME/.aws/credentials"
    profile                 = "notes-app"
    region                  = var.aws_region
}

module "vpc" {
    source = "../../modules/vpc-double"
    vpc_name = "db-sample"
    vpc_cidr = "192.168.0.0/16"
    public_cidr = "192.168.1.0/24"
    private1_cidr = "192.168.2.0/24"
    private2_cidr = "192.168.3.0/24"
    private1_az = data.aws_availability_zones.available.names[0]
    private2_az = data.aws_availability_zones.available.names[1]
    public_az = data.aws_availability_zones.available.names[0]
}

data "aws_availability_zones" "available" {
  state = "available"
}

This sets up the AWS Provider, configuring it with our access credentials, and our preferred AWS region.

At the bottom we use aws_availability_zones to load a list of AWS Availability Zones for the selected region.

In the middle we invoke the vpc-double module. As we said, this module is a simple VPC, containing one public subnet, and two private subnets. Using the arguments we declare the CIDR's and availability zones for those subnets.

To support this project we need a file, variables.tf, to hold the project variables. For the var.aws_region variable add this declaration.

variable "aws_region" {
    default = "us-west-2"
}

Initializing the EC2 instance for the cluster

We'll shortly set up a MySQL instance using AWS RDS. That MySQL instance won't be visible from the public Internet. To have a chance of interacting with the database we will add an EC2 instance so we can log in and run the MySQL CLI client.

In practice you might want to deploy an application service on the EC2 instance.


resource "aws_instance" "public-ec2" {
    ami           = var.ami_id
    instance_type = var.instance_type
    subnet_id     = module.vpc.subnet_public_id
    key_name      = "notes-app-key-pair"
    vpc_security_group_ids = [ aws_security_group.ec2-sg.id ]
    associate_public_ip_address = true

    tags = {
        Name = "ec2-main"
    }

    depends_on = [ module.vpc.vpc_id, module.vpc.igw_id ]

    user_data = <<EOF
#!/bin/sh
sudo apt-get update
sudo apt-get install -y mysql-client
EOF
}

resource "aws_security_group" "ec2-sg" {
  name        = "security-group"
  description = "allow inbound access to the EC2 instance"
  vpc_id      = module.vpc.vpc_id

  ingress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = [ "0.0.0.0/0" ]
  }

  egress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = ["0.0.0.0/0"]
  }
}

This EC2 deployment is extremely similar to what we did in Easily launch EC2 instance, with SSH access, using Terraform.

We have created a Security Group that allows any traffic to enter (ingress) or leave (egress) the EC2 instance. Obviously for a production deployment the security instance must be tightly specified. But this is just a demonstration, so it's okay to leave the barn doors wide open.

We are attaching the EC2 instance to the public subnet, and have required that it allocate a public IP address. It is configured with our AWS "key pair" so we can use SSH to log in.

We are adding the mysql-client package to the EC2 instance so that we can access the database using the MySQL CLI. The user_data attribute lets us attach a shell script that will execute when the EC2 instance launches. As you see, that shell script runs Ubuntu commands to install the mysql-client package. In the previous tutorial we had a script that instead installed the mysql package so that the EC2 instance had a pre-installed database.

To support this we need two additional variables in variables.tf

variable "ami_id" {
    // Ubuntu Server 18.04 LTS (HVM), SSD Volume Type in us-west-2 
    default = "ami-0d1cd67c26f5fca19"
}

variable "instance_type" {
  default = "t2.micro"
}

Initializing the RDS database instance

The scripts so far have created a VPC, with a set of public and private subnets, and then attached an EC2 instance. Our goal is to attach an RDS instance to the VPC, and then access the database from the EC2 instance. We now have enough ready to go about attaching an RDS instance to the VPC.

While RDS offers huge flexibility, such as scaling, replication, automated backups, and more, we'll use the simplest possible configuration. A single instance, attached to a single availability zone.

In the workspace you'll find a file, rds.tf, containing the RDS configuration.

resource "aws_db_parameter_group" "default" {
  name   = "rds-pg"
  family = "mysql8.0"

  parameter {
    name  = "character_set_server"
    value = "utf8"
  }

  parameter {
    name  = "character_set_client"
    value = "utf8"
  }
}

resource "aws_db_subnet_group" "default" {
  name       = "main"
  subnet_ids = [ module.vpc.subnet_private1_id , module.vpc.subnet_private2_id ]

  tags = {
    Name = "My DB subnet group"
  }
}

resource "aws_security_group" "rds-sg" {
  name        = "rds-security-group"
  description = "allow inbound access to the database"
  vpc_id      = module.vpc.vpc_id

  ingress {
    // protocol    = "tcp"
    // from_port   = 0
    // to_port     = 3306
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = [ module.vpc.vpc_cidr ]
  }

  egress {
    protocol    = "-1"
    from_port   = 0
    to_port     = 0
    cidr_blocks = [ module.vpc.vpc_cidr ]
  }
}

To create an RDS instance we first start with creating:

  • A database subnet group - containing the subnets to be used by the RDS instance.
  • A parameter group - containing the the configuration parameters for the database.
  • A security group - containing the security firewall around the RDS instance.

Using the parameter group, AWS creates a MySQL configuration file. It is also possible to specify an option group for even more options. For documentation see (www.terraform.io) aws_db_parameter_group and aws_db_option_group on the Terraform website.

The aws_db_subnet_group declaration controls which subnets the database will be used in. More importantly this associates the RDS instance with the VPC configured for the project. Notice that we're using the two private subnets created in the vpc-double module.

For the security group notice that we're leaving the barn doors wide open again. There is an obvious ingress rule to use, limiting traffic to the 3306 port number that is standard for MySQL.

resource "aws_db_instance" "default" {
  allocated_storage    = 100
  storage_type         = "gp2"
  engine               = "mysql"
  engine_version       = "8.0"
  instance_class       = "db.t2.micro"
  identifier           = "mydb"
  name                 = "mydb"
  username             = "root"
  password             = "foobarbaz"
  parameter_group_name = aws_db_parameter_group.default.id
  db_subnet_group_name = aws_db_subnet_group.default.id
  vpc_security_group_ids = [ aws_security_group.rds-sg.id ]
  publicly_accessible  = false
  skip_final_snapshot  = true
  multi_az             = false
}

Then we finally set up the RDS instance. For documentation see (www.terraform.io) aws_rds_instance.

In this case we are configuring a db.t2.micro instance to stay in the free tier, containing 100GB of data storage, and running MySQL 8.0.

The name parameter is the name for the initial database, and the username and password parameters create a login identity to use.

The parameter_group_name, db_subnet_group_name and vpc_security_group_ids connect to the options declared above.

The publicly_accessible setting makes sure this database is not publicly accessible. The skip_final_snapshot setting means that when deleting the database instance no database snapshot will be made. The multi_az setting means the database will appear in only one availability zone.

Clearly some of these settings are not what we'd use on a production server. The security group should me limited to port 3306 for instance. And there should be settings for backups, updating the software, multiple availability zones, taking a final database snapshot, and more.

Output variables

The final thing to handle is outputting useful bits of data. The outputs.tf file contains these declarations:

output "ec2-public-dns" {
    value = aws_instance.public-ec2.public_dns
}

output "ec2-public-ip" {
    value = aws_instance.public-ec2.public_ip
}

output "ec2-public-private-dns" {
    value = aws_instance.public-ec2.private_dns
}

output "ec2-public-private-ip" {
    value = aws_instance.public-ec2.private_ip
}

output "db-address" {
    value = aws_db_instance.default.address
}

output "db-arn" {
    value = aws_db_instance.default.arn
}

output "db-domain" {
    value = aws_db_instance.default.domain
}

output "db-id" {
    value = aws_db_instance.default.id
}

output "db-name" {
    value = aws_db_instance.default.name
}

output "db-port" {
    value = aws_db_instance.default.port
}

This gives us quite a bit of output when we launch the service.

Launching the RDS and EC2 services on AWS using Terraform

We're now ready to launch this thing. As always the first step is to run terraform init to initialize the Terraform plugins and state file. We follow that by running terraform plan to give this a once-over and ensure that it does what we want.

Once ready we run terraform apply:

$ terraform apply

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

...

Plan: 18 to add, 0 to change, 0 to destroy.

Do you want to perform these actions?
  Terraform will perform the actions described above.
  Only 'yes' will be accepted to approve.

  Enter a value: yes

...

Apply complete! Resources: 18 added, 0 changed, 0 destroyed.

Outputs:

db-address = mydb.chuhqysisf6f.us-west-2.rds.amazonaws.com
db-arn = arn:aws:rds:us-west-2:098106984154:db:mydb
db-domain = 
db-id = mydb
db-name = mydb
db-port = 3306
ec2-public-dns = ec2-34-215-225-103.us-west-2.compute.amazonaws.com
ec2-public-ip = 34.215.225.103
ec2-public-private-dns = ip-192-168-1-72.us-west-2.compute.internal
ec2-public-private-ip = 192.168.1.72

If all goes well the output will look like this. The ec2-public-dns item tells us where the EC2 instance is located, while the db-address item tells us where the database is located.

From our laptop, the database is not accessible, but the EC2 instance can be accessed:

$ ping mydb.chuhqysisf6f.us-west-2.rds.amazonaws.com
PING mydb.chuhqysisf6f.us-west-2.rds.amazonaws.com (192.168.2.199): 56 data bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
^C
--- mydb.chuhqysisf6f.us-west-2.rds.amazonaws.com ping statistics ---
3 packets transmitted, 0 packets received, 100.0% packet loss
$ ping ec2-34-215-225-103.us-west-2.compute.amazonaws.com
PING ec2-34-215-225-103.us-west-2.compute.amazonaws.com (34.215.225.103): 56 data bytes
64 bytes from 34.215.225.103: icmp_seq=0 ttl=40 time=44.286 ms
64 bytes from 34.215.225.103: icmp_seq=1 ttl=40 time=44.586 ms
^C
--- ec2-34-215-225-103.us-west-2.compute.amazonaws.com ping statistics ---
2 packets transmitted, 2 packets received, 0.0% packet loss
round-trip min/avg/max/stddev = 44.286/44.436/44.586/0.150 ms

This is what we wanted - the EC2 instance can be accessed, but the database is not accessible.

Let's log in to the EC2 instance and see what's what.

$ ssh -i ~/Downloads/notes-app-key-pair.pem ubuntu@ec2-34-215-225-103.us-west-2.compute.amazonaws.com
The authenticity of host 'ec2-34-215-225-103.us-west-2.compute.amazonaws.com (34.215.225.103)' can't be established.
ECDSA key fingerprint is SHA256:nSJSNfFHeV4UtFRgqQ6Nyyd5WxokxxxI15JjzVJjtU0.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'ec2-34-215-225-103.us-west-2.compute.amazonaws.com,34.215.225.103' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 18.04.3 LTS (GNU/Linux 4.15.0-1057-aws x86_64)

...
ubuntu@ip-192-168-1-72:~$ mysql -u root -h mydb.chuhqysisf6f.us-west-2.rds.amazonaws.com -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 11
Server version: 8.0.11 Source distribution

Copyright (c) 2000, 2020, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> show databases;
+--------------------+
| Database           |
+--------------------+
| information_schema |
| innodb             |
| mydb               |
| mysql              |
| performance_schema |
| sys                |
+--------------------+
6 rows in set (0.00 sec)

mysql> 

We can log in and access the database server.

The security group attached to the RDS instance allows traffic from anywhere inside the VPC. This means we can install any software we like on the EC2 instance to use the database - for example set up a web server with Wordpress - and use a MySQL API client to access the database. Or we can install other facilities in the VPC like an ECS cluster to use the database.

Examining the AWS Management Console

Let's look at what was created.

We see here all the same settings and parameters being shown in the RDS dashboard for this instance. The security group, VPC, and subnets matches what we declared earlier.

Likewise in the VPC dashboard area we can inspect the VPC, and the configuration of various things attached to it like the subnets and route tables.

Simplifying finding the RDS instance

You may have noticed that the domain name assigned by AWS is not exactly user-friendly.

db-address = mydb.chuhqysisf6f.us-west-2.rds.amazonaws.com

According to the AWS documentation there is no way to change this domain name into one that is more usable. For example it would be excellent if the RDS instance domain name were db.mydomain.com or maybe db.local. The best offered by AWS is to assign a CNAME to the AWS-provided domain name.

But let's think of another method. If we change the declaration for the EC2 instance a little, the creation of the EC2 instance can occur after the creation of the RDS instance, and then the script for installing mysql-server can also save the RDS instance domain name.

resource "aws_instance" "public-ec2" {
    ami           = var.ami_id
    instance_type = var.instance_type
    subnet_id     = module.vpc.subnet_public_id
    key_name      = "notes-app-key-pair"
    vpc_security_group_ids = [ aws_security_group.ec2-sg.id ]
    associate_public_ip_address = true

    tags = {
        Name = "ec2-main"
    }

    depends_on = [ module.vpc.vpc_id, module.vpc.igw_id, aws_db_instance.default ]

    user_data = <<EOF
#!/bin/sh
sudo apt-get update
sudo apt-get install -y mysql-client
echo ${aws_db_instance.default.address} >/tmp/dbdomain.txt
sudo mv /tmp/dbdomain.txt /dbdomain.txt
EOF
}

This is the modified EC2 instance declaration. There are two changes here that are significant.

First, the depends_on clause adds aws_db_instance.default. The depends_on clause tells Terraform that creating a given resource must occur after some other resource exists. Therefore we tell Terraform in this case that creating the EC2 instance must wait until the DB instance is created.

Secondly, the user_data script now saves the value of aws_db_instance.default.address into a file inside the EC2 instance file-system.

Indeed, if we launch the system after this change, we get the following result:

ubuntu@ip-192-168-1-171:~$ cat /dbdomain.txt 
mydb.chuhqysisf6f.us-west-2.rds.amazonaws.com

This implementation may or may not be usable for your application. But it demonstrates we can capture this domain name, and place it somewhere that it will be useful for the application.

In other words the goal is to automate configuring the database connection in whatever application we have created. For example if our EC2 instance is housing a Wordpress installation, then the user_data script could ensure the database domain name gets into a Wordpress configuration file.

But it's really silly to have to use this kind of subterfuge.

For example in the AWS ECS service (for deploying Docker containers) it is possible to connect ECS Service instances to Route53 such that AWS automatically assigns a useful domain to the ECS container. That means ECS containers can automatically find each other by a simple domain name like my-service.local. For an example see Terraform deployment on AWS ECS a multi-tier Node.js and Nginx system using Service Discovery .

Deleting the AWS RDS infrastructure

Since we don't have any immediate need for this system, we can destroy it.

$ terraform destroy

...

Plan: 0 to add, 0 to change, 18 to destroy.

Do you really want to destroy all resources?
  Terraform will destroy all your managed infrastructure, as shown above.
  There is no undo. Only 'yes' will be accepted to confirm.

  Enter a value: yes

...

Destroy complete! Resources: 18 destroyed.

Summary

We've seen in this tutorial how to set up a simple deployment of an RDS instance, and other AWS facilities, into an AWS VPC environment.

RDS offers not just MySQL but other database engines, and offers the capability to scale across AWS infrastructure.

About the Author(s)

David Herron : David Herron is a writer and software engineer focusing on the wise use of technology. He is especially interested in clean energy technologies like solar power, wind power, and electric cars. David worked for nearly 30 years in Silicon Valley on software ranging from electronic mail systems, to video streaming, to the Java programming language, and has published several books on Node.js programming and electric vehicles.