Users of Amazon Web Services are used to having Route53 available to provide DNS records within their clouds. However, if you’re a government contractor or an agency, your deployments likely live on an Amazon GovCloud environment. And on that environment, Route53 is not yet available. So does that mean you should forego deploying any services that need custom DNS records on GovCloud? Of course not!
We will show how to circumvent this restriction by deploying our own DNS server inside a GovCloud VPC in a way that makes it easy to update and manage zone records.
Terraform is used to ensure reproducible environments between engineers. You should read more how it and other technologies help streamline operations productivity here.
We assume you will be using a terminal emulator under a UNIX or UNIX-like operating system. Your GovCloud account must have enough availability in its EC2 quota for at least one more instance. If you never had to deal with quotas before, you likely will not exceed your quota with the extra instance and can ignore this point. Examples preceded by $ means the line is supposed to be typed and run from your terminal emulator:
$ terraform version Terraform vX.YY.Z
$ aws --version
aws-cli/X.YY.ZZ Python/X.Y.Z Linux/X.YY.Z botocore/X.Y.ZZ
We will create an empty environment for deployment, and add fpco-terraform-aws, which includes the module you will need. FP complete created and open-sourced fpco-terraform-aws as a collection of modules that ease managing resources in an AWS cloud. As you will see below, a few of its modules are used, and they considerably shorten the amount of code you need to build your environment. We will need a git repository to bring in fpco-terraform-aws, and it’s also good practice to always version your infrastructure. This way, not only you but your team can collaborate in developing and operating it:
$ r="dns-on-govcloud"; git init $r; cd $r; unset r $ echo '# DNS on GovCloud' > README.md $ git add README.md && git commit -m 'doc: add README.md' $ mkdir vendor vpc $ git subtree add --prefix vendor/fpco-terraform-aws git://github.com/fpco/fpco-terraform-aws.git master --squash $ touch vpc/main.tf vpc/terraform.tfvars
Add the usual AWS credentials and your cloud environment variables to main.tf. Now we need to create a VPC and add our DNS to it. Include the following:
module "vpc" { source = "../vendor/fpco-terraform-aws/tf-modules/vpc" region = "${var.region}"
cidr = "${var.cidr}"
name_prefix = "${var.name_prefix}"
enable_dns_hostnames = true
enable_dns_support = true
dns_servers = ["AmazonProvidedDNS"]
}
module "public-subnets" {
source = "../vendor/fpco-terraform-aws/tf-modules/subnets"
azs = "${var.azs}"
name_prefix = "${var.name_prefix}-public"
cidr_blocks = "${var.public_subnet_cidrs}"
}
module "public-gateway" {
source = "../vendor/fpco-terraform-aws/tf-modules/route-public"
vpc_id = "${module.vpc.vpc_id}"
name_prefix = "${var.name_prefix}-public"
public_subnet_ids = ["${module.public-subnets.ids}"]
}
This is our basic VPC, with still only Amazon DNS available, and no Route53 in sight. It already allows us access to the internet, but has no private parts. This should be enough for our scenario, but in a real setting, you will want to protect your resources with private subnets and additional machinery. Next, we will add an EC2 instance with BIND pre-installed, which will receive our configuration changes.
We will now add BIND. Create a file named dns.tf and add it alongside main.tf in the repo, with the following contents:
resource "template-dir" "db-records" { source_dir = "${path.module}/data/dns/templates/db_records" destination_dir = "${path.module}/data/dns/rendered/db_records"
}
data "template_file" "etc_bind_named_conf_local" {
template = "${file("data/dns/templates/named.conf.local")}"
}
data "template_file" "etc_bind_named_conf_options" {
template = "${file("data/dns/templates/named.conf.options")}"
}
module "dns" {
source = "../vendor/fpco-terraform-aws/tf-modules/bind-server"
aws_cloud = "aws-us-gov"
…
named_conf_options = "${data.template_file.etc_bind_named_conf_options.rendered}"
named_conf_local = "${data.template_file.etc_bind_named_conf_local.rendered}"
db_records_folder = "${template_dir.db_records.destination_dir}"
…
private_ips = "${var.dns_ips}"
}
You will also need a few other parameters on the dns module, and security groups for ports used by BIND, as well as internet access. Adding these is easy, and it’s a good exercise to fill in the gaps before deploying the examples we’re giving here.
As we can see from the snippet above, the dns module expects to receive at least named.conf.options, named.conf.local and a folder of pre-generated zone records files. For BIND options, we want to ensure that it forwards requests for which we are not authoritative servers upstream to Amazon’s DNS on the VPC. The following is an example of how you could accomplish that with data/dns/templates/named.conf.options:
options { directory "/var/cache/bind"; dnssec-validation no; auth-nxdomain no; allow-query { any; }; allow-recursion { any; }; recursion yes; version none; forwarders { 169.245.169.253; };
};
You will need to double check allow-query on a real environment. The IP under forwarders is a fixed address always available inside AWS VPCs. This ensures other names other than those under our own management can be resolved.
You will then need to tell BIND which zones we respond to authoritatively. This happens under both named.conf.options and the DNS resource records (RRs) files. Let’s start with named.conf.options first:
zone "yourdomain.com" { type master; file "/etc/bind/db.yourdomain.com"; }; zone "us-gov-west-1.compute.internal" { type forward; forward only; Forwarders { 169.254.169.253; }; };
Great. Now, whenever a query comes for yourdomain.com, BIND will look for it locally and return whatever it has under its RRs. This solves the Route53 part of being able to answer for DNS records for ourselves. It still does not solve the dynamic updates part that the Route53 API provides us. For that, we will change the RRs directly. Whenever that happens and we run Terraform, BIND will read back those changes, achieving the same functionality. Here is an example of data/dns/templates/db_records/db.yourdomain.com:
$ORIGIN yourdomain.com. $TTL 1h @ IN SOA yourdomain.com. mail.yourdomain.com. (1 15m 3m 1d 1m) @ IN NS ns1 @ IN A 127.0.0.1 ns1 IN A 127.0.0.1
RRs and directives for DNS follow a straightforward format set by RFC 1035 alongside a few later amendments. The snippet above is a text representation of this format used by BIND. Whenever you would have wanted to call the Route53 API to add a record on your behalf, you will instead add the resource record directly on those files and get Terraform to apply the changes.
Once you have terraform applyed the configuration above — alongside the security groups and other changes not shown — you should SSH into the host and test that it’s capable of resolving queries to your own zone, as well as to other zones:
$ ssh ${SSH_USER}@${SSH_HOST} $ dig +short any fpcomplete.com $ dig +short any yourdomain.com
Both queries should return an IP. If the first doesn’t, then you cannot resolve names using the Amazon-provided DNS. If the second fails, there is an error in your RRs. Make sure that BIND is running on your instance by using systemctl status bind9 and checking your system journal. You can later turn this into a test suite and add an alarm whenever some names fail resolving, prompting a DevOps engineer to verify the situation.
Finally, we need to change who provides DNS for the VPC by changing the vpc module parameters:
dns_servers = ["${module.dns.private_ips}"]
From that point on, other resources will start resolving using the BIND instance. Use the templates as your interface to DNS, and terraform apply to keep in sync.
The previous configuration works just fine, but has quite a few things missing to make it into a production-grade environment. For a start, it has only a public subnet, meaning your DNS server is exposed to the entire internet. Without further security, it could be used for attacks on this and other environments. Secondly, if the instance fails, there is some time until auto-recovery kicks in, time during which the entire VPC remains unable to resolve names. There are various strategies to mitigate both issues, but they are beyond the scope for this article. Lastly, you might have sensitive data that needs to live in the BIND server. For those cases, you should take a look at another of our blog posts.
In case you need further support with GovCloud to ensure important government-sensitive data is kept reliable, available and manageable, get in touch with us.
Subscribe to our blog via email
Email subscriptions come from our Atom feed and are handled by Blogtrottr. You will only receive notifications of blog posts, and can unsubscribe any time.