I recently started a new project where we’re using Terraform to specify our infrastructure as code. If you know a programming language (or two) or have used other <X>-as-Code tools, Terraform shouldn’t be a big lift. That having been said, there are a few concepts which I found to be unintuitive and/or under-documented. Hopefully this saves someone else from having to discover these things empirically.
For reference, my background is Ansible, Python, and bash/shell, with a smattering of other languages for drill (e.g. Java, JavaScript). If your background is similar, you may find these guidelines helpful.
project structure
There is a lot of noise out there regarding how to structure your project, and this is not intended to replace that guidance. Specifically, if you have multiple environments, configurations, or subsystems, you may want to investigate a tool like Terragrunt to help keep your code DRY. With that caveat out of the way, I settled on the structure below for a small- to medium-sized project with multiple subsystems, knowing that I can refactor in the future if/when I want to adopt a tool like Terragrunt.
I have a repository for my Terraform code, along with any supporting codes, where each language gets a top-level directory:
# terraform-root
cf/ # <-- AWS CloudFormation templates
*.cform.yml # <---- example CFT to facilitate syntax highlighting, etc.
ssh/ # <-- shared SSH keys (if using)
tf/ # <-- Terraform code
modules/ # <---- reusable modules
subsystems/ # <---- inventories
Within the tf
directory, I have two subdirectories: one for reusable modules and one for subsystem inventories, where each module or subsystem gets a subdirectory. For example, my “management” subsystem inventory files would live under tf/subsystems/management
.
inventory
As with project structure, there is a lot written about how to declare Terraform files; this attempts to distill those guidelines and provide an opinionated approach to creating repeatable Terraform inventories.
Without using extra switches on the command line, Terraform scopes everything to the current working directory. Using the project structure above, this implies that all terraform
commands are run from a specific inventory directory (e.g. tf/subsystems/management
).
Within that subsystem directory, Terraform will automatically parse any file that ends in *.tf
as “code” (in Ansible parlance, this would roughly equate to a playbook). I take advantage of this by splitting my code into multiple files.
# tf/subsystems/**/
locals.tf # <---- local (e.g. pre-computed) variables
main.tf # <---- main "playbook"
outputs.tf # <---- output variables
providers.tf # <---- infrastructure provider(s)
variables.tf # <---- input variables
versions.tf # <---- provider versions
There is no inherent ordering; Terraform computes the dependency graph at runtime. Additionally, there are no constraints on where you place resources, data, etc. In particular, I use locals.tf
(or another non-normative *.tf
file) to specify lookup tables and other computed variables that I don’t want to specify in main.tf
. For example:
# locals.tf
locals {
my_defaults = [
{
keyA: value1
keyB: value2
},
{
keyA: value3
keyB: value4
},
{
keyA: value5
keyB: value6
},
]
}
inputs
Unlike “code” (which is a specification), input variable values are specified in *.tfvars
files. Per the Terraform docs, Terraform automatically parses both of the following as “inputs”:
- files named exactly
terraform.tfvars
orterraform.tfvars.json
- files with names ending in
.auto.tfvars
or.auto.tfvars.json
By convention, local or sensitive inputs are specified in terraform.tfvars
, which is excluded from version control via a .gitignore
entry:
# .gitignore
**/terraform.tfvars
By convention, subsystem inputs are specified in inputs.auto.tfvars
; any variables which are common across subsystems are specified in globals.auto.tfvars
. By default, Terraform will prompt you for any variables which are required and do not have default values (which can be useful for user prompts).
loops
At some point, you’re going to want to have a loop in your code to generate more than one of something. I find that Terraform is generally well-documented except for the count and for_each meta-arguments. After much wringing of hands, I stumbled on this simple guideline:
count takes a list and returns a list; for_each takes a map and returns a map.
That’s it. That’s the rule. As it turns out, the return is far more important than the input thanks to Terraform’s support for for expressions. When used in combination with meta-arguments, you can construct (almost) any arbitrary loop.
# Ex.1 -- starting with a list...
locals {
# input
my_list = [foo, bar, baz] # list
# outputs
list_to_list = [ for item in my_list : upper(item) ] # list
list_to_map = { for item in my_list : item => my_list } # map
}
# returns
# list_to_list = [ FOO, BAR, BAZ ]
# list_to_map = {
# foo = [foo, bar, baz]
# bar = [foo, bar, baz]
# baz = [foo, bar, baz]
# }
# Ex.2 -- starting with a map...
locals {
# input
my_map = {
foo = Foo
bar = Bar
baz = Baz
}
# outputs
map_to_list = [ for key, value in my_map : value ]
map_to_map = { for key, value in my_map : value => key }
}
# returns
# map_to_list = [Foo, Bar, Baz]
# map_to_map = {
# Foo = foo
# Bar = bar
# Baz = baz
# }
Note that ordering is not guaranteed with for loops; to ensure forward-compatibility, it is best practice to wrap list outputs in the toset()
command to explicitly declare the set as unordered. Note that toset()
removes duplicate values, so be certain that this is compatible with your use case.
extending loops
The Terraform documentation covers two advanced extensions for loops: filtering (for only projecting a subset of the input into the output) and grouping (aggregating values, by key). Note that grouping is only applicable to maps (since it addresses the issue of non-unique map keys). The documentation here is pretty good, so I won’t reproduce it here.
closing thoughts
If you made it this far, thanks for sticking with this rather lengthy post. Hopefully this saves someone else from the confusion that I faced when starting out with Terraform. If you have recommendations (or counter-recommendations) I’d love to hear about your use cases. Hit me on Twitter (or elsewhere) to start a conversation.