BureauTerraformScaleway

Use Terraform on an infra Scaleway with Continuous Delivery

Infrastructure as code

The infrastructure as code is a practice which consists in declaring in the form of code files the elements of the infrastructure (computing resource, storage, flow rules etc.). The advantages of this method are multiple:

  • Any change is tracked (via a version control system like git).
  • This can help maintain a consistent state of the infrastructure by ensuring that the configuration does not change over time.
  • The return to a functional infrastructure in case of problem is facilitated.
  • To some extent, the code also serves as documentation of the infrastructure.
  • It is easier to share infrastructure source code than a set of documents (not necessarily up-to-date) describing how an infrastructure, an environment, or an application was built.

This practice is more and more common in the world around me. It is a good thing and I try to spread it around me as well.

In the rest of this article, Terraform will be our tool, our language to make infrastructure as code.

Terraform

To start, here is a small excerpt from Wikipedia :

Terraform is an open-source infrastructure as code software tool created by HashiCorp. Users define and provision data center infrastructure using a declarative configuration language known as HashiCorp Configuration Language (HCL), or optionally JSON.

In addition to this first definition, let us add that Terraform is a descriptive tool. It describes what we want as resources, as parameters. On the other hand, at no time is it written how things should be created.

One of the great strengths of Terraform is that it works with many well-known suppliers (the tool refers to "providers"): Amazon AWS, Google GCP, Microsoft Azure, IBM, VMware, OVH or Scaleway. It is the latter that will be of interest to us in the rest of this article.

For the record, Terraform is written in Golang. To describe our infrastructure as code, we use a configuration language specific to HashiCorp (editor of the solution): HCL (Hashicorp Configuration Language).

Last but not least, Terraform is idempotant. A first application of the Terraform code makes the necessary modifications to the infrastructure. A second application indicates that nothing needs to be done, the infrastructure is already in the target state.

Let's stop the introduction here and get to the heart of the matter.

Please note that this article is not an introduction to Terraform. In order to better understand what is presented afterwards, it is best to have a basic knowledge of the use and operation of Terraform. Here is a link to the official Hashicorp tutorial: https://learn.hashicorp.com/terraform

Terraform in Continuous Delivery

As mentioned in the introduction of this article, we want to version the Terraform code. For that, we will use GitLab and gitlab-ci for the automation part.

The Continuous Delivery, also abbreviated CD, is a practice that consists in automating a certain number of actions around a code project. Each time new code is deployed, actions are automatically launched: build if necessary, tests (unit, functional, integration, etc.). Everything you need to prepare and validate that the application is ready to be deployed. This last action remains manual. In this operating mode, the aim is to automate everything except the production launch, which is kept in manual mode in order to continue to control it.

There is also the Continuous Deployment, which aims to automate everything, including the deployment stage.

In this case, it is Continuous Delivery that interests us. We need to remain in control of the applications of Terraform code on our infrastructure.

Another advantage of implementing such a system with GitLab is the simplification of the user workstation. It is no longer necessary to install terraform on your workstation. Everything is triggered and executed from the runner gitlab-ci. This machine uses containers to perform the tasks assigned to it. If you need to investigate, understand or analyze a problem, it is possible to do the same thing on your workstation: launch terraform commands from a container. This offers the possibility to focus a little more on the code and not on launching the terraform commands.

A bit of technique for implementation

Terraform stores the state of the infrastructure and its configuration. The subject was discussed earlier: commands are executed from containers. Therefore, a place must be found to store and share this state file with successive containers. Several solutions are possible and here we made the arbitrary choice to store the file on the Scaleway object storage service. The service is based on Amazon's S3 protocol. Later on this article, we will discuss the setup of the configuration for this status file.

For the GitLab integration, there is no need to reinvent the wheel, a model exists and we will be strongly inspired by it. Let's get started!

Practical example

In the following example, we are working on the declaration of an instance security group. These are flow rules, incoming and outgoing traffic for one or more instances.

Without further ado, here is the tree structure of our mini project.

.
+-- .gitlab-ci.yml
+-- backend.tf
+-- provider.tf
+-- security-group.tf

For the .gitlab-ci.yml file, a template is provided by GitLab, you just have to customize it a bit and use it. We'll come back to this later. Here is the file provided :

# This file is a template, and might need editing before it works on your project.
# Official image for Hashicorp's Terraform. It uses light image which is Alpine
# based as it is much lighter.
#
# Entrypoint is also needed as image by default set `terraform` binary as an
# entrypoint.
image:
  name: registry.gitlab.com/gitlab-org/gitlab-build-images:terraform
  entrypoint:
    - /usr/bin/env'.
    - PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin'.

# Default output file for Terraform plan
variables:
  PLAN: plan.tfplan
  JSON_PLAN_FILE: tfplan.json

cache:
  paths:
    - .terraform

before_script:
  - alias convert_report="jq -r '([[.resource_changes[]?.change.actions?]|flatten)|{\"create\":(map(select(.==\"create\"))|length),\"update\":(map(select(.==\"update\"))|length),\"delete\":(map(select(.==\"delete\"))|length)}'"
  - terraform --version
  - terraform init

internships:
  - validate
  - build
  - test
  - deploy

validate:
  internship: validate
  script:
    - terraform validate

plan:
  stage: build
  script:
    - terraform plan -out=$PLAN
    - terraform show --json $PLAN | convert_report > $JSON_PLAN_FILE
  artifacts:
    paths:
      - $PLAN
    carryovers:
      terraform: $JSON_PLAN_FILE

# Separate apply job for manual launching Terraform as it can be destructive
# action.
apply:
  stage: deploy
  environment:
    name: production
  script:
    - terraform apply -input=false $PLAN
  dependencies:
    - map
  when: manual
  only:
    - master

The backend.tf file contains the necessary information to access the tfstate. As a reminder, this is the Terraform status file which contains information on the current status of the infrastructures declared in the code etc. In the CD approach that we are conducting, it must be stored separately. Indeed, each launch can fall on a different container. This state must therefore be kept in a place accessible to all. In our case, we chose the Scaleway object storage. It is considered as S3 storage by Terraform :

terraform {
  backend "s3" {
    bucket = "mybucket
    key = "terraform.tfstate".
    region = "fr-par"
    endpoint = "https://s3.fr-par.scw.cloud"
    access_key = "__BACKEND_ACCESS_KEY__".
    secret_key = "__BACKEND_SECRET_KEY__"
    skip_credentials_validation = true
    skip_region_validation = true
  }
}

It is important to note that the access_key and the secret_key are not given in the code. This is sensitive information that we replace on the fly so that it does not appear directly in the source code. It is in the .gitlab-ci.yml file that this work is done with the sed tool:

(...)
before_script:
  - alias convert_report="jq -r '([.resource_changes[]?.change.actions?]|flatten)|{\"create\":(map(select(.==\"create\"))|length),\"update\":(map(select(.==\"update\"))|length),\"delete\":(map(select(.==\"delete\"))|length)}'"
  - sed -i "s~__BACKEND_ACCESS_KEY__~${BACKEND_ACCESS_KEY}~" backend.tf
  - sed -i "s~__BACKEND_SECRET_KEY__~${BACKEND_SECRET_KEY}~" backend.tf
  - terraform --version
  - terraform init
(...)

As explained in the introduction, Terraform is a very powerful tool. It makes it possible to do a lot with a large number of providers. It is therefore imperative to let him know which ones we wish to use. This is the role of our provider.tf file.

terraform {
    required_providers {
        scaleway = {
            source = "scaleway/scaleway".
            version = "~> 1.17
        }
    }
}

provider "scaleway" {
    organization_id = "00000000-0000-0000-0000-000000000000"
    zone = "fr-par-1"
    region = "fr-par"
}

It is therefore in this file that the Scaleway provider is declared and the information necessary for its proper operation. Here the project is quite simple so we chose to write the organization_id. On the other hand, like what has been done for the backend, the API information (access_key and secret_key) is not visible here. This sensitive information is stored in variables that are injected on the fly during the execution of the pipeline.

The last file (security-group.tf) is the one containing the declaration of the resource we are interested in: the flow rules, or security group at Scaleway.

resource "scaleway_instance_security_group" "my_instance_security_group" {
    inbound_default_policy = "drop
    outbound_default_policy = "accept

    inbound_rule {
        action = "accept
        ip = "1.2.3.4".
    }

    inbound_rule {
        action = "accept
        postage = 80
    }

    inbound_rule {
        action = "accept
        postage = 443
    }
}

The default properties are declared at the very top. In the example, 3 rules are declared including one with an ip address and all ports. On the others, the ports are specified but no more address, all internet is allowed.

pipeline

As soon as code is pushed, the pipeline is triggered on GitLab with 3 steps:

  • Validate: Launches the terraform validate command to make a check of the project's correct syntax.
  • Plan : terraform plan is used to simulate a launch and show everything that would be done (creation, modification and destruction) by Terraform.
  • Apply : Last step. This one is manual. Once the plan is validated, it is possible to trigger it and this executes the command terraform apply.

Adding code

In order to continue in concrete terms, let's add a Object Storage resource to the code base and study what happens.

The first step is to create an issue.

issue creation

Once the issue is created, the next logical step is to propose a Merge Request or MR. A new git branch allows us to propose our code.

Small parenthesis: in the open source world, to propose a contribution to a project, you would have to create a copy (or fork) of the project, work on it and propose modifications via a Merge Request. In the example of this article, a simplification has been made. There is no copy of the project because we have the rights to it.

It is in this branch that we will add a new storage.tf file. The file contains the following lines:

resource "scaleway_object_bucket" "test_bucket" {
  name = "scw-area51-bucket
  acl = "private
  tags = {
    key = "test
  }
}

A new resource scaleway_object_bucket is declared. It is necessary to give it a name, the other properties are optional.

mr overview

The code is pushed on the project, in its separate branch. Here is the MR overview. Several tabs are visible at the top with Overview, Commits, Pipelines and Changes. The first one presents an overview of the code proposal. The Changes tab is very interesting.

mr changes

As shown in the previous capture, it is possible to quickly view all the proposed changes. In the example, a new file is added, with the Object Storage declaration lines.

When the code was pushed into the project, a pipeline was automatically triggered.

pipeline validate

The first step is a validate that checks the correct syntax of the project's terraform files. In our example, the task is successful, so the pipeline can continue its execution with the next step.

pipeline plan

The step shown above is a plan. A detailed list of actions to be performed is displayed. This list is completed by a summary of the actions: number of objects to create, modify or destroy.

This summary is very practical. When adding Object Storage, as in the example, there is no reason for terraform to destroy any resource. On the MR preview, a block is visible displaying this summary and a link allows to go directly to the task plan.

mr overview report

The plan did not come up with any mistakes and indicated what it would like to do with our infrastructure.

The proposed code can be merged into the main branch of the project. This automatically triggers a new pipeline:

mr overwiew pipeline master

The pipeline launched the tasks validate and plan. The Terraform plan is exactly the same as the one detailed above. A last and new step is visible on the pipeline. This one did not start automatically. Indeed, as previously explained, we are here in a Continuous Delivery process. We want to maintain total control over production deployment.

Here, everything has been validated upstream, so we can apply the modifications.

pipeline master apply

Everything went well and our new Object Storage was deployed!

Conclusion

That's the end of this rather long article. We have seen in broad outline how such a project works and also how to contribute to it.

It is a good basis but there are still points for improvement. In particular, the management of secrets in GitLab has its limits. It would be better to use a vault like Vault. But that's another story...

As it stands, the project remains functional. It allows you to create and manage infrastructures with Terraform. And you, how do you do it?

{{ message }}

{{ 'Comments are closed.' | trans }}