Kelvin Tay

cicd

Many CI/CD providers, like GitHub Actions and CircleCI, offer the options to run your CI/CD job using Docker images today.

This is a useful feature since you can ensure your job is always running with the same pre-installed dependencies. Teams may choose to bring their own Docker image, or use readily-available community images on Docker Hub for instance.

One challenge is that the Docker image in use may not have been intended for use in CI/CD automation. Your team may thus find yourselves trying to debug puzzles like:

  • I'm pretty sure XYZ is installed. Why does the CI/CD job fail to find XYZ?
  • Why is the FOOBAR environment variable different from what we have defined in the Docker image?

Docker image 101

When you execute the docker container run ... command, Docker will, by default, run a Docker container as a process based on the ENTRYPOINT and CMD definitions of your image. Docker will also load the environment variables declared in the ENV definitions.

Your ENTRYPOINT and CMD may be defined to run a long-running process (e.g., web application), or a short process (e.g., running Speccy to validate your OpenAPI spec). This will depend on the intended use of your image.

In addition, Docker images will be designed to come with just-enough tools to run its intended purpose. For example, the wework/speccy image understandably does not come installed with git or curl (see Dockerfile).

Docker images may also be published for specific OS architectures only (e.g., linux/amd64). You will want to confirm which OS and architecture the image can be run on.

These are important contexts, when designing CI/CD jobs using Docker images.

Understanding Docker images for CI/CD

Generally, for CI/CD automation, your job will run a series of shell commands in the build environment.

CI/CD providers like GitLab CI and CircleCI achieve this by override your Docker image's entrypoint with/bin/sh or /bin/bash when executing them as containers.

This is why you would want to use the -debug tag variant for Kaniko's Docker image when using in CircleCI for instance.

Additionally, your Docker image may not come with the required tools for your CI/CD automation. For example, you would require git in order to clone the repository as part of the CI/CD job steps.

Debug Cheatsheet

With this information in mind, here is a list of commands you can run locally to debug your chosen Docker image.

# inspect the "built-in" environment variables of an image
$ docker image inspect docker.io/amazon/aws-glue-libs:glue_libs_2.0.0_image_01 | jq ".[0].Config.Env"
[
  "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin",
  "LANG=en_US.UTF-8",
  "PYSPARK_PYTHON=python3",
  "SPARK_HOME=/home/glue_user/spark",
  "SPARK_CONF_DIR=/home/glue_user/spark/conf",
  "PYTHONPATH=/home/glue_user/aws-glue-libs/PyGlue.zip:/home/glue_user/spark/python/lib/py4j-0.10.7-src.zip:/home/glue_user/spark/python/",
  "PYSPARK_PYTHON_DRIVER=python3",
  "HADOOP_CONF_DIR=/home/glue_user/spark/conf"
]

# check the default entrypoint
$ docker image inspect docker.io/amazon/aws-glue-libs:glue_libs_2.0.0_image_01 | jq ".[0].Config.Entrypoint"
[
  "bash",
  "-lc"
]

# check the default cmd
$ docker image inspect docker.io/amazon/aws-glue-libs:glue_libs_2.0.0_image_01 | jq ".[0].Config.Cmd" 
[
  "pyspark"
]

# check tools installed
$ docker container run --rm docker.io/amazon/aws-glue-libs:glue_libs_2.0.0_image_01 "git --version"
...
git version 2.37.1

$ docker container run --rm docker.io/amazon/aws-glue-libs:glue_libs_2.0.0_image_01 "python --version"
...
Python 2.7.18

You can also find an example of this debugging for a CircleCI use-case here: https://github.com/kelvintaywl-cci/docker-executor-explore/blob/main/.circleci/config.yml

#docker #cicd #debug #cheatsheet

buy Kelvin a cup of coffee

Preface

In a CircleCI pipeline, workflows run independent of one another. As such, there is no built-in feature to ensure workflow B runs after workflow A.

However, you can still achieve ordering, through some trickery.

💡 You can control the sequence of jobs within a workflow. I recommend you consider if you can combine/merge workflow B into workflow A itself, first. This article is for when you require separate workflows somehow.

How

To achieve ordering, we simply set an approval job as the first job for workflow B.

# contrived snippet of a .circleci/config.yaml

workflows:
  aaa:
    jobs:
      - one
      - two
  bbb:
    jobs:
      - start:
          type: approval
      - next:
          requires:
            - start

Subsequent jobs in workflow B will only run when the approval job is approved. As such, you can “force” a wait, and only approve this job when workflow A is completed.

Note that this requires manual intervention, of course.

However, a benefit in this approach is that your team can take the time to confirm the outcomes of workflow A. For example, workflow A has deployed some infrastructure changes (e.g., terraform apply), and you prefer inspecting these changes before running workflow B.

One Step Further

You can automate this approval, at the end of workflow A, via the Approve a job API.

Specifically, you would need to create a job that does the following:

  1. Find workflow B's ID from the current pipeline.
  2. Find the approval job's ID from the invoked workflow B.
  3. Approve the job.
jobs:
  ...
  approve-workflow:
    parameters:
      workflow-name:
        type: string
        description: workflow name
      job-name:
        type: string
        description: name of approval job in workflow
    docker:
      - image: cimg/base:current
    steps:
      - run:
          name: Find Workflow ID for << parameters.workflow-name >>
          command: |
            curl -H "Circle-Token: $CIRCLE_TOKEN" https://circleci.com/api/v2/pipeline/<< pipeline.id >>/workflow > workflows.json
            WORKFLOW_ID=$(jq -r '.items | map(select(.name == "<< parameters.workflow-name >>")) | .[0].id' workflows.json)
            echo "export WORKFLOW_ID='${WORKFLOW_ID}'" >> $BASH_ENV
      - run:
          name: Find Job ID for << parameters.job-name >>
          command: |
            curl -H "Circle-Token: $CIRCLE_TOKEN" "https://circleci.com/api/v2/workflow/${WORKFLOW_ID}/job" > jobs.json
            APPROVAL_JOB_ID=$(jq -r '.items | map(select(.name == "<< parameters.job-name >>" and .type == "approval")) | .[0].id' jobs.json)
            echo "export APPROVAL_JOB_ID='${APPROVAL_JOB_ID}'" >> $BASH_ENV
      - run:
          name: Approve job
          command: |
            curl -X POST -H "Circle-Token: $CIRCLE_TOKEN" "https://circleci.com/api/v2/workflow/${WORKFLOW_ID}/approve/${APPROVAL_JOB_ID}" | jq .

In the spirit of sharing, I have created a CircleCI Orb that codifies the above job for your convenience.

https://circleci.com/developer/orbs/orb/kelvintaywl/control-flow

I hope this article and the Orb will be useful. Keep on building, folks!

#circleci #cicd #workflow

buy Kelvin a cup of coffee

snorkel

Before Diving in

This is an attempt to explain and explore how teams can use Docker Buildx for delivering Docker images.

Since we will not be covering all features around Docker Buildx, this is a wide snorkel rather than a deep dive.

This is a quick article for developers who have yet to use Docker Buildx but are curious on its use-cases.

What is Docker Buildx?

Let's take a few steps back before plunging in.

We use Docker Build to build Docker images from Dockerfiles.

Since 18.09, BuildKit was introduced as an improved version of the previous builder. As an example, we can mount secrets when building our images with BuildKit. BuildKit will also ensure that these secrets are not exposed within the built image's layers.

Buildx builds (no pun intended) on top of BuildKit. It comes with more operations besides image-building, as you can see from its available commands. Importantly, Buildx provides features for caching and cross-platform image builds.

Why should we use Docker Buildx?

For software teams shipping Docker images often, Docker Buildx can be an important tool in the box.

Caching image layers ensure a next rebuild of the image will be faster.

Before, teams would need various machines on different platforms to build images for each platform. For example, we would need a ARM64 machine to build a Docker image for ARM64 architectures.

With Docker Buildx's cross-platform feature, we can now use the same AMD64 machine to build both AMD64 and ARM64 Docker images.

Why is it relevant in CI/CD?

Many teams are building Docker images as part of their CI/CD pipelines. Hence, they can lean on the build cache and cross-platform capabilities of Docker Buildx to build various images faster and cheaper.

Let's discuss the two mentioned features a little deeper.

Caching

This pertains to the cache-from and cache-to options with the docker buildx build command.

Docker Buildx allows you to choose your caching strategy (e.g., inline, local, registry and etc), and each comes with its pros and cons.

Your choice will depend largely on your team's philosophy and the CI/CD provider.

For example, you can leverage GitHub's Cache service when running Docker Buildx on GitHub Actions.

For CircleCI users, you may find my exploratory project here useful.

Cross-platform

When building an ARM64 Docker image on a CI/CD pipeline, you would need to do so on an ARM64-based machine runner then (if not using Buildx).

Depending on your CI/CD provider, there may not be ARM64 support.

This can be worked around, if your CI/CD provider allows you to “bring you own runners” (also known as self-hosted runners). GitHub Actions and CircleCI support self-hosted runners. However, it does mean someone in your team now has to manage these runners on your infrastructure.

With Docker Buildx, we can now build cross-platform images within any arbitrary machine runner.

This can be a big win for team that prefers not owning additional infrastructures.

Resurfacing to Shore

We have explored the appeal of Docker Buildx, particularly in a CI/CD context here. As mentioned, it is ultimately a tool. For teams building Docker images in their CI/CD pipelines, I do encourage you to look into Docker Buildx if you have not!

#docker #buildx #cicd #performance

buy Kelvin a cup of coffee