The fundamental architecture of GitHub Actions is designed around the concept of linear step execution within jobs. However, modern continuous integration and continuous deployment (CI/CD) requirements frequently demand repetitive execution patterns—specifically, the ability to iterate over a set of data, a list of files, or a variety of environment configurations. Achieving a "for loop" functionality in GitHub Actions is not as straightforward as in a standard programming language because the YAML syntax of the workflow file does not natively support traditional loop constructs. Instead, developers must employ specific architectural strategies to simulate iterative behavior. These strategies range from leveraging matrix builds for parallel execution to utilizing specialized community actions or delegating the loop logic to shell scripts. Understanding the nuance between these methods is critical for optimizing build times, managing resource allocation, and ensuring that the automation pipeline is both scalable and maintainable.
Matrix Builds as Parallel Iterators
Matrix builds represent the most powerful native method for implementing iterative logic within GitHub Actions. While a traditional for loop in a language like Python or JavaScript executes sequentially, a matrix build functions as a concurrent loop. It allows a developer to define a set of variables, and GitHub Actions automatically generates a unique job for every possible combination of those variables.
The technical mechanism behind a matrix build involves the strategy key within a job definition. By defining a matrix object, the workflow engine expands the job into multiple instances. For example, if a matrix is defined with three different operating systems and three different versions of a runtime like Node.js, the engine creates nine distinct jobs. This is functionally equivalent to a nested for loop: for each OS, iterate through each version of Node.js.
The real-world impact of this approach is a massive reduction in total wall-clock time for testing. Because these jobs run in parallel across different runners, the time it takes to complete the "loop" is reduced to the time it takes for the longest single job to finish, rather than the sum of all iterations. This is indispensable for cross-platform compatibility testing, where a web application must be verified on Windows, macOS, and various Linux distributions simultaneously.
The contextual relationship between matrix builds and other looping methods is that matrices are intended for job-level iteration. When the "iteration" requires a separate virtual machine or a clean environment for each item, the matrix is the correct choice. This contrasts with shell-based loops, which are intended for step-level iteration within a single runner.
| Matrix Component | Technical Role | Impact on Workflow |
|---|---|---|
| Strategy | Defines the matrix configuration | Enables the expansion of a single job into many |
| Matrix Variable | Acts as the iterator value | Provides the current item value to the job via ${{ matrix.variable }} |
| Parallelism | Executes jobs concurrently | Drastically reduces total execution time |
| Configuration | Combines multiple dimensions (OS, Version) | Ensures comprehensive environmental coverage |
Dynamic Matrix Generation via Job Outputs
A common challenge in CI/CD is when the list of items to iterate over is not known until the workflow is actually running. This is often the case when looping over a dynamic list of files, such as all .png images in a directory. Since the strategy.matrix field normally requires static values, developers must use a two-job architecture to achieve dynamic looping.
In this pattern, the first job acts as the "generator." This job executes a shell command to identify the target items and then formats that list as a JSON array. For instance, a command like ls *.png | jq -R -s -c 'split("\n")[:-1]' can be used to capture all PNG files and transform them into a JSON string. This string is then passed to the GitHub Actions environment using an output variable.
The second job then consumes this output. By using the fromJson() function within the matrix definition, the workflow converts the JSON string back into a list that the matrix can iterate over. This allows the workflow to dynamically scale based on the content of the repository.
The technical flow for this implementation is as follows:
- The generator job runs on a runner (e.g.,
ubuntu-latest). - It executes a shell command to gather the list.
- It sets the output using the syntax
echo "::set-output name=file::$(...) ". - The subsequent job declares a dependency on the first job using the
needskeyword. - The matrix is defined as
file: ${{ fromJson(needs.list-png-files.outputs.file) }}. - Each instance of the job can then access the specific file via
${{ matrix.file }}.
This method is the only way to natively loop a step in GitHub Actions while maintaining the benefit of separate job logs and parallel execution for each item in the list.
The Command-Loop Community Action
For simpler use cases where the overhead of creating multiple jobs is unnecessary, community-developed actions provide a way to run a shell command in a loop within a single step. An example of this is the cliffano/command-loop-action. This action abstracts the complexity of shell scripting by providing a declarative way to iterate over a string of items.
The technical implementation involves providing three primary inputs to the action: items, command, and delimiters. The items input is a string containing the values to be iterated over. While the default behavior supports comma and space-separated strings, the action allows for custom delimiters. For example, if the items are separated by colons, the delimiters input can be set to :.
The loop executes the provided shell command for each item in the list. Within the command, the current item is accessed using the $ITEM environment variable. For example, providing the command echo "Count $ITEM" with a list of numbers from 1 to 10 will result in ten sequential echo commands.
The impact for the user is a simplified YAML file. Instead of writing complex bash loops with string manipulation and quote handling, the user simply defines the list and the command. This is particularly useful for lightweight tasks, such as pinging a list of servers or triggering a series of API calls, where the cost of spinning up multiple matrix jobs would be prohibitive in terms of time and resource usage.
The specific configuration for this action is detailed below:
- items: A required string containing the list of items. Example:
1 2 3 4 5 6 7 8 9 10. - command: A required string containing the shell command to execute. Example:
echo "Count $ITEM". - delimiters: An optional string specifying the characters that separate items. The default is
,. Example:|.
Shell-Based Iteration and Environment Variable Passing
When neither a matrix nor a third-party action is appropriate, the most flexible method for looping is to delegate the logic to a Bash script. This approach is necessary when complex string manipulation or conditional logic is required during the iteration process that exceeds the capabilities of a simple command-loop action.
This method involves defining a list as an environment variable within the workflow step and then calling an external script or an inline bash block to process that list. For example, a variable MY_LIST might be set to a JSON-like string of values: "a","b","c".
The technical challenge in this approach is that environment variables are passed as strings, and Bash does not natively treat a comma-separated string as an array. To resolve this, developers must use string replacement techniques. The syntax ${MY_LIST//,/ } is used in Bash to replace all commas with spaces, which allows a for loop to iterate over the items correctly.
Furthermore, since the items may be enclosed in double quotes, a cleaning process is required to ensure the data is usable. This is often achieved using the sed command. The command sed -e 's/^"//' -e 's/"$//' is used to remove the leading and trailing quotes from each item during the iteration.
The implementation process follows these steps:
- Define the environment variable in the workflow YAML:
```yaml
- name: Print List
env:
MY_LIST: '"a","b","c"'
run: |
bash .github/scripts/print-list.sh
shell: bash
```
- Create the bash script (
.github/scripts/print-list.sh) with the following logic:
```bash
!/bin/bash
if [[ -z "${MYLIST:-}" ]]; then
echo "ERROR: Missing env var MYLIST"
exit 1
fi
for i in ${MYLIST//,/ }
do
echo "$i"
iwithout_quotes=$(sed -e 's/^"//' -e 's/"$//' <<<"$i")
echo "$i"
done
```
The impact of this method is total control over the execution environment. Because the loop is happening inside a standard Linux shell, developers can use all the power of the GNU toolchain, including grep, awk, and jq, to process the items. This is the preferred method for complex data transformations where the "loop" is just one part of a larger data processing pipeline.
Comparative Analysis of Looping Strategies
Choosing the correct looping mechanism depends on the specific requirements of the workflow, such as the size of the list, the need for parallelism, and the complexity of the operations being performed.
| Strategy | Scope | Execution | Best Use Case | Resource Cost |
|---|---|---|---|---|
| Matrix Build | Job-Level | Parallel | Environmental testing, large datasets | High (Multiple Runners) |
| Dynamic Matrix | Job-Level | Parallel | Filesystem-based iteration (e.g., all PNGs) | High (Multiple Runners) |
| Community Action | Step-Level | Sequential | Simple commands on a static list | Low (Single Runner) |
| Bash Scripting | Step-Level | Sequential | Complex logic, string manipulation | Low (Single Runner) |
From a technical perspective, the Matrix Build is the most "GitHub-native" way to iterate. It integrates directly with the UI, showing each iteration as a separate job with its own status and logs. This makes debugging much easier, as a failure in one specific item of the loop does not necessarily stop the other items from completing.
In contrast, Step-Level loops (Community Actions and Bash scripts) are "opaque" to the GitHub Actions UI. If a bash loop fails on the 5th item of a 100-item list, the entire step is marked as failed, and the developer must sift through a single large log file to find the specific error. This makes them less suitable for tasks that are prone to intermittent failures.
Conclusion
The implementation of "for loops" in GitHub Actions is a study in adapting to the constraints of YAML-based orchestration. While there is no for keyword in the workflow schema, the platform provides multiple avenues to achieve the same result. Matrix builds offer a high-performance, parallelized approach for environment and configuration testing, while dynamic matrices allow for flexible, data-driven workflows. For those requiring simplicity, community actions like cliffano/command-loop-action provide a streamlined interface for sequential execution. Finally, for the highest level of control, delegating logic to Bash scripts allows for sophisticated string processing and system-level manipulation.
The strategic choice between these methods depends on the trade-off between visibility and resource efficiency. Job-level iteration (Matrix) provides maximum visibility and speed but consumes more runner minutes. Step-level iteration (Bash/Actions) is resource-efficient but offers less granular reporting. By mastering these three distinct patterns—parallel matrix expansion, dynamic output consumption, and shell-based processing—developers can build robust, scalable CI/CD pipelines that handle repetitive tasks with precision and efficiency.