Conditional Execution Strategies in GitHub Actions Using Git Diff

GitHub Actions provides a robust infrastructure for continuous integration and deployment, but it lacks native, built-in constructs for executing jobs or steps based solely on which files have changed within a repository. While the platform excels at triggering workflows on push or pull request events, developers often require granular control to optimize pipeline efficiency. Running comprehensive test suites, linting processes, or deployment tasks on every single commit—regardless of the specific files modified—wastes computational resources and increases feedback latency. To address this, engineers utilize git diff commands in conjunction with shell scripting, environment variables, and GitHub Actions expressions to implement conditional logic that restricts execution to relevant code changes.

The Limitations of Native GitHub Actions

GitHub Actions does not natively support the ability to skip jobs or steps based on file changes without external tooling or custom scripting. The platform’s default behavior is to run all defined jobs when a workflow is triggered by an event such as a push or a pull request. This default approach becomes inefficient in large monorepos or projects with distinct components, such as documentation, frontend assets, and backend services. For instance, a change to a markdown file in a documentation folder should not trigger a heavy backend integration test suite.

To implement file-based conditional logic, developers must leverage the Git version control system, which is available by default in GitHub Actions runner containers. The strategy involves comparing the current commit with a previous state (such as the parent commit or the base branch of a pull request), identifying the list of modified files, and then using that list to determine whether subsequent steps should execute. This approach typically relies on PowerShell Core, Bash, or other shell environments available on the runner to parse the output of git diff and set workflow outputs or environment variables.

Implementing Conditional Steps with PowerShell

One common method for implementing conditional steps involves using PowerShell to check for specific file patterns. This approach is particularly useful when the condition depends on the file path or extension, such as detecting changes in documentation files.

The workflow begins by checking out the repository code. Crucially, the fetch-depth parameter must be configured to retrieve enough commit history to perform the diff. If only the latest commit is fetched, git diff cannot compare the current HEAD against the previous commit. Therefore, setting fetch-depth: 2 ensures that both the current commit and its parent are available.

yaml name: demo on: push: branches: - 'main' jobs: conditional_step: runs-on: 'ubuntu-20.04' steps: - uses: actions/checkout@v2 with: fetch-depth: 2 - shell: pwsh id: check_file_changed run: | $diff = git diff --name-only HEAD^ HEAD $SourceDiff = $diff | Where-Object { $_ -match '^docs/' -or $_ -match '.md$' } $HasDiff = $SourceDiff.Length -gt 0 Write-Host "::set-output name=docs_changed::$HasDiff" - shell: pwsh if: steps.check_file_changed.outputs.docs_changed == 'True' run: echo publish

In this example, the step with the id check_file_changed executes a PowerShell script. The command git diff --name-only HEAD^ HEAD retrieves the names of files that differ between the parent commit (HEAD^) and the current commit (HEAD). The script then filters this list using Where-Object to find files matching specific regular expressions, such as those under the docs/ directory or ending with the .md extension. If any matches are found, the variable $HasDiff is set to true.

The result is then passed to subsequent steps using the legacy workflow command Write-Host "::set-output name=docs_changed::$HasDiff". This command writes an output value that can be referenced in the if condition of later steps. The final step runs only if steps.check_file_changed.outputs.docs_changed equals 'True'. This mechanism allows for precise control, ensuring that publishing or documentation-related tasks only execute when relevant files are modified.

Pull Request Diffing with Bash and GitHub Context

For pull request workflows, the comparison logic differs slightly because the target is often the base branch of the pull request rather than the immediate parent commit. Developers frequently use Bash scripts in conjunction with GitHub’s context variables to achieve this. The github.event.pull_request.base.sha provides the SHA of the commit from which the pull request originated, while github.sha provides the SHA of the commit being checked out.

To obtain a list of changed files that are still present (excluding deleted files), the --diff-filter argument can be used. The filter ACMRT includes files that are Added, Copied, Modified, Renamed, or have their type changed. This ensures that the subsequent operations only target files that exist in the current working tree.

bash git diff --name-only --diff-filter=ACMRT ${{ github.event.pull_request.base.sha }} ${{ github.sha }}

This approach avoids the need for third-party actions that might require repository tokens, leveraging instead the native Git capabilities available in the runner. By combining this with standard shell commands, developers can create lightweight, efficient checks that integrate seamlessly into their existing workflows.

Advanced File Filtering with Grep and Environment Variables

A more complex scenario involves checking for changes in specific directories, such as model files in a Django application, and using that information to branch the test execution strategy. This method utilizes the GITHUB_ENV file to pass environment variables between steps, which is the modern replacement for the deprecated ::set-output command.

In a Python/Django project, it is common to want to run a full test suite only when model files change, while running a lighter subset of tests otherwise. This can be achieved by using git diff to list changed files and grep to count matches for a specific path.

yaml name: Run Pytest without Model on: pull_request: types: [opened, reopened, synchronize, ready_for_review] env: WORKING_DIRECTORY: application jobs: Setup: name: Run Test Code runs-on: ubuntu-20.04 defaults: run: working-directory: application steps: - name: Checkout code uses: actions/checkout@v4 with: fetch-depth: 0 - name: Check for model changes run: | MODEL_CHANGED=false if git diff --name-only origin/${{ github.base_ref }} | grep -c ${{ env.WORKING_DIRECTORY }}/application/models/ > 0; then MODEL_CHANGED=true fi echo "MODEL_CHANGED=$MODEL_CHANGED" >> $GITHUB_ENV - name: Run Pytest Without Model if: env.MODEL_CHANGED == 'false' run: | set -o pipefail poetry run pytest ${{ env.WORKING_DIRECTORY }}/tests/serializers ${{ env.WORKING_DIRECTORY }}/tests/views --junitxml=pytest.xml -x -n auto --cov --no-cov-on-fail --suppress-no-test-exit-code | tee pytest-coverage.txt - name: Run Pytest if: env.MODEL_CHANGED == 'true' run: | set -o pipefail poetry run pytest --junitxml=pytest.xml -x -n auto --cov --no-cov-on-fail --suppress-no-test-exit-code | tee pytest-coverage.txt

The checkout step uses fetch-depth: 0 to retrieve the full history. This is critical because the subsequent diff command references origin/${{ github.base_ref }}. The github.base_ref context variable contains the base branch of the pull request (e.g., develop). Without fetching the full history, the origin remote might not contain the necessary ref, leading to errors such as fatal: ambiguous argument 'origin': unknown revision or path not in the working tree.

The "Check for model changes" step initializes MODEL_CHANGED to false. It then executes git diff --name-only origin/${{ github.base_ref }} to get the list of changed files relative to the base branch. The output is piped to grep -c, which counts the number of lines matching the pattern ${{ env.WORKING_DIRECTORY }}/application/models/. If the count is greater than zero, MODEL_CHANGED is set to true. This value is appended to $GITHUB_ENV, making it available as an environment variable for subsequent steps.

The following steps use the if condition env.MODEL_CHANGED == 'false' or env.MODEL_CHANGED == 'true' to determine which test command to run. This allows for significant optimization of CI resources by avoiding unnecessary full test runs when only non-critical files are modified.

Automating Dependency Changes with Diff-Check

Beyond conditional test execution, git diff logic can be employed to enforce correctness in automated workflows, such as those triggered by Dependabot. When a dependency is updated, it may cause changes in generated files, such as Podfile.lock in iOS projects or lock files in other ecosystems. If these generated changes are not committed alongside the dependency bump, the build may fail or require manual intervention.

To prevent this, a custom GitHub Action called "diff-check" can be used. This action runs a specific command (such as an install or build command) and then checks if any files have changed as a result. If files are modified, the action fails, alerting the user that the generated files need to be committed. This automates the detection of side-effects from dependency updates, ensuring that the repository remains in a consistent state and reducing the burden on developers to manually verify lock file changes.

Best Practices and Considerations

Implementing file-based conditional logic in GitHub Actions requires careful attention to several factors:

  • Fetch Depth: Always ensure that fetch-depth is set appropriately. A depth of 1 is insufficient for diffing against previous commits. A depth of 2 is sufficient for comparing with the parent commit, while 0 (full history) is required for comparing against base branches in pull requests.
  • Event Context: The availability of certain context variables, such as github.base_ref, is limited to pull_request and pull_request_target events. For push events, developers must rely on HEAD vs HEAD^ or other commit SHAs.
  • Shell Compatibility: Choose the appropriate shell (bash, pwsh, sh) based on the runner’s operating system and the complexity of the logic required. PowerShell offers powerful filtering capabilities, while Bash with grep is lightweight and universal.
  • Environment Variables vs. Outputs: Modern workflows should use GITHUB_ENV or GITHUB_OUTPUT files to pass data between steps, as the legacy ::set-output command is deprecated and may be removed in future versions of the runner.

By leveraging these techniques, development teams can create highly efficient, responsive, and maintainable CI/CD pipelines that adapt dynamically to the nature of the code changes.

Conclusion

The inability of GitHub Actions to natively filter jobs by changed files is a significant hurdle for large-scale projects, but it is one that can be effectively circumvented using standard Git tools. By combining git diff with shell scripting, developers can implement sophisticated conditional logic that restricts workflow execution to relevant code changes. Whether using PowerShell to detect documentation updates, Bash with grep to filter model changes in a Django application, or custom actions to enforce lock file consistency, these strategies provide a robust framework for optimizing CI/CD pipelines. As repositories grow in complexity, the ability to selectively execute tests and builds based on file changes becomes not just a convenience, but a necessity for maintaining rapid feedback loops and efficient resource utilization.

Sources

  1. Executing GitHub Actions jobs or steps only when specific files change
  2. Get changed files in GitHub Actions
  3. How to change test items according to git diff using GitHub Actions
  4. Diff Check GitHub Action

Related Posts