Deterministic Pipeline Execution via GitLab CI/CD when: on_success and manual Policies

The orchestration of continuous integration and continuous deployment (CI/CD) pipelines requires a granular understanding of job execution logic to ensure stability, security, and efficiency. Within the GitLab CI/CD ecosystem, the when keyword serves as the primary mechanism for defining the conditional logic that governs when a specific job enters the execution queue. Among the various possible directives, on_success and manual represent two fundamentally different philosophies of pipeline progression: one driven by automated continuity and the other by human intervention. Navigating the intersection of these two states is critical for DevOps engineers who must balance the speed of automated testing with the safety of controlled production deployments. Misconfiguration of these parameters can lead to catastrophic pipeline failures, where subsequent stages attempt to execute before their necessary dependencies are met, or where sensitive production environments are left exposed to unauthorized triggers.

The Mechanics of on_success and Automatic Progression

In the standard GitLab CI/CD lifecycle, the on_success directive acts as the default behavior for jobs within a pipeline. This policy establishes a strict dependency chain based on the outcome of preceding stages. When a job is configured with when: on_success, the GitLab runner evaluates the status of all jobs in all previous stages. The job is only permitted to enter the execution state if every single job in the prior stages has completed with a "success" status.

The impact of this policy is profound in complex microservices architectures or multi-stage build pipelines. It ensures that a deployment job, for instance, never attempts to pull an artifact that failed to build or a container image that failed a security scan. By enforcing this logical gate, on_success maintains the integrity of the software supply chain. However, the reliance on this default behavior requires a precise understanding of how GitLab interprets stage transitions.

Comparison of Execution Policies

Policy	Requirement for Execution	Default Behavior	Impact on Pipeline Flow
`on_success`	All prior stage jobs must succeed	Yes (Standard)	Ensures data integrity and dependency availability
`on_failure`	At least one prior stage job must fail	No	Useful for error reporting and cleanup tasks
`always`	Execution occurs regardless of prior status	No	Necessary for teardown or logging tasks
`manual`	Human interaction is required	No	Pauses pipeline for verification or gatekeeping

The Functional Nature of manual Jobs

A manual job is a specialized job type that requires an explicit trigger from a user to begin execution. This is implemented in the .gitlab-ci.yml configuration file by adding the when: manual directive. Manual jobs are most frequently employed during the deployment phase of a lifecycle, such as pushing code to a production environment, where automated execution might pose a risk to system stability without human oversight.

When a pipeline is initialized, manual jobs are not executed automatically. Instead, they appear in the pipeline interface as "skipped" or "waiting" states. This behavior allows the pipeline to progress through automated testing and linting phases, coming to a halt at the manual gate until an authorized user intervenes.

Distinction Between Optional and Blocking Manual Jobs

The behavior of a manual job is further categorized by its impact on the overall pipeline status, determined by the allow_failure parameter. This distinction is vital for controlling the flow of the pipeline after the manual gate is passed or if the manual job is ignored.

Optional Manual Jobs

These occur when allow_failure is set to true.
This is the default setting for jobs that define when: manual outside of a rules block.
The status of an optional manual job does not influence the overall success or failure of the pipeline.
A pipeline can be marked as "passed" even if an optional manual job is skipped or fails.

Blocking Manual Jobs

These occur when allow_failure is set to false.
This is the default setting for jobs that define when: manual within a rules block.
The pipeline stops entirely at the stage where the blocking manual job is defined.
Subsequent stages will not execute until the manual job is triggered and completes successfully.

Critical Failure Modes in Stage Sequencing

A common point of failure for developers new to GitLab CI/CD involves the interaction between when: manual and the default on_success policy of subsequent stages. If a pipeline is designed with a manual job in an early stage (e.g., the build stage) and a default job in a later stage (e.g., the deploy stage), the pipeline may exhibit unexpected and "miserable" failure patterns.

In this scenario, if the build job is set to when: manual, it is skipped by default at the start of the pipeline. Because the deploy job is left with the default on_success policy, GitLab observes that no jobs in the build stage "failed"—they were simply not run or were skipped. This can trigger the deploy job to start automatically. If the deploy job depends on artifacts produced by the build job (via the dependencies keyword), the deployment will fail because the required files do not exist.

Example of a Faulty Configuration

```yaml
stages:
- build
- deploy

buildexample:
stage: build
script:
- echo "done" > build.tar.gz
artifacts:
expirein: 2h
paths:
- ./*.tar.gz
when: manual

deployexample:
stage: deploy
dependencies:
- buildexample
script:
- mv build.tar.gz deployed.tar.gz
```

In the configuration above, the deploy_example job will attempt to run immediately after the pipeline starts because the build_example job was skipped rather than failed. The mv command will fail because build.tar.gz was never created, leading to a broken pipeline for every single commit. To resolve this, developers must ensure that subsequent stages are logically gated or that the manual job is configured to prevent the "skipped" status from being interpreted as a successful prerequisite.

Advanced Control and Security via Protected Environments

For enterprise-grade deployments, manual jobs must be coupled with security protocols to ensure that only authorized personnel can trigger critical actions. GitLab provides two primary methods for this: manual confirmation and protected environments.

Manual Confirmation

To mitigate the risk of accidental clicks or "fat-finger" errors during high-stakes deployments, GitLab allows the use of manual_confirmation. When use: manual_confirmation is paired with when: manual, the system requires a secondary confirmation dialog to appear before the job begins. This is particularly useful for sensitive operations such as production deletions or database migrations.

Protected Environments and Authorization

For more robust security, GitLab offers the ability to protect specific environments. This is available in the Premium and Ultimate tiers and applies to GitLab.com, GitLab Self-Managed, and GitLab Dedicated offerings. By assigning an environment to a manual job, administrators can restrict who is allowed to trigger that job.

Environment Assignment

The job must include an environment block.
Example configuration:
yaml deploy_prod: stage: deploy script: <ul> <li>echo "Deploy to production server" environment: name: production url: https://example.com when: manual rules:</li> <li>if: $CICOMMITBRANCH == $CIDEFAULTBRANCH

Access Control

Authorized users are defined in the "Allowed to Deploy" list within the protected environments settings.
In public projects, users with Developer, Maintainer, or Owner roles are typically permitted.
In private or internal projects, roles can include Guest, Planner, Reporter, Developer, Maintainer, or Owner.
GitLab administrators maintain global authority to use protected environments regardless of specific list settings.

Blocking Pipeline Flow

When using protected environments with blocking manual jobs (allow_failure: false), the pipeline will stay in a pending state.
The subsequent stages of the pipeline will only execute after an authorized user has triggered the manual job.

Variable Management and Manual Job Retries

The ability to inject specific context into a manual job is essential for flexible deployment strategies. GitLab allows users to override or provide custom CI/CD variables at the time of manual execution.

Overriding Variables

When a user interacts with a manual job through the GitLab UI, they can add, modify, or delete CI/CD variables. If a variable provided during the manual run shares a name with a variable already defined in the `.gitlab-ci.yml` file or the project's CI/CD settings, the manual variable takes precedence. It is critical to note that variables overridden through this process are expanded and are not masked, which requires caution when handling sensitive data.

The Retry Mechanism

If a manual job fails or needs to be executed with different parameters, GitLab provides two distinct retry pathways from the job details page:

Retry with Same Variables:
Select the Retry icon.
This re-runs the job using the exact same variable set used in the previous attempt.
Retry with Modified Values:
Select the Retry job with modified values option from the dropdown menu.
The interface will prefill the form with the variables from the previous run, allowing the user to efficiently adjust specific values without re-entering the entire set.

Alternative Temporal Control: Delayed Jobs

While `when: manual` relies on human intervention, `when: delayed` provides a mechanism for temporal control without requiring a user. This is useful for jobs that must wait for external systems to stabilize or for specific time windows. A delayed job uses the `start_in` keyword to define the wait period. The value provided to `start_in` is an elapsed time in seconds, though units can be specified. The timer for a delayed job begins immediately after the previous stage completes.

Configuration Parameters for Delayed Jobs

Unit	Valid Example	Constraints
Seconds (Default)	`'5'`	Must be surrounded by single quotes
Minutes	`30 minutes`	Minimum 1 second
Hours	`2 hours`	Maximum 1 week
Days	`1 day`
Weeks	`1 week`

The use of `when: delayed` ensures that a stage does not progress until the timer expires, providing a built-in buffer between CI/CD actions.

Conclusion: Synthesizing Manual and Automated Logic

The strategic implementation of `when: on_success` and `when: manual` determines the reliability and security of a DevOps pipeline. While `on_success` provides the backbone for automated, high-velocity delivery, it requires strict adherence to dependency management to avoid the pitfalls of automated execution on uninitialized artifacts. Conversely, `manual` jobs provide the necessary friction required for production stability, but they must be implemented with a clear understanding of whether they are intended to be optional or blocking. By integrating advanced features such as protected environments, manual confirmation, and variable overrides, organizations can move beyond simple "pass/fail" logic toward a sophisticated, human-in-the-loop deployment model. The ultimate goal is to create a pipeline that is both resilient to failure and controlled enough to prevent unauthorized or accidental changes to the production landscape.