---
name: github-actions
description: "GitHub Actions best practices and debugging workflows. Use when the user asks to: (1) Debug a failing GitHub Actions workflow, (2) Create a new CI/CD pipeline, (3) Fix flaky tests in CI, (4) Speed up slow workflows, (5) Audit an existing workflow for security or performance issues, or (6) Pass data between jobs. NOT for: learning GitHub Actions from scratch, complex multi-repo monorepo setups, self-hosted runner configuration, or GitHub Enterprise-specific features."
---

# GitHub Actions

Help users build, debug, and maintain GitHub Actions workflows using patterns professionals use — not beginner tutorials.

## Critical First Steps

When a user reports a failing workflow, immediately run these commands. Do not send them to the browser UI.
```bash
gh run list --limit 10
gh run view <run-id> --log-failed
```

When creating or modifying a workflow, validate before committing:
```bash
actionlint .github/workflows/
act -l
```

NEVER commit a workflow that hasn't passed actionlint.

## Core Rules

**Logs:** Always use `gh run view --log-failed`. Never send the user to the browser UI.

**Validation:** Always run `actionlint` before committing. It catches type errors, undefined context references, and expression mistakes that `act` won't catch until runtime.

**SHA pinning:** Always pin third-party actions to a full commit SHA, never a mutable tag. Use `gh api` to resolve the SHA for a specific release tag — never guess one.

**⚠️ Note:** The SHAs in examples below are snapshots and will become outdated. Always resolve current SHAs via:
```bash
# Look up the SHA for a specific release
gh api repos/actions/checkout/git/ref/tags/v4.1.1 --jq '.object.sha'
```

```yaml
# Bad — mutable tag, can change under you
- uses: actions/checkout@v4

# Good — immutable SHA with human-readable comment
- uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
```

**Caching:** Always include cache config when adding a language setup step. Uncached workflows waste minutes re-downloading dependencies on every run.
```yaml
- uses: actions/setup-node@60edb5dd545a775178f52524783378180af0d1f8 # v4.0.2
  with:
    cache: npm
```

**Concurrency:** Add concurrency groups to test and build workflows. Never use `cancel-in-progress: true` on deploy workflows — canceling mid-deploy causes partial state.
```yaml
# Tests and builds — cancel stale runs
concurrency:
  group: ${{ github.workflow }}-${{ github.ref }}
  cancel-in-progress: true

# Deploys — queue, never cancel
concurrency:
  group: deploy-${{ github.ref }}
  cancel-in-progress: false
```

**Manual triggers:** Always add `workflow_dispatch` to production workflows so users can run them without pushing a commit.
```yaml
on:
  push:
    branches: [main]
  pull_request:
  workflow_dispatch:
```

**Timeouts:** Always set an explicit `timeout-minutes`. The default is 6 hours — a stuck job burns through your minutes budget silently.
```yaml
jobs:
  test:
    timeout-minutes: 15
```

**Permissions:** Use least-privilege `permissions` blocks. GitHub tightened the default `GITHUB_TOKEN` scope — many workflows that worked before now fail with 403s because they assume write access they no longer have.
```yaml
permissions:
  contents: read
  pull-requests: write
```

Only grant what the workflow actually needs. If a workflow doesn't interact with PRs, don't give it `pull-requests: write`.

**Composite actions:** Before writing inline steps, check if the org has a `.github/workflows` repo or `actions/` directory with shared composite actions. Prefer those over copy-pasting.

**Reusable workflows:** For workflows shared across repos, use `workflow_call`:
```yaml
# .github/workflows/reusable-deploy.yml
on:
  workflow_call:
    inputs:
      environment:
        required: true
        type: string

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - run: echo "Deploying to ${{ inputs.environment }}"

# Caller workflow in another repo
jobs:
  prod-deploy:
    uses: org/repo/.github/workflows/reusable-deploy.yml@main
    with:
      environment: production
```

Prefer reusable workflows over duplicating the same workflow across repos.

**Dependabot:** If a repo doesn't have `.github/dependabot.yml` with `github-actions` ecosystem configured, suggest adding it. This keeps action SHAs current without manual tracking.
```yaml
version: 2
updates:
  - package-ecosystem: github-actions
    directory: /
    schedule:
      interval: weekly
```

## Matrix Strategy

Prefer matrix over duplicate jobs. One job definition, multiple configurations.
```yaml
jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      matrix:
        node-version: [18, 20, 22]
      fail-fast: false  # Don't cancel other versions if one fails
    steps:
      - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11 # v4.1.1
      - uses: actions/setup-node@60edb5dd545a775178f52524783378180af0d1f8 # v4.0.2
        with:
          node-version: ${{ matrix.node-version }}
          cache: npm
      - run: npm ci
      - run: npm test
```

Use `fail-fast: false` when you need to know which versions break. Use `fail-fast: true` (the default) when any failure means the PR is broken regardless.

For cross-platform testing, combine `os` and version in the matrix:
```yaml
strategy:
  matrix:
    os: [ubuntu-latest, windows-latest, macos-latest]
    node-version: [18, 20]
```

## Passing Data Between Jobs

Jobs run on separate runners. They don't share a filesystem.

**For files** — use artifacts:
```yaml
jobs:
  build:
    runs-on: ubuntu-latest
    steps:
      - run: npm run build
      - uses: actions/upload-artifact@65c4c4a1ddee5b72f698fdd19549f0f0fb45cf08 # v4.6.0
        with:
          name: dist
          path: dist/
          retention-days: 1  # Short retention for ephemeral build artifacts; use 90 for audit trails
  deploy:
    needs: build
    runs-on: ubuntu-latest
    steps:
      - uses: actions/download-artifact@65a9edc5881444af0b9093a5e628f2fe47ea3b2e # v4.1.7
        with:
          name: dist
          path: dist/
      - run: ./deploy.sh
```

**For values** — use job outputs:
```yaml
jobs:
  version:
    runs-on: ubuntu-latest
    outputs:
      tag: ${{ steps.get.outputs.tag }}
    steps:
      - id: get
        run: echo "tag=$(git describe --tags)" >> "$GITHUB_OUTPUT"

  deploy:
    needs: version
    runs-on: ubuntu-latest
    steps:
      - run: echo "Deploying ${{ needs.version.outputs.tag }}"
```

Never use `set-output` (deprecated) — always use `$GITHUB_OUTPUT`.

## Environment Protection and Deployment Gates

The `environment:` key triggers GitHub's deployment protection rules:
```yaml
jobs:
  deploy:
    runs-on: ubuntu-latest
    environment: production  # Triggers required reviewers, wait timers, deployment branches
    steps:
      - run: ./deploy.sh
```

When you specify an environment, the workflow will:
- Wait for required reviewers to approve
- Enforce environment-specific secrets
- Respect deployment branch restrictions
- Apply any configured wait timers

If your deploy workflow hangs waiting for approval, check the environment's protection rules in repo settings.

## Security: Untrusted Code Triggers

**⚠️ Critical:** `pull_request_target` and `workflow_run` run with write permissions in the context of the base branch, not the PR. This means untrusted code from a fork can access secrets if you're not careful.

```yaml
# DANGEROUS — untrusted PR code runs with GITHUB_TOKEN write access
on:
  pull_request_target:

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@b4ffde65f46336ab88eb53be808477a3936bae11
        with:
          ref: ${{ github.event.pull_request.head.sha }}  # Checking out PR code
      - run: npm test  # PR could contain malicious package.json scripts
```

**Safe pattern:**
```yaml
# Safe — only checkout trusted base code, run untrusted code in isolated job
on:
  pull_request_target:

jobs:
  label:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write  # Only what's needed
    steps:
      - run: gh pr edit ${{ github.event.pull_request.number }} --add-label "needs-review"
        env:
          GH_TOKEN: ${{ github.token }}
```

Use `pull_request_target` only when you need write access to the base repo (labels, comments, checks). For running tests on PR code, use `pull_request` (read-only, safe).

## Expression Syntax Gotchas

GitHub Actions expression syntax is JavaScript-like but not JavaScript.

**At the step level** (where env vars are mapped):
```yaml
steps:
  - run: echo "deploying"
    if: env.DEPLOY == 'true'  # Works here because env is in scope
    env:
      DEPLOY: ${{ secrets.DEPLOY_FLAG }}
```

**At the job level** (before env vars exist):
```yaml
jobs:
  deploy:
    if: github.event_name == 'push'  # Can't use secrets.X or env.X here
    runs-on: ubuntu-latest
```

Secrets can only be referenced in `env:` blocks or directly in `with:` parameters, never in `if:` conditions at the job level.

Other gotchas:
```yaml
# WRONG — empty strings are falsy, but missing env vars return ''
if: env.DEPLOY == 'true'

# RIGHT — explicit check handles both missing and empty
if: env.DEPLOY == 'true' && env.DEPLOY != ''
```

Always use explicit equality checks. Never rely on truthy/falsy for strings. Use `always()`, `success()`, `failure()` for status checks — `always()` runs even if a previous step was cancelled.

## Debugging Patterns

### Tests pass locally, fail in CI

1. Check runtime version mismatch (Node, Python, Ruby)
2. Check for missing environment variables
3. Check for missing system dependencies (libxml, imagemagick, etc.)
4. Check for timing issues (tests assuming fast disk or network)
5. Check for filesystem case sensitivity (macOS is case-insensitive, Linux isn't)
6. Reproduce locally: `act -j <job-name>`

### Workflow times out

Likely causes: infinite loop, script waiting for user input, network request with no timeout, `npm install` hitting a registry outage.

### Secrets not working
```yaml
# WRONG — secrets are not in env automatically
- run: echo $API_KEY

# RIGHT — explicitly map secrets to env vars
- run: echo $API_KEY
  env:
    API_KEY: ${{ secrets.API_KEY }}
```

Secrets are case-sensitive. `secrets.api_key` and `secrets.API_KEY` are different. Secrets are also redacted from logs — if your secret appears as `***`, that's working correctly.

### Permission denied

Most common cause: the default `GITHUB_TOKEN` no longer has write permissions. GitHub changed the default to read-only for new repos. Add an explicit `permissions` block (see Core Rules above).

### Flaky tests

Before blaming CI, check if the test is actually flaky:
```bash
# Run the failing job 3 times
for i in 1 2 3; do gh workflow run test.yml; done
```

Common flaky causes: time-dependent assertions, port conflicts, shared state between tests, external API rate limits.

### Cloud deploys: Use OIDC, not secrets

For AWS/GCP/Azure deploys, use OIDC instead of long-lived credentials:
```yaml
permissions:
  id-token: write
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      # Note: Using @v4 here as a placeholder; pin to SHA in production
      - uses: aws-actions/configure-aws-credentials@e3dd6a429d7300a6a4c196c26e071d42e0343502 # v4.0.2
        with:
          role-to-assume: arn:aws:iam::123456789012:role/GitHubActionsRole
          aws-region: us-east-1
```

Never commit cloud credentials as secrets. OIDC is keyless, time-limited, and scoped per workflow.

## Local Testing

Install `act` to run workflows locally before pushing. Note: `act` is community-maintained and won't perfectly replicate GitHub's runner environment, but it catches most structural issues.

```bash
# macOS
brew install act

# Linux/WSL — verify the script before piping to bash
curl -O https://raw.githubusercontent.com/nektos/act/master/install.sh
less install.sh  # review it
sudo bash install.sh

# Run a specific job
act -j test

# Run with secrets
act -s GITHUB_TOKEN=<token>

# Run with a specific event
act pull_request
```

**⚠️ Limitations:** `act` cannot reliably run service containers (postgres, redis), doesn't support all runner features (caching behavior may differ), and has permission model differences. Use it for quick syntax validation and basic logic checks, not as a full CI replacement.

**Good for:** Syntax checks, basic script logic, catching obvious errors before pushing  
**Not good for:** Service containers, GitHub-specific features, complex permissions, or exact CI reproduction

## References

- [GitHub Actions docs](https://docs.github.com/en/actions)
- [act - Run GitHub Actions locally](https://github.com/nektos/act)
- [actionlint - Static checker for GitHub Actions](https://github.com/rhysd/actionlint)
- [Security hardening for GitHub Actions](https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions)
- [Workflow syntax reference](https://docs.github.com/en/actions/reference/workflow-syntax-for-github-actions)
