กลับไปที่บทความ
DevOps CI/CD GitHub Actions Deployment Automation

GitHub Actions for Automated Deployment: A Practical Guide

พลากร วรมงคล
18 เมษายน 2569 15 นาที

“How to build deploy workflows that are fast, safe, and survive the 3 a.m. incident — secrets, environments, OIDC, caching, rollback, SSH vs runner-based, and the anti-patterns that catch every team once.”

GitHub Actions is the most common CI/CD system running on the internet today, and for the majority of small-to-mid teams it is good enough that it will never need to be replaced. But “good enough” is not the same as “well set up”. Most teams’ deploy workflows are stitched-together snippets — a checkout, a setup-node, an npm ci, a scp, a systemctl restart — that work on the happy path and become difficult to debug the moment something goes sideways.

This post is the deploy workflow I’d want every team to start with: fast, safe, observable, recoverable. It covers the mental model, the concrete patterns, the security knobs that actually matter (OIDC, environments, protected rules), and the handful of anti-patterns that catch every team once.

TL;DR

  • Workflows are triggered events running jobs (in parallel by default), each made of steps on a runner.
  • The three production-critical patterns: build-in-CI → ship-artifacts, OIDC over long-lived secrets, Environments with required reviewers.
  • Cache aggressively (actions/cache, setup-* built-in caching) — most deploy pipelines spend 60%+ of their time on repeatable work.
  • Concurrency groups prevent two deploys from racing. Treat deploy as a singleton.
  • Rollback before you ever deploy — have the “previous version” button wired before the first release.
  • Force-push, ignore status checks, and self-hosted runners on the public internet are the three ways teams accidentally give away prod.

The Mental Model

Before the YAML, four concepts that explain almost everything else.

ConceptWhat it is
WorkflowA YAML file in .github/workflows/. Has triggers and one or more jobs.
JobA unit of work that runs on a single runner. Jobs in the same workflow run in parallel by default; use needs: to chain them.
StepA shell command or a reusable action (a unit of logic someone else published).
RunnerThe machine that executes a job. GitHub-hosted (ubuntu-latest, macos-latest, windows-latest) or self-hosted.

And three more that matter for production workflows:

ConceptWhat it is
EnvironmentA named deployment target (e.g. production, staging) with its own secrets, required reviewers, protection rules.
OIDCGitHub issues a short-lived identity token to your workflow. Cloud providers trust the issuer, so you never need a long-lived access key in your secrets.
ConcurrencyGroup of workflows that cannot run simultaneously. Use it to serialise deploys.

A Minimal Deploy Workflow

The shortest useful template. Triggered on push to main, runs tests, builds, deploys.

name: Deploy

on:
  push:
    branches: [main]
  workflow_dispatch: # manual trigger from the Actions tab

# Serialise deploys — never two at once on the same env
concurrency:
  group: deploy-${{ github.ref }}
  cancel-in-progress: false

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm
      - run: npm ci
      - run: npm test -- --run

  build:
    needs: test
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm
      - run: npm ci
      - run: npm run build
      - uses: actions/upload-artifact@v4
        with:
          name: build-output
          path: dist/
          retention-days: 7

  deploy:
    needs: build
    runs-on: ubuntu-latest
    environment: production    # enforces whatever protection rules are set on the env
    steps:
      - uses: actions/download-artifact@v4
        with:
          name: build-output
          path: dist/
      - name: Deploy via SSH
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.DEPLOY_HOST }}
          username: ${{ secrets.DEPLOY_USER }}
          key: ${{ secrets.DEPLOY_SSH_KEY }}
          script: |
            set -euo pipefail
            cd /var/www/app
            # ... receive artifact, swap atomically, restart service ...

This is already better than most production setups:

  • Jobs are split (test → build → deploy) so failures fail fast and the build isn’t wasted on a broken test.
  • Artifacts pass cleanly between jobs (no building twice).
  • Concurrency prevents two deploys colliding on the same branch.
  • Environment routes through protection rules — optional reviewers, wait timer, whatever you configure on the env.

Build in CI, Ship Artifacts to Prod

The single biggest win most teams miss. Building on the production host works for a month — until the host runs out of memory on a large TypeScript project, or a transient npm error leaves the site half-deployed, or a build step unintentionally reads production env vars.

The clean pattern:

BadGood
Where the build runsOn the prod server via SSHOn the CI runner
What gets shippedSource + npm install --omit=devPre-built artifact (dist/, .next/, standalone build, Docker image)
Prod server’s jobCompile + restartSwap and restart
Prod server’s RAMMust fit the buildJust needs to run the binary
# Build once on the runner
- name: Build
  run: npm run build
# Compress
- run: tar -czf dist.tar.gz dist/
# Ship to the server (rsync or scp)
- name: Upload
  uses: appleboy/scp-action@v0.1.7
  with:
    host: ${{ secrets.DEPLOY_HOST }}
    username: ${{ secrets.DEPLOY_USER }}
    key: ${{ secrets.DEPLOY_SSH_KEY }}
    source: "dist.tar.gz"
    target: "/tmp"
# Swap + restart
- name: Activate
  uses: appleboy/ssh-action@v1
  with:
    host: ${{ secrets.DEPLOY_HOST }}
    username: ${{ secrets.DEPLOY_USER }}
    key: ${{ secrets.DEPLOY_SSH_KEY }}
    script: |
      set -euo pipefail
      tar -xzf /tmp/dist.tar.gz -C /var/www/app/
      sudo systemctl restart my-app
      rm /tmp/dist.tar.gz

Secrets: OIDC Beats Long-Lived Keys

Long-lived access keys stored in GitHub Secrets are fine, but they rot. Developers leave, keys get leaked, rotation is manual, and if someone runs echo $AWS_SECRET in a workflow debug step, it’s in the logs forever.

OIDC (OpenID Connect) solves this: GitHub mints a short-lived token for each workflow run, and your cloud provider trusts GitHub as an identity provider. No static secret.

# AWS with OIDC
permissions:
  id-token: write   # required for OIDC
  contents: read

jobs:
  deploy:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: arn:aws:iam::1234567890:role/github-deploy
          aws-region: us-east-1
      # Now standard AWS CLI / SDK calls authenticate automatically
      - run: aws s3 sync dist/ s3://my-bucket --delete

The IAM role trust policy names GitHub’s OIDC provider and restricts to your specific repo/branch. A leaked workflow can’t mint a token for your role.

GCP, Azure, and Vercel all support OIDC from GitHub Actions. Use it. The setup cost is one afternoon; the maintenance cost is zero.

ProviderAction to use
AWSaws-actions/configure-aws-credentials
GCPgoogle-github-actions/auth
Azureazure/login with client-id (no client-secret)
Cloudflare(still API token at time of writing)
VercelOIDC for AWS/GCP within your deploy; platform itself uses project tokens

When OIDC isn’t an option, use Environment secrets over Repo secrets: they’re scoped to the env (prod vs staging), and access is recorded in the deployment log.

Environments: Where Safety Lives

Environments are underrated. They give you, for free:

  • Separate secrets per env. The production deploy can’t accidentally grab a staging URL.
  • Required reviewers. The prod env refuses to start until N specific people click approve.
  • Wait timers. “Prod deploy sits for 10 minutes before starting” gives you a window to cancel.
  • Branch rules. The prod env will only run from main.
  • Deployment history. Every deploy is a first-class event you can inspect, rerun, or roll back from.
jobs:
  deploy-staging:
    environment: staging
    # ... deploy to staging ...
  deploy-production:
    needs: deploy-staging
    environment:
      name: production
      url: https://example.com   # shown in deployment UI
    # ... deploy to prod ...

Then in the repo → Settings → Environments, configure the prod env with at least: required reviewers, allowed branch main, secrets scoped to prod. You now have a half-decent change-management system without any other tool.

Caching: Most Workflows Are 60% Repeatable

The built-in cache: option in most setup actions handles 90% of cases.

- uses: actions/setup-node@v4
  with:
    node-version: 22
    cache: npm            # caches ~/.npm based on package-lock.json

- uses: actions/setup-python@v5
  with:
    python-version: "3.12"
    cache: pip            # caches pip downloads

For anything else — Docker layer cache, Next.js .next/cache, Turbo cache, Jest cache — use actions/cache:

- uses: actions/cache@v4
  with:
    path: |
      .next/cache
      node_modules/.cache
    key: ${{ runner.os }}-next-${{ hashFiles('**/package-lock.json') }}-${{ hashFiles('src/**') }}
    restore-keys: |
      ${{ runner.os }}-next-${{ hashFiles('**/package-lock.json') }}-
      ${{ runner.os }}-next-

The restore-keys fallback chain is important: if the exact key isn’t found, use the next best match. This gives you partial cache hits when dependencies change but your source mostly hasn’t.

Don’t cache node_modules directly. Cache the package manager’s download cache (~/.npm, ~/.pnpm-store) and re-run the install. That way npm ci still runs but finishes in seconds.

Concurrency: Don’t Race Yourself

Without a concurrency block, a flurry of pushes to main triggers a flurry of deploy workflows — and they’ll race to write to the same production server.

concurrency:
  group: deploy-${{ github.ref }}
  cancel-in-progress: false   # queue, don't cancel

Two real settings worth understanding:

SettingEffect
cancel-in-progress: trueNew run cancels the one in progress. Good for PR checks where only the latest commit matters.
cancel-in-progress: falseNew run waits for the current one to finish. Good for deploys where you want every commit to actually ship (in order).

The group key is a string — use deploy-${{ github.ref }} to serialise per branch, or deploy-production to serialise across all branches that deploy there.

Matrix Builds: Parallel by Default

When your pipeline naturally has multiple variants (Node versions, OSes, app folders in a monorepo), use a matrix.

jobs:
  test:
    runs-on: ${{ matrix.os }}
    strategy:
      fail-fast: false
      matrix:
        os: [ubuntu-latest, macos-latest, windows-latest]
        node: [20, 22]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: ${{ matrix.node }}
          cache: npm
      - run: npm ci
      - run: npm test

fail-fast: false keeps the other cells running when one fails — valuable for flaky tests or cross-platform bugs.

For monorepo deploys, matrix across the apps:

jobs:
  detect-changes:
    # ... outputs an array of apps that changed ...

  deploy:
    needs: detect-changes
    strategy:
      matrix:
        app: ${{ fromJson(needs.detect-changes.outputs.apps) }}
    steps:
      - run: ./scripts/deploy.sh ${{ matrix.app }}

Combined with path filters on the trigger (on.push.paths), this means a change in apps/api/** triggers only the API deploy, not everything.

Deployment Strategies

Three strategies most teams want eventually. GitHub Actions doesn’t implement these itself — but it orchestrates them.

StrategyWhat it isHow Actions does it
RollingReplace instances one at a time.matrix over instances, with small batch size + health checks between.
Blue/GreenRun two prod stacks; cut traffic to the new one after smoke tests pass.Two environments (blue, green); a “swap” step flips the DNS/load-balancer target.
CanaryRoute a small % of traffic to the new version; graduate if metrics are good.Deploy action marks new version; a monitored step (or external tool) graduates or rolls back.

A blue/green example sketched out:

jobs:
  deploy-to-inactive:
    runs-on: ubuntu-latest
    steps:
      - name: Determine inactive color
        id: color
        run: |
          ACTIVE=$(./scripts/get-active-color.sh)   # 'blue' or 'green'
          echo "inactive=$([ "$ACTIVE" = "blue" ] && echo green || echo blue)" >> $GITHUB_OUTPUT
      - name: Deploy to inactive
        run: ./scripts/deploy.sh ${{ steps.color.outputs.inactive }}
      - name: Smoke test inactive
        run: ./scripts/smoke.sh https://${{ steps.color.outputs.inactive }}.example.com
      - name: Swap LB
        run: ./scripts/swap-lb.sh ${{ steps.color.outputs.inactive }}

The swap step is what makes it a real blue/green: until that step runs, old traffic still flows to the active side, and a failed smoke test never affects users.

Rollback — Wire It Before the First Deploy

The single most common reason “we can’t rollback” is that no one planned for it. The deploy shipped is implicit (git commit, npm build output); the previous version is not addressable.

Rule: every deploy produces an immutable, re-deployable artifact — with the git SHA or version tag as the primary key.

  • Container images: tag with :${{ github.sha }} (never :latest for prod), push to registry, never overwrite.
  • Static sites: upload to a versioned path (s3://bucket/releases/${{ github.sha }}/) and update the active symlink/alias.
  • VPS tarballs: name app-${{ github.sha }}.tar.gz, keep last 10 on the server.

Rollback then becomes a manual workflow that takes a version and points prod at it:

name: Rollback

on:
  workflow_dispatch:
    inputs:
      version:
        description: Git SHA or tag to roll back to
        required: true
        type: string

jobs:
  rollback:
    runs-on: ubuntu-latest
    environment: production
    steps:
      - run: ./scripts/activate-version.sh ${{ inputs.version }}

No rebuild, no git revert, no drama — just swap the active version. This is the workflow you want to have and never need.

Observability: You Will Regret Flying Blind

Two observability things matter in the workflow itself:

  1. Write the deploy events to your observability platform. Datadog, Honeycomb, Grafana Cloud — whatever you use. A deploy marker on a latency graph has saved more debugging sessions than any post-mortem.
  2. Set up job-failure notifications. Default is email; most teams want Slack. The slackapi/slack-github-action posts to a channel on failure.
- name: Notify on failure
  if: failure()
  uses: slackapi/slack-github-action@v1
  with:
    payload: |
      {
        "text": "🚨 Deploy failed on ${{ github.ref_name }} (${{ github.sha }})",
        "blocks": [ ... ]
      }
  env:
    SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}

Also worth knowing: $GITHUB_STEP_SUMMARY lets you write rich Markdown to the job summary page. Useful for deploy timings, artifact sizes, release notes auto-generated from commits.

- name: Summary
  run: |
    echo "## Deploy summary" >> $GITHUB_STEP_SUMMARY
    echo "- Version: \`${{ github.sha }}\`" >> $GITHUB_STEP_SUMMARY
    echo "- Duration: $(date -d@$SECONDS -u +%M:%S)" >> $GITHUB_STEP_SUMMARY

Common Anti-Patterns (and the Fix)

Anti-patternWhy it bitesFix
Building on the prod serverRuns out of RAM / CPU, affects traffic, slow deploys.Build on the runner, ship the artifact.
Long-lived cloud access keys in secretsRotation is manual; leaks are permanent.OIDC.
uses: actions/checkout@master or @mainBreaks silently when upstream refactors.Pin to a version tag (@v4) or a SHA for higher-security paths.
Shared secret for staging + prodA compromise in either is a compromise in both.Environment-scoped secrets.
continue-on-error: true to “ship anyway”Hides real failures; gives teams a false sense of CI coverage.Fix the test, or mark it as expected-to-fail with an issue number.
Deploy from any branchDevelopers accidentally push prod at 2am.Environment → Deployment branches → main only.
No concurrency groupTwo deploys race; last-write-wins on prod.Concurrency block.
Self-hosted runners on open internetRuns untrusted PR code; attacker pivots to your network.Private/ephemeral runners, and never run untrusted PRs on self-hosted.
Logging secretsecho $TOKEN leaks to CI logs.Never echo secrets; they’re masked but only when GitHub recognises them.

Real-World Shape: SSH-to-VPS Deploy

A complete SSH deploy workflow for a Node.js/Next.js app (loosely based on the one running palakorn.com):

name: Deploy

on:
  push:
    branches: [main]
    paths: ['apps/admin/**']
  workflow_dispatch:

concurrency:
  group: deploy-admin
  cancel-in-progress: false

jobs:
  deploy:
    runs-on: ubuntu-latest
    environment:
      name: production
      url: https://example.com/admin
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 22
          cache: npm
          cache-dependency-path: apps/admin/package-lock.json

      - name: Install, build
        working-directory: apps/admin
        run: |
          npm ci
          npm run db:generate
          npm run build

      - name: Package artifact
        run: tar -czf admin.tar.gz -C apps/admin .next package.json package-lock.json prisma public

      - name: Upload artifact
        uses: appleboy/scp-action@v0.1.7
        with:
          host: ${{ secrets.DEPLOY_HOST }}
          username: ${{ secrets.DEPLOY_USER }}
          port: ${{ secrets.DEPLOY_SSH_PORT }}
          key: ${{ secrets.DEPLOY_SSH_KEY }}
          source: "admin.tar.gz"
          target: /tmp

      - name: Activate
        uses: appleboy/ssh-action@v1
        with:
          host: ${{ secrets.DEPLOY_HOST }}
          username: ${{ secrets.DEPLOY_USER }}
          port: ${{ secrets.DEPLOY_SSH_PORT }}
          key: ${{ secrets.DEPLOY_SSH_KEY }}
          script: |
            set -euo pipefail
            TARGET=/var/www/apps/admin
            tar -xzf /tmp/admin.tar.gz -C "$TARGET"
            cd "$TARGET" && npm ci --omit=dev && npx prisma db push --accept-data-loss
            sudo systemctl restart app-admin
            rm /tmp/admin.tar.gz
            curl -sf http://127.0.0.1:3003/admin > /dev/null  # smoke

Notes:

  • Path filter apps/admin/** means unrelated changes don’t trigger this deploy.
  • Concurrency group is per-app (deploy-admin), so a web deploy and admin deploy can still run in parallel.
  • Environment URL makes the deployment page show the live URL with a one-click “view”.
  • Smoke check at the end fails the deploy if the service didn’t come back healthy, so you find out immediately, not from your users.

Closing Checklist

Before declaring your deploy workflow “done”:

  • Triggered by push to main + workflow_dispatch (manual)
  • Jobs split: test / build / deploy; deploy only runs if tests pass
  • Build on the runner, ship an immutable artifact
  • Concurrency group set — two deploys never race
  • Environment configured with required reviewers + main-only branch rule
  • Secrets scoped to environment, not repo-wide
  • OIDC used where the cloud provider supports it
  • Version artifact with github.sha or a tag — never overwrite
  • Rollback workflow exists as a separate workflow_dispatch job
  • Cache the package-manager store (not node_modules)
  • Smoke check at the end of deploy — fail fast on bad releases
  • Notifications on failure (Slack/email) with commit SHA + run link
  • Actions pinned to version tags, not @main

Further Reading

The goal of this setup is not to ship more often — it’s to make shipping so cheap and so boring that teams stop treating a deploy as an event. When deploys are boring, you deploy small, you deploy often, and you spend the time you saved on work that actually moves the product.

Comments powered by Giscus are not yet configured. Set PUBLIC_GISCUS_REPO_ID and PUBLIC_GISCUS_CATEGORY_ID in apps/web/.env to enable.

PV

เขียนโดย พลากร วรมงคล

Software Engineer Specialist ประสบการณ์กว่า 20 ปี เขียนเกี่ยวกับ Architecture, Performance และการสร้างระบบ Production

เพิ่มเติมเกี่ยวกับผม

บทความที่เกี่ยวข้อง