15 Days of Playwright - Day 13: CI/CD, Retries, Sharding and Reports

A test suite that only runs on your laptop is not a test suite — it's a local script. The value of automated tests comes from running them continuously, on every pull request, before every deploy. Today we wire Playwright into CI/CD and make it production-ready.

Running Playwright in GitHub Actions

The simplest GitHub Actions workflow for Playwright:

# .github/workflows/playwright.yml
name: Playwright Tests

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: 22

      - name: Install dependencies
        run: npm ci

      - name: Install Playwright browsers
        run: npx playwright install --with-deps

      - name: Run tests
        run: npx playwright test

      - name: Upload test report
        if: always() # Upload even when tests fail
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 30

The key details:

npx playwright install --with-deps — installs browsers AND system dependencies (fonts, codecs). Without --with-deps, Playwright browsers often fail on bare CI runners.
if: always() on the artifact upload — you need the report most when tests fail. always() ensures it uploads even on failure.

Caching Browser Downloads

Browser binaries are ~300MB. Cache them across runs to save ~2 minutes per workflow:

- name: Cache Playwright browsers
  uses: actions/cache@v4
  id: playwright-cache
  with:
    path: ~/.cache/ms-playwright
    key: playwright-${{ runner.os }}-${{ hashFiles('package-lock.json') }}

- name: Install Playwright browsers
  if: steps.playwright-cache.outputs.cache-hit != 'true'
  run: npx playwright install --with-deps
  
- name: Install only system deps (cache hit)
  if: steps.playwright-cache.outputs.cache-hit == 'true'
  run: npx playwright install-deps

Retries: Handling Flakiness Intelligently

Not all test failures mean the code is broken. Network blips, race conditions in the test environment, and third-party service timeouts create intermittent failures. Retries let you recover from these without masking real bugs.

Configuring Retries

// playwright.config.js
import { defineConfig } from '@playwright/test';

export default defineConfig({
  // In CI: retry 2 times. Locally: fail immediately for fast feedback.
  retries: process.env.CI ? 2 : 0,
});

Retry Behavior

When a test fails and retries > 0:

Playwright re-runs the entire test from scratch (a fresh browser context)
If it passes on retry, it is marked as flaky (not failed) in the report
The CI job exits with code 0 only if the test passes within the retry budget

A test marked flaky is a signal to investigate — it means the test (or the application) has a timing issue that should be fixed.

`trace: 'on-first-retry'`

Combine retries with trace collection to get a recording of the exact failure:

use: {
  trace: 'on-first-retry', // Only trace when a retry is triggered
  screenshot: 'only-on-failure',
  video: 'retain-on-failure',
}

This way, you don't pay the cost of tracing every test, but you always have a trace for the tests that needed retrying.

Sharding: Splitting Tests Across Machines

For large test suites (hundreds of tests), a single machine is a bottleneck. Playwright's built-in sharding splits the test suite across multiple machines:

# Machine 1 of 3
npx playwright test --shard=1/3

# Machine 2 of 3
npx playwright test --shard=2/3

# Machine 3 of 3
npx playwright test --shard=3/3

Sharding in GitHub Actions (Matrix Strategy)

jobs:
  test:
    runs-on: ubuntu-latest
    strategy:
      fail-fast: false
      matrix:
        shard: [1, 2, 3, 4]

    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22

      - run: npm ci
      - run: npx playwright install --with-deps

      - name: Run Playwright tests (shard ${{ matrix.shard }}/4)
        run: npx playwright test --shard=${{ matrix.shard }}/4

      - name: Upload blob report
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: blob-report-${{ matrix.shard }}
          path: blob-report
          retention-days: 1

Merging Shard Reports

Each shard produces a partial report. Merge them in a follow-up job:

  merge-reports:
    if: always()
    needs: [test]
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: 22
      - run: npm ci

      - name: Download blob reports
        uses: actions/download-artifact@v4
        with:
          path: all-blob-reports
          pattern: blob-report-*
          merge-multiple: true

      - name: Merge reports
        run: npx playwright merge-reports --reporter html ./all-blob-reports

      - name: Upload merged HTML report
        uses: actions/upload-artifact@v4
        with:
          name: playwright-report
          path: playwright-report/
          retention-days: 14

This is the canonical Playwright sharding pattern from the official documentation. The blob reporter (--reporter blob) outputs a machine-readable format designed for merging; the final merge step converts it to a human-readable HTML report.

Reporters: Making Results Actionable

Playwright supports multiple reporters simultaneously:

import { defineConfig } from '@playwright/test';

export default defineConfig({
  reporter: [
    ['html', { outputFolder: 'playwright-report', open: 'never' }],
    ['junit', { outputFile: 'results.xml' }], // For CI dashboards
    ['list'],                                  // Console output during run
    ['json', { outputFile: 'results.json' }],  // For custom tooling
  ],
});

The HTML Reporter

The HTML reporter produces a self-contained report website (playwright-report/index.html). It shows:

Pass/fail/flaky counts
Execution duration per test
Failure messages and stack traces
Inline traces, videos, and screenshots for failing tests

Open locally:

npx playwright show-report

JUnit Reporter for CI Dashboards

GitHub Actions, Jenkins, and most CI systems can parse JUnit XML. Adding ['junit'] makes your test results appear as a native CI test report:

reporter: [
  ['junit', { outputFile: 'results.xml' }],
],

In GitHub Actions, publish it with:

- name: Publish test results
  uses: actions/upload-artifact@v4
  with:
    name: junit-results
    path: results.xml

Environment Variables in CI

Never hardcode base URLs or credentials. Use environment variables:

// playwright.config.js
export default defineConfig({
  use: {
    baseURL: process.env.BASE_URL || 'http://localhost:3000',
  },
});

In GitHub Actions:

env:
  BASE_URL: ${{ vars.STAGING_URL }}
  TEST_USER: ${{ secrets.TEST_USER_EMAIL }}
  TEST_PASSWORD: ${{ secrets.TEST_USER_PASSWORD }}

Access in tests:

test('login', async ({ page }) => {
  await page.goto('/login');
  await page.getByLabel('Email').fill(process.env.TEST_USER!);
  await page.getByLabel('Password').fill(process.env.TEST_PASSWORD!);
});

Complete Production Configuration

Here is a battle-tested playwright.config.js for a CI/CD environment:

import { defineConfig, devices } from '@playwright/test';

export default defineConfig({
  testDir: './tests',
  fullyParallel: true,
  forbidOnly: !!process.env.CI,
  retries: process.env.CI ? 2 : 0,
  workers: process.env.CI ? 4 : undefined,
  
  reporter: [
    ['html', { open: 'never' }],
    ['blob'],
    ...(process.env.CI ? [['junit', { outputFile: 'results.xml' }]] : [['list']]),
  ],

  use: {
    baseURL: process.env.BASE_URL || 'http://localhost:3000',
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
    video: 'retain-on-failure',
  },

  projects: [
    { name: 'chromium', use: { ...devices['Desktop Chrome'] } },
    { name: 'firefox', use: { ...devices['Desktop Firefox'] } },
  ],
});

Key options explained:

fullyParallel: true — every test file runs in parallel (not just within a file)
forbidOnly: !!process.env.CI — blocks .only tests from reaching main branch (safety net)
workers: process.env.CI ? 4 : undefined — 4 workers in CI, auto-detect locally

Link to GitHub project

Conclusion

CI/CD integration is what turns Playwright from a local tool into a quality gate. With today's setup you have:

GitHub Actions workflow that installs browsers, runs tests, and uploads reports on every PR
Smart retries that absorb environment flakiness without hiding real failures
Sharding that cuts a 20-minute suite to 5 minutes across four machines
Multi-format reporting that feeds CI dashboards, HTML viewers, and custom tooling

Your next step: add the playwright.yml workflow to your repository and watch it run on the next pull request.

Thank you for reading and see you in the next lesson! ☕💜