Skip to content

Single Run

Opening a run — from the Runs view, from a commit, or from the upload confirmation link in your CI logs — shows the full report for that single upload: every test’s result with errors and artifacts, plus the execution waterfall and system telemetry when the reporter captured them.

Tests Waterfall showing parallel test execution across workers with system telemetry overlay

Certain test runners expose information about test execution details, like which parallel worker executed the test, when it started and how long did it take. In addition, certain reporters embed system telemetry to overlay the test execution.

The lower section shows one horizontal lane per parallel worker (thread/process). Each test appears as a colored bar:

ColorMeaning
GreenTest passed
RedTest failed
GrayTest was skipped

The bar’s width represents the test’s duration. Gaps between bars show idle time on that worker. This makes it easy to spot:

  • Uneven distribution — one worker doing most of the work while others sit idle
  • Long-running tests — single tests that dominate the timeline
  • Bottlenecks — workers waiting because a slow test is blocking

Hover over a test bar to see its name, outcome, and retry number. Click a test bar to navigate to that test’s detailed results; other tests are dimmed to highlight the selection.

When available, the upper section overlays CPU and memory usage synchronized with the test execution timeline, helping you correlate test failures with resource pressure.

  • CPU — average load across all CPU cores, rendered with an orange-to-red gradient fill
  • Memory — rendered with a blue-to-purple gradient fill

Hover over any point to see the exact CPU percentage and memory usage (both as a percentage and in absolute bytes, e.g. “2.4 GB (54%)”) at that moment in the test run.

Telemetry depends on the test runner reporter. For example, @flakiness/playwright captures CPU and memory telemetry automatically, while @flakiness/pytest-flakiness does not. If no telemetry data is present in the report, the telemetry section is simply not shown.

When tests fail, the Test Report shows error details with full context. Each failed test shows its errors inline. An error includes:

  • Error message — the assertion or exception message
  • Stack trace — the full call stack with ANSI colors preserved from the terminal
  • Code snippet — when the reporter embeds source files, the viewer shows the relevant lines of code around the failure location

Test Report includes an Errors tab that aggregates all unique errors across the report into a summary table.

The table sorts by impacted test count (most common errors first), making it easy to identify systemic issues — like a shared service being down or a common assertion pattern failing across many tests.

Clicking an error filters the report to show only tests with that specific error.

Use FQL to search for tests by error text:

$timeout # tests with "timeout" in the error
$undefined # tests with "undefined" in the error
$"connection refused" f:api # connection errors in API test files

A single test can produce multiple errors — for example, when using soft assertions that continue execution after a failure. Test Report displays all errors for the test, not just the first one.

Infrastructure errors that occur outside of a specific test (e.g., during setup or teardown) are shown separately at the Test Report header.

Test reporters capture files produced during test execution — screenshots, videos, trace files, HTML snapshots, terminal output, and any other artifacts — and include them in the Flakiness report. The Report Viewer displays these inline alongside test results.

Every uploaded artifact counts toward your organization’s storage quota, which is defined by the active organization plan. Once the quota is full, new uploads to any project in the organization will fail until storage is freed or the plan is upgraded.

Flakiness reporters auto-compress compressible artifacts before upload to save storage, and all artifacts are encrypted at rest.

The Flakiness JSON report itself is stored as an artifact, so the report’s own data also contributes to the quota. On average, 1 GB fits about 13 million test results.

Where the test runner exposes step information (for example, Playwright Test’s test.step or built-in action steps), attachments are attributed to the step that produced them rather than the test as a whole. Failure context becomes much easier to scan: the screenshot taken right before an assertion failed sits next to the assertion, not at the bottom of the run.

Inline attachments in the Report Viewer, attributed to individual test steps

For runners that don’t emit step events, attachments are attached to the test.

Flakiness.io supports several attachment types natively.

HTML attachments render inside a sandboxed <iframe> in the Report Viewer, so you can inspect the exact markup produced at test time. This is useful for DOM snapshots, rendered email templates, or any other HTML artifact the test captured.

Images are displayed inline with the test result. When multiple images are present, they are laid out in a flex-wrapped gallery.

When a visual comparison test fails, reporters typically produce three images: expected, actual, and diff. The Report Viewer auto-detects these by naming convention (name-expected.png, name-actual.png, name-diff.png) and presents them in a tabbed interface with four views:

  • Diff — the computed difference image
  • Actual — what the test produced
  • Expected — the baseline reference
  • Slider — an interactive image comparer that lets you drag between actual and expected

Video attachments render as inline <video> elements with playback controls.

Playwright traces receive special treatment. When the @flakiness/playwright reporter captures a trace file, the Report Viewer displays it with a dedicated thumbnail and a link that opens the trace in the Playwright Trace Viewer, which gives you a full timeline of actions, network requests, DOM snapshots, and console logs for that test.

Traces are automatically detected when an attachment has the name trace.

Reporters capture stdout and stderr output during test execution. Each test run shows its terminal output in a dedicated section with log entries displayed in chronological order.

Stderr entries are highlighted with a red background to make them visually distinct.

ANSI escape sequences are rendered as styled text, so colors, bold, and other formatting match what you’d see in your terminal.

Any other attachment type (logs, JSON files, data exports) is shown as a downloadable link.