How CI runs on every PR

Every PR to main triggers .github/workflows/pr_branch_tests.yml. The pipeline is structured as four jobs in two parallel waves to keep wall-clock fast (~13 min total instead of ~25 min serial) while still proving the regulated chain end-to-end.

The four jobs

Total wall-clock: max(dart, build) + max(system, smoke) ≈ 13 min.

Job 1 — `dart-tests (UT + IT) + analyze`

What it proves: the Flutter codebase compiles, static analysis passes, and every Dart UT/IT (the SRS-linked ones, by tag) is green.

Steps:

Checkout
Configure git creds (private bioflow_binaries repo via BIOFLOW_BINARIES_TOKEN)
Setup Flutter + cache
flutter pub get, upgrade zsig_plugin, dart run build_runner build
flutter analyze (continue-on-error — informational)
flutter test test/integration/ test/unit/ with --file-reporter "json:test-results/flutter.json"
Convert JSON → JUnit XML via junitreport:tojunit
Run performance tests (continue-on-error — informational)
Upload flutter-junit-<run> and perf-results-<run> artefacts
Re-evaluate the JUnit XML and exit non-zero if any test failed (the gate)

Why it’s first: if the unit / integration tests fail, the rest of the pipeline is meaningless. Failing fast saves runner minutes.

Job 2 — `build-and-package` (Flutter build + Inno Setup)

What it proves: the project produces a real release-grade installer that’s ready to install on a Windows workstation.

Steps:

Checkout
Configure git creds
Setup Flutter + cache + Inno Setup
Get version from pubspec.yaml
flutter pub get, upgrade zsig_plugin, dart run build_runner build
flutter build windows --release --obfuscate --split-debug-info=build/symbols ...
Upload debug symbols to Sentry
Build installer via Inno Setup ISCC
Calculate SHA256 of the installer
Upload installer-<run> artefact (containing the .exe + SHA256SUMS)

The job exposes outputs (installer_name, installer_size_mb, installer_sha256, version) consumed by wave-2 jobs.

Why it runs in parallel with dart-tests: the build doesn’t depend on test results; we want both running concurrently.

Job 3 — `system-and-rtm` (VisionTrace + RTM rendering)

Needs: build-and-package (for the installer) + dart-tests (for the JUnit XML).

What it proves: the freshly-built installer installs cleanly, the running app passes every recorded ST workflow, and the regulated review state has no unauthorised drift.

Steps:

Checkout
Setup uv + install Python deps
Install VisionTrace + allure-pytest + pyperclip (via VISIONTRACE_TOKEN)
Install allure-commandline via npm
Set display resolution to 1920x1080 (so VisionTrace’s vision verifier sees the same UI proportions as a dev workstation)
Download the installer-<run> artefact
Silent-install the installer to C:\Program Files\BioFlow Pro\
Download flutter-junit-<run> artefact
Run VisionTrace ST suite via pytest visiontrace_tests/ (the autouse ensure_licensed fixture activates the license on first launch)
Upload bioflow-visiontrace-diagnostics-<run> artefact (screenshots + screen recordings)
Run StrictDoc export
Run post-processor (renders RTM, builds Allure dashboard, deep-links badges)
Upload bioflow-rtm-<run>, bioflow-allure-results-<run>, flutter-junit-<run> artefacts
Run the post-processor’s --fail-on-suspect gate (the regulated drift gate)

All steps from “Run StrictDoc export” onwards run with if: always() so the RTM is generated even when tests failed earlier — exactly the artefact a reviewer wants to inspect on a red run.

Job 4 — `smoke-and-perf`

Needs: build-and-package (installer).

What it proves: the installer launches and stays running for ≥ 10 s without crashing, and performance benchmarks haven’t regressed.

Steps:

Checkout
Slack notification “build started”
Download installer-<run> artefact
Silent install
Smoke test — launch the app, poll for the main window (timeout 120 s), soak for ≥ 10 s, terminate
Download perf-results-<run> from dart-tests
Combine into benchmark-results.json
Compare against baseline via benchmark-action/github-action-benchmark
Upload installer to Slack (#build-development)
Upload bioflow-smoke-perf-<run> artefact
Write $GITHUB_STEP_SUMMARY with PR + checksum + startup time + version

Failure semantics

If this job fails	Effect
`dart-tests`	Wave-2 jobs still run (they don’t `needs:` dart-tests). RTM renders showing the failed UT/IT in red. Smoke tests run normally. PR is red because dart-tests failed.
`build-and-package`	Wave-2 jobs are skipped (no installer to install). PR is red.
`system-and-rtm`	Smoke runs independently. PR is red. The RTM still renders even when ST fails (post-VT steps use `if: always()`).
`smoke-and-perf`	RTM still renders. PR is red because smoke / benchmark fails.

In every case, the bioflow-rtm-<run> artefact is uploaded if system-and-rtm ran (which happens whenever build-and-package succeeded). That’s why a red CI run still produces a useful artefact for review.

Required status checks for branch protection

For a regulated PR to be mergeable, all four of these jobs must be green. They are configured as required status checks in the branch-protection playbook.

Where the configuration lives

.github/workflows/pr_branch_tests.yml — full source. Read the comments at the top of each job for the rationale captured at the time each piece was written.

How CI runs on every PR

The four jobs

Job 1 — dart-tests (UT + IT) + analyze

Job 2 — build-and-package (Flutter build + Inno Setup)

Job 3 — system-and-rtm (VisionTrace + RTM rendering)

Job 4 — smoke-and-perf

Failure semantics

Required status checks for branch protection

Where the configuration lives

Job 1 — `dart-tests (UT + IT) + analyze`

Job 2 — `build-and-package` (Flutter build + Inno Setup)

Job 3 — `system-and-rtm` (VisionTrace + RTM rendering)

Job 4 — `smoke-and-perf`