How CI runs on every PR
Every PR to main triggers .github/workflows/pr_branch_tests.yml. The pipeline is structured as four jobs in two parallel waves to keep wall-clock fast (~13 min total instead of ~25 min serial) while still proving the regulated chain end-to-end.
The four jobs
Section titled “The four jobs”Total wall-clock: max(dart, build) + max(system, smoke) ≈ 13 min.
Job 1 — dart-tests (UT + IT) + analyze
Section titled “Job 1 — dart-tests (UT + IT) + analyze”What it proves: the Flutter codebase compiles, static analysis passes, and every Dart UT/IT (the SRS-linked ones, by tag) is green.
Steps:
- Checkout
- Configure git creds (private
bioflow_binariesrepo viaBIOFLOW_BINARIES_TOKEN) - Setup Flutter + cache
flutter pub get, upgradezsig_plugin,dart run build_runner buildflutter analyze(continue-on-error — informational)flutter test test/integration/ test/unit/with--file-reporter "json:test-results/flutter.json"- Convert JSON → JUnit XML via
junitreport:tojunit - Run performance tests (continue-on-error — informational)
- Upload
flutter-junit-<run>andperf-results-<run>artefacts - Re-evaluate the JUnit XML and exit non-zero if any test failed (the gate)
Why it’s first: if the unit / integration tests fail, the rest of the pipeline is meaningless. Failing fast saves runner minutes.
Job 2 — build-and-package (Flutter build + Inno Setup)
Section titled “Job 2 — build-and-package (Flutter build + Inno Setup)”What it proves: the project produces a real release-grade installer that’s ready to install on a Windows workstation.
Steps:
- Checkout
- Configure git creds
- Setup Flutter + cache + Inno Setup
- Get version from
pubspec.yaml flutter pub get, upgradezsig_plugin,dart run build_runner buildflutter build windows --release --obfuscate --split-debug-info=build/symbols ...- Upload debug symbols to Sentry
- Build installer via Inno Setup ISCC
- Calculate SHA256 of the installer
- Upload
installer-<run>artefact (containing the .exe + SHA256SUMS)
The job exposes outputs (installer_name, installer_size_mb, installer_sha256, version) consumed by wave-2 jobs.
Why it runs in parallel with dart-tests: the build doesn’t depend on test results; we want both running concurrently.
Job 3 — system-and-rtm (VisionTrace + RTM rendering)
Section titled “Job 3 — system-and-rtm (VisionTrace + RTM rendering)”Needs: build-and-package (for the installer) + dart-tests (for the JUnit XML).
What it proves: the freshly-built installer installs cleanly, the running app passes every recorded ST workflow, and the regulated review state has no unauthorised drift.
Steps:
- Checkout
- Setup
uv+ install Python deps - Install VisionTrace +
allure-pytest+pyperclip(viaVISIONTRACE_TOKEN) - Install
allure-commandlinevia npm - Set display resolution to 1920x1080 (so VisionTrace’s vision verifier sees the same UI proportions as a dev workstation)
- Download the
installer-<run>artefact - Silent-install the installer to
C:\Program Files\BioFlow Pro\ - Download
flutter-junit-<run>artefact - Run VisionTrace ST suite via
pytest visiontrace_tests/(the autouseensure_licensedfixture activates the license on first launch) - Upload
bioflow-visiontrace-diagnostics-<run>artefact (screenshots + screen recordings) - Run StrictDoc export
- Run post-processor (renders RTM, builds Allure dashboard, deep-links badges)
- Upload
bioflow-rtm-<run>,bioflow-allure-results-<run>,flutter-junit-<run>artefacts - Run the post-processor’s
--fail-on-suspectgate (the regulated drift gate)
All steps from “Run StrictDoc export” onwards run with if: always() so the RTM is generated even when tests failed earlier — exactly the artefact a reviewer wants to inspect on a red run.
Job 4 — smoke-and-perf
Section titled “Job 4 — smoke-and-perf”Needs: build-and-package (installer).
What it proves: the installer launches and stays running for ≥ 10 s without crashing, and performance benchmarks haven’t regressed.
Steps:
- Checkout
- Slack notification “build started”
- Download
installer-<run>artefact - Silent install
- Smoke test — launch the app, poll for the main window (timeout 120 s), soak for ≥ 10 s, terminate
- Download
perf-results-<run>fromdart-tests - Combine into
benchmark-results.json - Compare against baseline via
benchmark-action/github-action-benchmark - Upload installer to Slack (
#build-development) - Upload
bioflow-smoke-perf-<run>artefact - Write
$GITHUB_STEP_SUMMARYwith PR + checksum + startup time + version
Failure semantics
Section titled “Failure semantics”| If this job fails | Effect |
|---|---|
dart-tests | Wave-2 jobs still run (they don’t needs: dart-tests). RTM renders showing the failed UT/IT in red. Smoke tests run normally. PR is red because dart-tests failed. |
build-and-package | Wave-2 jobs are skipped (no installer to install). PR is red. |
system-and-rtm | Smoke runs independently. PR is red. The RTM still renders even when ST fails (post-VT steps use if: always()). |
smoke-and-perf | RTM still renders. PR is red because smoke / benchmark fails. |
In every case, the bioflow-rtm-<run> artefact is uploaded if system-and-rtm ran (which happens whenever build-and-package succeeded). That’s why a red CI run still produces a useful artefact for review.
Required status checks for branch protection
Section titled “Required status checks for branch protection”For a regulated PR to be mergeable, all four of these jobs must be green. They are configured as required status checks in the branch-protection playbook.
Where the configuration lives
Section titled “Where the configuration lives”.github/workflows/pr_branch_tests.yml — full source. Read the comments at the top of each job for the rationale captured at the time each piece was written.