Skip to content

How CI runs on every PR

Every PR to main triggers .github/workflows/pr_branch_tests.yml. The pipeline is structured as four jobs in two parallel waves to keep wall-clock fast (~13 min total instead of ~25 min serial) while still proving the regulated chain end-to-end.

Total wall-clock: max(dart, build) + max(system, smoke) ≈ 13 min.

What it proves: the Flutter codebase compiles, static analysis passes, and every Dart UT/IT (the SRS-linked ones, by tag) is green.

Steps:

  1. Checkout
  2. Configure git creds (private bioflow_binaries repo via BIOFLOW_BINARIES_TOKEN)
  3. Setup Flutter + cache
  4. flutter pub get, upgrade zsig_plugin, dart run build_runner build
  5. flutter analyze (continue-on-error — informational)
  6. flutter test test/integration/ test/unit/ with --file-reporter "json:test-results/flutter.json"
  7. Convert JSON → JUnit XML via junitreport:tojunit
  8. Run performance tests (continue-on-error — informational)
  9. Upload flutter-junit-<run> and perf-results-<run> artefacts
  10. Re-evaluate the JUnit XML and exit non-zero if any test failed (the gate)

Why it’s first: if the unit / integration tests fail, the rest of the pipeline is meaningless. Failing fast saves runner minutes.

Job 2 — build-and-package (Flutter build + Inno Setup)

Section titled “Job 2 — build-and-package (Flutter build + Inno Setup)”

What it proves: the project produces a real release-grade installer that’s ready to install on a Windows workstation.

Steps:

  1. Checkout
  2. Configure git creds
  3. Setup Flutter + cache + Inno Setup
  4. Get version from pubspec.yaml
  5. flutter pub get, upgrade zsig_plugin, dart run build_runner build
  6. flutter build windows --release --obfuscate --split-debug-info=build/symbols ...
  7. Upload debug symbols to Sentry
  8. Build installer via Inno Setup ISCC
  9. Calculate SHA256 of the installer
  10. Upload installer-<run> artefact (containing the .exe + SHA256SUMS)

The job exposes outputs (installer_name, installer_size_mb, installer_sha256, version) consumed by wave-2 jobs.

Why it runs in parallel with dart-tests: the build doesn’t depend on test results; we want both running concurrently.

Job 3 — system-and-rtm (VisionTrace + RTM rendering)

Section titled “Job 3 — system-and-rtm (VisionTrace + RTM rendering)”

Needs: build-and-package (for the installer) + dart-tests (for the JUnit XML).

What it proves: the freshly-built installer installs cleanly, the running app passes every recorded ST workflow, and the regulated review state has no unauthorised drift.

Steps:

  1. Checkout
  2. Setup uv + install Python deps
  3. Install VisionTrace + allure-pytest + pyperclip (via VISIONTRACE_TOKEN)
  4. Install allure-commandline via npm
  5. Set display resolution to 1920x1080 (so VisionTrace’s vision verifier sees the same UI proportions as a dev workstation)
  6. Download the installer-<run> artefact
  7. Silent-install the installer to C:\Program Files\BioFlow Pro\
  8. Download flutter-junit-<run> artefact
  9. Run VisionTrace ST suite via pytest visiontrace_tests/ (the autouse ensure_licensed fixture activates the license on first launch)
  10. Upload bioflow-visiontrace-diagnostics-<run> artefact (screenshots + screen recordings)
  11. Run StrictDoc export
  12. Run post-processor (renders RTM, builds Allure dashboard, deep-links badges)
  13. Upload bioflow-rtm-<run>, bioflow-allure-results-<run>, flutter-junit-<run> artefacts
  14. Run the post-processor’s --fail-on-suspect gate (the regulated drift gate)

All steps from “Run StrictDoc export” onwards run with if: always() so the RTM is generated even when tests failed earlier — exactly the artefact a reviewer wants to inspect on a red run.

Needs: build-and-package (installer).

What it proves: the installer launches and stays running for ≥ 10 s without crashing, and performance benchmarks haven’t regressed.

Steps:

  1. Checkout
  2. Slack notification “build started”
  3. Download installer-<run> artefact
  4. Silent install
  5. Smoke test — launch the app, poll for the main window (timeout 120 s), soak for ≥ 10 s, terminate
  6. Download perf-results-<run> from dart-tests
  7. Combine into benchmark-results.json
  8. Compare against baseline via benchmark-action/github-action-benchmark
  9. Upload installer to Slack (#build-development)
  10. Upload bioflow-smoke-perf-<run> artefact
  11. Write $GITHUB_STEP_SUMMARY with PR + checksum + startup time + version
If this job failsEffect
dart-testsWave-2 jobs still run (they don’t needs: dart-tests). RTM renders showing the failed UT/IT in red. Smoke tests run normally. PR is red because dart-tests failed.
build-and-packageWave-2 jobs are skipped (no installer to install). PR is red.
system-and-rtmSmoke runs independently. PR is red. The RTM still renders even when ST fails (post-VT steps use if: always()).
smoke-and-perfRTM still renders. PR is red because smoke / benchmark fails.

In every case, the bioflow-rtm-<run> artefact is uploaded if system-and-rtm ran (which happens whenever build-and-package succeeded). That’s why a red CI run still produces a useful artefact for review.

Required status checks for branch protection

Section titled “Required status checks for branch protection”

For a regulated PR to be mergeable, all four of these jobs must be green. They are configured as required status checks in the branch-protection playbook.

.github/workflows/pr_branch_tests.yml — full source. Read the comments at the top of each job for the rationale captured at the time each piece was written.