Live Observe Performance Policy¶
The live runtime workbench treats "low overhead" as a validation target. The targets are per-machine comparisons against a local baseline, not portable absolute latency guarantees.
Budget¶
| Mode | Target overhead |
|---|---|
| observe disabled | < 0.5% |
| observe summary | < 2% |
| observe detailed | < 5% |
| observe debug | no hard threshold; explicitly intrusive |
The disabled target protects existing graph run, graph metrics, and graph
trace behavior. Summary is the default graph observe mode. Detailed and debug
are opt-in investigation modes.
Measurement policy¶
- Compare modes on the same machine, same build type, and same graph workload.
- Use repeated runs and report median plus p95 where timing is available.
- Do not enforce cross-machine absolute timing thresholds in CI.
- CI may run smoke/output-contract checks only when the environment is too noisy for timing thresholds.
- Thresholds are adjustable through environment variables for local release evidence, but docs and defaults remain conservative.
Benchmark workloads¶
Checked-in live workloads should cover:
- minimal graph;
- high-frequency channel publish/commit paths;
- trigger stress;
- thread-pool lane behavior;
- CompositeLoop iteration/convergence.
Checked-in live-observe benchmark files:
benchmarks/live_observe_minimal.yaml
benchmarks/live_observe_high_frequency_channels.yaml
benchmarks/live_observe_trigger_stress.yaml
benchmarks/live_observe_thread_pool.yaml
benchmarks/live_observe_composite_loop.yaml
Focused gates¶
./scripts/goal_check.sh live verifies functional and schema behavior:
- observe schema smoke;
- CLI NDJSON smoke;
- assertion pass/fail smoke;
- record artifact smoke;
- replay smoke;
- dashboard static asset smoke;
- observer drop summary smoke.
./scripts/goal_check.sh live-perf verifies low-overhead evidence:
- disabled baseline;
- summary overhead;
- detailed overhead;
- high-frequency overflow does not block;
- observer drops are visible;
- no unbounded memory growth in bounded smoke scenarios.
Local command shape¶
Example release-evidence loop:
topoexec graph bench benchmarks/live_observe_high_frequency_channels.yaml \
--steps 1000 --runs 20 --format json
topoexec graph observe benchmarks/live_observe_high_frequency_channels.yaml \
--steps 1000 --observe-level summary --record /tmp/topoexec-live-summary
topoexec graph observe benchmarks/live_observe_high_frequency_channels.yaml \
--steps 1000 --observe-level detailed --record /tmp/topoexec-live-detailed
./scripts/goal_check.sh live-perf
live-perf must prefer a controlled per-machine baseline. If a local environment
cannot provide stable timing, record the skipped command, reason, replacement
smoke evidence, and whether the gap blocks release in the goal ledger.
The default live-perf gate is intentionally smoke/output-contract only. Set
TOPOEXEC_LIVE_PERF_ENFORCE=1 to make the local per-machine thresholds
blocking for release evidence.
Hot-path audit checklist¶
Before claiming the budget, inspect or test that producer paths avoid:
- JSON serialization;
- socket or file I/O;
- UI waits;
- payload body copies;
- unbounded allocations;
- blocking queue pushes;
- contended mutex waits;
- global shared-map lookups;
- global atomic event sequence increments for every event.
Collectors, recorders, assertion tools, and dashboards may use richer data structures because they run outside the runtime hot path.