Live Observe Performance Policy¶

The live runtime workbench treats "low overhead" as a validation target. The targets are per-machine comparisons against a local baseline, not portable absolute latency guarantees.

Budget¶

Mode	Target overhead
observe disabled	< 0.5%
observe summary	< 2%
observe detailed	< 5%
observe debug	no hard threshold; explicitly intrusive

The disabled target protects existing graph run, graph metrics, and graph trace behavior. Summary is the default graph observe mode. Detailed and debug are opt-in investigation modes.

Measurement policy¶

Compare modes on the same machine, same build type, and same graph workload.
Use repeated runs and report median plus p95 where timing is available.
Do not enforce cross-machine absolute timing thresholds in CI.
CI may run smoke/output-contract checks only when the environment is too noisy for timing thresholds.
Thresholds are adjustable through environment variables for local release evidence, but docs and defaults remain conservative.

Benchmark workloads¶

Checked-in live workloads should cover:

minimal graph;
high-frequency channel publish/commit paths;
trigger stress;
thread-pool lane behavior;
CompositeLoop iteration/convergence.

Checked-in live-observe benchmark files:

benchmarks/live_observe_minimal.yaml
benchmarks/live_observe_high_frequency_channels.yaml
benchmarks/live_observe_trigger_stress.yaml
benchmarks/live_observe_thread_pool.yaml
benchmarks/live_observe_composite_loop.yaml

Focused gates¶

./scripts/goal_check.sh live verifies functional and schema behavior:

observe schema smoke;
CLI NDJSON smoke;
assertion pass/fail smoke;
record artifact smoke;
replay smoke;
dashboard static asset smoke;
observer drop summary smoke.

./scripts/goal_check.sh live-perf verifies low-overhead evidence:

disabled baseline;
summary overhead;
detailed overhead;
high-frequency overflow does not block;
observer drops are visible;
no unbounded memory growth in bounded smoke scenarios.

Local command shape¶

Example release-evidence loop:

topoexec graph bench benchmarks/live_observe_high_frequency_channels.yaml \
  --steps 1000 --runs 20 --format json

topoexec graph observe benchmarks/live_observe_high_frequency_channels.yaml \
  --steps 1000 --observe-level summary --record /tmp/topoexec-live-summary

topoexec graph observe benchmarks/live_observe_high_frequency_channels.yaml \
  --steps 1000 --observe-level detailed --record /tmp/topoexec-live-detailed

./scripts/goal_check.sh live-perf

live-perf must prefer a controlled per-machine baseline. If a local environment cannot provide stable timing, record the skipped command, reason, replacement smoke evidence, and whether the gap blocks release in the goal ledger.

The default live-perf gate is intentionally smoke/output-contract only. Set TOPOEXEC_LIVE_PERF_ENFORCE=1 to make the local per-machine thresholds blocking for release evidence.

Hot-path audit checklist¶

Before claiming the budget, inspect or test that producer paths avoid:

JSON serialization;
socket or file I/O;
UI waits;
payload body copies;
unbounded allocations;
blocking queue pushes;
contended mutex waits;
global shared-map lookups;
global atomic event sequence increments for every event.

Collectors, recorders, assertion tools, and dashboards may use richer data structures because they run outside the runtime hot path.