Schema Version 1¶
Schema v1 describes a single-process TopoExec runtime graph. The loader is strict: unknown fields are rejected at the root and inside known sections.
Runtime visibility rules for edge kinds, epochs, transactions, commits, triggers, and CompositeLoop ownership are defined in runtime-semantics.md. Versioned runtime meaning is tracked separately in semantic-contract.md; schema v1 controls graph shape, while the semantic contract controls behavior meaning.
Root¶
Required fields:
schema_version: 1graphlanescomponentsedges
Optional fields:
composite_loopssubgraphstemplatestemplate_instances
Allowed root fields are exactly schema_version, graph, lanes,
components, edges, composite_loops, subgraphs, templates, and
template_instances.
graph¶
graph:
name: minimal
kind: runnable
config:
profile: alpha
clock:
runtime_domain: steady
event_domain: steady
Fields:
namerequired string.kindoptional string, defaultrunnable; allowed values arerunnableandinternal_test.configoptional mapping of graph-level snapshot values. Nested mappings/sequences are preserved as serialized strings and marked as nested inConfigView.clock.runtime_domainoptional string, defaultsteady; onlysteadyis currently accepted.clock.event_domainoptional string, defaultsteady; allowed values aresteady,system,device, andexternal.
lanes¶
lanes is a mapping from lane id to lane configuration:
lanes:
main:
type: event_loop
hz: 100
max_callback_ms: 5
Fields:
typerequired string; allowed values areevent_loop,fixed_rate, andthread_pool.hzoptional number, default0.priorityoptional string, default empty; lane-level scheduler/OS priority intent remains advisory.max_callback_msoptional integer, default0.max_threadsoptional integer, default0; must be non-negative. Forthread_pool, this is the persistent worker count and active worker width (0means one worker).queue_capacityoptional integer, default0; must be non-negative. Forthread_pool, positive values bound pending ready invocations after active workers;0admits only the active worker width.overflowoptional string, defaultreject; allowed values areoverwrite,drop_oldest,drop_newest,reject,reject_new,fail_fast, andblock. Forthread_pool,drop_oldest/overwritediscard oldest ready invocations before execution,drop_newest/reject/reject_new/blockskip newest ready invocations in the non-blocking runtime, andfail_faststops the run.wall_clock_enabledoptional boolean, defaultfalse; opt-in wall-clock cadence forfixed_rate.period_msoptional integer, default0; fixed-rate period override for cadence and overrun accounting.tick_budget_msoptional integer, default0; explicit per-iteration budget for overrun accounting without changing cadence.overrun_policyoptional string, defaultdrop_tick; allowed values aredrop_tick,skip_next, andcatch_up_once. This affects the next scheduled wall-clock tick after lateness; it is not hard preemption.thread_nameoptional string.cpu_affinityoptional integer array.nice_priorityoptional integer, default0.rt_policyoptional string, defaultnone.rt_priorityoptional integer, default0.isolation_intentoptional string, defaultnone.
Runtime support note: event_loop is the deterministic default. fixed_rate is simulated by bounded runtime ticks unless wall_clock_enabled opts into cooperative sleeping cadence; it reports tick/overrun/jitter/skipped/max-lateness metrics from hz, period_ms, tick_budget_ms, and overrun_policy. Runtime component execution.priority orders independent ready work using high > normal > low > background without changing OS priority. thread_pool uses run-scoped persistent workers plus bounded priority queue admission for ready invocations with explicit queue/worker/priority metrics; see scheduler.md and concurrency.md.
components¶
components is a sequence. Each component must have an id, type, event
sources, trigger policy, and execution lane. It may be empty when all runtime
components are supplied through subgraphs[] compile-time namespace expansion.
components:
- id: transform
type: topoexec.transforms.Identity
event_sources:
- type: message
inputs: [in]
trigger_policy:
type: any_input
inputs: [in]
execution:
lane: main
boundary:
role: processing
config:
gain: 1
Component fields:
idrequired string; must be unique.typerequired string.event_sourcesoptional sequence, default[{type: manual}].trigger_policyoptional mapping, default{type: manual}.executionrequired mapping.depends_onoptional string array; lifecycle dependencies must form a DAG.boundaryoptional mapping.configoptional mapping of component-specific values.
event_sources[]¶
Allowed fields:
idoptional string.typeoptional string, defaultmanual; allowed values aremanual,message,timer,request,task_ready, andfuture_ready. Action goal/cancel event sources remain deferred and are not valid schema-v1 runtime triggers.inputsoptional string array.inputoptional string shorthand for one input.period_msoptional integer; required positive value fortimer.
message sources require at least one input.
timer sources cannot be mixed with input-driven event source types on the same
component; split periodic and input-driven behavior into separate components.
trigger_policy¶
Allowed fields:
typeoptional string, defaultmanual; allowed values aremanual,on_event,any_input,all_inputs,time_sync,batch,request,task_ready,watermark,condition,debounce, andrate_limit.inputsoptional string array.inputoptional string shorthand for one input.batch_sizeoptional non-negative integer.batch_window_msoptional non-negative integer.sync_slop_msoptional non-negative integer.min_interval_msoptional non-negative integer.max_latency_msoptional non-negative integer; when positive, pending trigger input messages older than this limit are dropped before readiness is evaluated.watermark_lateness_msoptional non-negative integer;watermarkdrops timestamped messages older than the component's observed watermark minus this allowance.debounce_window_msoptional integer reserved for future wall-clock debounce windows. In schema v1 it must be0;debouncecoalesces pending inputs at the current scheduler check without sleeping.conditionoptional string forconditiontriggers. Allowed values areall_inputs_ready,any_input_ready, andevent_timestamp_present; arbitrary expressions or scripts are invalid.coalesceoptional boolean, defaultfalse.
Input-driven trigger policies require incoming edges for every listed input.
batch requires batch_size or batch_window_ms; rate_limit requires a
positive min_interval_ms.
execution¶
Allowed fields:
lanerequired string; must reference a lane id.reentrantoptional boolean, defaultfalse.priorityoptional string, defaultnormal; allowed values arebackground,low,normal, andhigh. This is runtime-level component/invocation priority, not lane/OS scheduler priority.budget_msoptional integer, default0; this is a cooperative execution budget. Exceeding it records budget/timeout metrics after the invocation returns and does not preempt component code.on_erroroptional string, defaultfail_fast; declared values arefail_fast,continue, andisolate, but onlyfail_fastis implemented in schema v1 today. Other values parse but semantic validation rejects them rather than silently emulating a policy, using diagnostic codeunsupported_error_policy.
boundary¶
Allowed fields:
roleoptional string, defaultprocessing; allowed values areprocessing,input,output, andinput_output.descriptoroptional string.
For registry-backed runnable graphs, validation requires at least one input boundary and one output boundary.
Descriptor-backed ports¶
Schema v1 does not add ports fields to YAML. Port typing remains a
registry/descriptor contract so existing v1 graph files keep their shape and
compatibility. When validate_graph(graph, registry) can instantiate component
descriptors, it checks edge endpoints against descriptor inputs/outputs and
validates:
- non-empty source/target payload schemas match;
- non-empty source/target
payload_typenames match; requireddescriptor inputs have an incoming edge;- optional descriptor inputs may be unconnected, producing
optional_input_unconnectedadvisory diagnostics; PortMultiplicity::kSingleinputs have at most one incoming edge;- graph
boundary.rolerequirements are compatible with descriptor roles.
These checks are semantic validation, not JSON Schema validation. Future schema v2 work may decide whether typed ports should become YAML fields.
subgraphs¶
subgraphs is optional. Phase-1 subgraphs are compile-time namespace expansion,
not nested runtime schedulers.
components: []
edges: []
subgraphs:
- id: cell
components:
- id: source
type: topoexec.test.Source
event_sources: [{type: manual}]
trigger_policy: {type: manual}
execution: {lane: main}
- id: sink
type: topoexec.test.Sink
event_sources: [{type: message, inputs: [in]}]
trigger_policy: {type: any_input, inputs: [in]}
execution: {lane: main}
edges:
- {id: source_sink, kind: immediate, from: source.out, to: sink.in}
Fields:
idrequired string; must be unique among subgraphs.componentsrequired non-empty sequence using the same component schema as top-levelcomponents.edgesrequired sequence using the same edge schema as top-leveledges.composite_loopsoptional sequence using the same CompositeLoop schema as top-levelcomposite_loops.
Expansion rules:
- local component id
sourceunder subgraphcellbecomescell.source; - local edge id
source_sinkbecomescell.source_sink; - endpoint
source.outbecomescell.source.outby prefixing only the component part; depends_onand subgraph-local CompositeLoop component references are prefixed the same way;- lanes stay top-level and are referenced by expanded components unchanged.
Validation runs after expansion. Immediate cycles inside or across expanded
subgraph boundaries are still rejected unless a matching expanded
composite_loops[] entry owns the SCC. Plan JSON includes hierarchy[], and
Mermaid output groups expanded components under Subgraph: <id>.
templates and template_instances¶
templates and template_instances are optional. They are compile-time reusable
snippets and are expanded before normal validation. See
Graph templates for the full contract.
templates:
- id: source_sink
parameters: [source_type, sink_type]
components:
- {id: source, type: "{{source_type}}", event_sources: [{type: manual}], trigger_policy: {type: manual}, execution: {lane: main}}
- {id: sink, type: "{{sink_type}}", event_sources: [{type: message, inputs: [in]}], trigger_policy: {type: any_input, inputs: [in]}, execution: {lane: main}}
edges:
- {id: source_sink, kind: immediate, from: source.out, to: sink.in}
template_instances:
- id: cell
template: source_sink
parameters: {source_type: topoexec.test.Source, sink_type: topoexec.test.Sink}
Template fields:
idrequired string.parametersoptional string array of placeholder names.componentsrequired non-empty sequence using component schema.edgesrequired sequence using edge schema.composite_loopsoptional sequence using CompositeLoop schema.
Template instance fields:
idrequired string; this becomes the expansion namespace.templaterequired string referencing a template id.parametersoptional mapping of string values. It must include every declared template parameter and no unknown names.
Only scalar placeholder substitution is supported. Placeholders use {{name}}
and missing or unknown parameters fail during graph loading. Expanded output then
follows the same namespace rules as subgraphs[], so runtime remains unaware of
templates.
edges¶
edges is a sequence. Every edge must declare an explicit kind.
edges:
- id: source_to_transform
kind: immediate
from: source.out
to: transform.in
policy:
mode: queue
capacity: 4
overflow: drop_oldest
copy_policy: shared_view
Edge fields:
idrequired string; must be unique.kindrequired string; allowed values areimmediate,delay,state, andasync.fromrequired endpoint string, usuallycomponent.output.torequired endpoint string, usuallycomponent.input.policyoptional mapping.
Only immediate edges participate in immediate SCC analysis. Immediate cycles are invalid unless they exactly match one composite_loops[] entry. delay, state, and async edges break same-transaction feedback and become visible at a later epoch boundary.
policy¶
Allowed fields:
modeoptional string, defaultlatest; allowed values arelatest,queue,ring_buffer,latched,barrier, andprevious_tick.capacityoptional positive integer, default1.overflowoptional string, defaultoverwrite; allowed values areoverwrite,drop_oldest,drop_newest,block,fail_fast, andreject.lifespan_msoptional integer, default0.deadline_msoptional integer, default0.max_inflightoptional non-negative integer, default0; applies only toasyncedges and limits deferred completions before channel capacity.preserve_orderoptional boolean, defaulttrue.allow_dropoptional boolean, defaulttrue.emit_health_eventsoptional boolean, defaulttrue; suppresses observer health-event records for this edge whenfalsewhile preserving metrics.timestamp_domainoptional string, defaultsteady; allowed values aresteady,system,device, andexternal.copy_policyoptional string, defaultcopy; allowed values arecopy,shared_view,loaned_view, andmove_only.owneroptional string, defaultruntime; allowed values areproducer,runtime, andconsumer.readersoptional string, defaultsingle; allowed values aresingle,multi, andmultiple.
Latest-style modes (latest, latched, previous_tick) cannot use drop_newest or block. move_only requires readers: single. State edges currently reject multiple writers to the same target endpoint. max_inflight is invalid on non-async edges.
composite_loops¶
composite_loops is optional. Each entry must exactly match one immediate cyclic SCC.
composite_loops:
- id: estimator_controller_loop
components: [estimator, controller]
loop_policy:
type: fixed_point
max_iterations: 3
budget_ms: 5
Fields:
idrequired string.componentsrequired non-empty string array.loop_policyrequired mapping.
Loop policy fields:
typerequired string; allowed values arefixed_point,transaction,solver_iteration,coalesced_event, andasync_task.budget_msoptional non-negative integer; the loop checks this cooperatively between completed iterations.max_iterationsoptional non-negative integer.max_inflightoptional non-negative integer.drop_policyoptional string.min_interval_msoptional non-negative integer.convergenceoptional string.single_pass,after_first_iteration, andalwaysstop after one iteration;stable_stateis accepted as an advisory existing-example value.residual_thresholdoptional non-negative number used bysolver_iteration.partial_successoptional string:commit_outputs,discard_outputs, orfail_run.
Current runtime execution is strongest for fixed_point and the bounded
solver_iteration slice. loop_policy.max_inflight is reserved for loop-level
async policies and is separate from edge-level async policy.max_inflight.
Valid Minimal Example¶
schema_version: 1
graph: {name: minimal, kind: runnable}
lanes: {main: {type: event_loop}}
components:
- id: source
type: topoexec.boundary.Input
boundary: {role: input}
event_sources: [{type: manual}]
trigger_policy: {type: manual}
execution: {lane: main}
- id: sink
type: topoexec.boundary.Output
boundary: {role: output}
event_sources: [{type: message, inputs: [in]}]
trigger_policy: {type: any_input, inputs: [in]}
execution: {lane: main}
edges:
- id: source_sink
kind: immediate
from: source.out
to: sink.in
policy: {mode: latest, copy_policy: shared_view}
Invalid Examples¶
Missing edge kind:
edges:
- id: source_sink
from: source.out
to: sink.in
Immediate cycle without a matching CompositeLoop:
edges:
- {id: ab, kind: immediate, from: a.out, to: b.in}
- {id: ba, kind: immediate, from: b.out, to: a.in}
Both cases are covered by CLI validation fixtures under examples/invalid_*.yaml.
Machine-Readable Schema¶
A machine-readable Draft 2020-12 JSON Schema is checked in at schema/topoexec.schema.v1.json. It mirrors the strict loader field set and enum surface documented here. The schema_v1_contract_smoke CTest parses that schema and validates representative checked-in graph fixtures through the runtime validator so schema documentation and executable validation do not silently drift.
The JSON Schema is a documentation and generation contract today; semantic rules such as SCC ownership, registry-backed port compatibility, multi-state-writer rejection, and trigger/input compatibility remain enforced by the C++ validator.
The schema also records the executable defensive limits that are cheap to express in JSON Schema: 256 lanes, 4096 components, 8192 edges, 1024 CompositeLoop entries, 128-byte ids, and 4096-byte non-config strings/endpoints. Additional parser limits such as graph text size, UTF-8 input, file read bounds, and config nesting depth are documented in Defensive input handling.
CLI validation exposes the same split:
topoexec graph validate examples/minimal.yaml --schema-only --format json
topoexec graph validate examples/minimal.yaml --semantic --format json
--schema-only checks the strict loader contract (required fields, known fields, basic scalar shapes). --semantic is the default and additionally runs the compiler/validator checks.
For editor integration and YAML-language-server setup, see Editor and Schema UX.
The standalone tooling surface is:
topoexec schema dump --format json
topoexec schema check examples/minimal.yaml --format json
schema dump reads the bundled schema (or TOPOEXEC_SCHEMA_PATH when set). schema check intentionally mirrors strict loader validation only; use topoexec graph validate for semantic graph/compiler diagnostics.
Versioning¶
Schema v1 is strict and compatibility-preserving. Additive fields require a schema update only when v1 validation or runtime meaning would change. Breaking semantic changes should bump the schema version rather than silently changing v1 behavior. See Schema v2 notes before adding fields that would require runtime nesting, graph-declared adapter/plugin discovery, arbitrary expressions, or behavior selection by schema version.