Schema introspection

Three earlier design choices — Three-stage initialization, The discriminated-union plugin registry, and per-operation @fvSchemes.add / @fvSolution.add requirements — combine to make case validation, schema export, and scheme introspection cheap. This page traces that causal chain and then lists the user-visible capabilities that fall out of it.

The three pieces (causal chain)

The capability is built on three pieces, each of which exists for an independent reason but combines to make introspection trivial:

1. The three-stage init split makes validation cheap. StagedInit.validate() walks the same code path the real solver does, but stops before BUILD. LOAD and RESOLVE do not touch the mesh and do not allocate fields — that is the entire point of the three-stage pipeline (see Three-stage initialization). A typo in constant/turbulenceProperties is caught after LOAD without paying for mesh I/O.

2. The discriminated-union plugin registry exposes the schema. Every model’s config is a Pydantic BaseConfig subclass; the plugin registry rebuilds its discriminated union every time a new plugin registers (see The discriminated-union plugin registry). Calling model_json_schema() on the rebuilt union returns a JSON schema that reflects the currently-installed plugin set — not whatever set the framework shipped with.

3. Per-operation requirements live on the operations themselves. Each operation declares its scheme/solution requirements with @fvSchemes.add(...) and @fvSolution.add(...). collect_requirements_from_models() aggregates them across every active model; verify_fvschemes() and verify_fvsolution() check the loaded case against the aggregated set.

None of the three pieces was designed for schema introspection. Each came from a different goal — multi-physics resolution, third-party extensibility, declarative scheme contracts. Schema introspection is what falls out when you have all three.

The capabilities that fall out

  • StagedInit.validate() — run LOAD + RESOLVE + VERIFY against a case directory and return any VerificationError objects. No fields are allocated, no mesh is loaded.

  • StagedInit.solver_inputs() — return a dict[str, type] of Pydantic config classes for every model the solver knows about. No case directory required.

  • StagedInit.scheme_inputs() — return a Pydantic model whose fields are typed with the correct scheme unions (DdtScheme, DivScheme, …) for every @fvSchemes.add requirement declared by the active operations.

What makes this hard (the alternative)

OpenFOAM validates by running. A misconfigured case fails at the first solver step that touches the bad entry — sometimes minutes in, after mesh decomposition has already consumed memory and walltime. The error location (a stack trace inside RASModel::correct) is rarely the same as the configuration location (a typo in RAS.RASModel in constant/turbulenceProperties). For the case author, the loop is “edit, run, wait, read traceback, guess, edit again.”

Three other consumers want something cheaper than that:

  • CI gates. A test that says “every example case in the repo passes solver_validate” needs to run in seconds across hundreds of cases. Allocating a mesh per case is too expensive.

  • UI form generation. A web form that lets a user edit a case needs to know which fields are valid and what their types are before the user has filled in any values.

  • AI-assisted templating. An LLM that drafts constant/ and system/ files for a target solver needs the schema, not the solver source code.

A “validate by running” pattern serves none of those well.

Trade-offs

The capabilities have honest limitations:

  • Validation is only as good as the requirement declarations. An operation that reads system/fvSolution directly (instead of declaring an @fvSolution.add(...)) bypasses verification. The framework cannot detect the bypass; the linter and code review do.

  • Schema introspection holds at the model-config level, not at the resolved-runtime-value level. solver_inputs() tells you that the Boussinesq model’s thermalExpansion field is a float; it does not tell you whether the value chosen for your case is numerically reasonable.

  • Plugin registry rebuild cost. solver_inputs() is cheap per call, but the registry rebuild it depends on happens at every plugin import. For a process that imports many plugins, that cost is paid once at startup.

  • Validation is point-in-time. A case that validated yesterday may not validate today if a plugin was upgraded and now requires a new scheme. That is the right behaviour, but it does mean validate() == 0 errors is not durable across plugin upgrades.

When this matters in practice

The user-visible recipes are:

This capability is what delivers the simplify usability goal from Goals & core concepts — typed configs, IDE autocomplete, validate-before-solve, JSON schema for external tooling. It is not a feature that was built; it is what falls out when the three earlier design choices are paid for.