Metabolism Modeling

jskromer
Oct 19, 2025
3 min read

Fully integrated M&V

1) Define “building metabolism” as your model substrate

Treat the asset as a network of stocks and flows with conservation constraints.

Stocks (s): thermal mass (kWh_th), water in storage (m³), refrigerant mass, indoor CO₂, etc.
Flows (f): electricity (kW), fuel (kW_th), chilled/hot water (kW_th), ventilation (kg/s), water (L/s), waste heat (kW), waste mass (kg/s).
Drivers (x): weather, occupancy, schedules, tariffs, controls.
Boundary (B): the “metabolic cut” where you account for inflows/outflows (grid import/export, gas, water, sewer, district energy).

This gives you a constrained state-space where any statistical/ML model is regularized by physics (mass/energy balance) and any physics model is parameterized by measurable fluxes.

2) Formal counterfactual within the metabolic frame

Let M_0 be the baseline metabolic model (statistical, physical, or hybrid) fit on baseline data.

Dynamics: \dot{s}(t) = A\,s(t) + G\,f(t) + w(t)
Balances: C\,f(t) = d(t) (node/loop balance: energy/material must balance)
Outputs: y(t) = H\,s(t) + K\,f(t) + v(t)

Counterfactual (no-intervention) under the same drivers x(t) in the reporting period:

\hat{f}_0(t), \hat{s}_0(t) = M_0(x(t); B)

Observed (with intervention):

f_1(t), s_1(t) = M_1(x(t); B)

Impact in a chosen accounting metric (cost/carbon/water):

\Delta(t) = c^\top\!\big(\hat{f}0(t) - f_1(t)\big), \quad \text{and} \quad \text{Impact} = \int{T}\Delta(t)\,dt

where c maps flows to value (e.g., energy prices, CO₂ factors, water fees).

3) How metabolism strengthens the “counterfactual” leg

Identifiability: Balance constraints and stock-flow structure limit spurious correlations in ML baselines.
Transportability: The same metabolic cut B ports across buildings/campuses; only parameters change.
Completeness: You quantify all major inflows/outflows (energy, water, waste heat), not just kWh.
Counterfactual coherence: The no-intervention world respects conservation and operational limits—no “magic” savings.

4) Tying to the other two legs

Confidence (quantified uncertainty)

Physics-informed priors: Encode balances as hard constraints or penalties in the loss:
\mathcal{L} = \mathcal{L}_{fit} + \lambda \| C f - d \|_2^2
State estimation: Kalman/particle filters to infer unmetered flows from sparse sensors → credible intervals on \hat{f}_0.
Coverage diagnostics: Probability that observed flows fall within the counterfactual predictive band.
Conservative accounting: Use lower-bound impacts at a stated confidence level (e.g., 80% one-sided).

Design (choices you must specify)

Measurement boundary B: choose the metabolic cut (e.g., whole building vs. plant loops vs. end-uses).
Duration: cover the building’s metabolic rhythms (diurnal/weekly/seasonal) so M_0 is relevant to reporting.
Model type: statistical, physical, or hybrid; all must honor balances; hybrids are often best.
Meter plan (VoI): place sensors where they collapse posterior uncertainty in key flows crossing B.

5) Minimal sensor set (observability of metabolism)

Prioritize meters at boundary flows (electric main, gas main, water in/out, district energy) and mixing nodes (AHU supply/exhaust, plant headers). Use TEMP/ΔP/flow to solve for unmetered branches via balances.

6) Implementation recipe

Select boundary B and enumerate stocks/flows.
Map drivers x (weather, occupancy, control states, tariffs).
Draft balance graph (nodes/edges), write C f = d.
Choose model:
- Statistical: GAM/GBM/GLM with penalties for balance violations.
- Physical: first-principles plant + zone models.
- Hybrid: ML for loads + physics for transfers/storage.
Fit M_0 on baseline; validate across diurnal/seasonal regimes.
Quantify uncertainty: Bayesian fit or bootstraps + state estimation.
Run counterfactual \hat{f}_0(t) for reporting drivers x(t).
Compute impact \int c^\top(\hat{f}_0 - f_1) dt.
Report confidence (e.g., 80% lower bound) and assumptions (B, duration, model spec).
Iterate meter plan using VoI to shrink the widest intervals.

7) Example (chiller plant retrofit, whole-building boundary)

Flows: grid kW, gas kW_th, CHW/HW kW_th, condenser water kg/s, sewer m³.
Stocks: chilled water tank kWh_th, building thermal mass.
Baseline M_0: hybrid (ML load model + physics plant COP model) with balance penalty.
Intervention: new chillers + reset strategies.
Counterfactual: simulate \hat{f}_0 with old COP curves under reporting drivers.
Impact: energy cost and CO₂ from c^\top(\hat{f}_0 - f_1), with 80% one-sided CI.
Design notes: boundary is whole-building; duration spans at least one cooling season; meter plan adds CHW ΔT/flow at headers.

8) KPIs you can standardize

Metabolic intensity: total inflow per m² (kWh/m², m³/m², kg/m²).
Turnover time: stock / mean through-flow (e.g., thermal storage hours).
Exergy efficiency (optional): quality-weighted use of energy carriers.
Entropy proxy: degree of mixing/irreversibility (useful for diagnosing waste heat opportunities).
Coverage: % hours where observed f_1 lies within the counterfactual band.

9) Reporting template (what goes into your M&V/CF Designs doc)

Design: B, duration, model type, sensor plan.
Counterfactual: model spec, validation plots, regime coverage.
Conservation checks: residual balances at nodes/headers.
Impact results: point estimate + CI, by carrier (kWh, therms, m³) and by value (cost, CO₂, water).
Assumptions & VoI: what data would most tighten intervals.

If you want, I can turn this into:

a one-page spec sheet for projects,
a slide with the equations and the graph of C f = d, or
a short R/Python code stub that fits M_0 with balance penalties and produces counterfactual intervals.