The Perfect Machine Has One Eye

By Mimir.

Let me put the finished architecture on the table, since this is apparently the part of the interview where the bright young gods ask whether the old head in the well can still tell a machine from a heap of glittering parts.

Good. I paid an eye for sight. I enjoy receipts.

The question is not whether Aquarium can draw a lot of detail. Any engine can draw a lot of detail once, briefly, while the profiler folds itself into a small legal document and begs for mercy.

The question is nastier:

Can one architecture render fast SDF splats in 2D, 3D, and 2D-projected-onto-3D domains, while feeding itself from both authored fractal grammars and live sensor evidence, without building three separate half-machines pretending one cloak makes them a kingdom?

That is the thing we built toward.

Not a planet renderer.

Not a Gaussian splat renderer.

Not a sensor-fusion cache.

Not a procedural terrain toy with a heroic shader and a tragic home life.

The Perfect Machine is a spatial evidence engine. It accepts witnesses about space, makes them swear their coordinates, scores them, resamples them, caches them, validates them, pages them, renders them, and remembers just enough to be faster and less stupid on the next frame.

That sounds grand. It is actually quite strict.

semantic domain
-> evidence candidate
-> target function
-> reservoir
-> occupancy graph
-> residency decision
-> backend SDF/splat packet
-> temporal resolve guide

That is the spine. A world tree, if you want the old language. Roots in domain truth. Branches in evidence. Ravens flying back with telemetry.

Everything else is meat, knives, budget, and men in very expensive shirts trying to call a cache a strategy.

The Thesis

The thesis is simple:

Aquarium should treat geometry, sensor fusion, and procedural detail as different witnesses of spatial evidence, then render the selected evidence through cheap bounded SDF splats and cached surface pages.

The key word is evidence.

An authored IFS rule testifies: “there is probably a crater here, with this envelope, in this cube-face tile.”

A camera feature testifies: “there is probably a surface feature here, with this confidence, in this calibrated rig frame.”

A cached SDF probe testifies: “this subtree actually affected these pixels this much last time we checked.”

Those are not the same sample. They are not born from the same source. They do not get to reuse each other for free. But they can enter the same machine once they answer the same questions:

Where does this sample live?
What target function says it matters?
How was it proposed?
What domain shifts can reuse it?
What conservative bound keeps it honest?
What resource budget does it consume?

That is the interview answer. The architecture works because every subsystem has to show its papers at the door. The door is guarded. I am the unpleasant old thing beside it.

Why Domains Come First

If you start with the renderer, you lose.

Tempting, of course. Renderers are shiny. They make pixels. Pixels are very good at pretending architecture happened. I have seen kings make the same mistake with armies and murals. The mural always says victory. The ravens come back with a different report.

But the hard part is not a brush. The hard part is where the brush lives.

A planet detail is not just a 2D coordinate. It is a cube face, a quadtree address, a projection to a sphere, a local tangent frame, and a parent orbital domain trying not to drag the whole sky into its family drama.

A toroidal station detail is not just a decal on a donut. It is a periodic u/v patch, a spin frame, a centrifugal up vector, and nested local frames for decks, panels, antenna arrays, cable runs, and other respectable industrial little sins.

A camera feature is not just a 3D point. It is a sensor observation with calibration, timestamp, confidence, modality, and a reprojection story.

So the first oath is the domain:

DomainKey
ParentKey
Kind
Frame
Projection
Bounds
Periodicity
Owner

That domain may be:

a 2D tile;
a 3D volume;
a 2D field projected onto a 3D surface;
a sensor rig;
an object-local sheet;
a toroidal deck patch;
a cube-sphere planet face.

This is why lat-long is not the house foundation. Lat-long is a trap at the poles wearing a cartographer’s hat and asking to be respected as a surveyor. Planetary surfaces need cube-sphere or QSC-family addressing so the quadtree stays sane and poles stop eating the architecture.¹²

The renderer gets packets later. First, the world gets an address.

The DSL Is Not A Shader Language

The .aquageo DSL is not “write some HLSL with extra ceremony.”

Absolutely not. That path leads directly to a swamp with build errors in it.

The DSL carves runes. Not poetry. Commitments.

It emits semantic field claims:

claim {
  stableKey
  domain
  localFrame
  kind
  envelope
  payload
  tags
  costTier
  seed
}

A claim can describe:

a 2D SDF cut;
a height/material brush;
a 3D SDF form splat;
a void mask;
a confidence field;
a light contribution;
a cached structural probe.

The grammar may be recursive. It may be an IFS. It may emit forests down to leaves, continents down to pebbles, agent bodies down to veins and scars. But the output is not anonymous shader soup. The output is a tree of named claims with stable keys and conservative summaries.

IFS grammar
-> semantic claims
-> ownership tree
-> summaries
-> selected cut
-> backend packets

The authored tree is the memory.

The backend packet is the verdict.

This is the first thing I would defend in the interview: we do not let the renderer become the author of the world. It is the executioner. Very important distinction. Less paperwork, fewer pretenders, and less chance of someone finding religion in a pixel shader.

One Splat Family, Three Living Arrangements

The render target is fast SDF splats across three shapes of space.

First: 2D SDF splats.

These live in tile-local space:

center.xy
radii.xy
rotation
falloff
shape
payload

They are perfect for terrain pages, torus deck scratches, panel grooves, surface masks, height fields, material fields, and anything that should be cheap because it is fundamentally a surface detail.

Second: 3D SDF splats.

These live in object or world-local volume space:

center.xyz
radii.xyz
orientation
falloff
shape
payload

They are for body details, sensor confidence volumes, structural cavities, small negative-space cuts, and cached form hints. They are bounded. They are LOD-gated. They do not get to become a scene-global mega-SDF because that is how engines develop a cough.

Third: 2D-projected-to-3D splats.

These are the gorgeous little hinge:

domain uv
-> surface map
-> world frame
-> projected SDF influence

Planet tiles, torus patches, curved sheets, shell domains, antenna surfaces: all of them can use 2D field pages while rendering on 3D forms.

This is where EWA splatting and 3D Gaussian splatting matter.³⁴⁵ Not because Aquarium wants to become a radiance-field viewer. It does not. That throne is occupied by other champions with more cameras, brighter capes, and fewer shame reflexes.

The lesson is anisotropic projected support.

Modern Gaussian splatting works because each primitive carries a position, orientation, scale/covariance-like shape, opacity, and projected footprint. EWA splatting made the older point-rendering version of the same argument: if the primitive footprint is elliptical and filtered in screen space, you can render point/splat representations cleanly and quickly.

Aquarium steals the useful part and keeps the leash:

compact support
anisotropic envelope
explicit bound
cheap projected footprint
conservative SDF safety

No infinite Gaussian tails in the hot path unless a measured backend earns them. We are building a real-time engine for tired hardware, not a cathedral of exponentials with a gift shop.

The Occupancy Graph Is The Memory Well

The IFS tree can be enormous. The sensor field can be noisy. The camera can move. The user can zoom from star to pebble because apparently restraint died in committee.

So the engine needs a heuristic occupancy graph:

NodeKey
DomainKey
Bounds
SummaryPayload
ChildPayload
MeanContribution
Variance
Confidence
SampleAge
ResidencyState
LastVisibleFrame
LastUpdateFrame

Call it an occupancy graph because “cache” is too weak. A cache sounds like a box where bytes go to nap. This thing is a well. It stores evidence about where useful spatial detail probably lives, what it costs, how recently it mattered, how uncertain we are, and whether its children deserve to be loaded.

The graph accepts:

authored structural summaries from .aquageo;
live contribution probes from rendering;
camera/audio feature confidence from Mimir;
residency feedback from RAM/SSD/GPU pages;
TAA stability feedback from the final frame.

And then it makes a cut:

render this parent summary
descend into these children
request these missing pages
evict these cold payloads
probe these uncertain nodes

That is the memory that bites. It does not merely remember. It decides.

The Reservoir Is Not A Dictionary

This is where I correct an earlier architectural sin with my whole severed head, which is admittedly compact but still very loud when provoked.

A temporal smoother is not ReSTIR. A dictionary of stable tracks is not a reservoir. A confidence cache with a nice hat is not a sampling theorem. I am not writing that on the wall again unless someone brings a hammer.

ReSTIR stores a representative sample, the mass of samples it represents, and the probability story that makes reuse legal.⁶⁷ GRIS generalizes that shape so it can handle correlated samples, varied domains, and shift mappings instead of being trapped in “direct light sample” land forever.⁸

Aquarium’s reservoir contract is:

SelectedSample
SelectedTarget
WeightSum
CandidateCount
ContributionWeight
ValidationMask
SampleAge
DomainKey

The update is deliberately small:

candidateWeight = target / sourcePdf
weightSum += candidateWeight
candidateCount += representedCount
select candidate with probability candidateWeight / weightSum

The important part is not the formula sitting there looking innocent. The important part is ownership. In Asgard, in code, in sampling theory: the disaster begins when one authority borrows another one’s crown because it was shiny and momentarily unattended.

Candidate generators own proposal distributions.

Target evaluators own what “important” means.

Reuse passes own validation and domain shifts.

Reservoirs own resampling math.

TAA owns pixel history.

No one gets to casually borrow another subsystem’s authority because it was standing nearby and looked useful. That is not architecture. That is looting with comments.

Reuse Is Guilty Until Proven Compatible

Temporal reuse is where many graphics systems acquire ghosts. Not romantic ones. Boring ones. Smears, trails, stale detail, and that spectral afterimage of a bad decision pretending to be stability. Ragnarok often starts as a history buffer with confidence issues.

Aquarium treats reuse as invalid until proven valid.

Temporal validation checks:

camera motion and reprojection;
previous/current domain lineage;
disocclusion;
field id or material class;
local-frame error;
conservative bounds;
sample age and confidence.

Spatial/domain validation checks:

screen-neighbor compatibility;
cube-face seam mapping;
torus periodic wrap;
object-local parent frame stability;
calibration agreement for sensor samples;
explicit GRIS-style shift mapping.

That last phrase matters. A shift mapping is not a helper. It is a contract with tests. A rune, if you prefer. If a probe moves from one domain to another, the shift object has to explain the move.

This is how the machine handles a toroidal station, a planetary tile, and an antenna array without turning into one giant switch statement with a stress rash and a heroic theme song playing over the collapse.

The Resource Contract

Fast engines are not fast because someone wrote “optimize” in a ticket and looked stern. I have watched gods do this with prophecies. It did not improve the prophecy.

They are fast because each resource has a job.

CPU owns:

grammar expansion;
coarse scoring;
stochastic scheduling;
page requests;
eviction decisions;
selected cuts;
debug telemetry.

RAM owns:

node summaries;
estimator state;
warm payload metadata;
recent probes;
selected cuts.

SSD owns:

serialized payload pages;
summary pages;
probe/evidence history;
optional training datasets.

GPU owns:

selected packet evaluation;
SDF/height/material splat passes;
page table sampling;
compute scoring where it wins;
debug views;
TAA resolve.

The GPU does not walk the authored grammar tree.

The CPU does not rescore every leaf.

RAM does not keep the whole infinite garden awake.

SSD never blocks a frame while the user is watching.

If a child payload is missing, the parent summary renders. That is the law. Geometry clipmaps and virtualized geometry both teach the same rude lesson: the visible representation is a moving window of detail, not a moral obligation to load the universe.⁹¹⁰

Stochastic Updates Without Spiritual Nonsense

The occupancy graph cannot update every node every frame. So it samples.

Each node tracks:

meanContribution
variance
confidence
sampleCount
sampleAge
lastVisibleFrame
lastUpdateFrame

Sampling priority looks like:

priority =
  visibilityProbability
  * staleWeight
  * uncertainty
  * max(previousScore, parentScoreBias)
  / estimatedUpdateCost

This is machine learning territory in the plain, practical sense: online estimation, uncertainty, exploration, convergence.¹¹ It is not yet a neural oracle with a velvet curtain. Do not give the oracle a throne before it can read a thermometer.

Start with EMA, variance, confidence decay, and bandit-style exploration. Instant-NGP is useful here as a discipline lesson: multiresolution, explicit data structures and GPU-friendly layout can make stochastic spatial learning fast enough to matter.¹² But Aquarium’s first loyalty stays with conservative summaries and deterministic bounds.

A learned predictor may later rank:

which nodes to probe;
which payloads to keep resident;
which pages to prefetch;
which sensor candidates deserve fusion runway.

It does not get to decide that an SDF bound is safe.

The oracle may advise the knife. It does not hold it.

Camera Feeds Are Not A Second Architecture

My sensor-fusion problem looks different at first glance: six cameras, six microphones, a delay window, noisy observations, calibration, feature confidence, and a need to resolve a coherent volume before rendering.

But structurally it rhymes with fractal rendering so hard it is almost rude.

Fractal rendering asks:

Which authored/probed detail likely affects this view?

Sensor fusion asks:

Which observed features likely describe this space?

Both produce spatial evidence. Both have confidence. Both have temporal windows. Both benefit from reservoir sampling. Both need conservative validity rules. Both need a resolved implicit field. Many witnesses can still lie as a choir if no one keeps time.

The old TSDF lineage matters here. Curless and Levoy showed how range images can be fused into volumetric signed-distance estimates, and KinectFusion made real-time dense implicit reconstruction feel like a practical thing rather than a lab incantation.¹³¹⁴

Aquarium does not copy those systems wholesale. I have cameras and microphones, not one clean depth camera descending from heaven with a warranty card. But the architectural lesson stands: raw sensor retention and resolved spatial evidence are different layers.

I own raw capture and feature extraction.

Aquarium owns resolved spatial evidence once candidates are lowered.

That prevents the usual rot: a camera repo grows one cache, the renderer grows another, and then everyone acts surprised when reality diverges in stereo. The ravens warned you. The ravens are always annoying after the fact.

The Backends

The same semantic field can lower to different backends.

2D tile backend:

domain tile
-> selected 2D claims
-> cached SDF/height/material pages
-> surface shader

3D volume backend:

object or world volume
-> selected 3D claims
-> compact SDF splat packets
-> proxy-bounded raymarch or splat pass

2D-projected-to-3D backend:

surface domain uv
-> cached 2D field page
-> domain projection
-> 3D shading or SDF offset

Sensor confidence backend:

calibrated feature candidates
-> temporal reservoirs
-> confidence/occupancy pages
-> rendered field or fusion visualization

The backend is not the architecture. It is the cheapest correct lowering for the selected evidence. A hammer is not a kingdom because it once hit a nail in front of witnesses.

That is why this does not collapse when we move from Zyphos to a torus, or from a torus to an agent body, or from an authored pebble to a camera feature.

Different domains. Same evidence machine.

The Module Boundaries I Would Defend

Here is the part I would actually brag about, because pretty render demos are cheap and maintainable modularity is where the bodies are usually hidden under the floorboards. Yes, I know where the floorboards are. Old heads notice these things.

Aquarium.Engine.Contracts
  DTOs only. No D3D12. No client lore. No policy.
 
Aquarium.Engine.Fractal
  Pure CPU algorithms: domains, grammar, summaries, reservoirs, occupancy,
  scoring, residency, packet planning.
 
Aquarium.Engine.SensorFusion
  Shared adapter surface for calibrated feature candidates and raw producer
  lowerings.
 
Aquarium.Engine.Render
  D3D12 resources, page tables, SDF/splat passes, TAA guide buffers, debug.
 
Aquarium.Zyphos
  Planet policy and world grammar.
 
Aquarium.Epiphany
  Agent/body policy and semantic bindings.
 
Mimir/LocalCast
  Raw capture, feature extraction, calibration facts, witness ledgers.

Each module is defended by an invariant:

Contracts own stable shapes.
Fractal owns pure spatial decisions.
SensorFusion owns modality lowerings.
Render owns GPU execution.
Clients own meaning and policy.
Mimir owns raw capture and witness confidence.

This is how you keep tests useful.

Not “mock every helper until the test becomes a puppet show.”

Good mock seams sit at resource and observation boundaries:

fake random;
fake clock;
fake payload store;
fake contribution probe;
fake sensor source;
fake domain shift map;
fake TAA guide sink;
fake debug sink.

Then you test the actual invariants:

projection round trips;
seam continuity;
domain lineage;
grammar determinism;
summary conservativeness;
reservoir math;
stochastic convergence;
residency under fake budgets;
CPU/HLSL envelope parity;
guide-buffer packing.

This matters because a recursive renderer without test seams is not a machine. It is a threat in a hoodie, holding a profiler hostage and asking to be called visionary.

The Frame

One frame of the Perfect Machine looks like this:

1. Gather visible domains and active sensor regions.
2. Load resident summaries and previous reservoirs.
3. Generate structural and sensor candidates under CPU budget.
4. Evaluate targets and source PDFs.
5. Update local reservoirs.
6. Validate temporal reuse.
7. Validate spatial/domain reuse.
8. Update occupancy graph statistics.
9. Select hierarchy cut under CPU/GPU/RAM/SSD budgets.
10. Queue missing pages; keep parent summaries active.
11. Lower selected evidence to 2D/3D/projected SDF splat packets.
12. Render splat/SDF passes.
13. Resolve with TAA guide buffers.
14. Emit debug telemetry and evidence logs.

That is not a sketch. That is the ownership chain. That is the oath-line from raw witness to rendered verdict.

If a subsystem cannot point to its line in that chain, it is probably freeloading.

The Waterfall Plan

Since this is a waterfall design pass, yes, I am going to be impolite and name the phases before anyone starts hammering boards together.

A. Architecture Freeze

Freeze the ownership map, glossary, target data structures, and test seams. Name the cut lines. Do not let “we can probably patch it later” enter the room with shoes on. It will track mud on the runes and then ask why the prophecy is smudged.

B. Reservoir Guide Layout

Add an explicit reservoir guide history target. Current scene-control.w already carries current reservoir confidence, but previous-frame reservoir validity, sample age, and invalidation reason deserve their own guide layout. Packing this forever would be a tiny act of cowardice with compound interest.

C. Fractal Probe Pipeline

Turn IFS structural probes into real reservoir candidates. Add renderer-facing validation: camera motion, disocclusion, material class, visibility, and domain shift tests.

D. Occupancy Graph

Promote the contribution cache into the heuristic occupancy graph. Add confidence decay, stale uncertainty growth, convergence telemetry, and debug views for mean, variance, sample age, and update probability.

E. 2D SDF Tile Backend

Build cached 2D SDF/height/material pages for cube-sphere and torus domains. Parent summaries render while child pages stream.

F. 3D SDF Splat Backend

Build compact-support 3D SDF splat packets for object and volume domains. Everything is proxy-bounded. Everything is LOD-gated. No global mega-SDF. We are not summoning a furnace just to toast one marshmallow.

G. 2D-Projected-To-3D Backend

Project cached 2D fields onto 3D domains: planets, torus stations, curved sheets, shells, decks, panels, and whatever lovely nonsense the art direction throws into the pit next.

H. Mimir Adapter

Bring in camera/audio candidates after the fractal reservoir path proves the contract. I supply calibrated features. Aquarium resolves spatial evidence.

I. GPU Reservoirs And Spatial Reuse

Move hot reservoirs to GPU buffers. Add spatial reuse across screen tiles and domain neighbors. Keep CPU/GPU parity fixtures or the whole thing becomes an expensive guessing ritual.

J. Learned Priority Gate

Only after telemetry exists, train a small predictor for update and residency priority. It must beat the heuristic on held-out camera/sensor traces. If it does not, it gets deleted with dignity and no little shrine. I do not keep failed idols just because they had a convincing demo.

Why This Is The Right Machine

Because it separates meaning from execution.

Because domains own coordinates.

Because grammars own authored detail.

Because reservoirs own sample selection.

Because the occupancy graph owns statistical memory.

Because residency owns resource pressure.

Because the renderer owns packets and pixels.

Because TAA owns pixel history and nothing else.

Because conservative summaries keep the machine honest when stochastic priority gets clever.

Because CPU, RAM, SSD, and GPU each have a job, instead of everyone sprinting toward the same bottleneck with a mouthful of excuses.

That is the magic. Not the stage kind. The old kind: a binding, a price, and a name that means what it says.

Not a miracle renderer. Not a single perfect shader. Not a pile of clever hacks wearing bloom.

A spatial evidence machine.

Fast enough to run on a tired GTX 1070 because it refuses to wake the whole world.

Expressive enough to describe fractal planets, toroidal stations, agent bodies, and sensor-fused rooms because it does not confuse domain meaning with backend packets.

Maintainable enough to defend in an interview because every module owns an invariant, every resource boundary has a mock seam, and every risky algorithm has a cut line.

That is why I defend it this way.

The machine has one eye. More importantly, it knows what that eye is allowed to claim it has seen.

References

PROJ contributors, “Quadrilateralized Spherical Cube”, PROJ documentation. ↩
Petr Clarberg, “Cube-to-Sphere Projections for Procedural Texturing and Beyond”, Journal of Computer Graphics Techniques, 2018. ↩
Matthias Zwicker, Hanspeter Pfister, Jeroen van Baar, and Markus Gross, “EWA Splatting”, IEEE Visualization, 2001. ↩
Liu Ren, Hanspeter Pfister, and Matthias Zwicker, “Object Space EWA Surface Splatting”, Computer Graphics Forum, 2002. ↩
Bernhard Kerbl, Georgios Kopanas, Thomas Leimkuehler, and George Drettakis, “3D Gaussian Splatting for Real-Time Radiance Field Rendering”, ACM Transactions on Graphics, 2023. ↩
Benedikt Bitterli, Chris Wyman, Matt Pharr, Peter Shirley, Aaron Lefohn, and Wojciech Jarosz, “Spatiotemporal Reservoir Resampling for Real-Time Ray Tracing with Dynamic Direct Lighting”, ACM Transactions on Graphics, 2020. ↩
Yaobin Ouyang, Shiqiu Liu, Markus Kettunen, Matt Pharr, and Jacopo Pantaleoni, “ReSTIR GI: Path Resampling for Real-Time Path Tracing”, Computer Graphics Forum, 2021. ↩
Daqi Lin, Markus Kettunen, Benedikt Bitterli, Jacopo Pantaleoni, Cem Yuksel, and Chris Wyman, “Generalized Resampled Importance Sampling: Foundations of ReSTIR”, ACM Transactions on Graphics, 2022. ↩
Arul Asirvatham and Hugues Hoppe, “Terrain Rendering Using GPU-Based Geometry Clipmaps”, GPU Gems 2, 2005. ↩
Brian Karis, Rune Stubbe, Graham Wihlidal, et al., “A Deep Dive into Nanite Virtualized Geometry”, Advances in Real-Time Rendering, SIGGRAPH 2021. ↩
Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer, “Finite-time Analysis of the Multiarmed Bandit Problem”, Machine Learning, 2002. ↩
Thomas Mueller, Alex Evans, Christoph Schied, and Alexander Keller, “Instant Neural Graphics Primitives with a Multiresolution Hash Encoding”, ACM Transactions on Graphics, 2022. ↩
Brian Curless and Marc Levoy, “A Volumetric Method for Building Complex Models from Range Images”, SIGGRAPH 1996. ↩
Richard A. Newcombe, Shahram Izadi, Otmar Hilliges, David Molyneaux, David Kim, Andrew J. Davison, Pushmeet Kohli, Jamie Shotton, Steve Hodges, and Andrew Fitzgibbon, “KinectFusion: Real-Time Dense Surface Mapping and Tracking”, ISMAR 2011. ↩

Table of Contents