Sphere Depth Maps from Cube Faces

00 — Motivation

A validation harness with a known-analytic ground truth.

The downstream Six-Plane Mesh Reconstruction pipeline (Topic 35) consumes six orthographic depth maps and emits a watertight mesh. The deployment target is Building Elevation Reconstruction (Topic 40) where those depth maps come from a generative view-synthesis frontend running on street-view photographs. Before any of that ML machinery is connected, the depth-map-to-mesh pipeline needs an independent stress test against inputs where the correct answer is mathematically known.

A sphere centred in the unit cube gives that ground truth. Every depth map is a radially-symmetric disc. The silhouette is the equatorial great circle. The pipeline should recover three perpendicular great circles passing through the origin — anything else means a bug in the reconstruction logic, the rendering, or the lift-to-3D stage. The sphere lets every stage be cross-checked against an analytic answer before any neural component or real-world image is in the loop.

The same scratch-built pipeline generalises to cube, cylinder, torus, and arbitrary OBJ meshes — each adds its own failure modes (cap rims on the cylinder, hole topology on the torus, concave silhouettes on L-shapes) that drive specific refinements to the rendering and lift stages. The sphere just goes first because its ground truth has no degrees of freedom.

What it feeds

The output is consumed directly by Topic 35 as the canonical synthetic test input. The contour-extraction algorithms developed here (subpixel marching squares, corner-preserving smoothing) are reused unchanged in Topic 40's architectural pipeline.

01 — Premise

Six orthographic views, six depth maps, one 3D wireframe.

The downstream Six-Plane Mesh Reconstruction pipeline (Topic 35) and Building Elevation Reconstruction (Topic 40) both consume six axis-aligned orthographic depth maps as input. The question this topic answers: how do you generate those depth maps from a given mesh, and how do you invert them back into a 3D wireframe to verify the pipeline is correct?

The validation target is a sphere centred in the unit cube. For a sphere we know the analytic answer: every depth map should be a radially-symmetric disc, and the reconstructed wireframe should be three perpendicular great circles passing through the sphere's centre. If the pipeline can't recover those three great circles, it has a bug. If it can, the pipeline transfers to harder shapes — cube, cylinder, torus, and ultimately arbitrary OBJ meshes.

Core Insight

The silhouette lives at the equator.
Not on the face.

A sphere's silhouette under orthographic projection is the equatorial circle. Lifting the 2D contour onto the cube's face plane puts six disconnected circles at x = ±1, y = ±1, z = ±1 — wrong. The contour must be lifted to the surface at the limit of zero inward distance: a single great circle at the sphere's centre for each pair of opposite faces. Six views, three coincident great circles, one watertight wireframe.

03 — Stage 1 · Orthographic Raycast Rendering

Analytic ray intersection beats numerical for known shapes.

For each of the six cube faces, parallel rays are cast inward from a 512 × 512 grid covering that face. Ray-shape intersection is solved analytically where possible — for a sphere of radius r centred at the origin:

|origin + t · direction|² = r²
→ |o|² + 2t (o · d) + t² |d|² − r² = 0
→ at² + bt + c = 0    (with a = |d|² = 1 for unit direction)

The smaller positive root is the near-surface intersection; the larger is the far-surface intersection. For a cube, the intersection is a per-axis slab test. For a cylinder, it's a quadratic in the lateral coordinates plus a plane test for the caps. For a torus, no closed-form is fast enough, so we fall back to sphere-marching on the SDF — march along the ray in steps equal to the current SDF value until convergence.

Shape	Intersection method	Why
Sphere	Analytic (quadratic, 2 roots)	Single quadratic in t — exact, fast
Cube	Slab test (per-axis)	Six plane intersections, take innermost / outermost
Cylinder	Analytic (quadratic in XZ + cap planes)	Curved wall is quadratic, caps are planes — combine both
Torus	Sphere-marching on SDF	4th-degree polynomial otherwise — SDF march is faster and robust
Arbitrary OBJ mesh	BVH ray-cast via `trimesh.ray`	No analytic form — bounding-volume hierarchy is the production choice

Each depth map is stored as a 512×512 grayscale PNG plus the metadata pair (t_min, t_range) — two floats per face, twelve floats total per shape — that lets a downstream consumer denormalise pixel values [0, 1] back to world t coordinates. Without those two floats, each face's depth would be auto-contrast-stretched per image and the gradients across opposite faces become incomparable.

04 — Stage 2 · Marching Squares with Subpixel Accuracy

Treat depth as a scalar field, not as a binary mask.

Early attempts thresholded the depth map into a binary hit/no-hit mask and traced the boundary pixel-by-pixel. The result was severe zigzag artefacts: the pixel grid forces axis-aligned steps, and a diagonal silhouette boundary produces a staircase of horizontal and vertical jumps. Three observations fixed it.

First, the depth map is itself a continuous scalar field sampled on a pixel grid. Treat it as such and run marching squares on the float values rather than a thresholded mask. Second, skimage.measure.find_contours(depth, level=0.01) extracts subpixel-accurate contours by linearly interpolating each grid edge between the two adjacent pixel values, yielding fractional (row, col) coordinates. Third, the level value matters: level=0 sits exactly at the discontinuity between hit (positive depth) and no-hit (zero), which is numerically unstable. level=0.01 sits just inside the silhouette where the field is well-defined.

contours = skimage.measure.find_contours(depth_map, level=0.01)
# Result: list of (N, 2) float arrays — each contour is a sequence
# of subpixel (row, col) coordinates tracing the iso-line
# For the sphere case: a single contour with ~1025 points

Why subpixel matters

The contour for a sphere's silhouette is a perfect circle of radius 0.5 in UV space. Pixel-grid marching produces ~32 axis-aligned stair steps. Subpixel marching produces 1025 points lying within 0.001 of the true circle — three orders of magnitude better, no smoothing required at this stage.

05 — Stage 3 · The Equator Insight

Silhouette = limit at zero inward distance.

The first working implementation placed each face's contour on the face plane: P = normal · 1.0 + right · u + up · v. For a sphere this puts the +Z contour at z = 1 and the −Z contour at z = −1 — two parallel circles on opposite walls of the bounding cube. They are geometrically correct as renderings, but they are not the silhouette of the sphere. The sphere's silhouette is the equator. From any viewpoint, the silhouette is the set of surface points where the surface normal is perpendicular to the viewing direction. For a sphere centred at the origin viewed along ±Z, that locus is the unit-circle in the XY plane at z = 0.

The correct lift therefore reads the depth scalar at the silhouette contour and projects to that depth. But the depth field is discontinuous exactly at the silhouette — sampling at the contour pixel gives an unstable value. Sampling just inside the contour gives a value, but that value is epsilon away from the true surface and produces gaps between opposite faces: the +Z and −Z reconstructions sit at z = +ε and z = −ε rather than coinciding at z = 0.

The fix is extrapolation to the limit. Sample depth at several inward distances — typically 1 px, 2 px, and 4 px from the contour — fit a polynomial, and extrapolate back to zero inward distance. This recovers the surface depth at the silhouette without sampling the discontinuity. Opposite-face contours then coincide at the equator, producing one coherent 3D curve rather than two parallel circles.

d1 = bilinear_sample(depth, contour_pt − 1px · inward)
d2 = bilinear_sample(depth, contour_pt − 2px · inward)
d4 = bilinear_sample(depth, contour_pt − 4px · inward)
# Fit polynomial through (1, d1), (2, d2), (4, d4) in (distance, depth) space
# Evaluate at distance = 0
d_surface = polynomial_extrapolate([1, 2, 4], [d1, d2, d4], target=0)

# Lift to 3D using face's normal/up/right basis
P_3D = origin + (1.0 − d_surface) · normal + u · right + v · up

Where it breaks

For shapes with concave silhouettes — torus profiles, L-shapes — the inward direction is locally ambiguous. The current heuristic (inward = gradient direction in the binary mask) works on convex silhouettes and simple concave ones; arbitrary concavities require a per-point inward vector estimated from the contour's local frame.

06 — Stage 3b · Near / Far from Opposite Faces

No re-rendering required.

For solid shapes the depth map gives the near surface depth — the distance from the cube face to the first intersection. The far depth — distance to the back-side intersection — is also useful for thickness reasoning. Naively this would require a second pass of ray-casting. But there's an identity: the near depth of the opposite face is the far depth of the current face, reprojected.

For a ray from the +X face at pixel (u, v): origin (1, u, v), direction (−1, 0, 0). Hits the front surface at x_front = 1 − t_near_posX and the back surface at x_back = 1 − t_far_posX. For the same physical point on the back surface seen from the −X face: origin (−1, u, −v) (note: −X has up = −Z so v is flipped), direction (+1, 0, 0). That point has t_negX = 1 − x_back = t_far_posX. So t_far_posX = 2.0 − t_near_negX after the appropriate coordinate flip — no re-rendering, just arithmetic on existing maps.

# Reconstruct far depth from opposite face's near depth
t_far_posX = 2.0 − reproject(t_near_negX, axis_flip='Y')

# 6 near maps + 12 metadata floats give all 6 near + 6 far depths
# Plus a gap map: thickness = t_far − t_near
gap = t_far − t_near   # bright = thick geometry, dark = thin

The visual sanity check that failed

When the gap reconstruction was first tested, all six gap maps for the sphere looked identical to all six near maps. That's because the sphere is symmetric — front-hemisphere depth and back-hemisphere thickness have the same radial gradient, and auto-contrast normalisation removes the offset. The maps were different in world units; only the visualisation collapsed them. The fix was to denormalise to world t values via the stored (t_min, t_range) per face before any visual comparison.

07 — Stage 4 · Corner-Preserving Smoothing

Detect corners by turning-angle peak, smooth between them.

The subpixel marching squares output is geometrically correct but visually noisy: at high zoom the contour has small wobbles from the pixel-level interpolation. A naive Gaussian smooth removes the wobble but also rounds sharp architectural corners — exactly the features that distinguish a rectangle from a circle and a window jamb from a wall.

The selective-smoothing algorithm: first apply a light Gaussian to remove pixel-grid noise while preserving structural features. Then compute the turning angle at each vertex — the angle between incoming and outgoing edges. Run scipy peak detection on the turning-angle series with a configurable threshold (default 30°). Vertices with turning-angle peaks above threshold are structural corners and get pinned. Apply heavy smoothing to each segment between corners independently.

corner_threshold_deg = 30
# Stage 1: light Gaussian to suppress pixel-grid noise
contour = gaussian_filter1d(contour, sigma=1.0)

# Stage 2: detect corners
turning_angles = compute_turning_angles(contour)   # in degrees
corner_idx = scipy.signal.find_peaks(turning_angles, height=corner_threshold_deg)[0]

# Stage 3: split, smooth each segment, rejoin
segments = split_at_indices(contour, corner_idx)
smoothed = []
for seg in segments:
    smoothed.append(gaussian_filter1d(seg, sigma=4.0))  # heavy smoothing
contour_out = concat_segments(smoothed, corner_points=contour[corner_idx])

08 — Results · Per-Shape Verification

Validated against analytic ground truth on four primitives.

Shape	Contour points / face	Reconstruction	Notes
Sphere (r=0.5)	1025 subpixel pts	3 perpendicular great circles at sphere centre	All 6 faces produce identical discs by symmetry
Cube (unit)	4 corner pts + edges	Wireframe edges of the cube	Near depth is flat — every pixel inside silhouette has identical t
Cylinder (h=1, r=0.5)	~1000 pts / face	Top + bottom rim circles + 4 quarter-arcs for side curvature	Side-view rectangle's top/bottom edges lift to arcs, not lines — required multi-distance extrapolation
Torus (R=0.5, r=0.2)	2 contours / top & bottom face	Outer + inner rim circles in the XZ plane	Hole correctly recovered; intermediate cross-section curvature still under refinement

The pipeline is shape-agnostic at the export stage: the final OBJ contains the union of all six lifted polylines, scaled back to the input mesh's original bounding box (the centring + unit-cube normalisation applied during rendering is inverted before export). The output is consumed downstream by Topic 35 (Six-Plane Mesh Reconstruction) where the polylines become the contour input for triangulation.

Appendix — Raw Materials

Transcripts & Source References

████████████████████████████████████████████████
███████████████████████████████████████

01 — ██████████████████████████

██████████████████████████████████████

█████████ · ████ · █████████████████████

█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████

██████████████████████████████████████████████

██████████ · ████ · ███████████████████████████████

02 — ████████████████████████████████

████████████████████████████████████████████

03 — ████████████████████████████████████████████

Feb 2026

Restricted Access

Sphere Depth Maps
from Cube Faces.

A validation harness with a known-analytic ground truth.

Six orthographic views, six depth maps, one 3D wireframe.

Five stages: render · extract · lift · stitch · export.

Analytic ray intersection beats numerical for known shapes.

Treat depth as a scalar field, not as a binary mask.

Silhouette = limit at zero inward distance.

No re-rendering required.

Detect corners by turning-angle peak, smooth between them.

Validated against analytic ground truth on four primitives.

Interactive Demo · Live

Sphere Depth Maps from Cube Faces.

A validation harness with a known-analytic ground truth.

Six orthographic views, six depth maps, one 3D wireframe.

Five stages: render · extract · lift · stitch · export.

Analytic ray intersection beats numerical for known shapes.

Treat depth as a scalar field, not as a binary mask.

Silhouette = limit at zero inward distance.

No re-rendering required.

Detect corners by turning-angle peak, smooth between them.

Validated against analytic ground truth on four primitives.

Interactive Demo · Live

Sphere Depth Maps
from Cube Faces.