A scratch-built pipeline that renders six orthographic depth maps from a 3D mesh — one per cube face — and reconstructs the silhouette as 3D polylines in OBJ. The geometry insight matters more than the code: a sphere's silhouette doesn't sit on the face plane, it sits at the equator.
The downstream Six-Plane Mesh Reconstruction pipeline (Topic 35) consumes six orthographic depth maps and emits a watertight mesh. The deployment target is Building Elevation Reconstruction (Topic 40) where those depth maps come from a generative view-synthesis frontend running on street-view photographs. Before any of that ML machinery is connected, the depth-map-to-mesh pipeline needs an independent stress test against inputs where the correct answer is mathematically known.
A sphere centred in the unit cube gives that ground truth. Every depth map is a radially-symmetric disc. The silhouette is the equatorial great circle. The pipeline should recover three perpendicular great circles passing through the origin — anything else means a bug in the reconstruction logic, the rendering, or the lift-to-3D stage. The sphere lets every stage be cross-checked against an analytic answer before any neural component or real-world image is in the loop.
The same scratch-built pipeline generalises to cube, cylinder, torus, and arbitrary OBJ meshes — each adds its own failure modes (cap rims on the cylinder, hole topology on the torus, concave silhouettes on L-shapes) that drive specific refinements to the rendering and lift stages. The sphere just goes first because its ground truth has no degrees of freedom.
The downstream Six-Plane Mesh Reconstruction pipeline (Topic 35) and Building Elevation Reconstruction (Topic 40) both consume six axis-aligned orthographic depth maps as input. The question this topic answers: how do you generate those depth maps from a given mesh, and how do you invert them back into a 3D wireframe to verify the pipeline is correct?
The validation target is a sphere centred in the unit cube. For a sphere we know the analytic answer: every depth map should be a radially-symmetric disc, and the reconstructed wireframe should be three perpendicular great circles passing through the sphere's centre. If the pipeline can't recover those three great circles, it has a bug. If it can, the pipeline transfers to harder shapes — cube, cylinder, torus, and ultimately arbitrary OBJ meshes.
The silhouette lives at the equator.
Not on the face.
A sphere's silhouette under orthographic projection is the equatorial circle.
Lifting the 2D contour onto the cube's face plane puts six disconnected
circles at x = ±1, y = ±1, z = ±1 — wrong.
The contour must be lifted to the surface at the limit of zero inward
distance: a single great circle at the sphere's centre for each pair of
opposite faces. Six views, three coincident great circles, one watertight
wireframe.
For each of the six cube faces, parallel rays are cast inward from a
512 × 512 grid covering that face. Ray-shape intersection is
solved analytically where possible — for a sphere of radius r
centred at the origin:
The smaller positive root is the near-surface intersection; the larger is the far-surface intersection. For a cube, the intersection is a per-axis slab test. For a cylinder, it's a quadratic in the lateral coordinates plus a plane test for the caps. For a torus, no closed-form is fast enough, so we fall back to sphere-marching on the SDF — march along the ray in steps equal to the current SDF value until convergence.
| Shape | Intersection method | Why |
|---|---|---|
| Sphere | Analytic (quadratic, 2 roots) | Single quadratic in t — exact, fast |
| Cube | Slab test (per-axis) | Six plane intersections, take innermost / outermost |
| Cylinder | Analytic (quadratic in XZ + cap planes) | Curved wall is quadratic, caps are planes — combine both |
| Torus | Sphere-marching on SDF | 4th-degree polynomial otherwise — SDF march is faster and robust |
| Arbitrary OBJ mesh | BVH ray-cast via trimesh.ray | No analytic form — bounding-volume hierarchy is the production choice |
Each depth map is stored as a 512×512 grayscale PNG plus the
metadata pair (t_min, t_range) — two floats per face, twelve
floats total per shape — that lets a downstream consumer denormalise pixel
values [0, 1] back to world t coordinates. Without
those two floats, each face's depth would be auto-contrast-stretched per
image and the gradients across opposite faces become incomparable.
Early attempts thresholded the depth map into a binary hit/no-hit mask and traced the boundary pixel-by-pixel. The result was severe zigzag artefacts: the pixel grid forces axis-aligned steps, and a diagonal silhouette boundary produces a staircase of horizontal and vertical jumps. Three observations fixed it.
First, the depth map is itself a continuous scalar field
sampled on a pixel grid. Treat it as such and run marching squares on the
float values rather than a thresholded mask. Second, skimage.measure.find_contours(depth, level=0.01)
extracts subpixel-accurate contours by linearly interpolating each grid edge
between the two adjacent pixel values, yielding fractional (row, col)
coordinates. Third, the level value matters: level=0 sits exactly
at the discontinuity between hit (positive depth) and no-hit (zero), which is
numerically unstable. level=0.01 sits just inside the silhouette
where the field is well-defined.
0.5 in UV space. Pixel-grid marching produces ~32 axis-aligned
stair steps. Subpixel marching produces 1025 points lying within 0.001 of
the true circle — three orders of magnitude better, no smoothing required
at this stage.
The first working implementation placed each face's contour on the face plane:
P = normal · 1.0 + right · u + up · v. For a sphere this puts the
+Z contour at z = 1 and the −Z contour at z = −1 —
two parallel circles on opposite walls of the bounding cube. They are
geometrically correct as renderings, but they are not the silhouette
of the sphere. The sphere's silhouette is the equator. From any viewpoint, the
silhouette is the set of surface points where the surface normal is
perpendicular to the viewing direction. For a sphere centred at the origin
viewed along ±Z, that locus is the unit-circle in the XY plane at z = 0.
The correct lift therefore reads the depth scalar at the silhouette contour
and projects to that depth. But the depth field is discontinuous exactly at
the silhouette — sampling at the contour pixel gives an unstable value.
Sampling just inside the contour gives a value, but that value is
epsilon away from the true surface and produces gaps between opposite faces:
the +Z and −Z reconstructions sit at z = +ε and z = −ε
rather than coinciding at z = 0.
The fix is extrapolation to the limit. Sample depth at several inward distances — typically 1 px, 2 px, and 4 px from the contour — fit a polynomial, and extrapolate back to zero inward distance. This recovers the surface depth at the silhouette without sampling the discontinuity. Opposite-face contours then coincide at the equator, producing one coherent 3D curve rather than two parallel circles.
For solid shapes the depth map gives the near surface depth — the distance from the cube face to the first intersection. The far depth — distance to the back-side intersection — is also useful for thickness reasoning. Naively this would require a second pass of ray-casting. But there's an identity: the near depth of the opposite face is the far depth of the current face, reprojected.
For a ray from the +X face at pixel (u, v): origin (1, u, v),
direction (−1, 0, 0). Hits the front surface at
x_front = 1 − t_near_posX and the back surface at
x_back = 1 − t_far_posX. For the same physical point on the
back surface seen from the −X face: origin (−1, u, −v)
(note: −X has up = −Z so v is flipped), direction (+1, 0, 0).
That point has t_negX = 1 − x_back = t_far_posX. So
t_far_posX = 2.0 − t_near_negX after the appropriate
coordinate flip — no re-rendering, just arithmetic on existing maps.
t values via the stored (t_min, t_range) per
face before any visual comparison.
The subpixel marching squares output is geometrically correct but visually noisy: at high zoom the contour has small wobbles from the pixel-level interpolation. A naive Gaussian smooth removes the wobble but also rounds sharp architectural corners — exactly the features that distinguish a rectangle from a circle and a window jamb from a wall.
The selective-smoothing algorithm: first apply a light Gaussian to remove pixel-grid noise while preserving structural features. Then compute the turning angle at each vertex — the angle between incoming and outgoing edges. Run scipy peak detection on the turning-angle series with a configurable threshold (default 30°). Vertices with turning-angle peaks above threshold are structural corners and get pinned. Apply heavy smoothing to each segment between corners independently.
| Shape | Contour points / face | Reconstruction | Notes |
|---|---|---|---|
| Sphere (r=0.5) | 1025 subpixel pts | 3 perpendicular great circles at sphere centre | All 6 faces produce identical discs by symmetry |
| Cube (unit) | 4 corner pts + edges | Wireframe edges of the cube | Near depth is flat — every pixel inside silhouette has identical t |
| Cylinder (h=1, r=0.5) | ~1000 pts / face | Top + bottom rim circles + 4 quarter-arcs for side curvature | Side-view rectangle's top/bottom edges lift to arcs, not lines — required multi-distance extrapolation |
| Torus (R=0.5, r=0.2) | 2 contours / top & bottom face | Outer + inner rim circles in the XZ plane | Hole correctly recovered; intermediate cross-section curvature still under refinement |
The pipeline is shape-agnostic at the export stage: the final OBJ contains the union of all six lifted polylines, scaled back to the input mesh's original bounding box (the centring + unit-cube normalisation applied during rendering is inverted before export). The output is consumed downstream by Topic 35 (Six-Plane Mesh Reconstruction) where the polylines become the contour input for triangulation.
Pick a shape from the preset row, or click the input canvas to cycle. The six depth maps and the 3-D wireframe update in real time as you switch shapes. Each shape exposes a different stress test for the pipeline — sphere validates symmetry, cube validates flat faces, cylinder validates cap rims, torus validates hole topology. Drag the wireframe pane to rotate.