← Research Timeline Aditya Jain / Apple Maps · 3D Reconstruction
Apr 2026
Topic 42 Apr 2026 DeepSDF · Activation Probing · Unsupervised Segmentation

Activation-Space —
SDF Part Discovery.

The key insight: when a neural network learns to represent a 3-D shape as a Signed Distance Field, its hidden-layer activations naturally encode part-level structure — without any supervision. Probe the DeepSDF decoder's hidden activations at each mesh vertex, cluster them into semantic surface types, split by mesh connectivity, and a classical balustrade railing falls apart into 1 top rail + 8 individual balusters + 1 base — correct instance segmentation, no labels. The segmentation works. The follow-on goal — train one small SDF per part and boolean-union them into sharp junctions — does not, and the project status document is candid about exactly why: interior-volume part-ownership is ambiguous.

00 — Motivation

Can a network segment a shape it was never told had parts?

DeepSDF trains a single MLP to represent a 3-D shape as a continuous signed distance function. The network is never given part labels — it only ever sees (point, sdf) pairs. The hypothesis this project tests: the network's hidden activations, even though it was trained on a single global SDF, encode the part structure of the shape implicitly. A baluster's surface and a top rail's surface produce different activation patterns because the network has learned different local geometry to represent them — so clustering the activations should recover the parts, for free.

The downstream motivation is the "3-D modelling inside neural networks" idea that runs through the whole thesis line: if a shape can be decomposed into parts, each part can be its own small neural SDF, and a boolean union of per-part SDFs gives sharp junctions at part boundaries — the kind of crisp edges a single global SDF smooths away. Unsupervised part discovery is the first step toward part-structured 3-D representation.

What it informs
This is the segmentation counterpart to the Hypernet → DeepSDF generation work (Topic 41) — both probe what a trained DeepSDF decoder has actually learned. The activation-probing result feeds the part-structured-latent direction proposed as future work in Topic 41's white paper. The unsolved interior-ownership problem (§03) is the open question that a DINO-self-distillation extension — designed but not yet built — is meant to address.
Pipeline

Mesh → DeepSDF → activation probe → segment → per-part SDF → union.

input meshrailing DeepSDF V2256×8, ~480K activation probeper-vertex features k-means + conn. comp.instance segmentation per-part SDFthickened shell MLP boolean unionmin() → sharp edges Green = works. Red dashed = the unsolved core problem — per-part reconstruction fails on interior-volume ownership. Hardware: M4 iMac (32 GB) · RTX 3060 (12 GB) · Vast.ai.
01 — What Works (Validated)

DeepSDF V2, activation probing, instance segmentation, a 93.2 % classifier.

DeepSDF V2 — high-fidelity representation. A scaled-up DeepSDF (256 hidden, 8 layers, skip connection at layer 4, ~480 K params) successfully represents complex real-world meshes with thin features. Trained on a classical balustrade railing — 8 balusters, ornate profiles, thin gaps between parts — at resolution 256 it captures every detail that the V1 network (128 hidden, 4 layers, 46 K params) at resolution 128 lost entirely. Key config: latent_dim=128, hidden=256, layers=8, skip_at=4, epochs=3000, multi-band surface sampling with SDF clamp [−0.1, 0.1].

Activation probing — parts emerge without supervision. Hidden-layer activations of the DeepSDF decoder naturally encode part-level structure. Validated on 9 analytical CSG shapes with an average Adjusted Rand Index (ARI) of 0.559 and a peak of 0.996.

ShapePartsBest layerARINotes
Lollipop2all layers0.996Near-perfect: sphere vs cylinder
Mushroom2layer 20.960Cap vs stem cleanly separated
Snowman2layer 00.634Positional distinction between spheres
L-shape2layer 00.606Two boxes — all other methods failed this
Chair6layer 00.371Legs hard to individuate
Head (real mesh)5layer 0n/aGeometric regions: forehead, side, face, nose, neck

Instance segmentation — semantic + connected components. The winning approach combines two steps. Step A: cluster activations into 3–5 semantic surface types with k-means on pure activation features (no spatial coordinates) — this identifies "rail surface", "baluster surface", "base surface", and the top rail stays one continuous piece. Step B: within each semantic type, find connected components on the mesh adjacency graph by BFS — this splits individual instances, because each baluster is disconnected from its neighbours by air gaps. Result on the railing: 1 top rail + 8 individual balusters + 1 base = 10 parts, each correctly isolated.

The core insight
Activations encode WHAT, mesh topology encodes WHERE. Activations give surface-type bands; topology alone gives one undifferentiated blob. Combined, they give instance segmentation. Neither half works alone — and spatial k-means (adding XYZ to the activation features) consistently fails, because k-means can only draw straight Voronoi cuts, not the topological splits that separate balusters across air gaps.

Activation classifier — 93.2 % accuracy on the surface. For each part, the mean activation vector (centroid) is computed from its segmented surface vertices. Any new query point is classified by forward-passing it through the original DeepSDF, reading its activations, and finding the nearest centroid. This hits 93.2 % accuracy on surface vertices — but degrades in the interior volume, which is the crux of §03.

The Unsolved Problem

The original DeepSDF has ONE SDF for the whole shape.
Any point inside is "inside" — it does not know which part a point belongs to.

Segmentation is a surface concept — it works because mesh vertices live on the surface and the network's activations there are meaningful. Per-part reconstruction needs an interior-volume concept: to train a watertight SDF for one baluster, you must know which 3-D points are inside that baluster and not its neighbour. The original DeepSDF cannot answer that — it was only trained near the zero-level set. Every masking attempt fails on this same rock.

02 — What Doesn't Work (Attempted & Failed)

Four attempts at per-part reconstruction. All fail on interior-volume ownership.

The goal: train a small MLP per part, union them via min(), get sharp edges at the boundaries. The status document is candid — this is "the core unsolved problem". The fundamental obstacle is that interior-volume part-ownership is ambiguous, and four distinct masking strategies all break on it.

AttemptMethodWhy it failed
1 — watertight closingClose each open part mesh via fill-holes / voxelize+fill+MC / convex hullVoxelize wrapped entire regions; convex hull lost all concavities. Each part claimed nearly the whole volume — union had 16 M / 16 M voxels inside, marching cubes produced garbage
2 — distance proximity maskUse the original SDF if a query point is within a proximity radius of the part, else force positiveRadius impossible to tune — too large and parts leak into neighbours, too small and parts get holes. Nearest-surface-vertex doesn't determine interior ownership: a point in the air gap between balusters is near a baluster vertex but inside no baluster
3 — segmentation-label maskFind the nearest full-mesh vertex, check its part label; same part → real SDF, else force positiveSame flaw — nearest-vertex is a surface concept. Interior points get assigned to whichever surface is closest, not the correct enclosing part. 30–50 % inside ratios for small parts (should be 5–10 %), blob outputs
4 — activation volume classifierClassify each query point by its DeepSDF activation vector — the 93.2 %-on-surface classifier from §01Activations are unreliable off-surface — the network was never trained to produce meaningful activations in the interior. Noisy interior classification, 30–50 % inside ratios persist, union has artefacts though the railing structure is dimly visible

The pattern across all four: segmentation is a surface operation, reconstruction needs a volume operation, and the trained DeepSDF only ever learned the surface. The thickened-shell workaround — defining each part's SDF as distance_to_part_surface − thickness/2 to turn open sheets into thin watertight solids — produces clean individual parts, but the union still inherits the interior-ownership ambiguity wherever parts are close.

Interactive Demo · Live

Watch the two-step segmentation. The left pane is a stylised railing. Toggle between the activation k-means step (semantic surface types — the rail is one colour, all balusters another) and the connected-components step (each baluster split into its own instance). The right pane shows the resulting part count.

01 — Segmentation step RAW MESH
02 — Railing 1 part
03 — What this step does explanation

Full Technical Paper

White paper · activation-space part discovery · the WHAT/WHERE insight · the ARI validation · the unsolved interior-ownership problem · the DINO-extension design

Read Paper →
Related Thesis Chapters
Hypernet → DeepSDF
The generation counterpart. Both probe what a trained DeepSDF decoder has actually learned — this one for segmentation, that one for image-conditioned generation.
SDF Research
The foundational SDF study. The CSG-union-for-sharp-junctions idea this project chases is the SDF compositionality documented there.
Hierarchical Triplane
The part-structured-representation goal. Activation-space part discovery is one route to the per-part decomposition the hierarchical triplane work assumes.
Appendix — Raw Materials
Transcripts & Source References
████████████████████████████████████████████████

██████████████████████████████████████
█████████ · ████ · █████████████████████
█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
Restricted Access