Four-paper survey of state-of-the-art 3-D reconstruction methods, each studied from paper read → GitHub clone → inference attempt. The output: identification of the field gap that the thesis line targets — no published method combines image-to-3-D with CAD-parametric (procedural) control. Either you get neural reconstruction (high fidelity, no parameters) or you get procedural generation (low fidelity, full parameters). The thesis line's goal is to bridge that gap.
Before committing to the thesis-line architectural decisions (procedural + neural hybrid, triplane intermediate, MambaFlow3D- class generators), the obvious question is whether someone has already shipped what the thesis is trying to build. The October 2025 survey work was the field-gap-finding exercise. Four representative recent papers were each read end-to-end, cloned, and inference-attempted on local hardware (Intel iMac, RTX 3060 via Vast.ai, or Colab T4 depending on the paper's compute requirements).
The finding that justified the thesis-line scope: the field bifurcates into neural reconstruction methods (SparC3D, TRELLIS, HoLa) that produce high-fidelity 3-D from images but have no parametric / procedural interface; and procedural CAD generation methods (CAPRI-Net, BrepGen) that produce fully-parametric output but from sparse / latent input rather than from images. The thesis line targets the union — image-to- 3-D with procedural / parametric output — which no paper at the time of survey solved end-to-end.
| Paper | Approach | Input | Output | Parametric? |
|---|---|---|---|---|
| CAPRI-Net | CSG primitive composition via learned program | Point cloud | CSG program (primitives + operations) | Yes |
| BrepGen | Diffusion on boundary-representation graph | Latent code | B-rep CAD model | Yes |
| HoLa | Hierarchical learned B-rep generation | Latent code | B-rep CAD model with topology | Yes |
| SparC3D | Sparse-cube transformer over voxel tokens | Single image | Sparse-voxel 3-D mesh | No |
| TRELLIS | Structured latent + dual decoder | Single image | Sparse-voxel + Gaussian splat output | No |
Image-to-3-D ∩ Parametric = ∅.
No paper does both. That's the thesis-line opportunity.
SparC3D and TRELLIS demonstrate state-of-the-art neural image- to-3-D with no procedural interface. CAPRI-Net, BrepGen, and HoLa demonstrate state-of-the-art procedural / parametric CAD generation with no image input. The intersection — the capability that the Apple Maps procedural-modelling team needs most — is empty. The thesis line targets that intersection explicitly: image → triplane → procedural DSL (PGN-class) / primitive program (SculptNet-class).
| Paper | Local hardware tried | Outcome |
|---|---|---|
| CAPRI-Net | Intel iMac, then Jupyter notebook with GPU | Successful inference after dependency-pin fixes |
| BrepGen | Google Colab T4 | Successful after CUDA-version compatibility fix |
| HoLa | iMac (CPU) | Inference too slow to be useful; queued for RTX 3060 retry |
| SparC3D | Hugging Face Space (cloud) | Works; analysed via the HF demo rather than local install |
| TRELLIS | Microsoft research release · A100-class required | Read-only — local hardware insufficient |
The field gap as a 2-D scatter plot. Each paper is a dot; X-axis is "image input capability", Y-axis is "parametric output". The empty top-right quadrant is the thesis-line target.
White paper · five-paper field-gap survey · image-input × parametric-output quadrant · thesis-line scope decision