Topic 03 · October 2025

Graph Grammar 3D

Implementing Merrell's example-based procedural modeling using graph grammars — extracting reusable production rules from r-similar boundary patterns in meshes, then generating new 3D shapes by systematic rule application.

3wk Implementation
2D→3D Extension
4× Leg Symmetry Detected
50+ Graphs Generated
§ 1 — Theoretical Foundation

What Are Graph Grammars?

Merrell's 2023 method takes an example input mesh, automatically extracts reusable grammar production rules from repeated boundary patterns, then generates entirely new shapes by systematically applying those rules. The key insight: shapes that are r-similar share the same local neighborhood structure everywhere — so the grammar captures what the example "knows" about valid local geometry.

r-Similarity (Section 3.1)
Local Neighborhood Equivalence
Two shapes are r-similar if every infinitesimal neighborhood in one appears in the other. As r → 0 this captures exact local topology. A grammar extracted from one shape can generate any r-similar shape — new global forms, same local rules.
Boundary String ∂G (Section 3.3)
Graph Representation
Each graph G has a boundary string ∂G encoding open half-edges and turns. A complete graph has ∂G = "∧" (no open edges). Production rules are string transformations on ∂G — making grammar matching algebraic rather than geometric.
∂G₁ = "y∧ba" + ∂G₂ = "ā∧xy̅"
↓ glue at a, ā
∂G₃ = "y∧bvxy̅"
Primitives (Section 4)
Atomic Grammar Units
Cutting every edge produces half-edges; one primitive per vertex. Each primitive = isolated vertex + surrounding half-edges. Gluing primitives back reproduces the original shape. Primitives are the grammar's alphabet.
Gluing Operations (Section 4.2)
Two Operations
Branch gluing: connect two separate graphs (∂G₁ + ∂G₂ → ∂G₃). Loop gluing: close a loop within one graph ("...a...ā..." → "..."). Together these build any complex shape from primitives.
Loop: "...a...ā..." → "..."
Requires ±360° turn between a, ā
§ 2 — System Architecture

Extraction → Neural → Generation

The full pipeline runs in two phases. Phase one (completed): grammar extraction from real meshes. Phase two (planned): a neural network — DINOv2 features fed through a diffusion transformer — predicts which rules to apply and where, replacing hand-coded generation with image-conditioned inference.

PHASE 1 — COMPLETED PHASE 2 — PLANNED Input Mesh chair.obj real geometry Extract Primitives 1 per vertex Pattern Match boundary strings 3d_grammar_rules.json rule_id, pattern, frequency Image input photo DINOv2 feature extract Diffusion Transformer flow matching [ {rule_id: 0, pos:[0.4,0.4,0]}, {rule_id: 0, pos:[-0.4,0.4,0]}, … ] rule applications + positions Editable 3D Mesh

Full pipeline: grammar extraction (complete) → neural rule prediction (planned). DINOv2 + diffusion transformer maps image → rule applications → 3D geometry.

§ 3 — Implementation

Three-Week Roadmap

Implemented Merrell's method from scratch following the paper exactly — not procedural composition, not hard-coded assembly. Actual rule-based grammar following the mathematical formalism.

01
2D Foundation — Boundary Strings & Gluing
Complete 2D implementation: boundary string representation (∂G), branch gluing (connect two separate graphs), loop gluing (close loops, ∂G → "∧"), graph hierarchy construction (4 generations), production rule extraction by string matching. 50+ graphs generated through systematic gluing operations.
Done
02
3D Extension — Real Mesh Grammar Extraction
Extended to 3D: primitives extracted from real chair meshes, simplified 3D boundary representation (edge direction signatures: x+, y+, z+, etc.), pattern matching finding repeated local structures, grammar rules exported to 3d_grammar_rules.json. Key finding: 4-fold leg symmetry and 2-fold arm symmetry automatically detected from pattern frequency.
Done
03
Neural Integration — Image → Rule Prediction
Planned: DINOv2 feature extraction → diffusion transformer (flow matching) → grammar decoder predicts rule applications and positions. Output: [{rule_id, position, scale}, …] applied to extracted grammar → editable 3D mesh. Target venue: SIGGRAPH 2026 or NeurIPS.
Planned
§ 4 — Results

Grammar Extraction Findings

Running the extraction pipeline on a chair mesh produced interpretable rules that correctly recovered structural symmetries without any supervision.

Rule ID Pattern Signature Frequency Interpretation Status
Rule 0 x+:2 y+:2 z+:1 Leg structure — 4-fold symmetry ✓ Extracted
Rule 1 x+:1 y+:2 z+:1 Arm structure — bilateral symmetry ✓ Extracted
Rule 2 x+:3 y+:1 z+:2 Seat / backrest slab Extracted
Full 3D boundary graph True Merrell: 2D boundary is a 2D surface graph Simplified
Position solving Ax=b Rejection sampling for vertex positions (Section 6.2) Planned
Pattern: "x+:2_y+:2_z+:1" occurs 4× → Rule 0 (legs)
Pattern: "x+:1_y+:2_z+:1" occurs 2× → Rule 1 (arms)

Boundary strings are powerful — pattern frequency reveals symmetry without supervision.
§ 5 — Challenge Log

Engineering Decisions

§ 6 — Led To

What This Work Enabled

Oct 2025 → Research Survey
CAPRI-Net / BrepGen / HoLa / Sparc3D
Studying grammar-based methods led directly to surveying the broader image-to-3D landscape: CAPRI-Net (ran on CPU), BrepGen (Colab), HoLa (BRep diffusion), Sparc3D. Proposed novel pipeline: Sparc3D → CAPRI-Net primitives → parametric control.
Oct–Nov 2025 → Infrastructure
CUDA & GPU Programming
Neural network phase planning required understanding GPU memory trade-offs. Led to CUDA fundamentals study via Mandelbrot set (PyCUDA, Colab), and M4 MPS vs. NVIDIA CUDA performance comparison for 3D ML workloads.
Thesis Direction
Grammar + Neural Hybrid
The core thesis insight emerging here: explicit grammar rules as the structural scaffold, neural network as the prediction mechanism. Image → DINOv2 → DiT → rule applications → editable mesh. Avoids both pure neural opacity and pure procedural rigidity.
§ 7 — References

Papers & Resources

§ 8 — Interactive

Grammar Extraction Demo

Select a preset shape to see how the graph grammar pipeline extracts primitives, identifies boundary strings, and generates production rules. Each preset mirrors a real shape used during the October 2025 implementation.

Preset
Shape Graph
Extracted Grammar Rules
Select a preset to run extraction...
Research Write-up
Graph Grammars for 3D Shape Generation: Implementation & Findings
arXiv-style preprint · cs.GR · cs.LG · October 2025
Read Paper →
arXiv Preprint · cs.GR · cs.LG · Oct 2025
Graph Grammars for Automatic 3D Procedural Modeling: Implementing Merrell's Boundary String Method with Neural Rule Prediction
Aditya Jain
Apple Maps · 3D Reconstruction Group, Hyderabad · Thesis Research, Unpublished Preprint
Submitted: October 2025 Subject: cs.GR · cs.LG Keywords: graph grammar, procedural generation, r-similarity, boundary string, rule extraction, 3D synthesis
Abstract
We implement Paul Merrell's 2023 example-based procedural modeling system using graph grammars, extending the original 2D formulation to 3D polygonal meshes. The method automatically extracts production rules from a single example shape by decomposing it into per-vertex primitives, encoding connectivity as boundary strings ∂G, and identifying repeated substring patterns as grammar rules. We implement the full pipeline from scratch: primitive extraction, branch-gluing and loop-gluing operations, boundary string computation, and frequency-based rule identification. On a standard chair mesh, the extractor correctly identifies 4-fold leg symmetry (rule frequency = 4) and bilateral back symmetry (frequency = 2) without any supervision. The 3D extension applies simplified boundary representations due to the combinatorial explosion of gluing operations in higher dimensions. Grammar-driven generation produces over 50 novel shapes locally r-similar to the input. We identify the non-differentiable executor as the primary barrier to end-to-end learning and propose a DiT-based neural rule predictor — using DINOv2 image features and flow matching over rule-selection sequences — as the natural next stage. Keywords: graph grammar, procedural modeling, r-similarity, boundary string, 3D generation, rule extraction.
1. Introduction

Procedural 3D generation in Apple Maps requires producing geometrically valid assets — bridges, buildings, terrain features — from structured inputs like polylines and semantic boundary attributes. The dominant approach is direct neural generation: diffusion over meshes, SDF fields, or triplane representations. These methods produce plausible geometry but encode no structural knowledge of what makes a valid assembly. A bridge generated by mesh diffusion may look correct from a distance but contain degenerate topology, non-manifold faces, or physically implausible structural relationships.

Graph grammars offer a fundamentally different inductive bias. Rather than learning a smooth distribution over geometry, they learn the combinatorial rules that govern valid assemblies. A grammar extracted from a real bridge knows that piers support decks, that railings appear at closed boundaries, that span count scales with polyline length. These constraints are implicit in the grammar rather than injected as post-hoc regularisers.

Merrell's 2023 method [1] is, to our knowledge, the only published system that automatically extracts such grammars from example shapes without manual rule authoring. It handles the hardest structural case — shapes with closed cycles — that previous grammar methods explicitly avoided. This paper describes our implementation, our 3D extension, and the architectural path toward neural rule prediction.

2. Background: r-Similarity and Boundary Strings
2.1 r-Similarity

Two shapes are r-similar if every ball of radius r centred on one shape contains a neighbourhood that appears somewhere in the other. As r→0 this captures exact local topology. A grammar extracted from shape S can generate any shape that is r-similar to S — same local rules, arbitrarily different global form.

2.2 Boundary Strings

Merrell decomposes a shape graph G into primitives by cutting every edge into two directed half-edges — one primitive per vertex. The boundary ∂G of a graph encodes its open half-edges as a string of symbols, with turn angles tracking curvature. A complete graph has ∂G = "∧". Production rules are string transformations on ∂G.

Branch gluing: ∂G₁ + ∂G₂ → ∂G₃ = merge at complementary half-edge pair (a, ā) Loop gluing: "…a…ā…" → "…" requires ±360° turn between a and ā
INPUT MESH polygonal G EXTRACTION primitives + ∂G boundary strings RULE MINING freq ≥ 2 → rule symmetry detect GENERATION apply rules until ∂G = "∧" NEW SHAPES r-similar to G
Figure 1: Merrell grammar pipeline. Input mesh → primitive extraction → boundary string ∂G → rule mining (freq ≥ 2) → grammar-driven generation until ∂G = "∧". Output shapes are r-similar to input.
3. Implementation
3.1 2D Foundation

We begin with 2D shapes to validate the boundary string machinery. The extractor represents shapes as planar graphs with directed half-edges. Each vertex becomes a primitive with outgoing half-edges labelled by edge direction. Boundary computation follows Merrell Section 3.3, tracking open half-edges and cumulative turn angles.

3.2 3D Extension

In 3D, ∂G is a 2D surface graph — half-edges become half-faces. We implement a simplified version using edge-direction vectors discretised into 26 directions as the matching alphabet. Full 3D boundary (graph isomorphism on ∂G) is deferred: gluing combinations grow O(d!) with vertex degree d, making high-valence vertices intractable.

3.3 Symmetry Detection as Emergent Property

Frequency-based rule mining detects symmetry without any explicit symmetry analysis. For a chair mesh, the four leg-attachment subgraphs produce identical boundary strings — frequency = 4. The extractor identifies rule R_leg with multiplicity 4, encoding 4-fold leg symmetry automatically. n-fold rotational symmetry always produces rules with frequency exactly n.

Table 1: Grammar extraction results by shape category
ShapePrimitivesRulesMax freqSymmetry
Chair12744-fold legs, 2-fold back
Arch (cycle)534Loop-closing rule
Grid lattice946Translational H+V
Tree (no cycle)732Binary branching
4. The Neural Rule Predictor

Grammar extraction gives rule set R. Generation applies rules stochastically — unguided, no conditioning on image or semantic intent. The natural extension replaces stochastic selection with learned prediction: given an image, predict which rule to apply at each step.

We propose a DiT-based sequence model [4] trained on (image, rule-sequence) pairs. DINOv2 [2] encodes the image into 768-dimensional features. Flow matching [3] over a continuous relaxation of the discrete rule-selection space trains the sequence model. The grammar extractor serves as training data generator, not just an end system.

p(rₜ | r₁,...,rₜ₋₁, x_img) ≈ DiT(DINOv2(x_img), [r₁,...,rₜ₋₁])
5. Discussion & Conclusion

Grammar extraction works reliably for shapes with repetitive structure. Shapes without repeated substructure produce grammars with near-zero rule reuse — the method is a poor fit for freeform organic geometry. The non-differentiable executor gap (no gradient through rule-application) prevents end-to-end training; the neural predictor with black-box executor is the viable path.

This work feeds directly into SculptNet, which applies a structurally similar inductive bias to coarse-to-fine primitive assembly, and into building elevation reconstruction, where grammar-like constraints govern the 6-plane mesh executor. The architectural template — structured input → rule extraction → program → geometry — recurs across the entire thesis.

References
[1] Merrell, P. "Example-Based Procedural Modeling Using Graph Grammars." ACM Trans. Graph., 2023. paulmerrell.org/…/ProcModelUsingGraphGram.pdf
[2] Oquab, M. et al. "DINOv2: Learning Robust Visual Features without Supervision." TMLR, 2023. arxiv.org/abs/2304.07193
[3] Lipman, Y. et al. "Flow Matching for Generative Modeling." ICLR, 2023. arxiv.org/abs/2210.02747
[4] Peebles, W. & Xie, S. "Scalable Diffusion Models with Transformers (DiT)." ICCV, 2023. arxiv.org/abs/2212.09748
[5] Chang, A. et al. "ShapeNet: An Information-Rich 3D Model Repository." shapenet.org
[6] Deitke, M. et al. "Objaverse: A Universe of Annotated 3D Objects." objaverse.allenai.org
Appendix — Raw Materials
Implementation Logs & Source References
████████████████████████████████████████████████████
████████████████████████████████████████

████████████████████████████████████
████████ · ████ · ████████████████████████
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
████████████████████████████████████████
████████████ · ████ · ████████████████████████████████
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████

████████████████████████████████████████████████████
████ ██████████ · ████████ · ████ · ████████████████████
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████

Oct 2025
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
Oct 2025
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████

████████████████████████████████████████████████████████████
████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████
Restricted Access