Topic 03 — Paul Merrell Graph Grammar

§ 1 — Theoretical Foundation

What Are Graph Grammars?

Merrell's 2023 method takes an example input mesh, automatically extracts reusable grammar production rules from repeated boundary patterns, then generates entirely new shapes by systematically applying those rules. The key insight: shapes that are r-similar share the same local neighborhood structure everywhere — so the grammar captures what the example "knows" about valid local geometry.

r-Similarity (Section 3.1)

Local Neighborhood Equivalence

Two shapes are r-similar if every infinitesimal neighborhood in one appears in the other. As r → 0 this captures exact local topology. A grammar extracted from one shape can generate any r-similar shape — new global forms, same local rules.

Boundary String ∂G (Section 3.3)

Graph Representation

Each graph G has a boundary string ∂G encoding open half-edges and turns. A complete graph has ∂G = "∧" (no open edges). Production rules are string transformations on ∂G — making grammar matching algebraic rather than geometric.

∂G₁ = "y∧ba" + ∂G₂ = "ā∧xy̅"
↓ glue at a, ā
∂G₃ = "y∧bvxy̅"

Primitives (Section 4)

Atomic Grammar Units

Cutting every edge produces half-edges; one primitive per vertex. Each primitive = isolated vertex + surrounding half-edges. Gluing primitives back reproduces the original shape. Primitives are the grammar's alphabet.

Gluing Operations (Section 4.2)

Two Operations

Branch gluing: connect two separate graphs (∂G₁ + ∂G₂ → ∂G₃). Loop gluing: close a loop within one graph ("...a...ā..." → "..."). Together these build any complex shape from primitives.

Loop: "...a...ā..." → "..."
Requires ±360° turn between a, ā

Rule ID	Pattern Signature	Frequency	Interpretation	Status
Rule 0	x+:2 y+:2 z+:1	4×	Leg structure — 4-fold symmetry ✓	Extracted
Rule 1	x+:1 y+:2 z+:1	2×	Arm structure — bilateral symmetry ✓	Extracted
Rule 2	x+:3 y+:1 z+:2	1×	Seat / backrest slab	Extracted
—	Full 3D boundary graph	—	True Merrell: 2D boundary is a 2D surface graph	Simplified
—	Position solving Ax=b	—	Rejection sampling for vertex positions (Section 6.2)	Planned

arXiv Preprint · cs.GR · cs.LG · Oct 2025

Graph Grammars for Automatic 3D Procedural Modeling: Implementing Merrell's Boundary String Method with Neural Rule Prediction

Aditya Jain

Apple Maps · 3D Reconstruction Group, Hyderabad · Thesis Research, Unpublished Preprint

Submitted: October 2025 Subject: cs.GR · cs.LG Keywords: graph grammar, procedural generation, r-similarity, boundary string, rule extraction, 3D synthesis

Abstract

We implement Paul Merrell's 2023 example-based procedural modeling system using graph grammars, extending the original 2D formulation to 3D polygonal meshes. The method automatically extracts production rules from a single example shape by decomposing it into per-vertex primitives, encoding connectivity as boundary strings ∂G, and identifying repeated substring patterns as grammar rules. We implement the full pipeline from scratch: primitive extraction, branch-gluing and loop-gluing operations, boundary string computation, and frequency-based rule identification. On a standard chair mesh, the extractor correctly identifies 4-fold leg symmetry (rule frequency = 4) and bilateral back symmetry (frequency = 2) without any supervision. The 3D extension applies simplified boundary representations due to the combinatorial explosion of gluing operations in higher dimensions. Grammar-driven generation produces over 50 novel shapes locally r-similar to the input. We identify the non-differentiable executor as the primary barrier to end-to-end learning and propose a DiT-based neural rule predictor — using DINOv2 image features and flow matching over rule-selection sequences — as the natural next stage. Keywords: graph grammar, procedural modeling, r-similarity, boundary string, 3D generation, rule extraction.

1. Introduction

Procedural 3D generation in Apple Maps requires producing geometrically valid assets — bridges, buildings, terrain features — from structured inputs like polylines and semantic boundary attributes. The dominant approach is direct neural generation: diffusion over meshes, SDF fields, or triplane representations. These methods produce plausible geometry but encode no structural knowledge of what makes a valid assembly. A bridge generated by mesh diffusion may look correct from a distance but contain degenerate topology, non-manifold faces, or physically implausible structural relationships.

Graph grammars offer a fundamentally different inductive bias. Rather than learning a smooth distribution over geometry, they learn the combinatorial rules that govern valid assemblies. A grammar extracted from a real bridge knows that piers support decks, that railings appear at closed boundaries, that span count scales with polyline length. These constraints are implicit in the grammar rather than injected as post-hoc regularisers.

Merrell's 2023 method [1] is, to our knowledge, the only published system that automatically extracts such grammars from example shapes without manual rule authoring. It handles the hardest structural case — shapes with closed cycles — that previous grammar methods explicitly avoided. This paper describes our implementation, our 3D extension, and the architectural path toward neural rule prediction.

2. Background: r-Similarity and Boundary Strings

2.1 r-Similarity

Two shapes are r-similar if every ball of radius r centred on one shape contains a neighbourhood that appears somewhere in the other. As r→0 this captures exact local topology. A grammar extracted from shape S can generate any shape that is r-similar to S — same local rules, arbitrarily different global form.

2.2 Boundary Strings

Merrell decomposes a shape graph G into primitives by cutting every edge into two directed half-edges — one primitive per vertex. The boundary ∂G of a graph encodes its open half-edges as a string of symbols, with turn angles tracking curvature. A complete graph has ∂G = "∧". Production rules are string transformations on ∂G.

Branch gluing: ∂G₁ + ∂G₂ → ∂G₃ = merge at complementary half-edge pair (a, ā) Loop gluing: "…a…ā…" → "…" requires ±360° turn between a and ā

Figure 1: Merrell grammar pipeline. Input mesh → primitive extraction → boundary string ∂G → rule mining (freq ≥ 2) → grammar-driven generation until ∂G = "∧". Output shapes are r-similar to input.

3. Implementation

3.1 2D Foundation

We begin with 2D shapes to validate the boundary string machinery. The extractor represents shapes as planar graphs with directed half-edges. Each vertex becomes a primitive with outgoing half-edges labelled by edge direction. Boundary computation follows Merrell Section 3.3, tracking open half-edges and cumulative turn angles.

3.2 3D Extension

In 3D, ∂G is a 2D surface graph — half-edges become half-faces. We implement a simplified version using edge-direction vectors discretised into 26 directions as the matching alphabet. Full 3D boundary (graph isomorphism on ∂G) is deferred: gluing combinations grow O(d!) with vertex degree d, making high-valence vertices intractable.

3.3 Symmetry Detection as Emergent Property

Frequency-based rule mining detects symmetry without any explicit symmetry analysis. For a chair mesh, the four leg-attachment subgraphs produce identical boundary strings — frequency = 4. The extractor identifies rule R_leg with multiplicity 4, encoding 4-fold leg symmetry automatically. n-fold rotational symmetry always produces rules with frequency exactly n.

Table 1: Grammar extraction results by shape category
Shape	Primitives	Rules	Max freq	Symmetry
Chair	12	7	4	4-fold legs, 2-fold back
Arch (cycle)	5	3	4	Loop-closing rule
Grid lattice	9	4	6	Translational H+V
Tree (no cycle)	7	3	2	Binary branching

4. The Neural Rule Predictor

Grammar extraction gives rule set R. Generation applies rules stochastically — unguided, no conditioning on image or semantic intent. The natural extension replaces stochastic selection with learned prediction: given an image, predict which rule to apply at each step.

We propose a DiT-based sequence model [4] trained on (image, rule-sequence) pairs. DINOv2 [2] encodes the image into 768-dimensional features. Flow matching [3] over a continuous relaxation of the discrete rule-selection space trains the sequence model. The grammar extractor serves as training data generator, not just an end system.

p(rₜ | r₁,...,rₜ₋₁, x_img) ≈ DiT(DINOv2(x_img), [r₁,...,rₜ₋₁])

5. Discussion & Conclusion

Grammar extraction works reliably for shapes with repetitive structure. Shapes without repeated substructure produce grammars with near-zero rule reuse — the method is a poor fit for freeform organic geometry. The non-differentiable executor gap (no gradient through rule-application) prevents end-to-end training; the neural predictor with black-box executor is the viable path.

This work feeds directly into SculptNet, which applies a structurally similar inductive bias to coarse-to-fine primitive assembly, and into building elevation reconstruction, where grammar-like constraints govern the 6-plane mesh executor. The architectural template — structured input → rule extraction → program → geometry — recurs across the entire thesis.

References

[1] Merrell, P. "Example-Based Procedural Modeling Using Graph Grammars." ACM Trans. Graph., 2023. paulmerrell.org/…/ProcModelUsingGraphGram.pdf

[2] Oquab, M. et al. "DINOv2: Learning Robust Visual Features without Supervision." TMLR, 2023. arxiv.org/abs/2304.07193

[3] Lipman, Y. et al. "Flow Matching for Generative Modeling." ICLR, 2023. arxiv.org/abs/2210.02747

[4] Peebles, W. & Xie, S. "Scalable Diffusion Models with Transformers (DiT)." ICCV, 2023. arxiv.org/abs/2212.09748

[5] Chang, A. et al. "ShapeNet: An Information-Rich 3D Model Repository." shapenet.org

[6] Deitke, M. et al. "Objaverse: A Universe of Annotated 3D Objects." objaverse.allenai.org

Appendix — Raw Materials

Implementation Logs & Source References

████████████████████████████████████████████████████
████████████████████████████████████████

01 — ████████████████████████████

████████████████████████████████████

████████ · ████ · ████████████████████████

████████████████████████████████████████

████████████ · ████ · ████████████████████████████████

████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████

02 — ████████████████████████████████

████████████████████████████████████████████████████

████ ██████████ · ████████ · ████ · ████████████████████

03 — ████████████████████████████████████████████

Oct 2025

04 — ████████████████████████████████

████████████████████████████████████████████████████████████

Restricted Access

Graph Grammar 3D

What Are Graph Grammars?

Extraction → Neural → Generation

Three-Week Roadmap

Grammar Extraction Findings

Engineering Decisions

What This Work Enabled

Papers & Resources

Grammar Extraction Demo