Assembly Theory and Its Adaptation for Curating

Roulette

Aug 16

What follows is ChatGPT’s description of Assembly Theory, a concept created by the physicist Sara Walker and the chemist Leroy Cronin. ChatGPT was then prompted to adapt its summary of Assembly Theory to the practice of curating.

ChatGPT 5 (Pro) • 8/16/25

Roulette is our section for essays, portfolios, videos, conversations, and more that range across a broader purview of culture and politics, not within our current thematic sections. It is, in a sense, a chance encounter with a subject of interest within the extended realm of visual culture.
Assembly theory models objects as being built from basic parts, quantifying the minimal construction steps (assembly index) and pairing this with how many copies appear (copy number). High assembly index together with high copy number signals selection-driven processes rather than undirected dynamics, and the index can be estimated empirically from spectral data. Adapted to curating, artworks, and texts become building blocks, curatorial relations (adjacency, motifs, sightlines, programs) are the joins, and the exhibition is the assembled structure. A Curatorial Assembly Index—combining a minimal motif “basis” (structural) and required interventions like walls or didactics (practical)—captures how few moves are needed for coherence, while copy numbers track recurrent themes that act as spines. In practice, curators form reusable “constellations,” pace complexity across rooms, and watch for overburdened bridges to balance coherence and novelty, using the metrics as guides rather than verdicts.

Assembly Theory—what it is (in plain terms)
Assembly Theory is a way to describe how hard it is for the universe to build a given object. It treats an object (a molecule, a polymer, a word, even a gadget) not just as a finished thing, but as something that could be assembled step‑by‑step from simpler building blocks. Two key quantities do the work:

Assembly index (AI) – the smallest number of join operations on building blocks needed to make the object (reusing substructures you’ve already built is allowed).
Copy number – how many identical copies of that object you actually observe in a sample or environment.

High assembly index objects that also appear in many copies are taken as evidence that some selective, history‑bearing process made them (e.g., evolution or technology), rather than undirected chemistry alone. (Nature)

Core ideas, a bit more formally

Assembly space. For any object, you can map the network of possible construction pathways from basic building blocks; the assembly index is the length of the shortest such path. In chemistry, the “joins” correspond to forming bonds; in strings, to concatenating substrings. (Nature)
Selection via abundance. One-off occurrences of a complex object could, in principle, happen by chance. But finding many identical copies of a high‑AI object is unlikely without a directed (selected) process. (Nature)
An ensemble measure. A sample’s overall “assembly” can be computed from object AIs and copy numbers, for example
A=∑i=1Neai (ni−1)NTA=\sum_{i=1}^{N} e^{a_i}\, \frac{(n_i-1)}{N_T},
where aia_i is AI, nin_i is copy number, and NTN_T is total objects. Larger AA indicates more selection is required to explain the sample. (Nature)

Concrete example. In the Nature paper that set out the framework, a specific small molecule (diethyl phthalate) is shown to have AI = 8 under the bond‑formation counting scheme—illustrating how motifs can be reused along the shortest pathway. (Nature)

**How the numbers are measured in practice (not just calculated)**

Mass spectrometry (MS/MS). Empirically, the number of peaks in a molecule’s fragmentation spectrum correlates with its AI, letting you estimate AI directly from data. This correlation was validated across diverse real samples and proposed as the basis for a life‑detection test. (Nature)
IR and NMR spectroscopy. Later work showed you can also infer AI from infrared and NMR spectra (and do even better by combining methods), and provided faster algorithms to compute exact AIs on molecular graphs. (PMC)

What it’s for

Life detection (“agnostic” biosignatures). The 2021 experiments argued that complex molecules with sufficiently high AI appearing in detectable abundance are strong biosignatures, independent of Earth‑specific biochemistry; this idea has been flagged by NASA’s astrobiology program. (Nature, NASA Astrobiology)
Linking physics and biology. The 2023 Nature paper reframed “objects” in physics as things with build histories, offering a way to quantify novelty and selection without changing the laws of physics. (Nature)
Mapping chemical space & discovery. “Assembly trees” group molecules by shared sub‑construction pathways, which can guide searches through huge chemical spaces (e.g., for drug‑like molecules). (PMC)

Debates, limitations, and current status

Where’s the life‑threshold? Early studies suggested that organic, covalently bonded molecules with experimentally measured AI ≳ 15 do not arise in detectable abundance in abiotic mixtures, motivating a “threshold for life” in those contexts. A 2024 reply emphasized this threshold was experimental and system‑specific, not a universal axiom. (PMC)
– In 2024, Hazen and colleagues calculated high AIs for inorganic mineral heteropolyanions, arguing that some abiotic structures can exceed the proposed threshold; the assembly authors countered that those were theoretical AIs for different building blocks and that thresholds must be measured for each class. (Royal Society Publishing, PMC)
Is AT new, and does it explain selection? Critical perspectives argue that AT’s indices reduce to familiar notions from compression/algorithmic complexity and do not, by themselves, identify Darwinian selection; proponents respond that AT’s novelty is the experimentally measurable link between construction history and abundance. For balanced critiques and counterpoints, see Jaeger (J. Mol. Evol. 2024) and two 2024 papers by Abrahão, Zenil and colleagues. (PMC, PLOS)

The take‑home

What AT gives you: a measurable “how hard was this to make?” score (AI) plus “how many copies exist?” (copy number). High AI + high copy number → evidence of selection/history beyond undirected dynamics. (Nature)
Where it’s useful now: analyzing complex chemical mixtures, proposing agnostic biosignatures, and organizing chemical space for discovery. (Nature, PMC)
What to keep in mind: thresholds are empirical and domain‑specific; there’s an active debate about novelty and interpretation; and measurement protocols (MS/IR/NMR) and algorithms are rapidly improving. (PMC)

Below is a practical way to “port” assembly theory into curatorial work—treating an exhibition as something built from parts, with explicit joins and reusable sub‑structures.

1) Map the concepts to curating

Building blocks → individual works, archival items, wall texts, sounds, architectural elements.
Joins (operations) → the curatorial moves that connect blocks: adjacency, sequencing, sightlines, shared motifs, material/process links, historical context, public programs, or a text that explicitly yokes pieces together.
Assembly index (AI) → the smallest number of joins needed to turn a set of works into a coherent, navigable exhibition (at room scale, show scale, or for a subsection).
Copy number → how often a motif/device recurs across the show (e.g., “repair,” “code,” “weaving,” “diaspora,” a specific sound or color discipline).
Reuse of substructures → “constellations” or micro‑clusters that can appear (or echo) in multiple rooms—like a triad that recurs with one element swapped.
Assembly space → the network of all feasible ways the show could be built from the available works and joins.

Intuition: High AI + high copy number of certain motifs/devices signals deliberate curatorial selection rather than a loose grouping; low AI gives a sparse, minimalist hang; high AI with low copy numbers produces a dense, possibly over‑novel experience.

2) A step‑by‑step framework

Step A — Define your join “grammar”

Create a short menu of allowable joins. Example categories:

Formal/material (shape, palette, medium, fabrication technique)
Process (rules, code, craft method, performance score)
Conceptual/thematic (care, borders, extraction, language)
Historical/biographical (school, place, period, collaboration)
Spatial/phenomenological (line of sight, sound bleed, light conditions)
Programmatic (talks, screenings, workshops that connect works separated in space/time)

Step B — Tag the works

For each piece, list 2–4 motifs and rate strength (1–5). You now have a work × motif matrix. Copy number of a motif = how many works carry it.

Step C — Build the “assembly graph”

Nodes = works.
Draw an edge when two works can be joined by at least one allowed relation.
Optionally weight edges by join strength (e.g., the strongest shared motif).

Step D — Estimate a Curatorial Assembly Index (CAI)

Two useful versions:

Structural CAI: the minimal number of distinct motifs you need to make the work‑graph connected (think: a smallest “motif basis” that ties the show together). You can approximate this with a greedy set‑cover: repeatedly pick the motif that links the largest number of as‑yet disconnected works/clusters until the whole graph connects.
Practical CAI: Structural CAI plus the count of physical/interpretive interventions required to realize those joins (e.g., partitions, long labels, media zones, program events).

Practical CAI≈∣motif basis∣ + #(walls+long texts+AV zones+program links)\text{Practical CAI} \approx |\text{motif basis}| \;+\; \#(\text{walls}+\text{long texts}+\text{AV zones}+\text{program links})

Step E — Design constellations (reusable substructures)

Aim for 3–7 works per constellation.
Anchor each constellation in a spine motif (high copy number) and include 1–2 bridge works that carry a second motif reaching into another constellation/room.
Reuse the same constellation in another room by swapping a single work; this reduces overall CAI (you’re reusing built substructures).

Step F — Lay out the sequence

Place high‑copy‑number motifs near entries as “on‑ramps.”
Ensure each room has at least two bridges (sightline or text) to others.
Use the edge motifs (those that appear in both adjoining rooms) to title thresholds—these are literal joins.

Step G — Evaluate and iterate

Use quick proxies for coherence vs. overload:

Encounter spectrum: list all interpretive “fragments” a visitor can hit (labels, audio stops, screenshots, timelines). If fragments balloon but the graph still has islands, you’re adding explanation without structural joins.
Bridge centrality: in your assembly graph, check whether a few works are over‑burdened as the only bridges; duplicate that bridge via an additional join or program.

3) A tiny worked example (toy numbers)

Works (W1–W6) with motifs M = {repair, archive, body, code}.

W1: repair(5), body(3)
W2: repair(4), archive(2)
W3: archive(5)
W4: body(4), code(3)
W5: code(5), repair(2)
W6: body(3), archive(3)

Copy numbers: repair=3, archive=3, body=3, code=2 (balanced—good spines).
Edges exist wherever a motif is shared (e.g., W1–W2 via repair; W2–W3 via archive; W4–W5 via code; W4–W6 via body; W1–W6 via body).

Structural CAI (motif basis): {repair, archive, body} already connects all six works; “code” is optional spice. Basis size = 3.
Practical CAI: if you need 1 threshold text + 1 AV zone for the code pair, then Practical CAI ≈ 3 + 2 = 5.

Constellations:

Constellation A (repair spine): W1–W2–W5 (bridge to code via W5).
Constellation B (archive spine): W2–W3–W6 (bridge to body via W6).
Constellation C (body spine): W1–W4–W6 (bridge to code via W4).

These three can populate two rooms by reusing A with W5 swapped for W4, echoing ideas while introducing novelty.

4) Heuristics and targets (tune to context)

Balance spines and rarities. In a 25–40 work show, aim for 3–5 spine motifs with copy numbers in the 20–50% range each; keep a few rare motifs for surprise.
Limit intervention load. If Practical CAI grows fast while Structural CAI is flat, you’re over‑engineering; look for cleaner bridges in the selection rather than more walls/texts.
Pace complexity. Start with low‑AI clusters at entry, escalate to higher‑AI constellations mid‑show, and release toward the end.
Reuse with variation. Repeated substructures create rhythm; change one element to keep attention.

5) The “Exhibition Assembly Canvas” (copy/paste template)

A. Join grammar (max 8 items):

Formal/material … 2) Process … 3) Concept … 4) Historical … 5) Spatial … 6) Programmatic … 7) … 8) …

B. Work × motif table (excerpt):

Work ID Media/Year Motifs (ranked) Strengths (1–5) Bridge Notes

W1 repair, body 5, 3 bridges to W6 via body

W2 repair, archive 4, 2 spine for repair

…

C. Metrics:

Copy numbers: repair=?, archive=?, body=?, code=?
Structural CAI (motif basis size): ___
Practical CAI (basis + interventions): ___
Over‑burdened bridges (works/thresholds to relieve): ___

D. Constellations:

Constellation 1 (spine motif ___): works ___; bridges to ___ via ___
Constellation 2 …

E. Threshold titles (edge motifs): Room A→B: _ ; B→C: _

6) Adapting by exhibition type

Monographic: Treat series/process phases as motifs; CAI reveals how few moves you need to make a career arc legible without over‑explaining.
Thematic group show: Start from a broad motif list, then down‑select to a small basis that truly connects the selection; cut works that require bespoke joins you can’t reuse.
Collection rehang: Use high‑copy‑number material/process motifs (frame types, techniques, supports) as hidden spines to surface fresh narratives.
Media‑heavy shows: Count acoustic/light containment zones as interventions in Practical CAI early—these are costly joins.

7) Pitfalls to avoid

Metric‑worship. The CAI is a design aid, not a verdict on quality. Keep room for intuition and negative space.
Token bridges. Don’t rely on a single overworked essay or program to connect islands; structural joins should be visible in the hang.
Motif inflation. Too many low‑copy motifs raise complexity without coherence; merge or drop weak motifs.

GPT‑5 Pro is OpenAI’s advanced ChatGPT reasoning model, built to deliver precise answers and creative help while keeping its chain‑of‑thought hidden. It excels at step‑by‑step problem solving and, when enabled, can use tools such as web research and Python to provide cited, structured outputs for complex tasks. Designed for reliability and safety, it adapts tone and depth to the user’s needs, is transparent about uncertainty, and avoids unverifiable personal claims.

Download PDF

ChatGPT 5 (Pro)

Emily Roemer

Assembly Theory and Its Adaptation for Curating

Assembly index (AI) – the smallest number of join operations on building blocks needed to make the object (reusing substructures you’ve already built is allowed).

Copy number – how many identical copies of that object you actually observe in a sample or environment.

High assembly index objects that also appear in many copies are taken as evidence that some selective, history‑bearing process made them (e.g., evolution or technology), rather than undirected chemistry alone. (Nature)

Core ideas, a bit more formally

Assembly space. For any object, you can map the network of possible construction pathways from basic building blocks; the assembly index is the length of the shortest such path. In chemistry, the “joins” correspond to forming bonds; in strings, to concatenating substrings. (Nature)

Selection via abundance. One-off occurrences of a complex object could, in principle, happen by chance. But finding many identical copies of a high‑AI object is unlikely without a directed (selected) process. (Nature)

Concrete example. In the Nature paper that set out the framework, a specific small molecule (diethyl phthalate) is shown to have AI = 8 under the bond‑formation counting scheme—illustrating how motifs can be reused along the shortest pathway. (Nature)

How the numbers are measured in practice (not just calculated)

Mass spectrometry (MS/MS). Empirically, the number of peaks in a molecule’s fragmentation spectrum correlates with its AI, letting you estimate AI directly from data. This correlation was validated across diverse real samples and proposed as the basis for a life‑detection test. (Nature)

IR and NMR spectroscopy. Later work showed you can also infer AI from infrared and NMR spectra (and do even better by combining methods), and provided faster algorithms to compute exact AIs on molecular graphs. (PMC)

What it’s for

Linking physics and biology. The 2023 Nature paper reframed “objects” in physics as things with build histories, offering a way to quantify novelty and selection without changing the laws of physics. (Nature)

Mapping chemical space & discovery. “Assembly trees” group molecules by shared sub‑construction pathways, which can guide searches through huge chemical spaces (e.g., for drug‑like molecules). (PMC)

Debates, limitations, and current status

The take‑home

What AT gives you: a measurable “how hard was this to make?” score (AI) plus “how many copies exist?” (copy number). High AI + high copy number → evidence of selection/history beyond undirected dynamics. (Nature)

Where it’s useful now: analyzing complex chemical mixtures, proposing agnostic biosignatures, and organizing chemical space for discovery. (Nature, PMC)

What to keep in mind: thresholds are empirical and domain‑specific; there’s an active debate about novelty and interpretation; and measurement protocols (MS/IR/NMR) and algorithms are rapidly improving. (PMC)

Below is a practical way to “port” assembly theory into curatorial work—treating an exhibition as something built from parts, with explicit joins and reusable sub‑structures.

1) Map the concepts to curating

Building blocks → individual works, archival items, wall texts, sounds, architectural elements.

Joins (operations) → the curatorial moves that connect blocks: adjacency, sequencing, sightlines, shared motifs, material/process links, historical context, public programs, or a text that explicitly yokes pieces together.

Assembly index (AI) → the smallest number of joins needed to turn a set of works into a coherent, navigable exhibition (at room scale, show scale, or for a subsection).

Copy number → how often a motif/device recurs across the show (e.g., “repair,” “code,” “weaving,” “diaspora,” a specific sound or color discipline).

Reuse of substructures → “constellations” or micro‑clusters that can appear (or echo) in multiple rooms—like a triad that recurs with one element swapped.

Assembly space → the network of all feasible ways the show could be built from the available works and joins.

Intuition: High AI + high copy number of certain motifs/devices signals deliberate curatorial selection rather than a loose grouping; low AI gives a sparse, minimalist hang; high AI with low copy numbers produces a dense, possibly over‑novel experience.

2) A step‑by‑step framework

Step A — Define your join “grammar”

Create a short menu of allowable joins. Example categories:

Formal/material (shape, palette, medium, fabrication technique)

Process (rules, code, craft method, performance score)

Conceptual/thematic (care, borders, extraction, language)

Historical/biographical (school, place, period, collaboration)

Spatial/phenomenological (line of sight, sound bleed, light conditions)

Programmatic (talks, screenings, workshops that connect works separated in space/time)

Step B — Tag the works

For each piece, list 2–4 motifs and rate strength (1–5). You now have a work × motif matrix. Copy number of a motif = how many works carry it.

Step C — Build the “assembly graph”

Nodes = works.

Draw an edge when two works can be joined by at least one allowed relation.

Optionally weight edges by join strength (e.g., the strongest shared motif).

Step D — Estimate a Curatorial Assembly Index (CAI)

Two useful versions:

Practical CAI: Structural CAI plus the count of physical/interpretive interventions required to realize those joins (e.g., partitions, long labels, media zones, program events).

Practical CAI≈∣motif basis∣ + #(walls+long texts+AV zones+program links)\text{Practical CAI} \approx |\text{motif basis}| \;+\; \#(\text{walls}+\text{long texts}+\text{AV zones}+\text{program links})

Step E — Design constellations (reusable substructures)

Aim for 3–7 works per constellation.

Anchor each constellation in a spine motif (high copy number) and include 1–2 bridge works that carry a second motif reaching into another constellation/room.

Reuse the same constellation in another room by swapping a single work; this reduces overall CAI (you’re reusing built substructures).

Step F — Lay out the sequence

Place high‑copy‑number motifs near entries as “on‑ramps.”

Ensure each room has at least two bridges (sightline or text) to others.

Use the edge motifs (those that appear in both adjoining rooms) to title thresholds—these are literal joins.

Step G — Evaluate and iterate

Use quick proxies for coherence vs. overload:

Encounter spectrum: list all interpretive “fragments” a visitor can hit (labels, audio stops, screenshots, timelines). If fragments balloon but the graph still has islands, you’re adding explanation without structural joins.

Bridge centrality: in your assembly graph, check whether a few works are over‑burdened as the only bridges; duplicate that bridge via an additional join or program.

3) A tiny worked example (toy numbers)

Works (W1–W6) with motifs M = {repair, archive, body, code}.

W1: repair(5), body(3)

W2: repair(4), archive(2)

W3: archive(5)

W4: body(4), code(3)

W5: code(5), repair(2)

W6: body(3), archive(3)

Copy numbers: repair=3, archive=3, body=3, code=2 (balanced—good spines). Edges exist wherever a motif is shared (e.g., W1–W2 via repair; W2–W3 via archive; W4–W5 via code; W4–W6 via body; W1–W6 via body).

Structural CAI (motif basis): {repair, archive, body} already connects all six works; “code” is optional spice. Basis size = 3.

Practical CAI: if you need 1 threshold text + 1 AV zone for the code pair, then Practical CAI ≈ 3 + 2 = 5.

Constellations:

Constellation A (repair spine): W1–W2–W5 (bridge to code via W5).

Constellation B (archive spine): W2–W3–W6 (bridge to body via W6).

Constellation C (body spine): W1–W4–W6 (bridge to code via W4).

These three can populate two rooms by reusing A with W5 swapped for W4, echoing ideas while introducing novelty.

4) Heuristics and targets (tune to context)

Balance spines and rarities. In a 25–40 work show, aim for 3–5 spine motifs with copy numbers in the 20–50% range each; keep a few rare motifs for surprise.

Limit intervention load. If Practical CAI grows fast while Structural CAI is flat, you’re over‑engineering; look for cleaner bridges in the selection rather than more walls/texts.

Pace complexity. Start with low‑AI clusters at entry, escalate to higher‑AI constellations mid‑show, and release toward the end.

Reuse with variation. Repeated substructures create rhythm; change one element to keep attention.

Concrete example. In the Nature paper that set out the framework, a specific small molecule (diethyl phthalate) is shown to have AI = 8 under the bond‑formation counting scheme—illustrating how motifs can be reused along the shortest pathway. (Nature)

**How the numbers are measured in practice (not just calculated)**

Copy numbers: repair=3, archive=3, body=3, code=2 (balanced—good spines).
Edges exist wherever a motif is shared (e.g., W1–W2 via repair; W2–W3 via archive; W4–W5 via code; W4–W6 via body; W1–W6 via body).

Constellation 1 (spine motif _): works _; bridges to _ via _

E. Threshold titles (edge motifs): Room A→B: _ ; B→C: _