Graph Types and Formats

Tetrad works with several types of causal graphs and provides multiple ways to save and load them. This page explains:

What the main graph types mean (DAG, CPDAG, MAG, PAG)
How Tetrad represents and exports them
How to interpret PAG edge-specialization markings

The goal is to clarify what a graph is telling you and how it is serialized when you save or share it.

1. Core Graph Types in Tetrad 

Tetrad’s search algorithms output four main graph types:

DAG – Directed Acyclic Graph
CPDAG – Completed Partially Directed Acyclic Graph
MAG – Maximal Ancestral Graph
PAG – Partial Ancestral Graph

Each type expresses different assumptions about ancestors, descendants, hidden variables, and selection.

1.1 DAG — Directed Acyclic Graph 

A DAG contains only directed edges and no directed cycles.

X → Y means X is (possibly indirectly) a cause of Y.
No bidirected edges, no undirected edges, no circle endpoints.
A DAG represents one specific causal model, not an equivalence class.

Most Tetrad algorithms output equivalence classes, but some (e.g., LiNGAM, DAGMA) output actual DAGs.

DAGs are often used as inputs as well; you can create them by editing a Graph Box or loading them from a file.

1.2 CPDAG — Completed Partially Directed Acyclic Graph 

A CPDAG represents a Markov equivalence class of DAGs that share the same conditional independence structure.

X → Y — the orientation is invariant across all DAGs in the class
X — Y — orientation is ambiguous, meaning it could be X → Y or X ← Y
No circle endpoints (o)

Common CPDAG outputs include:

PC, PC-Max
FGES, BOSS
Several score-based hybrids

1.3 MAG — Maximal Ancestral Graph 

A MAG encodes causal structure with latent confounding and possibly selection bias.
Edges may be:

X → Y — directed
X ↔ Y — bidirected (latent confounder)
X — Y — undirected (selection structure)

A MAG represents one fully specified causal structure involving latent and/or selection variables.

1.4 PAG — Partial Ancestral Graph 

A PAG represents an equivalence class of MAGs consistent with the observed data.

The edge endpoints can be:

Tail: -
Arrowhead: >
Circle: o (undetermined endpoint)

Examples:

X → Y (tail–arrowhead)
X o→ Y (circle–arrowhead)
X o–o Y (circle–circle)
X ↔ Y (arrowhead–arrowhead)
X — Y (tail–tail; selection structure)

PAGs are produced by:

FCI
RFCI
GFCI
BOSS-FCI
FCIT (targeted-testing hybrid)

2. Endpoint Marking System 

Tetrad uses the following endpoint symbols:

Endpoint	Meaning
`-` (tail)	This endpoint could be an ancestor
`>` (arrowhead)	This endpoint is not an ancestor
`o` (circle)	Uncertain: could be tail or arrowhead in some MAGs

These markings summarize what is invariant across the entire equivalence class.

3. PAG Edge-Specialization Markup (Optional GUI Feature)

When edge-specialization markup is enabled in the GUI, directed edges in a PAG may carry additional visual cues to indicate two independent properties: visibility and directness.

3.1 Two Independent Attributes 

PAG directed edges may vary along two independent dimensions:

(A) Visibility

“Visibility” is a technical notion introduced by Jiji Zhang (2008) in his paper
“On the Completeness of Orientation Rules for Causal Discovery in the Presence of Latent Confounding.”

Visibility describes whether a directed edge must represent a direct causal influence that is not explainable by a latent confounder in any compatible MAG.

Solid arrow: Visible edge
- Cannot be explained away by a latent confounder
- Corresponds to an identifiable causal effect in linear SEMs
Dashed arrow: Possibly invisible
- A latent confounder may exist
- The direct effect may be non-identifiable

Visibility is a core component of the FCI family of algorithms.

(B) Directness

This describes whether the directed edge is guaranteed to be direct (parent → child) in all MAGs represented by the PAG.

Thick arrow: Definitely direct
- Must be a direct parent in every compatible MAG
Thin arrow: Possibly indirect
- Some MAGs may contain intermediate variables along the causal chain

Directness is independent of visibility.

3.2 The Four Directed-Edge Types 

Because visibility and directness are orthogonal, a PAG directed edge may appear in one of four stylized forms:

Visibility	Directness	Appearance	Meaning
Solid	Thick	solid + thick	Visible and definitely direct
Solid	Thin	solid + thin	Visible but possibly indirect
Dashed	Thick	dashed + thick	Possibly confounded but definitely direct
Dashed	Thin	dashed + thin	Possibly confounded and possibly indirect

This visualization does not change the semantics of the PAG—it only makes explicit certain implications of the orientation rules.

3.3 Undirected Edges Represent Selection Bias

In a PAG, an undirected edge:

X — Y    (tail–tail)

represents a selection effect, which arises when conditioning on some variable(s) during data collection or post-processing.

A common conceptual form is:

X → S ← Y    where S is a (possibly latent) selection variable.

Conditioning on a selection variable induces an association between X and Y that is non-causal and cannot be removed by adjusting for measured variables.

In the Tetrad Graph Box, any node may be designated as a selection variable, and this affects orientation rules and reachability computations in FCI-style methods.

4. Saving and Loading Graphs 

Tetrad supports several graph interchange mechanisms:

Plain-text export formats
JSON-based graph formats (GUI + programmatic)
Session files containing graphs, data, and parameters
Programmatic construction from Java, Py-Tetrad, or RPy-Tetrad

4.1 Conceptual Plain-Text Format 

A minimal conceptual export looks like:

# Nodes
X, Y, (Z)

# Edges
1. X --> Y
2. Y o-> Z
3. X <-> Z

Where:

Nodes in parentheses, e.g., (Z), are latent.
Edge numbering is optional, used for clarity.
Commas or semicolons may be used to separate node names.

This is meant to be human-readable and easy to exchange.

5. Graphs and Data: Name Matching 

Graphs refer to variables by name, so:

Node names must match dataset column names exactly.
Renaming a variable in the data requires renaming the node accordingly.

To avoid issues:

Avoid spaces
Avoid unusual punctuation
Ensure all names are unique

6. Summary 

DAG: Directed, acyclic, no latents.
CPDAG: Equivalence class of DAGs (no circles).
MAG: Allows latent confounding & selection; fully oriented.
PAG: Equivalence class of MAGs; uses tails, circles, and arrowheads.
PAG Markup:
- Solid/dashed = visible vs. possibly confounded
- Thick/thin = direct vs. possibly indirect

These concepts provide the foundation needed to interpret and exchange causal graphs across Tetrad’s GUI and its Python/R/Java interfaces.