Graph Types and Formats

Tetrad works with several types of causal graphs and provides multiple ways to save and load them. This page explains:

  • What the main graph types mean (DAG, CPDAG, MAG, PAG)

  • How Tetrad represents and exports them

  • How to interpret PAG edge-specialization markings

The goal is to clarify what a graph is telling you and how it is serialized when you save or share it.


1. Core Graph Types in Tetrad

Tetrad’s search algorithms output four main graph types:

  • DAG – Directed Acyclic Graph

  • CPDAG – Completed Partially Directed Acyclic Graph

  • MAG – Maximal Ancestral Graph

  • PAG – Partial Ancestral Graph

Each type expresses different assumptions about ancestors, descendants, hidden variables, and selection.


1.1 DAG — Directed Acyclic Graph

A DAG contains only directed edges and no directed cycles.

  • X Y means X is (possibly indirectly) a cause of Y.

  • No bidirected edges, no undirected edges, no circle endpoints.

  • A DAG represents one specific causal model, not an equivalence class.

Most Tetrad algorithms output equivalence classes, but some (e.g., LiNGAM, DAGMA) output actual DAGs.

DAGs are often used as inputs as well; you can create them by editing a Graph Box or loading them from a file.


1.2 CPDAG — Completed Partially Directed Acyclic Graph

A CPDAG represents a Markov equivalence class of DAGs that share the same conditional independence structure.

  • X Y — the orientation is invariant across all DAGs in the class

  • X Y — orientation is ambiguous, meaning it could be X Y or X Y

  • No circle endpoints (o)

Common CPDAG outputs include:

  • PC, PC-Max

  • FGES, BOSS

  • Several score-based hybrids


1.3 MAG — Maximal Ancestral Graph

A MAG encodes causal structure with latent confounding and possibly selection bias.
Edges may be:

  • X Y — directed

  • X Y — bidirected (latent confounder)

  • X Y — undirected (selection structure)

A MAG represents one fully specified causal structure involving latent and/or selection variables.


1.4 PAG — Partial Ancestral Graph

A PAG represents an equivalence class of MAGs consistent with the observed data.

The edge endpoints can be:

  • Tail: -

  • Arrowhead: >

  • Circle: o (undetermined endpoint)

Examples:

  • X Y (tail–arrowhead)

  • X o→ Y (circle–arrowhead)

  • X o–o Y (circle–circle)

  • X Y (arrowhead–arrowhead)

  • X Y (tail–tail; selection structure)

PAGs are produced by:

  • FCI

  • RFCI

  • GFCI

  • BOSS-FCI

  • FCIT (targeted-testing hybrid)


2. Endpoint Marking System

Tetrad uses the following endpoint symbols:

Endpoint

Meaning

- (tail)

This endpoint could be an ancestor

> (arrowhead)

This endpoint is not an ancestor

o (circle)

Uncertain: could be tail or arrowhead in some MAGs

These markings summarize what is invariant across the entire equivalence class.


3. PAG Edge-Specialization Markup (Optional GUI Feature)

When edge-specialization markup is enabled in the GUI, directed edges in a PAG may carry additional visual cues to indicate two independent properties: visibility and directness.

3.1 Two Independent Attributes

PAG directed edges may vary along two independent dimensions:


(A) Visibility

“Visibility” is a technical notion introduced by Jiji Zhang (2008) in his paper
“On the Completeness of Orientation Rules for Causal Discovery in the Presence of Latent Confounding.”

Visibility describes whether a directed edge must represent a direct causal influence that is not explainable by a latent confounder in any compatible MAG.

  • Solid arrow: Visible edge

    • Cannot be explained away by a latent confounder

    • Corresponds to an identifiable causal effect in linear SEMs

  • Dashed arrow: Possibly invisible

    • A latent confounder may exist

    • The direct effect may be non-identifiable

Visibility is a core component of the FCI family of algorithms.


(B) Directness

This describes whether the directed edge is guaranteed to be direct (parent → child) in all MAGs represented by the PAG.

  • Thick arrow: Definitely direct

    • Must be a direct parent in every compatible MAG

  • Thin arrow: Possibly indirect

    • Some MAGs may contain intermediate variables along the causal chain

Directness is independent of visibility.


3.2 The Four Directed-Edge Types

Because visibility and directness are orthogonal, a PAG directed edge may appear in one of four stylized forms:

Visibility

Directness

Appearance

Meaning

Solid

Thick

solid + thick

Visible and definitely direct

Solid

Thin

solid + thin

Visible but possibly indirect

Dashed

Thick

dashed + thick

Possibly confounded but definitely direct

Dashed

Thin

dashed + thin

Possibly confounded and possibly indirect

This visualization does not change the semantics of the PAG—it only makes explicit certain implications of the orientation rules.


3.3 Undirected Edges Represent Selection Bias

In a PAG, an undirected edge:

X — Y    (tail–tail)

represents a selection effect, which arises when conditioning on some variable(s) during data collection or post-processing.

A common conceptual form is:

X → S ← Y    where S is a (possibly latent) selection variable.

Conditioning on a selection variable induces an association between X and Y that is non-causal and cannot be removed by adjusting for measured variables.

In the Tetrad Graph Box, any node may be designated as a selection variable, and this affects orientation rules and reachability computations in FCI-style methods.


4. Saving and Loading Graphs

Tetrad supports several graph interchange mechanisms:

  • Plain-text export formats

  • JSON-based graph formats (GUI + programmatic)

  • Session files containing graphs, data, and parameters

  • Programmatic construction from Java, Py-Tetrad, or RPy-Tetrad

4.1 Conceptual Plain-Text Format

A minimal conceptual export looks like:

# Nodes
X, Y, (Z)

# Edges
1. X --> Y
2. Y o-> Z
3. X <-> Z

Where:

  • Nodes in parentheses, e.g., (Z), are latent.

  • Edge numbering is optional, used for clarity.

  • Commas or semicolons may be used to separate node names.

This is meant to be human-readable and easy to exchange.


5. Graphs and Data: Name Matching

Graphs refer to variables by name, so:

  • Node names must match dataset column names exactly.

  • Renaming a variable in the data requires renaming the node accordingly.

To avoid issues:

  • Avoid spaces

  • Avoid unusual punctuation

  • Ensure all names are unique


6. Summary

  • DAG: Directed, acyclic, no latents.

  • CPDAG: Equivalence class of DAGs (no circles).

  • MAG: Allows latent confounding & selection; fully oriented.

  • PAG: Equivalence class of MAGs; uses tails, circles, and arrowheads.

  • PAG Markup:

    • Solid/dashed = visible vs. possibly confounded

    • Thick/thin = direct vs. possibly indirect

These concepts provide the foundation needed to interpret and exchange causal graphs across Tetrad’s GUI and its Python/R/Java interfaces.