Tetrad Manual
  • About
    • 📚 Project Background
    • 👥 Contributors
    • 📄 Papers and Books
    • 📬 Questions or Suggestions?
  • Workflows
    • Causal Analysis Workflows
      • 🧭 What You’ll Learn
      • 📌 Why a Workflow Matters
      • 🗺️ How the Workflow Is Organized
      • 🧠 Practical Advice Before You Begin
      • 🙌 Where to Start
    • Data Exploration: Understanding Your Data Before Causal Discovery
      • 1. Load and Inspect Your Data
      • 2. Review Variable Types
      • 3. Examine Marginal Distributions with Histograms
      • 4. Explore Pairwise Relationships with the Plot Matrix
      • 5. Consider Linearity and Gaussianity (Informally)
      • 6. Reflect on Causal Sufficiency and Latent Variables
      • 7. Clarify Your Modeling Goals
      • 8. Moving Forward
      • Practical Notes
    • Algorithm Selection and Assumptions
      • What This Page Covers
      • 1. Which Assumptions Matter?
        • 1.1. Causal Sufficiency
        • 1.2. Functional Form and Distribution
        • 1.3. Modeling Goal
        • 1.4. Sample Size and Dimensionality
      • 2. Major Algorithm Families in Tetrad
        • 2.1. Constraint-Based Methods
        • 2.2. Score-Based Methods
        • 2.3. Hybrid Methods
        • 2.4 Time Series Data (Lagged Variables)
      • 3. Mapping Assumptions to Starting Choices
      • 4. Choosing Tests and Scores
        • 4.1. Independence Tests
        • 4.2. Scores
      • 5. What If You’re Unsure?
      • 6. Using Grid Search Effectively
      • 7. Summary
      • 🧭 Next Step
    • Manual Exploration: Try Searches Interactively
      • Why Use Manual Exploration?
      • When Manual Exploration Is Useful
      • Pipelines: The Interactive Workflow
      • Building a Simple Pipeline
      • Examples of Manual Exploration
        • A. Varying Test Sensitivity
        • B. Comparing Algorithms
        • C. Adding Background Knowledge
        • D. Exploring Nonlinearity or Non-Gaussianity
      • Inspecting Results
      • How Manual Exploration Leads to Grid Search
      • Tips for Effective Manual Exploration
      • Summary
      • 🧭 Next Step
    • Running Searches and Grid Search Tips
      • Why Use Grid Search?
      • From Single Runs to Systematic Search
      • Running a Basic Search
      • What to Sweep in Grid Search
        • 1. Significance Level (α) — Test-Based Methods
        • 2. Penalty or Discount — Score-Based Methods
        • 3. Algorithm Choice
        • 4. Tests and Scores
      • Interpreting Grid Search Results
        • 1. Markov Consistency
        • 2. Model Complexity
      • A Practical Starter Pattern
      • Reading Grid Search Output
      • Common Pitfalls to Avoid
        • Sweeping Too Many Parameters at Once
        • Changing Background Knowledge Too Early
        • Delaying Diagnostics
        • Not Recording What Was Tried
      • Where Grid Search Fits in the Workflow
      • 🧭 Next Step
    • Model Evaluation and Markov Checking
      • Why Model Evaluation Matters
      • What the Markov Checker Does
        • Intuition
      • Running the Markov Checker in Tetrad
      • Interpreting Markov Checker Output
        • Key Outputs
        • How to Read the Results
      • Minimal Markov-Consistent Models
      • Comparing Models from Grid Search
      • Important Caveats
        • Markov Checking Is Not a Proof
        • Test Choice Matters
        • Sampling Variability Exists
      • Beyond Markov Checking
      • Practical Tips
      • Summary
      • 🧭 Next Step
    • Interpreting Results
      • 1. What a Discovered Graph Represents
      • 2. Types of Output and Their Meaning
        • 2.1 Fully Directed Acyclic Graphs (DAGs)
        • 2.2 Completed Partially Directed Acyclic Graphs (CPDAGs)
        • 2.3 Partial Ancestral Graphs (PAGs)
      • 3. Interpreting Common Edge Marks
      • 4. Robustness and Stability
      • 5. What You Can Say (With Care)
      • 6. What You Should Avoid Saying Unqualified
      • 7. Using Background Knowledge
      • 8. Communicating Uncertainty Clearly
      • 9. Documenting Your Analysis
      • 10. Summary
      • 🧭 What’s Next
    • Example: Auto MPG Analysis with Grid Search
      • 1. The Auto MPG Dataset
        • Data Preparation
      • 2. Loading and Exploring the Data in Tetrad
        • Visual Exploration
      • 3. Algorithm Choice and Assumptions
        • Causal Sufficiency
        • Algorithm: BOSS
        • Score: Degenerate Gaussian BIC
      • 4. Setting Up the Grid Search
        • Step 1: Connect the Data
        • Step 2: Algorithms Tab
        • Step 3: Table Columns Tab
        • Step 4: Comparison Tab (Initial Setup)
        • Step 5: Set Parameter Ranges
      • 5. Running the Comparison
      • 6. Interpreting the Comparison Results
        • Choosing a Model
      • 7. Viewing the Selected Graph
      • 8. What This Example Illustrates
      • 9. Next Steps
  • Tetrad Interface
    • Overview
      • Main Window
        • Project tree
        • Work area and tabs
        • Menus and toolbar
        • Status bar, logging pane, and messages
      • Working with Data
        • Importing data
        • Viewing and editing data
        • Linking data and graphs
        • Saving and exporting data
      • Graph Editor
        • Opening and creating graphs
        • Basic editing operations
        • Layout and visualization
        • Background knowledge and tiers
        • Saving and exporting graphs
      • Running Algorithms
        • Launching a search
        • Choosing tests and scores
        • Setting parameters
        • Running and monitoring
        • Re-running with modified settings
      • Estimate model parameters
        • Basic workflow
        • Inspecting the fitted model
        • Relationship to graphs and search
        • Where to look next
      • Viewing and Exporting Results
        • Graph results
        • Tabular and numeric results
        • Exporting graphs and tables
        • Reusing results in pipelines
      • Simulation and Utilities
        • Simulating data on the workbench
        • Resampling and bootstrap workflows
        • Grid Search (overview)
        • Other utilities
    • Box by Box
      • Graph Box
        • Purpose
        • Typical workflow
        • Key controls
        • Common patterns & tips
        • Related pages
      • Compare Box
        • Purpose
        • Typical workflow
        • Types of comparisons
        • Key controls
        • Common patterns & tips
        • Related pages
      • Grid Search Box (Data)
        • Purpose of Data-Based Grid Search
        • When This Mode Is Used
        • Basic Setup
        • Algorithms Tab
        • Table Columns Tab
        • Comparison Tab
        • Interpreting Results
        • View Graphs Tab
        • Notes and Best Practices
        • Summary
      • Grid Search (Simulation)
        • When to Use Simulation-Based Grid Search
        • Key Difference from Data-Based Grid Search
        • Step 1: Select a Simulation
        • Step 2: Algorithms Tab
        • Step 3: Table Columns Tab
        • Step 4: Comparison Tab
        • Step 5: Run Counts and Randomness
        • Running the Comparison
        • Interpreting Simulation Results
        • Common Pitfalls
        • Summary
        • 🧭 Next Steps
      • Parametric Model Box
        • Purpose
        • Typical workflow
        • Key controls
        • Common patterns & tips
        • Related pages
      • Instantiated Model Box
        • Purpose
        • Typical workflow
        • Key controls
        • Common patterns & tips
        • Related pages
      • Estimator Box
        • Purpose
        • Typical workflow
        • Key controls
        • Common patterns & tips
        • Estimator types and detail pages
        • Related pages
      • Data Box
        • Purpose
        • Typical workflow
        • Key controls
        • Common patterns & tips
        • Related pages
      • Simulation Box
        • Purpose
        • Simulation setup
        • Running a simulation
        • Using simulated graphs and data in other boxes
        • Common patterns & tips
        • Related pages
      • Search Box
        • Purpose
        • Wizard workflow
        • Connecting data, knowledge, and outputs
        • Common patterns & tips
        • Related pages
      • Latent Clusters Box
        • Purpose
        • Typical workflow
        • Key controls
        • Common patterns & tips
        • Related pages
      • Latent Structure Box
        • Purpose
        • Wizard workflow
        • Connecting data, clusters, knowledge, and outputs
        • Common patterns & tips
        • Related pages
      • Knowledge Box
        • Purpose
        • Typical workflow
        • Key controls
        • Common patterns & tips
        • Related pages
      • Updater Box
        • Purpose
        • Typical workflow
        • Updater types and detail pages
        • Connecting the Updater with other boxes
        • Common patterns & tips
        • Related pages
      • Regression box
        • Multiple Linear Regression
        • Logistic Regression
        • Adjustment Total Effects
        • IDA Check
        • Interpretation and workflow notes
        • Summary
      • Note Box
        • Purpose
        • Typical workflow
        • Key controls
        • Common patterns & tips
        • Related pages
    • Data Preparation
      • Where data preparation happens in Tetrad
      • Typical data preparation workflow
      • What the rest of this section covers
    • Detail Callouts
      • Data subset / resample
        • Inputs and outputs
        • Variable selection
        • Rows and sampling
        • Typical use cases
      • Detail: Graph Menu (Graph Box)
        • Random Graph
        • Graph Properties
        • Underlinings
        • Paths
        • Highlight
        • Check Graph Type
        • Manipulate Graph
        • PAG Edge Specialization Markups
        • Summary
      • Detail: Display Subgraphs
        • Purpose
        • Basic workflow
        • Subgraph types
        • Summary
      • Detail: Markov Checker
        • Purpose
        • Basic workflow
        • Outputs
        • Interpreting results
      • Detail: Bootstrapping and Ensemble Graphs
        • What Bootstrapping Does
        • Enabling Bootstrapping
        • Running a Bootstrapped Search
        • The Edges Tab: Bootstrap Frequencies
        • Ensemble Graph Display Options
        • How to Use Bootstrapping Effectively
        • Important Caveats
        • Summary
      • Detail: Parametric & Instantiated Model Types
        • Model families
        • Interaction with Estimator and Simulation
      • Detail: Simulation types
        • Bayes net
        • Linear structural equation model
        • Linear Fisher model
        • Nonlinear additive SEM (CAM)
        • General noise SEM
        • Additive noise SEM
        • Lee and Hastie
        • Conditional Gaussian
        • Time series
        • Choosing a simulator
      • Detail: Bayes (Multinomial) Parametric Model
        • When to use Bayes models
        • Main panel layout
        • Typical workflow
        • Tips and caveats
      • Detail: Bayes (Multinomial) Instantiated Model
        • How Bayes instantiated models are created
        • Instantiated Model box layout (Bayes)
        • Typical uses
        • Tips
      • Detail: ML Bayes Estimator
        • Purpose
        • Inputs and requirements
        • How it works (conceptually)
        • Output
        • Tips and common issues
        • Related pages
      • Detail: Dirichlet Estimator
        • Purpose
        • Inputs and requirements
        • How it works (conceptually)
        • Output
        • Tips and common issues
        • Related pages
      • Detail: EM Bayes Estimator
        • Purpose
        • Inputs and requirements
        • How it works (conceptually)
        • Output
        • Tips and common issues
        • Related pages
      • Detail: SEM (Linear) Parametric Model
        • When to use SEM models
        • Main panel layout
        • Typical workflow
        • Tips and caveats
      • Detail: SEM (Linear) Instantiated Model
        • How SEM instantiated models are created
        • Instantiated Model box layout (SEM)
        • File menu options (SEM instantiated model)
      • Detail: SEM (Linear) Estimator
        • Purpose
        • Inputs and requirements
        • How it works (conceptually)
        • Output
        • File menu options (SEM Estimator)
      • Detail: Hybrid (Conditional Gaussian) Parametric Model
        • When to use Hybrid models
        • Main panel layout
        • Typical workflow
        • Tips and caveats
      • Detail: Hybrid (Conditional Gaussian) Instantiated Model
        • How Hybrid instantiated models are created
        • Instantiated Model box layout (Hybrid)
        • Typical uses
        • Tips
      • Detail: Hybrid CG Estimator
        • Purpose
        • Inputs and requirements
        • How it works (conceptually)
        • Output
        • Tips and common issues
        • Related pages
      • Detail: Generalized Parametric Model
        • When to use Generalized models
        • Main panel layout
        • Typical workflow
        • Tips and caveats
      • Detail: Generalized Instantiated Model
        • How Generalized instantiated models are created
        • Instantiated Model box layout (Generalized)
        • Typical uses
        • Tips
      • Detail: Generalized SEM Estimator
        • Purpose
        • Inputs and requirements
        • How it works (conceptually)
        • Output
        • Tips and common issues
        • Related pages
      • Detail: Junction Tree Updater
        • Purpose
        • Inputs and setup
        • How it works (conceptually)
        • Output
        • Tips
        • Related pages
      • Detail: Approximate Updater
        • Purpose
        • Inputs and setup
        • How it works (conceptually)
        • Output
        • Tips
        • Related pages
      • Detail: Row Summing Updater
        • Purpose
        • Inputs and setup
        • How it works (conceptually)
        • Output
        • Tips
        • Related pages
      • Detail: SEM Updater
        • Purpose
        • Inputs and setup
        • How it works (conceptually)
        • Output
        • Tips
        • Related pages
      • Detail: Adjustment and Total Effects: Amenability and Discrete Variables
        • What Is an Amenable Pair?
        • Amenability via Visible Edges
        • How Amenability Is Reported in the Tool
        • Discrete Variables and Regression Output
        • Amenability and Refining Equivalence Classes
        • Summary
      • Detail: IDA Check (Regression box)
        • Layout and controls
        • Table columns
        • Summary statistics (bottom)
        • Typical usage
        • Notes and references
      • Detail: N-tad Explorer
        • Basic workflow
        • Interpretation
        • Tips and notes
        • Using N-tad Explorer with SEMs
  • Python and R Bindings
    • py-tetrad (Python Binding)
    • rpy-tetrad (R Binding)
    • When to Use These Bindings
    • Related Python Ecosystem Tools
      • Relationship to Tetrad
      • Recommendation
  • Graphs and DataSets
    • Graph Types and Formats
      • 1. Core Graph Types in Tetrad
        • 1.1 DAG — Directed Acyclic Graph
        • 1.2 CPDAG — Completed Partially Directed Acyclic Graph
        • 1.3 MAG — Maximal Ancestral Graph
        • 1.4 PAG — Partial Ancestral Graph
      • 2. Endpoint Marking System
      • 3. PAG Edge-Specialization Markup (Optional GUI Feature)
        • 3.1 Two Independent Attributes
        • (A) Visibility
        • (B) Directness
        • 3.2 The Four Directed-Edge Types
        • 3.3 Undirected Edges Represent Selection Bias
      • 4. Saving and Loading Graphs
        • 4.1 Conceptual Plain-Text Format
      • 5. Graphs and Data: Name Matching
      • 6. Summary
    • Data Types and Formats
      • 1. Overview of Supported Formats
      • 2. Dataset Format (Tabular Data)
        • Notes
      • 3. Discrete Data
      • 4. Continuous Data
      • 5. Covariance and Correlation Matrices
        • 5.1 Required Structure
        • 5.2 Lower Triangle Covariance Matrix Example
        • 5.3 Full Square Covariance Matrix Example (Current Default)
        • 5.4 Correlation Matrices
        • 5.5 Common Parsing Errors for Covariance/Correlation Files
      • 6. Lower-Triangular Format
        • 6.1 Note on GUI Display
      • 7. Exporting Data from Tetrad
      • 8. Summary
  • Search Algorithms
    • Choosing an Algorithm
      • 🔍 Choosing an Algorithm
      • 🧭 Recommended Algorithms (At a Glance)
      • 🔍 DAG / CPDAG Methods (No Latent Confounders)
      • 🌀 PAG Methods (Hidden Confounders Allowed)
      • 🔧 Other Useful Algorithm Classes
      • 🎛 Choosing CI Tests & Scores (Quick Guide)
      • ⚠️ Common Pitfalls and Fixes
    • Search Algorithms — By Type
      • Legend — Algorithm Categories
        • Extra Structural Badges
      • 🔍 Constraint-Based Algorithms (CPDAG / PAG)
      • 📏 Score-Based Algorithms (CPDAG)
      • 🌀 Hybrid Algorithms (Score + FCI)
      • 🎨 Non-Gaussian, Moment-Based, and Orientation Algorithms
      • Nonlinear & Distribution-Shift Algorithms
      • 📦 Stability / Resampling / Ensemble Wrappers
      • 🧪 Specialized / Utility Algorithms
      • Latent Clustering (Measurement Block Discovery)
      • Latent Structure / Measurement-Model Construction
    • Search Algorithms — Alphabetical
      • 1. BOSS — Best Order Score Search
        • 1.1. Key idea
        • 1.2. When to use
        • 1.3. How it works (at a glance)
        • 1.4. Strengths
        • 1.5. Limitations
        • 1.6. How it relates to other Tetrad algorithms
        • 1.7. Prior knowledge support
        • 1.8. Parameters
        • 1.9. Reference
        • 1.10. Summary
      • 2. BOSS-FCI — Best-Order Score Search + FCI Refinement
        • 2.1. Key Idea
        • 2.2. When to Use
        • 2.3. Strengths
        • 2.4. Limitations
        • 2.5. How It Differs From Related Algorithms
        • 2.6. Prior Knowledge Support
        • 2.7. Key Parameters in Tetrad
        • 2.8. Reference
        • 2.9. Summary
      • 3. BPC — Build Pure Clusters
        • 3.1. Basic Assumptions
        • 3.2. High-Level Algorithm
        • 3.3. Output and Interpretation
        • 3.4. Parameters in Tetrad
        • 3.5. Strengths
        • 3.6. Limitations
        • 3.7. Reference
        • 3.8. Summary
      • 4. CAM — Causal Additive Model
        • 4.1. Key Idea
        • 4.2. When to Use CAM
        • 4.3. Prior Knowledge Support
        • 4.4. Strengths
        • 4.5. Limitations
        • 4.6. Key Parameters in Tetrad
        • 4.7. Reference
        • 4.8. Summary
      • 5. CCD — Cyclic Causal Discovery
        • 5.1. Key Idea
        • 5.2. When to Use
        • 5.3. Prior Knowledge Support
        • 5.4. Strengths
        • 5.5. Limitations
        • 5.6. Key Parameters in Tetrad
        • 5.7. Reference
        • 5.8. Summary
      • 6. CD-NOD — Causal Discovery from Nonstationary / Distribution-Shifted Data
        • 6.1. Key Idea
        • 6.2. When to Use
        • 6.3. Prior Knowledge Support
        • 6.4. Strengths
        • 6.5. Limitations
        • 6.6. Key Parameters in Tetrad / Scripting
        • 6.7. Reference
        • 6.8. Summary
      • 7. Conservative PC (CPC) — Conservative Collider Orientation
        • 7.1. Key Idea
        • 7.2. When to Use
        • 7.3. Prior Knowledge Support
        • 7.4. Strengths
        • 7.5. Limitations
        • 7.6. Key Parameters in Tetrad
        • 7.7. Reference
        • 7.8. Summary
      • 8. CStaR (Causal Stability Ranking)
        • 8.1. High-level idea
        • 8.2. Inputs
        • 8.3. Outputs
        • 8.4. Parameters
        • 8.5. When to use CStaR
        • 8.6. References
        • 8.7. Summary
      • 9. DAGMA — Learning DAGs via M-Matrices and Log-Determinant Acyclicity
        • 9.1. Key Idea
        • 9.2. When to Use
        • 9.3. Prior Knowledge Support
        • 9.4. Strengths
        • 9.5. Limitations
        • 9.6. Key Parameters in Tetrad
        • 9.7. Reference
        • 9.8. Summary
      • 10. DirectLiNGAM
        • 10.1. Key Idea
        • 10.2. When to Use
        • 10.3. Prior Knowledge Support
        • 10.4. Strengths
        • 10.5. Limitations
        • 10.6. Key Parameters in Tetrad
        • 10.7. Reference
        • 10.8. Summary
      • 11. DM (Detect–Mimic)
        • 11.1. DM-PC
        • 11.2. DM-FCIT
      • 12. Factor Analysis
        • 12.1. Purpose
        • 12.2. When to Use
        • 12.3. How It Works (Conceptual)
        • 12.4. Strengths
        • 12.5. Limitations
        • 12.6. Relation to Other Latent Tools
        • 12.7. References
        • 12.8. Summary
      • 13. FAS — Fast Adjacency Search
        • 13.1. Key Idea
        • 13.2. When to Use
        • 13.3. Case Study: High-dimensional fMRI Preprocessing
        • 13.4. Prior Knowledge Support
        • 13.5. Strengths
        • 13.6. Limitations
        • 13.7. Key Parameters in Tetrad
        • 13.8. Reference
        • 13.9. Summary
      • 14. FASK — Fast Adjacency Skewness
        • 14.1. Key Idea
        • 14.2. When to Use
        • 14.3. Prior Knowledge Support
        • 14.4. Strengths
        • 14.5. Limitations
        • 14.6. Key Parameters in Tetrad
        • 14.7. Reference
        • 14.8. Summary
      • 15. FASK-Vote — Multi-Dataset FASK Voting over IMaGES
        • 15.1. Key Idea
        • 15.2. When to Use
        • 15.3. Prior Knowledge Support
        • 15.4. Strengths
        • 15.5. Limitations
        • 15.6. ImagES Parameters
        • 15.7. FASK Parameters
        • 15.8. Reference
        • 15.9. Summary
      • 16. FCI — Fast Causal Inference
        • 16.1. Key idea
        • 16.2. When to use FCI
        • 16.3. Assumptions
        • 16.4. How it works (at a glance)
        • 16.5. How it relates to other Tetrad algorithms
        • 16.6. Strengths
        • 16.7. Limitations
        • 16.8. Prior knowledge
        • 16.9. Key parameters in Tetrad
        • 16.10. References
      • 17. FCI-IOD — FCI with Independent Overlapping Datasets
        • 17.1. Key Idea
        • 17.2. When to Use
        • 17.3. Prior Knowledge Support
        • 17.4. Strengths
        • 17.5. Limitations
        • 17.6. Key Parameters in Tetrad
        • 17.7. Reference
        • 17.8. Summary
      • 18. FCIT — FCI with Targeted Testing
        • 18.1. Key Idea
        • 18.2. When to Use
        • 18.3. Strengths
        • 18.4. Limitations
        • 18.5. How It Differs From Related Algorithms
        • 18.6. Prior Knowledge Support
        • 18.7. Key Parameters in Tetrad
        • 18.8. Reference
        • 18.9. Summary
      • 19. FGES — Fast Greedy Equivalence Search
        • 19.1. Key Idea
        • 19.2. A Nuanced View of Scalability and Sparsity
        • 19.3. When to Use FGES
        • 19.4. Prior Knowledge Support
        • 19.5. Strengths
        • 19.6. Limitations
        • 19.7. Key Parameters in Tetrad
        • 19.8. Reference
        • 19.9. Summary
      • 20. FGES-MB — FGES Markov Blanket Search
        • 20.1. Key idea
        • 20.2. When to use FgesMb
        • 20.3. Prior knowledge support
        • 20.4. Strengths
        • 20.5. Limitations
        • 20.6. Key parameters in Tetrad
        • 20.7. Reference
        • 20.8. Summary
      • 21. FOFC — Find One-Factor Clusters
        • 21.1. Key Idea
        • 21.2. When to Use
        • 21.3. Prior Knowledge Support
        • 21.4. Strengths
        • 21.5. Limitations
        • 21.6. Key Parameters in Tetrad
        • 21.7. Reference
        • 21.8. Summary
      • 22. FTFC — Find Two-Factor Clusters (Sextad-Based)
        • 22.1. Key Idea
        • 22.2. Relation to FOFC and GFFC
        • 22.3. When to Use FTFC
        • 22.4. Strengths
        • 22.5. Limitations
        • 22.6. Parameters in Tetrad
        • 22.7. Reference
        • 22.8. Summary
      • 23. GFCI — Greedy Fast Causal Inference
        • 23.1. 🔍 Key Idea
        • 23.2. 🎯 When to Use GFCI
        • 23.3. 🧠 Prior Knowledge
        • 23.4. ⭐ Strengths
        • 23.5. ⚠️ Limitations
        • 23.6. 🔧 Key Parameters (Tetrad)
        • 23.7. ⛓ Relation to Other Algorithms
        • 23.8. 📚 Reference
      • 24. GFFC — Generalized Find Factor Clusters
        • 24.1. Key Idea
        • 24.2. Algorithm Overview
        • 24.3. Why Use GFFC?
        • 24.4. Strengths
        • 24.5. Limitations
        • 24.6. Parameters in Tetrad
        • 24.7. Reference
        • 24.8. Summary
      • 25. GIN (Generalized Independent Noise)
        • 25.1. Overview
        • 25.2. Requirements
        • 25.3. Parameters
        • 25.4. How the Algorithm Works
        • 25.5. Output
        • 25.6. When to Use
        • 25.7. When Not to Use
        • 25.8. Notes
        • 25.9. References
      • 26. GRaSP — Greedy Relaxations of the Sparsest Permutation
        • 26.1. Key idea
        • 26.2. When to use
        • 26.3. How it works (at a glance)
        • 26.4. Strengths
        • 26.5. Limitations
        • 26.6. How it relates to other Tetrad algorithms
        • 26.7. Prior knowledge support
        • 26.8. Key parameters in Tetrad
        • 26.9. Reference
        • 26.10. Summary
      • 27. GRaSP-FCI — Greedy Relaxations of Sparsest Permutation + FCI Refinement
        • 27.1. Key Idea
        • 27.2. When to Use
        • 27.3. Strengths
        • 27.4. Limitations
        • 27.5. How It Differs From Related Algorithms
        • 27.6. Prior Knowledge Support
        • 27.7. Key Parameters in Tetrad
        • 27.8. Reference
        • 27.9. Summary
      • 28. ICA Lingam — ICA-Based LiNGAM
        • 28.1. Key Idea
        • 28.2. When to Use
        • 28.3. Prior Knowledge Support
        • 28.4. Strengths
        • 28.5. Limitations
        • 28.6. Key Parameters in Tetrad
        • 28.7. Reference
        • 28.8. Summary
      • 29. ICA LingD — Cyclic LiNGAM (Lacerda et al.)
        • 29.1. Key Idea
        • 29.2. When to Use
        • 29.3. Prior Knowledge Support
        • 29.4. Strengths
        • 29.5. Limitations
        • 29.6. Key Parameters in Tetrad
        • 29.7. Reference
        • 29.8. Summary
      • 30. IMaGES — Independent Multiple-sample Greedy Equivalence Search
        • 30.1. Key Idea
        • 30.2. Variants
        • 30.3. When to Use
        • 30.4. Prior Knowledge Support
        • 30.5. Strengths
        • 30.6. Limitations
        • 30.7. Key Parameters in Tetrad
        • 30.8. Reference
        • 30.9. Summary
      • 31. Latent Clusters
        • 31.1. Key Idea
        • 31.2. When to Use
        • 31.3. Prior Knowledge Support
        • 31.4. Strengths
        • 31.5. Limitations
        • 31.6. Latent Cluster Algorithms in Tetrad
        • 31.7. Relationship to Latent Structure Algorithms
        • 31.8. Summary
      • 32. LV-Heuristic — Heuristic Latent-Variable PAG from a Single DAG
        • 32.1. What LV-Heuristic Is (and Is Not)
        • 32.2. Key Idea
        • 32.3. When to Use LV-Heuristic
        • 32.4. Strengths
        • 32.5. Limitations
        • 32.6. How LV-Heuristic Differs From Other Mixed-Strategy Algorithms
        • 32.7. Prior Knowledge Support
        • 32.8. Key Parameters in Tetrad
        • 32.9. Reference
        • 32.10. Summary
      • 33. Mimbuild Bollen
        • 33.1. Purpose
        • 33.2. How It Works (Conceptual)
        • 33.3. Strengths
        • 33.4. Limitations
        • 33.5. Relation to Other Latent Tools
        • 33.6. References
        • 33.7. Summary
      • 34. Mimbuild PCA
        • 34.1. Purpose
        • 34.2. How It Works (Conceptual)
        • 34.3. Strengths
        • 34.4. Limitations
        • 34.5. Relation to Other Latent Tools
        • 34.6. References
        • 34.7. Summary
      • 35. PagSamplingRfci
        • 35.1. Key Idea
        • 35.2. When to Use
        • 35.3. Prior Knowledge Support
        • 35.4. Strengths
        • 35.5. Limitations
        • 35.6. Key Parameters in Tetrad
        • 35.7. Reference
        • 35.8. Summary
      • 36. Pairwise Orientation Methods — FaskPw & RSkew
        • 36.1. Overview
        • 36.2. FaskPw — FASK Pairwise Left–Right Orientation
        • 36.3. Key Idea
        • 36.4. When to Use
        • 36.5. Strengths
        • 36.6. Limitations
        • 36.7. Parameters in Tetrad
        • 36.8. RSkew — Robust Skewness Orientation (Hyvärinen & Smith, 2013)
        • 36.9. Key Idea (informal)
        • 36.10. When to Use
        • 36.11. Strengths
        • 36.12. Limitations
        • 36.13. Parameters in Tetrad
        • 36.14. Prior Knowledge Support
        • 36.15. Summary
      • 37. PC — Peter–Clark Algorithm
        • 37.1. Key Idea
        • 37.2. When to Use
        • 37.3. Prior Knowledge Support
        • 37.4. Strengths
        • 37.5. Limitations
        • 37.6. Key Parameters in Tetrad
        • 37.7. Historical Notes
        • 37.8. Additional Reference
        • 37.9. Summary
      • 38. PC-Max — PC with Maximum-p Collider Orientation
        • 38.1. Key Idea
        • 38.2. When to Use
        • 38.3. Relation to Standard PC
        • 38.4. Prior Knowledge Support
        • 38.5. Strengths
        • 38.6. Limitations
        • 38.7. Key Parameters in Tetrad
        • 38.8. Reference
        • 38.9. Summary
      • 39. PCD — PC for Deterministic Relations
        • 39.1. Key Idea
        • 39.2. When to Use
        • 39.3. Prior Knowledge Support
        • 39.4. Strengths
        • 39.5. Limitations
        • 39.6. Key Parameters in Tetrad
        • 39.7. Summary
      • 40. PC-MB — PC Markov Blanket Search
        • 40.1. Key Idea
        • 40.2. When to Use
        • 40.3. Prior Knowledge Support
        • 40.4. Strengths
        • 40.5. Limitations
        • 40.6. Key Parameters in Tetrad
        • 40.7. Reference
        • 40.8. Summary
      • 41. PCMCI — Time-Series Causal Discovery (Runge et al.)
        • 41.1. Key Idea
        • 41.2. When to Use
        • 41.3. Prior Knowledge Support
        • 41.4. Strengths
        • 41.5. Limitations
        • 41.6. Key Parameters in Tetrad
        • 41.7. Reference
        • 41.8. Summary
      • 42. Restricted BOSS — Target-Focused Best Order Score Search
        • 42.1. Key Idea
        • 42.2. When to Use
        • 42.3. Prior Knowledge Support
        • 42.4. Strengths
        • 42.5. Limitations
        • 42.6. Key Parameters in Tetrad
        • 42.7. Reference
        • 42.8. Summary
      • 43. RFCI — Really Fast Causal Inference
        • 43.1. Key Idea
        • 43.2. When to Use
        • 43.3. Prior Knowledge Support
        • 43.4. Strengths
        • 43.5. Limitations
        • 43.6. Key Parameters in Tetrad
        • 43.7. Reference
        • 43.8. Summary
      • 44. RFCI-BSC
        • 44.1. Key Idea
        • 44.2. When to Use
        • 44.3. Prior Knowledge Support
        • 44.4. Strengths
        • 44.5. Limitations
        • 44.6. Key Parameters in Tetrad
        • 44.7. Reference
        • 44.8. Summary
      • 45. SingleGraphAlg (Imported Graph Wrapper)
        • 45.1. What it does
        • 45.2. Typical workflow
        • 45.3. When to use (and when not to)
      • 46. SP — Sparsest Permutation
        • 46.1. Key idea
        • 46.2. When to use
        • 46.3. How it works (at a glance)
        • 46.4. Strengths
        • 46.5. Limitations
        • 46.6. How it relates to other Tetrad algorithms
        • 46.7. Prior knowledge support
        • 46.8. Reference
        • 46.9. Summary
      • 47. SP-FCI — Sparsest-Permutation FCI
        • 47.1. Key Idea
        • 47.2. When to Use
        • 47.3. Strengths
        • 47.4. Limitations
        • 47.5. Key Parameters in Tetrad
        • 47.6. Knowledge Support
        • 47.7. Relation to Other Algorithms
        • 47.8. References
        • 47.9. Summary
      • 48. StabilitySelection
        • 48.1. Key Idea
        • 48.2. When to Use
        • 48.3. Prior Knowledge Support
        • 48.4. Strengths
        • 48.5. Limitations
        • 48.6. Key Parameters in Tetrad
        • 48.7. Reference
        • 48.8. Summary
      • 49. StARS
        • 49.1. Key Idea
        • 49.2. When to Use
        • 49.3. Prior Knowledge Support
        • 49.4. Strengths
        • 49.5. Limitations
        • 49.6. Key Parameters in Tetrad
        • 49.7. Reference
        • 49.8. Summary
      • 50. TSC — Trek Separation Clusters
        • 50.1. Intended use
        • 50.2. Model assumptions (NOLAC version)
        • 50.3. High-level algorithm sketch
        • 50.4. Inputs and outputs
        • 50.5. Key parameters
        • 50.6. Practical guidance
        • 50.7. Limitations
        • 50.8. Related methods
        • 50.9. Summary
  • Tests & Scores
    • Choosing Tests & Scores
      • 1. Continuous, Approximately Gaussian Data
        • Recommended Tests
        • Recommended Scores
        • Best-Fit Algorithms
      • 2. Discrete Data (Binary / Ordinal / Categorical)
        • Recommended Tests
        • Recommended Scores
        • Best-Fit Algorithms
      • 3. Mixed Continuous/Discrete Data
        • A. Conditional Gaussian (CG)
        • B. Degenerate Gaussian (DGC)
        • C. Basis Function (BF) Tests/Scores
      • 4. Non-Gaussian Linear Models
        • Recommended Tests
        • Recommended Scores
        • Best-Fit Algorithms
      • 5. Nonlinear Models
        • A. Kernel Conditional Independence Test (KCI)
        • B. Random Conditional Independence Test (RCIT)
        • B. Basis Function Test / Score (Recommended for scalability)
      • 6. Latent Variable Workflows (Block-Based Search)
        • Block-Based Tests/Scores
        • Compatible Algorithms
        • Typical Workflow
      • Summary Table (Practical Defaults)
      • Next Steps
    • Tests and Scores: By Type
      • Independence Tests
        • Independence Tests Overview
      • Scores
        • Scores Overview
      • How Tests and Scores Are Used in Algorithms
    • Tests and Scores — Alphabetical
      • 1. Basis Function BIC Score
        • 1.1. Summary
        • 1.2. When to use
        • 1.3. Model class
        • 1.4. Score form (conceptual)
        • 1.5. Parameters
        • 1.6. Strengths
        • 1.7. Limitations
        • 1.8. References
      • 2. Basis Function Likelihood Ratio Test
        • 2.1. Summary
        • 2.2. When to use
        • 2.3. Assumptions
        • 2.4. Test details (conceptual)
        • 2.5. Parameters
        • 2.6. Strengths
        • 2.7. Limitations
        • 2.8. References
      • 3. BDeu Score
        • 3.1. Summary
        • 3.2. When to use
        • 3.3. Model class
        • 3.4. Score form (conceptual)
        • 3.5. Parameters
        • 3.6. Strengths
        • 3.7. Limitations
        • 3.8. References
      • 4. Chi-Square Test
        • 4.1. Summary
        • 4.2. When to use
        • 4.3. Assumptions
        • 4.4. Test details (conceptual)
        • 4.5. Parameters
        • 4.6. Strengths
        • 4.7. Limitations
        • 4.8. References
      • 5. Conditional Gaussian BIC Score
        • 5.1. Summary
        • 5.2. When to use
        • 5.3. Model class
        • 5.4. Score form (conceptual)
        • 5.5. Parameters
        • 5.6. Strengths
        • 5.7. Limitations
        • 5.8. References
      • 6. Conditional Gaussian Likelihood Ratio Test
        • 6.1. Summary
        • 6.2. When to use
        • 6.3. Assumptions
        • 6.4. Test details (conceptual)
        • 6.5. Parameters
        • 6.6. Strengths
        • 6.7. Limitations
        • 6.8. References
        • 6.9. References
      • 7. Degenerate Gaussian BIC Score
        • 7.1. Summary
        • 7.2. When to use
        • 7.3. Model class
        • 7.4. Score form (conceptual)
        • 7.5. Parameters
        • 7.6. Strengths
        • 7.7. Limitations
        • 7.8. References
      • 8. Degenerate Gaussian Likelihood Ratio Test
        • 8.1. Summary
        • 8.2. When to use
        • 8.3. Assumptions
        • 8.4. Test details (conceptual)
        • 8.5. Parameters
        • 8.6. Strengths
        • 8.7. Limitations
        • 8.8. References
      • 9. Discrete BIC Score
        • 9.1. Summary
        • 9.2. When to use
        • 9.3. Model class
        • 9.4. Score form (conceptual)
        • 9.5. Parameters
        • 9.6. Strengths
        • 9.7. Limitations
      • 10. Extended BIC (EBIC) Score
        • 10.1. Summary
        • 10.2. When to use
        • 10.3. Model class
        • 10.4. Score form (conceptual)
        • 10.5. Parameters
        • 10.6. Strengths
        • 10.7. Limitations
        • 10.8. References
      • 11. Fisher Z Test
        • 11.1. Summary
        • 11.2. When to use
        • 11.3. Assumptions
        • 11.4. Test details (conceptual)
        • 11.5. Parameters
        • 11.6. Strengths
        • 11.7. Limitations
        • 11.8. References
      • 12. G-Square Test
        • 12.1. Summary
        • 12.2. When to use
        • 12.3. Assumptions
        • 12.4. Test details (conceptual)
        • 12.5. Parameters
        • 12.6. Strengths
        • 12.7. Limitations
        • 12.8. References
      • 13. Generalized Information Criterion (GIC) Scores
        • 13.1. Summary
        • 13.2. When to use
        • 13.3. Model class
        • 13.4. Score form (conceptual)
        • 13.5. Parameters
        • 13.6. Strengths
        • 13.7. Limitations
        • 13.8. References
      • 14. Kernel Conditional Independence Test (KCI)
        • 14.1. Summary
        • 14.2. When to use
        • 14.3. Assumptions
        • 14.4. Test details (conceptual)
        • 14.5. Parameters
        • 14.6. Strengths
        • 14.7. Limitations
        • 14.8. References
      • 15. m-Separation Test
        • 15.1. Summary
        • 15.2. When to use
        • 15.3. Assumptions
        • 15.4. Test details (conceptual)
        • 15.5. Parameters in Tetrad
        • 15.6. Strengths
        • 15.7. Limitations
        • 15.8. References
      • 16. m-Separation Score
        • 16.1. Summary
        • 16.2. When to use
        • 16.3. Model class
        • 16.4. Score form (conceptual)
        • 16.5. Parameters in Tetrad
        • 16.6. Strengths
        • 16.7. Limitations
      • 17. MVP BIC Score
        • 17.1. Summary
        • 17.2. When to use
        • 17.3. Model class
        • 17.4. Score form (conceptual)
        • 17.5. Parameters
        • 17.6. Strengths
        • 17.7. Limitations
      • 18. Multivariate Polynomial Likelihood Ratio Test (MVPLRT)
        • 18.1. Summary
        • 18.2. When to use
        • 18.3. Assumptions
        • 18.4. Test details (conceptual)
        • 18.5. Parameters
        • 18.6. Strengths
        • 18.7. Limitations
      • 19. Poisson BIC Test
        • 19.1. Summary
        • 19.2. When to use
        • 19.3. Relation to Poisson Prior Score
        • 19.4. Test details (conceptual)
        • 19.5. Parameters
        • 19.6. Strengths
        • 19.7. Limitations
      • 20. Poisson Prior Score
        • 20.1. Summary
        • 20.2. When to use
        • 20.3. Model class
        • 20.4. Score form (conceptual)
        • 20.5. Parameters
        • 20.6. Strengths
        • 20.7. Limitations
        • 20.8. Relation to other penalties
      • 21. Probabilistic Independence Test
        • 21.1. Summary
        • 21.2. When to use
        • 21.3. Assumptions
        • 21.4. Test details (conceptual)
        • 21.5. Parameters
        • 21.6. Strengths
        • 21.7. Limitations
      • 22. Random Conditional Independence Test (RCIT)
        • 22.1. Summary
        • 22.2. When to use
        • 22.3. Assumptions
        • 22.4. Test details (conceptual)
        • 22.5. Parameters
        • 22.6. Strengths
        • 22.7. Limitations
        • 22.8. Relationship to other CI tests in Tetrad
        • 22.9. References
      • 23. SEM BIC Score
        • 23.1. Summary
        • 23.2. When to use
        • 23.3. Model class
        • 23.4. Score form (conceptual)
        • 23.5. Parameters
        • 23.6. Strengths
        • 23.7. Limitations
      • 24. SEM BIC Test
        • 24.1. Summary
        • 24.2. When to use
        • 24.3. Relation to SEM BIC Score
        • 24.4. Test details (conceptual)
        • 24.5. Strengths
        • 24.6. Limitations
      • 25. Zhang–Shen Bound Score
        • 25.1. Summary
        • 25.2. When to use
        • 25.3. Model class
        • 25.4. Score form (conceptual)
        • 25.5. Parameters
        • 25.6. Strengths
        • 25.7. Limitations
        • 25.8. References
  • Parameters
  • Contributors
    • 🌟 Founders & Early Leadership
    • 🧭 Project Direction & Architecture
    • 🔬 Algorithmic & Research Contributions
    • 🛠 Software Engineering & Infrastructure
    • 🏛 Funding Acknowledgment
  • Papers and Books
  • Change Log
Tetrad Manual
  • Search Algorithms
  • Search Algorithms — Alphabetical
  • 1. BOSS — Best Order Score Search
  • View page source

1. BOSS — Best Order Score Search

Type: Score-based CPDAG search (order-based)
Output: CPDAG (equivalence class of DAGs)
Category: 📏 Score-based

BOSS is a fast, scalable order-based score search that finds high-scoring DAGs by exploring the space of variable orderings with aggressive pruning via Grow–Shrink Trees (GSTs).


1.1. Key idea

Instead of searching directly over DAG structures, BOSS searches over variable permutations:

  • For a given ordering, the best DAG consistent with that order can be found by local score optimization over parent sets.

  • BOSS uses Grow–Shrink Trees to reuse computations across similar orders and aggressively prune the search space.

  • The result is a high-scoring DAG (and CPDAG) at a fraction of the cost of exhaustive order search.

Compared to classical score-based methods like FGES, BOSS trades equivalence-class moves for a permutation-based view with sophisticated caching and pruning.


1.2. When to use

BOSS is a good default when:

  • You want high structural accuracy in moderate to large models.

  • You want a score-based DAG/CPDAG search with strong empirical performance.

  • You are comfortable assuming that a good score (e.g., BIC/BDeu/BF-BIC/BF-LRT) will identify high-quality structures.

  • You want a method that scales well even for moderately or very dense graphs (e.g., average degree up to 20, as demonstrated in the paper).

  • You may later feed the DAG into downstream algorithms such as:

    • BOSS-FCI for latent-variable PAGs.

    • LV-Heuristic for a fast preliminary PAG.

Less suitable when:

  • The graph is extremely dense and the sample size is very small (scores become unstable, and many DAGs tie).

  • You explicitly need a latent-variable–capable PAG search; in that case consider
    FCI,
    GFCI, or
    FCIT.


1.3. How it works (at a glance)

Very roughly:

  1. Initialize an ordering
    Start from one or several candidate permutations of the variables (e.g., random or heuristic).

  2. Best DAG for a given order
    For each variable in the order, select a parent set from earlier variables that maximizes a decomposable score (e.g., BIC), subject to sparsity constraints.

  3. Grow–Shrink Trees (GSTs)

    • Organize candidate parent sets in a tree where nearby orders share nodes.

    • Prune branches that cannot improve the score (monotonicity + local bounds).

    • Reuse computations across similar orders to avoid re-scoring all parent sets.

  4. Order search with pruning
    Iteratively propose local changes in the order (e.g., swaps) and accept moves that improve the global score. GSTs make this feasible for larger ( p ).

  5. Convert to CPDAG
    The final DAG is mapped to its equivalence class (CPDAG), which is the algorithm’s reported structure.

The BOSS paper shows that GSTs give near-SP-level accuracy with far better scalability.


1.4. Strengths

  • High accuracy
    Often more accurate than purely greedy equivalence-class searches when tuned appropriately.

  • Scalable order search
    GST-based pruning and caching make order-based search practical for substantially larger graphs than naive SP-style permutation search.

  • Good foundation for latent-variable extensions
    Serves as the backbone for BOSS-FCI and as the DAG input to LV-Heuristic.

  • Parallelizable
    Scoring of different candidate parents/orders can be parallelized across threads.


1.5. Limitations

  • No explicit latent variables
    BOSS outputs a CPDAG over the observed variables; latent confounding must be handled by downstream methods (e.g., BOSS-FCI, FCIT, etc.).

  • Score and penalty sensitive
    As with other score-based methods, performance depends on:

    • Choice of score (Gaussian / discrete / mixed).

    • Penalty strength (e.g., BIC multiplier).

  • Assumes DAG structure
    Feedback/cycles are not allowed; use methods like CCD for cyclic models.


1.6. How it relates to other Tetrad algorithms

  • Versus FGES

    • FGES performs equivalence-class moves on CPDAGs.

    • BOSS searches in ordering space, with GSTs providing strong pruning.

    • Empirically, BOSS can be more accurate in many regimes, at similar or slightly higher computational cost.

  • Versus SP / GRaSP

    • SP is the “ideal” sparsest-permutation method but is combinatorial in ( p! ) and only feasible for very small ( p ). oai_citation:0‡arXiv

    • GRaSP relaxes SP into a greedy local search; BOSS uses a related order-based perspective but with GSTs and design choices that improve scalability.

  • As a building block for latent-variable PAGs

    • BOSS-FCI and FCIT use BOSS as their score-based component.


1.7. Prior knowledge support

BOSS is fully knowledge-aware in Tetrad:

  • Forbidden edges (never include).

  • Required edges (always include if consistent with a DAG).

  • Tiers / partial orders (restrict permissible permutations).

  • Grouping / intervention metadata where supported by the score.

These constraints are enforced at the ordering and parent-selection levels, pruning incompatible permutations and parent sets.


1.8. Parameters

Parameter (camelCase)

Description

useBes

Boolean. If true, use the Bounded Equivalence Search (BES) phase after greedy search to refine the final graph. BES can improve accuracy by removing locally suboptimal edges.

numStarts

Integer ≥ 1. Number of random restarts for the hill-climbing phase. More starts increase the chance of escaping local optima but increase runtime. Typical values: 1–20.

timeLag

Boolean. If true, treat the data as time-lagged for score calculations (used in longitudinal or time-indexed models).

timeLagReplicatingGraph

Boolean. If true, replicate learned graph structures across time-lagged slices. Requires timeLag = true.

numThreads

Integer ≥ 1. Number of threads for parallel evaluation of candidate moves. Larger values increase speed on multicore systems.

useDataOrder

Boolean. If true, the initial variable order is taken from the dataset columns rather than randomized.

outputCpdag

Boolean. If true, the result is returned as a completed partially directed acyclic graph (CPDAG). If false, produce the raw search graph.

seed

Integer. Random seed for reproducible starts and randomized moves.

verbose

Boolean. If true, print detailed diagnostic output during search. Useful for debugging or teaching, but slows execution.


1.9. Reference

  • Primary paper
    Andrews, B., Ramsey, J., Sanchez-Romero, R., Camchong, J., & Kummerfeld, E. (2023).
    Fast scalable and accurate discovery of DAGs using the best order score search and grow shrink trees.
    NeurIPS 36, 63945–63956.


1.10. Summary

BOSS is Tetrad’s workhorse order-based score search: it leverages Grow–Shrink Trees to explore permutations efficiently, yielding accurate DAGs/CPDAGs at meaningful scales and serving as a core building block for modern latent-variable methods like BOSS-FCI and FCIT.

Previous Next

© Copyright 2025.

Built with Sphinx using a theme provided by Read the Docs.