Grid Search (Simulation)ο
This page describes how to use Grid Search in Tetrad when working from a simulation rather than a fixed dataset.
In simulation-based Grid Search, Tetrad repeatedly generates data from a specified simulation model and evaluates causal discovery algorithms across multiple parameter settings. This workflow is especially useful for method comparison, sensitivity analysis, and benchmarking under controlled conditions.
When to Use Simulation-Based Grid Searchο
Simulation-based Grid Search is appropriate when:
You want to compare algorithms under known ground truth
You want to understand how performance changes with:
sample size,
noise level,
graph density,
functional form,
or latent structure
You are evaluating robustness or failure modes of algorithms
You are developing or testing new methods
Unlike data-based Grid Search, simulation-based Grid Search is stochastic: results depend on random draws unless otherwise fixed.
Key Difference from Data-Based Grid Searchο
Aspect |
Data-Based |
Simulation-Based |
|---|---|---|
Data source |
Fixed dataset |
Generated repeatedly |
Randomness |
None |
Yes (unless seeded) |
Ground truth |
Unknown |
Known |
Truth-based statistics |
Hidden |
Available |
Typical use |
Applied analysis |
Method evaluation |
Step 1: Select a Simulationο
Add a Grid Search box to the workspace.
Connect it to a Simulation box.
In the Simulation editor:
Choose a graph type (e.g., random DAG, scale-free)
Choose a simulation model (e.g., linear Gaussian, nonlinear)
Set simulation parameters (number of variables, sample size, noise level, etc.)
Only one simulation may be active at a time.
Step 2: Algorithms Tabο
In the Algorithms tab:
Click Add Algorithm
Select one or more causal discovery algorithms
Choose compatible tests or scores
Optionally edit algorithm, test, or score parameters
As with data-based Grid Search, parameters may be specified as comma-separated lists, and all combinations will be explored.
Step 3: Table Columns Tabο
In the Table Columns tab, select statistics and parameters to report.
Because the true graph is known in simulation mode, you may include:
Adjacency precision / recall
Arrowhead precision / recall
Structural Hamming Distance (SHD)
Other truth-based performance measures
You may also include:
Markov checking statistics
Estimated graph properties (e.g., number of edges)
Parameter values
Choose a small, interpretable set of columns to keep comparisons readable.
Step 4: Comparison Tabο
In the Comparison tab:
Choose a comparison graph type (e.g., DAG, CPDAG, PAG)
Select a truth graph or derived graph for evaluation
Configure utilities for truth-based statistics if sorting by utility
Choose Markov checking options if desired
Truth-based utilities are meaningful here because the ground truth is known.
Step 5: Run Counts and Randomnessο
Simulation-based Grid Search allows you to specify how many times each configuration is run.
Key options include:
Number of runs per configuration
Random seed (if reproducibility is desired)
Aggregation method (e.g., mean statistics across runs)
Increasing the number of runs improves stability but increases computation time.
Running the Comparisonο
Click Run Comparison to begin.
For each algorithm and parameter combination, Grid Search will:
Generate data from the simulation
Run the algorithm
Compute selected statistics
Aggregate results across runs
Progress and detailed logs appear in the Verbose Output tab.
Interpreting Simulation Resultsο
Simulation-based Grid Search is best interpreted comparatively:
Compare algorithms under identical conditions
Examine trade-offs between:
accuracy,
complexity,
robustness,
and consistency
Identify regimes where methods perform well or fail
Avoid focusing on single rows; patterns across conditions are more informative.
Common Pitfallsο
Sweeping too many parameters at once
Using too few simulation runs
Over-interpreting small differences
Ignoring failure cases
Simulation studies are most valuable when they reveal limitations, not just successes.
Summaryο
Simulation-based Grid Search allows you to:
Evaluate causal discovery methods under controlled conditions
Use truth-based performance metrics responsibly
Understand sensitivity to modeling choices
Compare algorithms systematically
It complements data-based Grid Search by answering methodological questions rather than applied ones.
π§ Next Stepsο
Compare results across multiple simulations
Vary assumptions systematically
Use insights to guide applied analyses on real data