# Grid Search (Simulation) This page describes how to use **Grid Search** in Tetrad when working from a **simulation** rather than a fixed dataset. In simulation-based Grid Search, Tetrad repeatedly generates data from a specified simulation model and evaluates causal discovery algorithms across multiple parameter settings. This workflow is especially useful for **method comparison**, **sensitivity analysis**, and **benchmarking** under controlled conditions. --- ## When to Use Simulation-Based Grid Search Simulation-based Grid Search is appropriate when: - You want to compare algorithms under **known ground truth** - You want to understand how performance changes with: - sample size, - noise level, - graph density, - functional form, - or latent structure - You are evaluating **robustness or failure modes** of algorithms - You are developing or testing new methods Unlike data-based Grid Search, simulation-based Grid Search is **stochastic**: results depend on random draws unless otherwise fixed. --- ## Key Difference from Data-Based Grid Search | Aspect | Data-Based | Simulation-Based | |------|-----------|------------------| | Data source | Fixed dataset | Generated repeatedly | | Randomness | None | Yes (unless seeded) | | Ground truth | Unknown | Known | | Truth-based statistics | Hidden | Available | | Typical use | Applied analysis | Method evaluation | --- ## Step 1: Select a Simulation 1. Add a **Grid Search** box to the workspace. 2. Connect it to a **Simulation** box. 3. In the **Simulation** editor: - Choose a **graph type** (e.g., random DAG, scale-free) - Choose a **simulation model** (e.g., linear Gaussian, nonlinear) - Set simulation parameters (number of variables, sample size, noise level, etc.) Only **one simulation** may be active at a time. --- ## Step 2: Algorithms Tab In the **Algorithms** tab: 1. Click **Add Algorithm** 2. Select one or more causal discovery algorithms 3. Choose compatible tests or scores 4. Optionally edit algorithm, test, or score parameters As with data-based Grid Search, parameters may be specified as **comma-separated lists**, and all combinations will be explored. --- ## Step 3: Table Columns Tab In the **Table Columns** tab, select statistics and parameters to report. Because the true graph is known in simulation mode, you may include: - Adjacency precision / recall - Arrowhead precision / recall - Structural Hamming Distance (SHD) - Other truth-based performance measures You may also include: - Markov checking statistics - Estimated graph properties (e.g., number of edges) - Parameter values Choose a **small, interpretable set** of columns to keep comparisons readable. --- ## Step 4: Comparison Tab In the **Comparison** tab: - Choose a **comparison graph type** (e.g., DAG, CPDAG, PAG) - Select a **truth graph** or derived graph for evaluation - Configure utilities for truth-based statistics if sorting by utility - Choose Markov checking options if desired Truth-based utilities are meaningful here because the ground truth is known. --- ## Step 5: Run Counts and Randomness Simulation-based Grid Search allows you to specify how many times each configuration is run. Key options include: - **Number of runs per configuration** - **Random seed** (if reproducibility is desired) - **Aggregation method** (e.g., mean statistics across runs) Increasing the number of runs improves stability but increases computation time. --- ## Running the Comparison Click **Run Comparison** to begin. For each algorithm and parameter combination, Grid Search will: 1. Generate data from the simulation 2. Run the algorithm 3. Compute selected statistics 4. Aggregate results across runs Progress and detailed logs appear in the **Verbose Output** tab. --- ## Interpreting Simulation Results Simulation-based Grid Search is best interpreted comparatively: - Compare algorithms under identical conditions - Examine trade-offs between: - accuracy, - complexity, - robustness, - and consistency - Identify regimes where methods perform well or fail Avoid focusing on single rows; patterns across conditions are more informative. --- ## Common Pitfalls - Sweeping too many parameters at once - Using too few simulation runs - Over-interpreting small differences - Ignoring failure cases Simulation studies are most valuable when they reveal **limitations**, not just successes. --- ## Summary Simulation-based Grid Search allows you to: - Evaluate causal discovery methods under controlled conditions - Use truth-based performance metrics responsibly - Understand sensitivity to modeling choices - Compare algorithms systematically It complements data-based Grid Search by answering *methodological* questions rather than applied ones. --- ## 🧭 Next Steps - Compare results across multiple simulations - Vary assumptions systematically - Use insights to guide applied analyses on real data