# IMaGES — Independent Multiple-sample Greedy Equivalence Search **Type:** Score-based (multi-dataset) **Output:** CPDAG IMaGES extends GES/FGES to the case where you have **multiple datasets** measured on the *same set of variables*, assuming that **all datasets share the same causal structure** even if their noise distributions differ. Typical use cases: - fMRI data from multiple subjects performing the same task - Multi-site or multi-scanner datasets - Repeated measurements or multiple sessions - Any situation with several independent datasets measuring the same variables IMaGES modifies the scoring step of GES/FGES: ## Key Idea For each candidate edge addition or removal, compute **one BIC score per dataset**, then **average the scores**: Score_IMaGES = (1/M) * Σ_{m=1}^M BIC_m This produces: - More stable structure estimates - Fewer false positives due to dataset-specific noise - The ability to use multiple small datasets without concatenating them The greedy search (forward + backward phases) proceeds just like FGES, but uses this averaged score. --- ## Variants ### IMaGES Uses **FGES** as the greedy engine with the IMaGES averaged score. Output: CPDAG. ### IMaGES-BOSS Uses **BOSS** as the search engine but applies the IMaGES score at each step. Often more scalable and robust than FGES-based IMaGES, especially for mixed data or larger models. --- ## When to Use - You have **multiple independent datasets** with identical variable sets - You believe the underlying **causal structure is shared** - You want a more **stable CPDAG estimate** than running searches separately - Particularly effective for: - fMRI studies - clinical multi-site data - repeated measurements - large cohorts with multiple sessions --- ## Prior Knowledge Support **IMaGES fully supports background knowledge.** You may connect a **Knowledge** box to the IMaGES Search box in the GUI, or supply a `Knowledge` object programmatically (Java, py-tetrad, or rpy-tetrad). IMaGES will enforce: - **Forbidden edges** - **Required edges** - **Tier/temporal ordering constraints** - Any other structural constraints supported by FGES/BOSS Because IMaGES performs a **score-based CPDAG search** (via FGES or BOSS) but aggregates scores across multiple datasets, **all knowledge constraints are applied uniformly to each scoring step**, ensuring that the pooled/averaged score respects the user’s assumptions across all datasets. If a constraint is incompatible with any dataset in the collection, IMaGES will still enforce it globally, just as standard FGES/BOSS would. --- ## Strengths - Highly stable compared to single-dataset GES/FGES - Avoids incorrect concatenation of datasets - Reduces sampling noise - Parallelizable (each dataset’s score can be computed independently) - Still outputs a clean, interpretable **CPDAG** --- ## Limitations - Requires the assumption of a **shared causal structure** across datasets - Not appropriate if datasets differ meaningfully in mechanisms - Cannot by itself handle latent confounding (use GFCI/BOSS-FCI/FCIT for that) --- ## Key Parameters in Tetrad IMaGES exposes several parameters that control both the FGES/BOSS search step and the multi-dataset scoring behavior. All parameters appear in the Tetrad GUI and scripting interfaces using the **camelCase** versions shown here. | Parameter (camelCase) | Description | |------------------------|-------------| | `penaltyDiscount` | Multiplies the BIC penalty term. Larger values → sparser graphs. | | `semBicStructurePrior` | Prior over graph structures used in SEM-BIC scoring. | | `semBicRule` | Choice of local scoring rule (e.g., “SEM-BIC Score Rule”). | | `precomputeCovariances` | Whether to cache covariances for efficiency across datasets. | | `singularityLambda` | Regularization constant applied when covariance matrices are near-singular. | | `effectiveSampleSize` | Overrides sample size used in scoring (used with weighting or meta-designs). | | `symmetricFirstStep` | Whether the initial forward step uses a symmetric score evaluation. | | `maxDegree` | Maximum allowed degree per node (structural regularization). | | `numThreads` | Number of threads for parallel FGES/BOSS operations. | | `faithfulnessAssumed` | If true, assumes perfect faithfulness during adjacency search. | | `timeLag` | Lag (τ) when IMaGES is applied to time-series or lagged datasets. | | `timeLagReplicatingGraph` | Whether lagged graphs should be structurally replicated across time slices. | | `randomSelectionSize` | Size of random subsets considered during order or edge evaluation. | | `imagesMetaAlg` | Which score-driven meta-algorithm to use for multi-dataset merging. | | `verbose` | Print detailed scoring/decision information during search. | These parameters allow IMaGES to: - share information across multiple datasets, - encourage stability and reduce noise in scoring, - scale to large multi-subject studies (e.g., neuroimaging), - and enforce structural constraints across datasets or time-slices. --- ## Reference Ramsey, J. D., Hanson, S. J., & Glymour, C. (2011). **Multi-subject search correctly identifies causal connections and most causal directions in the DCM models of the Smith et al. simulation study.** *NeuroImage*, 58(3), 838–848. --- ## Summary **IMaGES = GES/FGES/BOSS + averaged BIC scores across datasets.** A robust and stable multi-sample CPDAG algorithm for repeated-measure or multi-subject data.