30. IMaGES β Independent Multiple-sample Greedy Equivalence Searchο
Type: Score-based (multi-dataset)
Output: CPDAG
IMaGES extends GES/FGES to the case where you have multiple datasets measured on the same set of variables, assuming that all datasets share the same causal structure even if their noise distributions differ.
Typical use cases:
fMRI data from multiple subjects performing the same task
Multi-site or multi-scanner datasets
Repeated measurements or multiple sessions
Any situation with several independent datasets measuring the same variables
IMaGES modifies the scoring step of GES/FGES:
30.1. Key Ideaο
For each candidate edge addition or removal, compute one BIC score per dataset, then average the scores:
Score_IMaGES = (1/M) * Ξ£_{m=1}^M BIC_m
This produces:
More stable structure estimates
Fewer false positives due to dataset-specific noise
The ability to use multiple small datasets without concatenating them
The greedy search (forward + backward phases) proceeds just like FGES, but uses this averaged score.
30.2. Variantsο
30.2.1. IMaGESο
Uses FGES as the greedy engine with the IMaGES averaged score.
Output: CPDAG.
30.2.2. IMaGES-BOSSο
Uses BOSS as the search engine but applies the IMaGES score at each step.
Often more scalable and robust than FGES-based IMaGES, especially for mixed data or larger models.
30.3. When to Useο
You have multiple independent datasets with identical variable sets
You believe the underlying causal structure is shared
You want a more stable CPDAG estimate than running searches separately
Particularly effective for:
fMRI studies
clinical multi-site data
repeated measurements
large cohorts with multiple sessions
30.4. Prior Knowledge Supportο
IMaGES fully supports background knowledge.
You may connect a Knowledge box to the IMaGES Search box in the GUI, or supply a Knowledge object programmatically (Java, py-tetrad, or rpy-tetrad). IMaGES will enforce:
Forbidden edges
Required edges
Tier/temporal ordering constraints
Any other structural constraints supported by FGES/BOSS
Because IMaGES performs a score-based CPDAG search (via FGES or BOSS) but aggregates scores across multiple datasets, all knowledge constraints are applied uniformly to each scoring step, ensuring that the pooled/averaged score respects the userβs assumptions across all datasets.
If a constraint is incompatible with any dataset in the collection, IMaGES will still enforce it globally, just as standard FGES/BOSS would.
30.5. Strengthsο
Highly stable compared to single-dataset GES/FGES
Avoids incorrect concatenation of datasets
Reduces sampling noise
Parallelizable (each datasetβs score can be computed independently)
Still outputs a clean, interpretable CPDAG
30.6. Limitationsο
Requires the assumption of a shared causal structure across datasets
Not appropriate if datasets differ meaningfully in mechanisms
Cannot by itself handle latent confounding (use GFCI/BOSS-FCI/FCIT for that)
30.7. Key Parameters in Tetradο
IMaGES exposes several parameters that control both the FGES/BOSS search step and the multi-dataset scoring behavior. All parameters appear in the Tetrad GUI and scripting interfaces using the camelCase versions shown here.
Parameter (camelCase) |
Description |
|---|---|
|
Multiplies the BIC penalty term. Larger values β sparser graphs. |
|
Prior over graph structures used in SEM-BIC scoring. |
|
Choice of local scoring rule (e.g., βSEM-BIC Score Ruleβ). |
|
Whether to cache covariances for efficiency across datasets. |
|
Regularization constant applied when covariance matrices are near-singular. |
|
Overrides sample size used in scoring (used with weighting or meta-designs). |
|
Whether the initial forward step uses a symmetric score evaluation. |
|
Maximum allowed degree per node (structural regularization). |
|
Number of threads for parallel FGES/BOSS operations. |
|
If true, assumes perfect faithfulness during adjacency search. |
|
Lag (Ο) when IMaGES is applied to time-series or lagged datasets. |
|
Whether lagged graphs should be structurally replicated across time slices. |
|
Size of random subsets considered during order or edge evaluation. |
|
Which score-driven meta-algorithm to use for multi-dataset merging. |
|
Print detailed scoring/decision information during search. |
These parameters allow IMaGES to:
share information across multiple datasets,
encourage stability and reduce noise in scoring,
scale to large multi-subject studies (e.g., neuroimaging),
and enforce structural constraints across datasets or time-slices.
30.8. Referenceο
Ramsey, J. D., Hanson, S. J., & Glymour, C. (2011).
Multi-subject search correctly identifies causal connections and most causal directions in the DCM models of the Smith et al. simulation study.
NeuroImage, 58(3), 838β848.
30.9. Summaryο
IMaGES = GES/FGES/BOSS + averaged BIC scores across datasets.
A robust and stable multi-sample CPDAG algorithm for repeated-measure or multi-subject data.