# Regression box The **Regression box** is used to estimate numerical relationships between variables *after* a causal model (or adjustment strategy) has been selected. Unlike search or graph-editing boxes, the Regression box does **not** modify the graph. Instead, it provides regression-based summaries that help interpret causal effects implied by the current model. The tools in this box are primarily used to: - estimate regression coefficients for selected variables, - compute total causal effects using valid adjustment sets, - sanity-check effect estimates under different graph orientations. The Regression box currently provides four main tools: - Multiple Linear Regression\ - Logistic Regression\ - Adjustment Total Effects\ - IDA Check Each is described below. ------------------------------------------------------------------------ ## Multiple Linear Regression **Purpose**\ Estimate linear relationships between a continuous outcome variable and one or more predictor variables. **Model** For outcome Y and predictors X₁,...,Xₖ, the fitted model is: Y = β₀ + β₁X₁ + ... + βₖXₖ + ε, where ε is an error term. **Typical use** - Exploratory analysis of associations. - Baseline comparison for causal effect estimates. - Regression-based effect summaries after choosing a graph. **Output** - Estimated coefficients (β values). - Standard errors and test statistics. - Model fit diagnostics (when available). **Notes** - This tool estimates *associational* regressions unless the predictors and covariates are chosen using a valid adjustment set. - Interpretation as a causal effect requires appropriate adjustment for confounding. ------------------------------------------------------------------------ ## Logistic Regression **Purpose**\ Estimate the effect of predictors on a binary outcome variable. **Model** For a binary outcome Y ∈ {0,1}, the model is: logit(P(Y=1)) = β₀ + β₁X₁ + ... + βₖXₖ. **Typical use** - Modeling binary outcomes (e.g., success/failure). - Estimating log-odds effects under adjustment. **Output** - Regression coefficients on the log-odds scale. - Standard errors and test statistics. **Notes** - As with linear regression, causal interpretation requires appropriate covariate adjustment. - Coefficients represent changes in log-odds, not probabilities. ------------------------------------------------------------------------ ## Adjustment Total Effects **Purpose**\ Estimate **total causal effects** of one or more treatment variables on an outcome variable using valid adjustment sets derived from the graph. **Conceptual approach** 1. A valid adjustment set is identified using the graph. 2. A regression model is fit with the treatment(s) and adjustment variables as predictors. 3. The coefficient(s) corresponding to the treatment variable(s) are reported as total effect estimates. **Typical use** - Estimating causal effects implied by a DAG or PAG. - Comparing effect sizes across different adjustment sets. - Handling single or multiple treatments. **Notes** - This tool relies on standard regression models (linear or logistic, depending on the outcome). - The validity of the effect estimates depends on the correctness of the adjustment set. *(See the Adjustment Total Effects detail page for full definitions and examples.)* ------------------------------------------------------------------------ ## IDA Check **Purpose**\ Evaluate the stability of effect estimates across Markov-equivalent graphs using the IDA (Intervention Calculus when the DAG is Absent) framework. **Conceptual approach** - For each DAG consistent with the estimated equivalence class: - Compute a valid adjustment set. - Estimate the causal effect using regression. - Compare the resulting effect estimates across DAGs. **Typical use** - Assess sensitivity of effect estimates to graph uncertainty. - Identify effects that are robust across equivalence classes. **Output** - A range or set of effect estimates rather than a single number. **Notes** - IDA Check does not assert that any single estimate is correct. - Instead, it highlights how much conclusions may vary given graph ambiguity. *(See the IDA Check detail page for definitions and interpretation.)* ------------------------------------------------------------------------ ## Interpretation and workflow notes - The Regression box is typically used **after** graph estimation or adjustment set selection. - Results should be interpreted in light of the assumptions encoded in the graph. - Regression estimates are only causal when the chosen covariates block all backdoor paths. ------------------------------------------------------------------------ ## Summary The Regression box provides regression-based tools for translating causal structure into numerical effect estimates. It bridges the gap between graphical causal models and quantitative interpretation, while making explicit the assumptions required for causal claims.