14. Kernel Conditional Independence Test (KCI)

14.1. Summary

The Kernel Conditional Independence (KCI) test is a nonparametric CI test based on reproducing kernel Hilbert spaces (RKHS). It detects whether X is independent of Y given S by embedding the joint distributions into RKHSs and measuring conditional cross-covariances.

14.2. When to use

Data may involve complex nonlinear relationships and non-Gaussian distributions.
You want a general-purpose nonparametric CI test, without assuming a specific parametric model.
Sample sizes are moderate (KCI is computationally more intensive than simple parametric tests).

14.3. Assumptions

Data are i.i.d. samples from a joint distribution over (X, Y, S).
The chosen kernels (e.g., Gaussian RBF) are characteristic enough to capture dependencies.
Kernel bandwidths and regularization parameters are reasonably tuned.

14.4. Test details (conceptual)

For each X ⟂ Y | S query, the KCI test:

Constructs kernel matrices for X, Y, and S using a chosen kernel (such as Gaussian RBF).
Forms an estimator of the conditional cross-covariance operator of X and Y given S in the RKHS.
Uses a trace or norm of this operator as the test statistic.
Calibrates the null distribution using asymptotic approximations, permutations, or resampling to obtain a p-value.

14.5. Parameters

Parameter (camelCase)	Description
`kciUseApproximation`	Boolean. If `true`, uses the Gamma approximation algorithm from Zhang et al. (2012); if `false`, uses the exact (more computationally intensive) procedure.
`alpha`	Significance level for the test. The null hypothesis is conditional independence; smaller values make the test more conservative (fewer rejections).
`scalingFactor`	Scaling factor for the Gaussian kernel. Larger values effectively broaden the kernel; smaller values make it more localized. oai_citation:0‡parameter.definitions.md
`kciNumBootstraps`	Number of bootstrap samples used for the KCI null distribution (Theorem 4 and Proposition 5 in Zhang et al. 2012). Must be a positive integer. oai_citation:1‡parameter.definitions.md
`kciEpsilon`	Small positive epsilon used in Proposition 5 of Zhang et al. (2012) to regularize the test statistic. Typically left at its default unless you know you need to tune it. oai_citation:2‡parameter.definitions.md
`kernelType`	Which kernel to use: `1 = Gaussian`, `2 = Linear`, `3 = Polynomial`. Controls the feature space in which independence is tested. oai_citation:3‡parameter.definitions.md
`polynomialDegree`	For the polynomial kernel: the degree (order) of the polynomial feature map. Higher degree emphasizes higher-order interactions. oai_citation:4‡parameter.definitions.md
`polynomialConstant`	For the polynomial kernel: the additive constant term controlling the tradeoff between lower-order and higher-order terms. oai_citation:5‡parameter.definitions.txt

14.6. Strengths

Very flexible: can capture highly nonlinear and non-Gaussian patterns of dependence.
Suitable as a gold-standard nonparametric CI test for small- to moderate- sized problems.

14.7. Limitations

Computationally expensive for large sample sizes (N) or many CI tests.
Requires kernel and hyperparameter choices; performance can be sensitive to bandwidth settings.
Less scalable than basis-function or parametric tests for large-scale graph discovery.

14.8. References

Zhang, K., Peters, J., Janzing, D., & Schölkopf, B. (2012). Kernel-based conditional independence test. Journal of Machine Learning Research, 13, 555–587.