22. Random Conditional Independence Test (RCIT)

22.1. Summary

The Random Conditional Independence Test (RCIT) is a nonparametric, kernel-based conditional independence test designed to scale to large datasets.

RCIT approximates kernel conditional independence testing using random Fourier features, allowing it to retain much of the flexibility of kernel methods while being far more computationally efficient than KCI.

RCIT tests whether
X ⟂ Y | S
by estimating conditional cross-covariances in a randomized feature space.


22.2. When to use

RCIT is appropriate when:

  • Relationships may be nonlinear or non-Gaussian

  • Sample sizes are large, making KCI too slow

  • You want a general-purpose CI test that scales well in causal discovery

  • You need a practical alternative to kernel CI tests in algorithms such as PC, FCI, or FCIT

RCIT is often preferred over KCI for medium to large datasets due to its substantially lower computational cost.


22.3. Assumptions

  • Data are i.i.d. samples from a joint distribution over (X, Y, S)

  • Conditional independence can be captured via kernel embeddings

  • Random feature approximations are sufficiently accurate for the chosen feature dimension

As with all CI tests, results depend on sample size and parameter choices.


22.4. Test details (conceptual)

For each conditional independence query X ⟂ Y | S, RCIT:

  1. Uses random Fourier features to approximate Gaussian kernel mappings for X, Y, and S.

  2. Projects variables into a randomized finite-dimensional feature space.

  3. Estimates the conditional cross-covariance between X and Y given S in this feature space.

  4. Uses a chi-square or Gamma approximation to obtain a p-value for the null hypothesis of conditional independence.

By avoiding full kernel matrix construction, RCIT achieves substantial speedups compared to KCI.


22.5. Parameters

Parameter (camelCase)

Description

alpha

Significance level for the test. Smaller values make the test more conservative.

rcitNumFeatures

Number of random Fourier features used in the approximation. Larger values improve accuracy but increase runtime.

rcitRegularization

Regularization parameter added to covariance estimates for numerical stability.

rcitKernelScale

Bandwidth (scale) parameter for the underlying Gaussian kernel.

rcitUseGammaApproximation

If true, uses a Gamma approximation to the null distribution for faster computation.

In practice, the defaults provide a good balance between accuracy and speed.


22.6. Strengths

  • Captures nonlinear and non-Gaussian dependencies

  • Much faster than KCI for large sample sizes

  • Scales well to high-dimensional conditioning sets

  • Well suited for large-scale causal discovery


22.7. Limitations

  • Approximate: accuracy depends on the number of random features

  • Less sensitive than KCI in very small samples

  • Still more expensive than simple parametric CI tests

RCIT trades a small amount of statistical efficiency for a large gain in computational efficiency.


22.8. Relationship to other CI tests in Tetrad

  • KCI: More exact, more computationally expensive

  • RCIT: Approximate, scalable alternative to KCI

  • Fisher Z / Partial Correlation: Fast but assumes linear-Gaussian structure

  • Basis-function tests (BF-LRT, BF-BIC): Efficient under additive-noise or basis-expansion assumptions

RCIT is a good default choice when nonlinear structure is suspected but kernel tests are too slow.


22.9. References

  • Strobl, E. V., Zhang, K., & Visweswaran, S. (2019).
    Approximate kernel-based conditional independence tests for fast nonparametric causal discovery.
    Journal of Causal Inference, 7(1).

  • Original implementation adapted from the causal-learn Python package.