Data Types and Formatsο
Tetrad accepts several file formats for loading datasets, covariance matrices,
and correlation matrices. This page documents all currently supported formats.
The goal is to eliminate common parsing errors and clarify exactly what Tetrad expects.
1. Overview of Supported Formatsο
Tetrad supports:
Tabular data files (continuous, discrete, or mixed)
Covariance matrices (full square or legacy lower-triangular)
Correlation matrices (full square or legacy lower-triangular)
This page covers these data formats.
2. Dataset Format (Tabular Data)ο
Tabular datasets must follow this structure:
Header row: variable names
One data row per case
Values separated by tabs, commas, or spaces
Example:
X Y Z
1.2 5.0 3
0.9 4.8 3
1.4 5.1 2
Notesο
Missing values are written by default as:
*, though other options are also supported.Mixed data (continuous + discrete) is allowed.
Variable order in the header determines column order throughout the file.
It is important not to end a line with a delimiter, as this can cause parsing errors.
3. Discrete Dataο
Discrete data follows the same structure but the values must be integers or strings
representing categories.
Example:
A B C
0 1 yes
1 0 no
2 1 yes
4. Continuous Dataο
Continuous data uses the same tabular format, but all values should be numeric:
HEIGHT WEIGHT AGE
66.1 150.2 33
70.3 182.4 44
5. Covariance and Correlation Matricesο
Tetrad now supports two equivalent file formats for covariance and correlation matrices:
A full square matrix, preceded by a sample-size line and a variable-name line (current default)
A lower-triangular matrix preceded by a sample-size line and a variable-name line (still fully supported for backward compatibility)
Both formats use the same header:
Sample size (integer, on its own line)
Variable names (space-separated or tab-separated)
Matrix values (either full square or lower-triangular)
Any deviation from this structure will cause Tetrad to reject the file.
5.1 Required Structureο
A valid covariance/correlation file must contain:
Sample size (integer, on its own line)
Variable names (space-separated or tab-separated)
Matrix body, in one of the two supported shapes:
Here p is the number of variables (the number of names on the header line).
Lower-triangular format
Row 1: 1 value
Row 2: 2 values
β¦
Row p: p values
Full square format (current default)
Row 1: p values
Row 2: p values
β¦
Row p:
pvalues
5.2 Lower Triangle Covariance Matrix Exampleο
Below is a complete, valid lower-triangular covariance file.
This is a classic example that has been used by Tetrad historically.
164
ABILITY GPQ PREPROD QFJ SEX CITES PUBS
1.0
.62 1.0
.25 .09 1.0
.16 .28 .07 1.0
-.10 .00 .03 .10 1.0
.29 .25 .34 .37 .13 1.0
.18 .15 .19 .41 .43 .55 1.0
Important Characteristics of the Lower Triangular Formatο
Line 1: sample size (
164)Line 2: variable names (7 variables)
Lines 3β9: lower triangle of the covariance matrix
Whitespace before numbers is acceptable
Diagonal entries must be included (
1.0above)No commas (
1,0is invalid)No extra blank lines at the end
The number of rows in the lower triangle must equal the number of variables
It is important not to end a line with a delimiter, as this can cause parsing errors.
This format is still accepted for both covariance and correlation matrices.
5.3 Full Square Covariance Matrix Example (Current Default)ο
When a covariance matrix is exported from Tetrad, it is now written in full square format.
The same example as above, written as a full 7 Γ 7 matrix:
164
ABILITY GPQ PREPROD QFJ SEX CITES PUBS
1.00 0.62 0.25 0.16 -0.10 0.29 0.18
0.62 1.00 0.09 0.28 0.00 0.25 0.15
0.25 0.09 1.00 0.07 0.03 0.34 0.19
0.16 0.28 0.07 1.00 0.10 0.37 0.41
-0.10 0.00 0.03 0.10 1.00 0.13 0.43
0.29 0.25 0.34 0.37 0.13 1.00 0.55
0.18 0.15 0.19 0.41 0.43 0.55 1.00
You may use tabs, spaces, or the chosen delimiter; the key requirement is that:
The matrix is square, with exactly p values on each of p lines,
The order of variables matches the header line.
It is important not to end a line with a delimiter, as this can cause parsing errors.
5.4 Correlation Matricesο
Correlation matrices use the same two formats as covariance matrices:
Full square matrix (current export default)
Lower triangle (still accepted)
Example (full square):
500
X Y Z
1.0 0.20 -0.10
0.20 1.0 0.35
-0.10 0.35 1.0
Lower-triangular example:
500
X Y Z
1.0
.20 1.0
-.10 .35 1.0
5.5 Common Parsing Errors for Covariance/Correlation Filesο
Users often encounter:
Missing sample size (first line must be an integer)
Variable names not matching matrix dimensions
For square format: not exactly p values per row, or not exactly p rows
For lower-triangular format: wrong number of values in a given row
Mixing square and triangular conventions in a single file
Trailing blank lines
Use of commas instead of decimal points
Extra text or comments in the matrix region
Each of these produces a βCould not parse covariance matrixβ error.
6. Lower-Triangular Formatο
The lower-triangular matrix format is kept as a legacy option:
All existing covariance and correlation files using the old lower-triangular convention
will continue to load without modification, provided the user selects the Lower Triangular option.New users are encouraged to use the square format, which aligns with conventions in R and Python.
Internally, Tetrad treats both formats as representing the same symmetric matrix.
There is no need to convert existing lower-triangular files unless you prefer the square style.
6.1 Note on GUI Displayο
In the Tetrad GUI, covariance and correlation matrices are currently displayed using only the lower triangle, even if they were loaded from a full square matrix file. This is purely a display choice:
The matrix is still treated as a full symmetric covariance/correlation matrix internally.
Only the lower triangle is shown to reduce redundancy and to emphasize that the object is a covariance or correlation matrix.
This behavior is independent of whether the file itself was in square or lower-triangular format.
7. Exporting Data from Tetradο
Tetrad can export:
Tabular datasets
Covariance matrices
Correlation matrices
Export behavior:
Covariance and correlation matrices are now exported in full square format by default.
The lower-triangular format is no longer used for export, but is still supported for import.
This means:
You can load either full square or lower-triangular files.
When you save from Tetrad, you will get the square form.
8. Summaryο
Format Type |
Supported? |
Notes |
|---|---|---|
Tabular data |
β |
Standard space/tab/comma-separated file |
Lower Triangle covariance/correlation matrix |
β (legacy) |
Fully supported for backward compatibility |
Full covariance/correlation square matrix |
β |
Current default import/export format |
Graph formats |
β |
See graph file section |