Multivariate Normality Testing via 3D Projections

This tool explores a projection-based method for assessing multivariate normality. It compares the distribution of ellipsoid fit errors (Mean Squared Error - MSE) from random 3D projections of a test dataset against the distribution obtained from a reference Multivariate Normal (MVN) dataset.

Controls & Configuration

Dimensions (N):

Samples:

Tests:

Upload CSV (Numeric, No Header):

Show Details

3D Projection Visualization (Last Tested Projection)

Reference (Normal) MSE Distribution

Test (Non-Normal/Uploaded) MSE Dist.

Methodology Explained

This method leverages the property that **any linear projection of a multivariate normal (MVN) distribution is itself normal**. Departures from normality in the high-dimensional space often manifest as non-ellipsoidal shapes in lower-dimensional projections.

Workflow:

Generate or upload N-dimensional data. The "Normal" dataset serves as a **reference** for comparison.
For a chosen dataset, repeatedly perform the following 'test':

Randomly sample 3 dimensions.
Project the N-dimensional data onto these 3 dimensions.
Fit an ellipsoid to the 3D projected points (based on the sample covariance matrix).
Calculate a goodness-of-fit metric: the Mean Squared Error (MSE) between the points and the fitted ellipsoid surface.

Collect the MSE values from many tests (e.g., 100+).
Compare the **distribution** of MSE values from the test dataset against the reference Normal dataset's MSE distribution using visualization (histograms) and a statistical test (Mann-Whitney U).

Statistical Rationale & Advantages:

**Dimensionality Reduction:** Avoids the "curse of dimensionality" inherent in many high-dimensional tests.
**Sensitivity:** Can detect various types of departures from multivariate normality (e.g., skewness, multimodality, non-linear dependencies) that affect projection shapes.
**Comparison-Based:** The conclusion relies on comparing the test distribution to a known normal reference, rather than an absolute threshold.
**Visualization:** Provides intuitive visual feedback through the 3D plot and MSE histograms.

Generated Dataset Details:

Normal: Generated from an N-dimensional standard MVN distribution.
Non-Normal: Generated from a mixture of two MVN distributions with slightly different means and covariances to introduce non-normality.
Sample counts > 5000 may impact browser performance.

User Guide:

Use "Generate Datasets" first (adjust N/Samples if needed).
Run "Test Normal" to establish the reference MSE distribution.
Run "Test Non-Normal" or Upload and Test your own data.
Use "Compare & Conclude" for statistical comparison.
The Mann-Whitney U test assesses if the two MSE distributions (Reference vs. Test) are significantly different. A low p-value (< 0.05) suggests a significant difference, implying the test data deviates from the reference normal.
"New Projection" shows another random 3D view of the *last dataset tested*.