OVERVIEW OF MULTIVARIATE STATISTICAL DATA ANALYSIS


 

EXPLORATORY/DESCRIPTIVE
MULTIVARIATE ANALYSIS

    CONFIRMATORY/INFERENTIAL MULTIVARIATE ANALYSIS
  DATA MINING     DATA CRAFTING
         
    DATA EXPLORATION
PATTERN RECOGNITION
      STATISTICAL INFERENCE
         
      LOOKING FOR PATTERNS         TEST HYPOTHESES
      EXPLORING RELATIONSHIPS         FIT & TEST THE MODELS
         
      FORM HYPOTHESES      
      SELECT MODELS      

PATTERN RECOGNITION
  UNSUPERVISED
(NO PRIOR KNOWLEDGE)
SUPERVISED
( PRIOR KNOWLEDGE)
PATTERNS OF "SIMILARITY" BETWEEN VARIABLES ORDINATION
PCA
FACTOR ANALYSIS
DISCRIMINANT ANALYSIS
GLM
REGRESSION
PATH ANALYSIS
PATTERNS OF "SIMILARITY" BETWEEN INDIVIDUALS

ORDINATION
PRINCIPAL COMPONENT ANALYSIS
MDS
CLUSTER ANALYSIS

 
 

 

MULTIVARIATE TECHNIQUE
EXPLORATORY VS CONFIRMATORY
DATA TYPES
Dependent/Independent
USE

Basic Numerical Multivariate Data Exploration

 

Sample Mean Vector
Sample Covariance
Sample Correlation

 

EXPLORATORY

Interval, ratio

* Data exploration, description, understanding relationships

Bssic Graphical Multivariate Data Exploration

 

The scatterplot

EXPLORATORY

interval, ratio/interval, ratio

* Data exploration, description, understanding relationships

Scatterplot Matrix

interval, ratio/interval, ratio

* Assessment of many bivariate relationship at the same time

Enhanced Scatterplots

 

* Add of univariate behaviour (boxplots, histograms, density estimates)
* Simplify functional relation (data smoothing)
* Summarize bivariate behaviour (bi-boxplots)

Coplots and Trellis Graphics

interval, ratio/any

* Understand Conditional joint relationship of two variables given another set of variables (coplots)
* Understand higher dimensional dependence structure by using lower dimensional graphs (trellis graphics)

Probability Plots

interval, ratio

* Check distributional assumptions

Other Plots: Star plots, Chernoff's Faces etc.

interval, ratio/any

* View the multivariate data in a easier way to understand

Principal Components Analysis

 

 

EXPLORATORY

interval, ratio

* Reduce the dimension of the data, deal with less number of variables
* Seek one- or two- dimensional projection of the data that maximizes some measure of "interestingness" (Projection Pursuit)
* Ease the interpretation

Correspondence  Analysis

    EXPLORATORY

nominal,ordinal/nominal, ordinal

* Display the association among a set of categorical variables in a type of scatterplot or map.
* Obtain low dimensional representation of multivariate categorical data

Multidimensional Scaling (MDS)

    EXPLORATORY

any/any

* Extract a structure in observed proximity matrces
* Identify the dimension on which the subjects make their similarity judgements

Cluster Analysis

    EXPLORATORY

any/any

* Classification of individuals to clusters

 

The Generalized Linear Models (GLM)

    CONFIRMATORY

interval, ratio/any

* Predict and/or explain the relationship between explanatory and response variables linearly.

Regression and MANOVA

    CONFIRMATORY   * Explain the relationship between explanatory and response variables by using GLM with identity link function and a normal error term

Log-Linear and Logistic Models

 

 

CONFIRMATORY

nominal, ordinal/nominal, ordinal

* Examine the relationship between categorical variables

Multivariate Response Models

 

Repeated Measures

CONFIRMATORY  

* Predict multivariate response, not only single response given multiple explanatory variables

Random Effects

Logistic Models

Marginal Models for Binary Response

Marginal Modelling

Generalized Random Effects

Discrimination, Classification, and Pattern Recognition

 

Allocation Rules

CONFIRMATORY   * For known groups, devise rules which can allocate previously unclassified objects or individuals into these groups

Logistic Discrimination

Pattern Recognition, Neural Networks

Exploratory Factor Analysis

    EXPLORATORY

interval, ratio

* Investigate the relationship between measured/manifest variables and factors without making any prior assumptions about which manifest variables are related with to which factors

Confirmatory Factor Analysis
    CONFIRMATORY

interval, ratio

* Test a specific factor structure in which particular manifest variables relate to particular factors

NOTE: Factor analysis postulates a model for the data, PCA does not

Covariance Structure Models
  Path Analysis CONFIRMATORY

interval, ratio

* Design FA model in which particular manifest variables are allowed to relate to particular latent variables

 

 


Engin A. Sungur, Spring 2005