Principal Component Analysis


Sébastien MEIGE avatar
Written by Sébastien MEIGE
Updated over a week ago

Introduction to CPA

PCA is a statistical method, related to the family of multivariate statistics, based on the reduction of dimensions whose objective is to understand what discriminates between groups of individuals.

It consists in studying multidimensional data by transforming them into new decorrelated variables. It is these new variables that we call the principal components or principal axes.

One of the advantages of PCA is to allow visualizations of multidimensional data in a 2-dimensional plane, while minimizing the loss of information, called in statistic inertia.

Interests of the CPA

PCA is a very visual statistical method allowing:

  • both to study the correlations between variables,

  • also to determine homogeneous groups of individuals within the same group and differentiated between different groups.

It can be applied to many fields, and allows to obtain visual and easily explicable results in terms of interpretation.

Application of PCA to the seroprevalence study SEROCOV56

In the SEROCOV56 study, PCR was used to observe the differences in positive tests between different kits.

One of the underlying interpretations being to approach a form of appearance kinetics between the test results.

Before applying PCA, the optical densities were normalized in order to be able to compare them with each other.

The results obtained for the graph of the correlations of the main axes were as follows:

Starting from the dial at the top right to the dial at the bottom right, we observe a form of appearance kinetics (identical to that observed during the analysis of the concordance tests of the same study) , namely:

1. IgM,

2. IgA,

3. IgG "Spike",

4. IgG "Nucleocapsid".

These first results seem to indicate a kinetics of appearance (IgM > IgA > IgG "Spike" > IgG "Nucleocapsid"), leading to the interest of using all the kits .


In our case, the use of PCA is not associated with a risk factor analysis, by association with variables of interest and which can be obtained via a regression at reduced ranks (RRR method).

For more information on PCA and data analysis

Did this answer your question?