Partial least squares (PLS) regression is a well-known method to
find the relationship between predictor variables X and dependent
variables y. In a PLS model, not only the variance of X, but also the
covariance between X and y is taken into account. Therefore, the
central point of PLS is to find latent variables in the feature space
that have a maximum covariance with y. PLSDA is a variant of
PLS to improve the separation between classes using a categorical
response variable y. In this study, X is the matrix of gene expression
values and the values of y are given as 1 and 1 for positive
and negative class, respectively. Each row of X matrix represents
the gene expression values of all the genes for each sample, and
each column corresponds to the gene expression values of all samples
for a gene. PLSDA is used for modeling the genes expression
data (X) and the response variable (y) using the training set. In
the calculations, the optimal latent variable (LV) number used in
the modeling is determined by Monte Carlo cross validation
(MCCV). In the prediction, the samples with predicted values above
zero are ascribed to positive class, otherwise to negative class. The
parameters of accuracy (Acc), precision (P), recall (R) and F-measure
(F) are used to evaluate the classification effect.
Partial least squares (PLS) regression is a well-known method tofind the relationship between predictor variables X and dependentvariables y. In a PLS model, not only the variance of X, but also thecovariance between X and y is taken into account. Therefore, thecentral point of PLS is to find latent variables in the feature spacethat have a maximum covariance with y. PLSDA is a variant ofPLS to improve the separation between classes using a categoricalresponse variable y. In this study, X is the matrix of gene expressionvalues and the values of y are given as 1 and 1 for positiveand negative class, respectively. Each row of X matrix representsthe gene expression values of all the genes for each sample, andeach column corresponds to the gene expression values of all samplesfor a gene. PLSDA is used for modeling the genes expressiondata (X) and the response variable (y) using the training set. Inthe calculations, the optimal latent variable (LV) number used inthe modeling is determined by Monte Carlo cross validation(MCCV). In the prediction, the samples with predicted values abovezero are ascribed to positive class, otherwise to negative class. Theparameters of accuracy (Acc), precision (P), recall (R) and F-measure(F) are used to evaluate the classification effect.
การแปล กรุณารอสักครู่..