normalize.features.Rd
This function performs feature normalization according to user-specified parameters.
an object of class siamcat-class
string, normalization method, can be one of these:
c('rank.unit', 'rank.std', 'log.std', 'log.unit', 'log.clr','std',
'pass')
list, specifying the parameters of the different normalization methods, see Details for more information
string, on which type of features should the function
work? Can be either "original"
, "filtered"
, or
"normalized"
. Please only change this paramter if you know what
you are doing!
integer, control output: 0
for no output at all,
1
for only information about progress and success, 2
for
normal level of information and 3
for full debug information,
defaults to 1
an object of class siamcat-class with normalized features
There are seven different normalization methods available, which
might need additional parameters, which are passed via the norm.param
list:
'rank.unit'
- converts features to ranks and normalizes
each column (=sample) by the square root of the sum of ranks
This method does not require additional parameters.
'rank.std'
- converts features to ranks and applies z-score
standardization.
This method requires sd.min.q
(minimum quantile of the standard
deviation to be added to all features in order to avoid underestimation of
standard deviation) as additional parameter.
'log.clr'
- centered log-ratio transformation.
This methods requires a pseudocount (log.n0
) before
log-transformation.
'log.std'
- log-transforms features and applies z-score
standardization.
This method requires both a pseudocount (log.n0
) and sd.min.q
'log.unit'
- log-transforms features and normalizes by
features or samples with different norms.
This method requires a pseudocount (log.n0
) and then additionally the
parameters norm.maring
(margin over which to normalize, similarly to
the apply
-syntax: Allowed values are 1
for normalization
over features, 2
over samples, and 3
for normalization
by the global maximum) and the parameter n.p
(vector norm to be
used, can be either 1
for x/sum(x)
or 2
for
x/sqrt(sum(x^2))
).
'std'
- z-score standardization without any other
transformation
This method only requires the sd.min.q
parameter
'pass'
- pass-through normalization will not change
the features
The function additionally allows to perform a frozen normalization on a
different dataset. After normalizing the first dataset, the norm_feat
slot in the SIAMCAT object contains all parameters of the normalization,
which you can access via the norm_params accessor.
In order to perform a frozen normalization of a new dataset, you can run the
function supplying the normalization parameters as argument to
norm.param
:
norm.param=norm_params(siamcat_reference)
. See also the example below.
# Example data
data(siamcat_example)
# Simple example
siamcat_norm <- normalize.features(siamcat_example,
norm.method='rank.unit')
#> Features normalized successfully.
# log.unit example
siamcat_norm <- normalize.features(siamcat_example,
norm.method='log.unit',
norm.param=list(log.n0=1e-05, n.p=1, norm.margin=1))
#> Features normalized successfully.
# log.std example
siamcat_norm <- normalize.features(siamcat_example,
norm.method='log.std',
norm.param=list(log.n0=1e-05, sd.min.q=.1))
#> Features normalized successfully.
# Frozen normalization
# normalize the object siamcat with the same parameters as used in
# siamcat_reference
#
# this is not run
# siamcat_norm <- normalize.features(siamcat,
# norm.param=norm_params(siamcat_reference))