Perform feature normalization — normalize.features • SIAMCAT

This function performs feature normalization according to user-specified parameters.

normalize.features(siamcat, norm.method = c("rank.unit", "rank.std",
"log.std", "log.unit", "log.clr", "std", "pass"), 
norm.param = list(log.n0 = 1e-06, sd.min.q = 0.1, n.p = 2, norm.margin = 1),
feature.type='filtered', verbose = 1)

Arguments

siamcat: an object of class siamcat-class
norm.method: string, normalization method, can be one of these: c('rank.unit', 'rank.std', 'log.std', 'log.unit', 'log.clr','std', 'pass')
norm.param: list, specifying the parameters of the different normalization methods, see Details for more information
feature.type: string, on which type of features should the function work? Can be either "original", "filtered", or "normalized". Please only change this paramter if you know what you are doing!
verbose: integer, control output: 0 for no output at all, 1 for only information about progress and success, 2 for normal level of information and 3 for full debug information, defaults to 1

Value

an object of class siamcat-class with normalized features

Implemented methods

There are seven different normalization methods available, which might need additional parameters, which are passed via the norm.param list:

'rank.unit' - converts features to ranks and normalizes each column (=sample) by the square root of the sum of ranks This method does not require additional parameters.
'rank.std' - converts features to ranks and applies z-score standardization. This method requires sd.min.q (minimum quantile of the standard deviation to be added to all features in order to avoid underestimation of standard deviation) as additional parameter.
'log.clr' - centered log-ratio transformation. This methods requires a pseudocount (log.n0) before log-transformation.
'log.std' - log-transforms features and applies z-score standardization. This method requires both a pseudocount (log.n0) and sd.min.q
'log.unit' - log-transforms features and normalizes by features or samples with different norms. This method requires a pseudocount (log.n0) and then additionally the parameters norm.maring (margin over which to normalize, similarly to the apply-syntax: Allowed values are 1 for normalization over features, 2 over samples, and 3 for normalization by the global maximum) and the parameter n.p (vector norm to be used, can be either 1 for x/sum(x) or 2 for x/sqrt(sum(x^2))).
'std' - z-score standardization without any other transformation This method only requires the sd.min.q parameter
'pass' - pass-through normalization will not change the features

Frozen normalization

The function additionally allows to perform a frozen normalization on a different dataset. After normalizing the first dataset, the norm_feat slot in the SIAMCAT object contains all parameters of the normalization, which you can access via the norm_params accessor.

In order to perform a frozen normalization of a new dataset, you can run the function supplying the normalization parameters as argument to norm.param: norm.param=norm_params(siamcat_reference). See also the example below.

Examples

# Example data
data(siamcat_example)

# Simple example
siamcat_norm <- normalize.features(siamcat_example,
    norm.method='rank.unit')
#> Features normalized successfully.

# log.unit example
siamcat_norm <- normalize.features(siamcat_example,
    norm.method='log.unit',
    norm.param=list(log.n0=1e-05, n.p=1, norm.margin=1))
#> Features normalized successfully.

# log.std example
siamcat_norm <- normalize.features(siamcat_example,
    norm.method='log.std',
    norm.param=list(log.n0=1e-05, sd.min.q=.1))
#> Features normalized successfully.

# Frozen normalization
# normalize the object siamcat with the same parameters as used in 
# siamcat_reference
# 
# this is not run
# siamcat_norm <- normalize.features(siamcat,
#   norm.param=norm_params(siamcat_reference))