This function performs unsupervised feature filtering. Features can be filtered based on abundance, prevalence, or on variance. Additionally, unmapped reads may be removed.
filter.features(siamcat, filter.method = "abundance", cutoff = 0.001, rm.unmapped = TRUE, feature.type='original', verbose = 1)
an object of class siamcat-class
string, method used for filtering the features, can be
one of these:
float, abundace, prevalence, or variance cutoff, defaults
boolean, should unmapped reads be discarded?, defaults to
string, on which type of features should the function
work? Can be either
integer, control output:
siamcat an object of class siamcat-class
This function filters the features in a siamcat-class object in a unsupervised manner.
The different filter methods work in the following way:
'abundace' - remove features whose maximum abundance is
never above the threshold value in any of the samples
'cum.abundance' - remove features with very low abundance
in all samples, i.e. those that are never among the most abundant
entities that collectively make up (1-cutoff) of the reads in
'prevalence' - remove features with low prevalence across
samples, i.e. those that are undetected (relative abundance of 0)
in more than
1 - cutoff percent of samples.
'variance' - remove features with low variance across
samples, i.e. those that have a variance lower than
'pass' - pass-through filtering will not change the
Features can also be filtered repeatedly with different methods, e.g. first using the maximum abundance filtering and then using prevalence filtering. However, if a filtering method has already been applied to the dataset, SIAMCAT will default back on the original features for filtering.
# Example dataset data(siamcat_example) # Simple examples siamcat_filtered <- filter.features(siamcat_example, filter.method='abundance', cutoff=1e-03)#># 5% prevalence filtering siamcat_filtered <- filter.features(siamcat_example, filter.method='prevalence', cutoff=0.05)#>