This function provides a wrapper around umap::umap()
that exposes
the umap defaults as function arguments.
Usage
UMAP(x, ...)
# S3 method for class 'pca'
UMAP(
x,
n_neighbors = 15,
n_components = 2,
metric = "euclidean",
n_epochs = 200,
input = "data",
init = "spectral",
min_dist = 0.1,
set_op_mix_ratio = 1,
local_connectivity = 1,
bandwidth = 1,
alpha = 1,
gamma = 1,
negative_sample_rate = 5,
a = NA,
b = NA,
spread = 1,
random_state = NA,
transform_state = NA,
knn = NA,
knn_repeats = 1,
verbose = FALSE,
umap_learn_args = NA
)
# S3 method for class 'prcomp'
UMAP(
x,
metadata = NULL,
n_neighbors = 15,
n_components = 2,
metric = "euclidean",
n_epochs = 200,
input = "data",
init = "spectral",
min_dist = 0.1,
set_op_mix_ratio = 1,
local_connectivity = 1,
bandwidth = 1,
alpha = 1,
gamma = 1,
negative_sample_rate = 5,
a = NA,
b = NA,
spread = 1,
random_state = NA,
transform_state = NA,
knn = NA,
knn_repeats = 1,
verbose = FALSE,
umap_learn_args = NA
)
# S3 method for class 'matrix'
UMAP(
x,
metadata = NULL,
n_neighbors = 15,
n_components = 2,
metric = "euclidean",
n_epochs = 200,
input = "data",
init = "spectral",
min_dist = 0.1,
set_op_mix_ratio = 1,
local_connectivity = 1,
bandwidth = 1,
alpha = 1,
gamma = 1,
negative_sample_rate = 5,
a = NA,
b = NA,
spread = 1,
random_state = NA,
transform_state = NA,
knn = NA,
knn_repeats = 1,
verbose = FALSE,
umap_learn_args = NA
)
# S3 method for class 'data.frame'
UMAP(
x,
metadata = NULL,
n_neighbors = 15,
n_components = 2,
metric = "euclidean",
n_epochs = 200,
input = "data",
init = "spectral",
min_dist = 0.1,
set_op_mix_ratio = 1,
local_connectivity = 1,
bandwidth = 1,
alpha = 1,
gamma = 1,
negative_sample_rate = 5,
a = NA,
b = NA,
spread = 1,
random_state = NA,
transform_state = NA,
knn = NA,
knn_repeats = 1,
verbose = FALSE,
umap_learn_args = NA
)
# S3 method for class 'dist'
UMAP(
x,
metadata = NULL,
n_neighbors = 15,
n_components = 2,
metric = "euclidean",
n_epochs = 200,
input = "dist",
init = "spectral",
min_dist = 0.1,
set_op_mix_ratio = 1,
local_connectivity = 1,
bandwidth = 1,
alpha = 1,
gamma = 1,
negative_sample_rate = 5,
a = NA,
b = NA,
spread = 1,
random_state = NA,
transform_state = NA,
knn = NA,
knn_repeats = 1,
verbose = FALSE,
umap_learn_args = NA
)
Arguments
- x
PCA object, prcomp object, or numeric matrix/data.frame that can be converted to a numeric matrix
- n_neighbors
Number of nearest neighbors. Default 15
- n_components
Dimension of target (output) space. Default 2
- metric
character or function; determines how distances between data points are computed. When using a string, available metrics are: euclidean, manhattan. Other available generalized metrics are: cosine, pearson, pearson2. Note the triangle inequality may not be satisfied by some generalized metrics, hence knn search may not be optimal. When using metric.function as a function, the signature must be function(matrix, origin, target) and should compute a distance between the origin column and the target columns. Default "euclidean"
- n_epochs
Number of iterations performed during layout optimization. Default 200
- input
character, use either "data" or "dist"; determines whether the primary input argument to umap() is treated as a data matrix or as a distance matrix. Default "data"
- init
character or matrix. The default string "spectral" computes an initial embedding using eigenvectors of the connectivity graph matrix. An alternative is the string "random", which creates an initial layout based on random coordinates. This setting.can also be set to a matrix, in which case layout optimization begins from the provided coordinates. Default "spectral"
- min_dist
numeric; determines how close points appear in the final layout. Default 0.1
- set_op_mix_ratio
numeric in range [0,1]; determines who the knn-graph is used to create a fuzzy simplicial graph. Default 1
- local_connectivity
numeric; used during construction of fuzzy simplicial set. Default 1
- bandwidth
numeric; used during construction of fuzzy simplicial set. Default 1
- alpha
numeric; initial value of "learning rate" of layout optimization. Default 1
- gamma
numeric; determines, together with alpha, the learning rate of layout optimization. Default 1
- negative_sample_rate
integer; determines how many non-neighbor points are used per point and per iteration during layout optimization. Default 5
- a
numeric; contributes to gradient calculations during layout optimization. When left at NA, a suitable value will be estimated automatically. Default NA
- b
numeric; contributes to gradient calculations during layout optimization. When left at NA, a suitable value will be estimated automatically. Default NA
- spread
numeric; used during automatic estimation of a/b parameters. Default 1
- random_state
integer; seed for random number generation used during umap(). Default NA
- transform_state
nteger; seed for random number generation used during predict(). Default NA
- knn
object of class umap.knn; precomputed nearest neighbors. Default NA
- knn_repeats
number of times to restart knn search. Default 1
- verbose
logical or integer; determines whether to show progress messages. Default FALSE
- umap_learn_args
vector of arguments to python package umap-learn. Default NA
- metadata
Optional data.frame with sample-level metadata. Used if a prcomp object or data.frame/matrix is supplied. Default NULL
Value
data.frame with the UMAP embeddings. If metadata was supplied then metadata columns are added to the results.
Examples
# Create metadata for plotting
metadata <- data.frame(row.names = colnames(GSE161650_lc))
metadata$Group <- rep(c("DMSO", "THZ1"), each = 3)
# PCA with PCAtools
p <- PCAtools::pca(GSE161650_lc, metadata, center = TRUE, scale = TRUE)
# PCA with prcomp
pr <- prcomp(t(GSE161650_lc), center = TRUE, scale. = FALSE)
# Pre-calculated distance matrix
d <- dist(t(GSE161650_lc))
# Perform UMAP on each data type
udata <- UMAP(p, n_neighbors = 2)
#> Warning: failed creating initial embedding; using random embedding instead
#> Warning: failed creating initial embedding; using random embedding instead
udata2 <- UMAP(pr, metadata, n_neighbors = 2)
#> Warning: failed creating initial embedding; using random embedding instead
#> Warning: failed creating initial embedding; using random embedding instead
udata3 <- UMAP(d, metadata, n_neighbors = 2)
#> Warning: failed creating initial embedding; using random embedding instead
#> Warning: failed creating initial embedding; using random embedding instead
# Also on raw data
udata4 <- UMAP(t(GSE161650_lc), metadata, n_neighbors = 2)
#> Warning: failed creating initial embedding; using random embedding instead
#> Warning: failed creating initial embedding; using random embedding instead