% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/perFeatureQCMetrics.R
\name{perFeatureQCMetrics}
\alias{perFeatureQCMetrics}
\alias{perFeatureQCMetrics,ANY-method}
\alias{perFeatureQCMetrics,SummarizedExperiment-method}
\title{Per-feature quality control metrics}
\usage{
perFeatureQCMetrics(x, ...)

\S4method{perFeatureQCMetrics}{ANY}(
  x,
  subsets = NULL,
  threshold = 0,
  BPPARAM = SerialParam(),
  flatten = TRUE,
  detection_limit = NULL
)

\S4method{perFeatureQCMetrics}{SummarizedExperiment}(x, ..., assay.type = "counts", exprs_values = NULL)
}
\arguments{
\item{x}{A numeric matrix of counts with cells in columns and features in rows.

Alternatively, a \linkS4class{SummarizedExperiment} or \linkS4class{SingleCellExperiment} object containing such a matrix.}

\item{...}{For the generic, further arguments to pass to specific methods.

For the SummarizedExperiment and SingleCellExperiment methods, further arguments to pass to the ANY method.}

\item{subsets}{A named list containing one or more vectors 
(a character vector of cell names, a logical vector, or a numeric vector of indices),
used to identify interesting sample subsets such as negative control wells.}

\item{threshold}{A numeric scalar specifying the threshold above which a gene is considered to be detected.}

\item{BPPARAM}{A \linkS4class{BiocParallelParam} object specifying how parallelization should be performed.}

\item{flatten}{Logical scalar indicating whether the nested \linkS4class{DataFrame}s in the output should be flattened.}

\item{detection_limit, exprs_values}{Soft deprecated equivalents to the arguments described above.}

\item{assay.type}{A string or integer scalar indicating which \code{assays} in the \code{x} contains the count matrix.}
}
\value{
A \linkS4class{DataFrame} of QC statistics where each row corresponds to a row in \code{x}.
This contains the following fields:
\itemize{
\item \code{mean}: numeric, the mean counts for each feature.
\item \code{detected}: numeric, the percentage of observations above \code{threshold}.
}

If \code{flatten=FALSE}, the output DataFrame also contains the \code{subsets} field.
This a nested DataFrame containing per-feature QC statistics for each subset of columns.

If \code{flatten=TRUE}, \code{subsets} is flattened to remove the hierarchical structure.
}
\description{
Compute per-feature quality control metrics for a count matrix or a \linkS4class{SummarizedExperiment}.
}
\details{
This function calculates useful QC metrics for features, including the mean across all cells
and the number of expressed features (i.e., counts above the detection limit).

If \code{subsets} is specified, the same statistics are computed for each subset of cells.
This is useful for obtaining statistics for cell sets of interest, e.g., negative control wells.
These statistics are stored as nested \linkS4class{DataFrame}s in the output.
For example, if \code{subsets} contained \code{"empty"} and \code{"cellpool"}, the output would look like:
\preformatted{  output 
  |-- mean 
  |-- detected
  +-- subsets
      |-- empty
      |   |-- mean 
      |   |-- detected
      |   +-- ratio
      +-- cellpool 
          |-- mean
          |-- detected
          +-- ratio
}
The \code{ratio} field contains the ratio of the mean within each subset to the mean across all cells.

If \code{flatten=TRUE}, the nested DataFrames are flattened by concatenating the column names with underscores.
This means that, say, the \code{subsets$empty$mean} nested field becomes the top-level \code{subsets_empty_mean} field.
A flattened structure is more convenient for end-users performing interactive analyses,
but less convenient for programmatic access as artificial construction of strings is required.
}
\examples{
example_sce <- mockSCE()
stats <- perFeatureQCMetrics(example_sce)
stats

# With subsets.
stats2 <- perFeatureQCMetrics(example_sce, subsets=list(Empty=1:10))
stats2

}
\seealso{
\code{\link{addPerFeatureQC}}, to add the QC metrics to the row metadata.
}
\author{
Aaron Lun
}
