The primary entrypoint for parsing and QC of the cell hashing count matrix.
ProcessCountMatrix(
rawCountData = NA,
minCountPerCell = 5,
barcodeWhitelist = NULL,
barcodeBlacklist = c("no_match", "total_reads", "unmapped"),
cellbarcodeWhitelist = NULL,
doPlot = TRUE,
simplifyBarcodeNames = TRUE,
saveOriginalCellBarcodeFile = NULL,
metricsFile = NULL,
minCellsToContinue = 25,
datatypeName = NULL
)
The input barcode file or umi_count folder
Cells (columns) will be dropped if their total count is less than this value.
A vector of barcode names to retain.
A vector of barcodes names to discard. An example would be an input library generated with CITE-seq and cell hashing. In this case, it may make sense to discard the CITE-seq markers.
If provided, the raw count matrix will be subset to include only these cells. This allows one to use the cellranger unfiltered matrix as an input, but filter based on target cells, such as those with GEX data. This can either be a character vector of barcodes, or a file with one cell barcode per line.
If true, QC plots will be generated
If true, the sequence tag portion will be removed from the barcode names (i.e. HTO-1-ATGTGTGA -> HTO-1)
An optional file path, where the set of original cell barcodes, prior to filtering, will be written. The primary use-case is if the count matrix was generated using a cell whitelist (like cells with passing gene expression). Preserving this list allows downstream reporting.
If provided, summary metrics will be written to this file.
Demultiplexing generally requires a minimal amount of cells. If the matrix contains fewer than this many cells, it will abort.
For output from CellRanger >= 3.0 with multiple data types, the result of Seurat::Read10X is a list. You need to supply the name of the Antibody Capture
The updated count matrix