Runs the default processing pipeline

CallAndGenerateReport(
  rawCountData,
  reportFile,
  callFile,
  rawFeatureMatrixH5 = NULL,
  barcodeWhitelist = NULL,
  barcodeBlacklist = c("no_match", "total_reads", "unmapped"),
  cellbarcodeWhitelist = "inputMatrix",
  methods = c("bff_cluster", "gmm_demux", "dropletutils"),
  methodsForConsensus = NULL,
  minCountPerCell = 5,
  title = NULL,
  metricsFile = NULL,
  rawCountsExport = NULL,
  skipNormalizationQc = FALSE,
  keepMarkdown = FALSE,
  molInfoFile = NULL,
  majorityConsensusThreshold = NULL,
  callerDisagreementThreshold = NULL,
  doTSNE = FALSE,
  datatypeName = NULL,
  maxAllowableDoubletRate = "auto",
  minAllowableDoubletRateFilter = 0.15
)

Arguments

rawCountData

The input barcode file or umi_count folder

reportFile

The file to which the HTML report will be written

callFile

The file to which the table of calls will be written

rawFeatureMatrixH5

Both demuxEM and demuxmix require the 10x h5 gene expression count file. This is only required when either demuxEM or demuxmix are used.

barcodeWhitelist

A vector of barcode names to retain.

barcodeBlacklist

A vector of barcodes names to discard. An example would be an input library generated with CITE-seq and cell hashing. In this case, it may make sense to discard the CITE-seq markers.

cellbarcodeWhitelist

Either a vector of expected barcodes (such as all cells with passing gene expression data), a file with one cellbarcode per line, or the string 'inputMatrix'. If the latter is provided, the set of cellbarcodes present in the original unfiltered count matrix will be stored and used for reporting. This allows the report to count cells that were filtered due to low counts separately from negative/non-callable cells.

methods

The set of methods to use for calling. See GenerateCellHashingCalls for options.

methodsForConsensus

By default, a consensus call will be generated using all methods; however, if this parameter is provided, all algorithms specified by methods will be run, but only the list here will be used for the final consensus call. This allows one to see the results of a given caller without using it for the final calls.

minCountPerCell

Cells (columns) will be dropped if their total count is less than this value.

title

A title for the HTML report

metricsFile

If provided, summary metrics will be written to this file.

rawCountsExport

If provided, the raw count matrix, after processing, will be written as an RDS object to this file. This can be useful for debugging.

skipNormalizationQc

If true, the normalization/QC plots will be skipped. These can be time consuming on large input data.

keepMarkdown

If true, the markdown file will be saved, in addition to the HTML file

molInfoFile

An optional path to the 10x molecule_info.h5.

majorityConsensusThreshold

This applies to calculating a consensus call when multiple algorithms are used. If NULL, then all non-negative calls must agree or that cell is marked discordant. If non-NULL, then the number of algorithms returning the top call is divided by the total number of non-negative calls. If this ratio is above the majorityConsensusThreshold, that value is selected. For example, when majorityConsensusThreshold=0.6 and the calls are: HTO-1,HTO-1,Negative,HTO-2, then 2/3 calls are for HTO-1, giving 0.66. This is greater than the majorityConsensusThreshold of 0.6, so HTO-1 is returned. This can be useful for situations where most algorithms agree, but a single caller fails.

callerDisagreementThreshold

If provided, the agreement rate will be calculated between each caller and the simple majority call, ignoring discordant and no-call cells. If any caller has an disagreement rate above this threshold, it will be dropped and the consensus call re-calculated. The general idea is to drop a caller that is systematically discordant.

doTSNE

If true, tSNE will be run on results as part of QC. This can be memory intensive and is not strictly needed, so it can be skipped if desired.

datatypeName

For output from CellRanger >= 3.0 with multiple data types, the result of Seurat::Read10X is a list. You need to supply the name of the Antibody Capture

maxAllowableDoubletRate

Per caller, the doublet rate will be computed as the total doublets / total droplets (including negatives). Any individual caller with a doublet rate above this value will be converted to NoCall. Note: if 'auto' is chosen, the value will be selected as 3x the theoretical doublet rate.

minAllowableDoubletRateFilter

This is the lower bound allowed for maxAllowableDoubletRate. This is primarily used to avoid excessively low values when selecting 'auto' for maxAllowableDoubletRate.