Calls the functions to run each of the three steps of the pipeline (similarity calculation, noise quantification, noise removal), with the specified parameters. See the individual function documentation for more details and required arguments. Required steps: calculate_expression_similarity_transcript, calculate_noise_threshold. remove_noise_from_bams. Optional steps: calculate_noise_threshold_method_statistics

noisyr_transcript(
  bams = NULL,
  path.bams = ".",
  genes = NULL,
  path.gtf = list.files(".", pattern = "\\.g[tf]f$"),
  ncores = 1,
  similarity.threshold = 0.25,
  method.chosen = "Boxplot-IQR",
  ...
)

Arguments

bams, path.bams

either a path to the directory where the BAM files are or a vector of paths to each individual file; if a path is specified, it extracts all files that end in .bam; looks in the working directory by default

genes

a tibble of the exons extracted from the gtf file; this is meant for speed if the output of cast_gtf_to_genes is already generated, or if the user wants to only calculate similarity for a subset of exons

path.gtf

the path to the gtf/gff annotation file (only used if genes is not provided); if unspecified, looks for one in the working directory

ncores

Number of cores for parallel computation; defaults to sequential computation, but parallelisation is highly encouraged; it is set to detectCores() if higher

similarity.threshold, method.chosen

parameters passed on to calculate_noise_threshold; they can be single values or vectors; if they are vectors optimal values are computed by calling calculate_noise_threshold_method_statistics and minimising the coefficient of variation across samples; all possible values for method.chosen can be viewed by get_methods_calculate_noise_threshold; only boxplot based methods are accepted for the transcript approach due to the number of observations and high variance

...

arguments to be passed on to individual pipeline steps; see their documentation for more details and required arguments

Value

The denoised BAM files are created, as specified by the destination.files argument of remove_noise_from_bams()

See also

Examples

bams <- rep(system.file("extdata", "ex1.bam", package="Rsamtools", mustWork=TRUE), 2) genes <- data.frame("id" = 1:2, "gene_id" = c("gene1", "gene2"), "seqid" = c("seq1", "seq2"), "start" = 1, "end" = 1600) noisyr_transcript( bams = bams, genes = genes, destination.files = paste0(tempdir(), "/", basename(bams), ".noisefiltered.bam") )
#> >>> noisyR transcript approach pipeline <<<
#> Calculating expression similarity for 2 genes...
#> this process may take a long time...
#> ncores=1, running sequentially...
#> Warning: the standard deviation is zero
#> Warning: the standard deviation is zero
#> Finished! Time elapsed: 0.12 secs
#> Similarity calculation produced too many NAs, returning zero...
#> Denoising 2 BAM files...
#> filtering genes using the noise thresholds
#> doing filtering by gene
#> kept 2 entries out of 2
#> denoising BAM file 1 of 2
#> denoising BAM file 2 of 2
#> >>> Done! <<<